19
Reengineering Classification at the USPTO Marti Hearst, Chief IT Strategist, USPTO PIUG Conference May 4, 2010

Reengineering Classification at the USPTO Marti Hearst, Chief IT Strategist, USPTO PIUG Conference May 4, 2010

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Reengineering Classification at the USPTO

Marti Hearst, Chief IT Strategist, USPTO

PIUG Conference

May 4, 2010

Questions to Consider

• What is classification used for?• What currently works well?• What is currently problematic about the

classification system?• How can we fix these problems?

How is classification used?

• As a search aid– Grouping related art together– Narrowing down what needs to be looked at

• To assign work– Associated with art units– Assigning applications to examiners

What are its strengths?

• Familiar• For some art areas, is nicely detailed.

What are the problems?

• Time-consuming to update

• Backwards looking (does not anticipate new

directions in technology)

• The non-patent literature is not classified

• Is difficult to understand:

– For new examiners

– For many in the external community

• Is not harmonized internationally

What are some goals?

• Note: these are just ideas to discuss. – Make it easier for examiners, managers, or

others, to suggest new classes as they arise.

– Engage the external community in suggesting up-and-coming classes.

– Engage the community in classifying NPL.

What are some goals?

• Note: these are just ideas to discuss.– Flexible, adaptive to rapidly-developing technology– Easier to browse classified documents

dynamically– Easier to classify complex topics that span many

fields– Easier to classify across different points of view

(structure vs function, for instance)– More aligned with modern classification practices

and technology.

Modern Classification

• Today, most online systems that use classes use faceted classification.– Not just e-commerce, but also digital libraries

• Bioscience (gopubmed.org, nextbio.com)

• Computer Science (dblp.l3s.de)

• Worldcat library catalog• U Chicago DL (http://lens.lib.uchicago.edu/)

• Image collections

Lens.lib.uchicagoe.edu

Lens.lib.uchicago.edu

Lens.lib.uchicago.edu

Worldcat.org

worldcat.org

http://gopubmed.org/web/goweb

How to apply to the USPC?

• Speech Signal Processing. Psychoacoustic. For storage or transmission.. Neural Network.. Transformation… Orthogonal functions.. Frequency… Specialized Information…. Pitch….. Voiced or Unvoiced…. Formant…. Silence decision.. Voice recognition… Preliminary matching…Endpoint detection.. Word recognition… Preliminary matching… Endpoint detection… Specialized models…. Markov….. Hidden Markov Models (HMM)…… Training of HMM……. With insufficient amount of training data…… HMM network. Synthesis.. Neural network.. Transformation

How to apply to the USPC?

• Speech Signal Processing. Psychoacoustic. For storage or transmission.. Neural Network.. Transformation… Orthogonal functions.. Frequency… Specialized Information…. Pitch….. Voiced or Unvoiced…. Formant…. Silence decision.. Voice recognition… Preliminary matching…Endpoint detection.. Word recognition… Preliminary matching… Endpoint detection… Specialized models…. Markov….. Hidden Markov Models (HMM)…… Training of HMM……. With insufficient amount of training data…… HMM network. Synthesis.. Neural network.. Transformation

… Specialized models…. Markov….. Hidden Markov Models (HMM)…… Training of HMM……. With insufficient amount of training data…… HMM network

.. Neural Network

.. Transformation… Orthogonal functions

How to Facet the USPC?

• Speech Signal Properties. Psychoacoustic. Pitch.. Voiced or unvoiced. Formant

• Speech Signal Problems. Voice recognition. Word recognition. Phrase recognition. Noise reduction. Generation

• Speech Signal Applications. Text to speech. Speech to text. Meeting recording

• Machine Learning Methods. HMMs

.. HMM Network. Neural Nets

.. Linear .. Sigmoidal. Maximum Entropy

• Machine Learning Techniques issues. Limited training data. Training techniques

• Data Transformations. Orthogonal functions. Quantization

Potential Advantages

• More familiar to outsiders• More flexible for navigation• Easy to include multiple points of view• Potentially easier to capture cross-domain

similarities (more systematic than x-class)• May integrate well with IPC and F-Terms• May be easier to automate assignment

Is this a good idea?

• We’re doing a pilot test on classes in 2600.• There are of course many issues to consider.• Let’s discuss!