If you can't read please download the document
Upload
alberto-ciaramella
View
166
Download
2
Embed Size (px)
Citation preview
MyIntelliPatent
Second generation semantic technologies for patent analysis
Alberto Ciaramella - IntelliSemantic
Marco Ciaramella - IntelliSemantic
PATINFO 2015 - 10/6/2015 Ilmenau
This presentation IntellisemanticSemantic technologies in patent solutions are sometimes controversial.
The first part of this presentation provides a framework and a fair overview of what exist now, and anticipates some coming evolutions, belonging to second generation semantic technologies.
The second part of this presentation provides IntelliSemantic specific examples.
.
in
OK
ContentIntellisemantic
IntelliSemantic: an introduction
Patent tasks, phases and technologies
Second generation semantic technologies
Semantic technologies demos
Embedded in MyIntelliPatent
TOPAS originated technologies
Conclusions and follow-up
Y
in
StatusIntroduction (ok)
Patent tasks (almost ok, but still requires 2 slides to conclude)
Second generation semantic technologies (the most significant section:it has to be substantially rewritten and simplidied)
MyIntelliPatent with semantics and demo (ok)
Other semantic technologies and demo (not difficult to do)
- Concusions (not difficult to do)
Speakers
Alberto Ciaramella background:
Intellisemantic CEO/founder in 2005.
Before that:
Researcher and Research Manager at CSELT, the research branch of Telecom Italia, for speech and and Natural Language Processing.
Competitive Intelligence Manager at Loquendo SpA, the CSELT spin-off for speech and language processing.
Marco Ciaramella background:
Intellisemantic product manager since 2009.
Before that:
Project officer at Enginering.
Technology consultant at HP.
IntelliSemantic
This slide has been produced initially for the PDG meeting, but it is now more precise
IntelliSemantic
The company
Solutions
The patent information challenge
IntelliSemantic
Founded in 2005, in Torino
in the incubator of the Politecnico di Torino.
Competences: natural language processing.
Research activities:
partner of the FP7 cofunded TOPAS project. (Tool for Patent Analysis and Summarization).
R&D internal activities for MyIntelliPatent.
partner of some Piemonte or Veneto region cofunded research projects for open data and NLP.
IntelliSemantic
I think that it is kind to present itself. It is not easy to summarize a company in a slide, but we will try it. This IntelliSemantic in a slide.
The obvious difficult quesstions are about the company size, the market adoptation of IntelliSemantic solutions and so: be readu to answer!!!
IntelliSemantic solutions
Intellisemantic
On the information side, the number of worldwide patents is continuously increasing, hence the effort required for any kind of patent-related task.
On the user side, the number of companies whose business can be affected by patent information is increasing and include now also a significant percentage of SMEs, which can be even more tight on costs.
But if patent analyses are performed less frequently or less deeply than required, a company can incur:
higher costs, if a company misses in due time a competitor which can invalidate its research efforts.
less benefits, if a company has not the time to extract hidden suggestions from the patent literature.
The patent information challenge IntelliSemantic
Since the number of patents to monitor is continuously increasing, it is also increasing the effort required for any kind of patent analysis. Another factor which increases this effort is the increasing variety of relevant languages, with an increasing number of patents available only in non English language. This factor is a supplier side factor. Ob the user side, an increasing number of companies, even SMEs, are in need to perform these analyses.
A solution to this challenge is to deliver smarter tools which allow professionals to concentrate their activities in the higher value-added part of their activity.
Smarter tools can include features as:
Patent specific knowledge management, to:
learn, accumulate, and reuse the company professionals knowledge.
provide a structured approach for different use cases.
Intelligent language technologies to automatically extract the text embedded knowledge, as the most relevant entities and passages, and to identify as well the patent document structure.
How to solve this challengeIntelliSemantic
The focus is clearly to company professionals. In any case, we have to be ready to the eventual question: whats is about external consultants?
Patent tasks, phases and technologies Tasks
Phases
Technologies
Semantic technologies in more details
Patent informatics solutions
Patent informatics solutions can be categorized according to three different dimensions:
tasks.
interaction phases.
technologies used.
This framework is useful:
to compare solutions.
to identify the potential benefits of a new technology on different tasks and interaction phases.
IntelliSemantic
Detail
Tasks
monitoring:
new published applications, status evolutions of already known documents.
searches:
Prior art, validity, freedom to operate.
analyses:
Technologies, competitors.
IntelliSemantic
Detail
Interaction phases
query:
by metadata, by text, by reference.
patent set results analysis:
extracts distributions (e,g. by applicant).
identifies correlations.
ranks documents to analyze in more detail.
single patent analysis:
identifies main sections.
identifies main topics.
navigates through topics and sections.
IntelliSemantic
Detail
Phases: some conclusions
the query and the patent set results analysis are characterized by recall and precision:
the recall is measured by (relevant results found / total relevant results in the data base).
the precision is measured by (relevant results found / total results found).
recall and precision are inversely related.
a safe strategy is to maximize the recall of the query, then use precise and efficient technologies to analyze results.
the single patent analysis can become more efficient by using suitable technologies.
IntelliSemantic
Detail
Tasks and phases: requirements
IntelliSemantic
Detail
Technology generations (1)
based on metadata only:
e.g. querying by IPC and applicant.
text-based, the most popular of which are:
boolean, e.g. querying by speaker recognition AND hidden Markov models. Results are included or not.
vector based, i.e. by comparing the words sequence of the query and the words sequence of results. Results are ranked.
vector based with term dependecies. A notewort example is Latent Semantic Analysis. Results can be clustered.
IntelliSemantic
Some more details are provided in http://en.wikipedia.org/wiki/Information_retrieval , which really cites 3*3 cases. In any case, in our slide, we have included only the three most popular methods.
Technology generations (2)
vector based with terms interdependencies have been called semantic technologies in patent informatics, since the Latent Semantic Algorithm (LSA) is the most popular in this class.
LSA clusters cooccuring terms, hence simulates an intelligent behaviour.
these technologies are more typically focused on recall.
IntelliSemantic
Detail
Technology generations (3)
second generation semantic solutions can be defined, as those having at least one of these characteristics:
to be user controllable, e.g. by relying on user defined lexicons.
to include patent specific algoritms, e.g. patent segmentation.
these technologies are more typically focused on precision.
mantic
Detail
Technologies: conclusions
We ordered technologies by time, without implying that a technology is superior to others simply because it is the most recent or that an older technology is to deprecated.
Technologies of different generations can coexist in the same applications:
for different tasks and phases
for different objectives, like to increase the recall or to increase the precision.
Define your requirements first, then select a technology, but:
A new technology can suggest you new requirements.
IntelliSemantic
Detail
Second generation semantic technologiesA taxonomy
Technologies and tasks enables
IntelliSemantic
A high level functions taxonomy
entities extraction:
Generic entities or tags.
Qualified entities: i.e. only measurements, or substances, or methods.
entities relationships identification:
short range: to relate entities in the same sentence.
long range: to relate claims and description.
patent structure identification:
the patent is a structured text.
the role of an entity is section specific, hence different in prior art or in claims.
IntelliSemantic
Technologies and application
Technologies mentioned in the previous slide, can be used very differently, since they can be used:
for different phases.
stand alone or in combination.
for enhancing a manual or an automatic process.
The most important issue for selecting these technologies is:
to figure out their advantages on the application side.
to select the more appropriate combination of application and technology.
Generic entity (or tag) extractor a tag is a word (e.g. inductor) or a sequence of words (e.g. speaker verification) having a well defined meaning.
from the implementation point of view we have to distinguish two phases:
the off line annotation.
the real time user interactions with annotated documents.
this also applies to other technologies mentioned in the following.
IntelliSemantic
Examples of applications enabled to build up topic specific vocabularies, from a topic-specific patent sets collections.
for queries: to extract most relevant topics in a patent and suggest them to the user in task like validity search and prior art search.
for patent set analyses:
to identify patents citing the same topics.
to score patents by topics richness.
to identify topics distribution (by applicant, by year).
for single patent analysis: to navigate a patent document through the same topic.
IntelliSemantic
Qualified entities Measurements, which can include:
physical unit (e,g. Volt) and rank (e.g. milli).
numbers (e.g. 10) and numerals (e.g. ten).
closed intervals (e.g. between 1 and 2 nm).
open intervals (e.g. up to 1 nm).
tolerance values.
Citations of patents and non patent literature.
Substances, as aluminium.
Processes, as redundancy control.
Technical quality, as piston speed.
IntelliSemantic
Examples of applications enabled for patent set analyses:
to identify patents mentioning specific measurements and ranges.
to categorize patents more related to substances, methods and a combination of.
IntelliSemantic
Structure identification functions
to identify the structure of the description:
first level: as technical field, background art, summary of invention, description of drawings, preferred embodiment.
second level, as preferred embodiments.
third level, as. objective, advantages.
to identify the structure of claims:
Interclaim, as dependent and independent claims.
intraclaim, as preamble, transition, aspects.
IntelliSemantic
Examples of applications enabled
patent segmentation only:
patent sets analyses: to select specific patent sections, as background art, and compare them.
single patent analysis: to build a patent document directory, which can facilitate the patent document navigation.
combined with entity extraction:
these technologies combine naturally, since the meaning of an entity can depend from the patent segment.
IntelliSemantic
MyIntelliPatent and semantics
MyIntelliPatent
Structured interaction
Tags
Tasks supported
Demo
IntelliSemantic
MyIntelliPatent
A smart solution for patent intelligence tasks.
MyIntelliPatent includes the company specific knowledge, since it is provided as a password-protected Software as a Service and repository. A company can build and access to its specific vocabularies, patent sets, patent annotations.
MyIntelliPatent supports structured interactions, as detailed in the following.
MyIntelliPatent includes intelligent language technologies, as detailed in the following.
Fine. In any case, it is the main slide and it could be partially rewritten
More than personalized solution it is better to present it as a knowledge management system enhanced with NL features.
Structured interaction IntelliSemantic
Queries, by metadata, by a reference patent, a reference text or even by a patent list
A first level results analysis through QuickView.
A second level analysis and statistics inlcluding metadata through Search/Statistics
A third level analysis and statistics including tags through Tag and Search/Statistics
This slide is nice, but a little too complicated here. Is is better to repace it with the table in the quick unser manual
IntelliSemantic
Second level analysis by Search/Statistics The Search/Statistics page allows the user to identify most relevant patents (by family size, by citations) or interesting (by applicant), to extract different kind of results tables and to order these results by different criteria, to extract statistics . Example shown here are only based on metadata. Tags allow more refined analyses, as shown in the following slide.SearchPatentManual_comments (2)
This is just a slide presenting a screenshot with very interesting, although specific feature, which empasize the importance of tagging, which is an important feature of MyIntelliPatent.
Other more general screenshots should be provided , as the screenshot presenting the ordered list of results. In any case we prefer not to add other screenshots, since:They can evolve with the evolution of the product, hence we could have an additional problem in the maintenance of this presentation
This presentation is typically followed by the product presentation, in which these details are more appropriate.
Linguistic intelligence: Tags A tag is a word (e.g. inductor) or a sequence of words (e.g. speaker verification) having a well defined meaning.
Tags are a distinguishing feature in MyIntelliPatent.
MyIntelliPatent can:
suggest a topic specific vocabulary from a set of topic specific patents.
allow the user to edit this suggested vocabulary.
apply the finally edited vocabulary to all collections, in such a way that vocabulary tags in a patent become new text-specific metadata.
different topic specific vocabularies can be present in the same platform.
IntelliSemantic
Motivate why tagging adds intelligence
Some examples of tags use
IntelliSemantic
This slide will be rewritten merging this information with the information of the table in the quick introduction
Extracting a tags vocabulary
IntelliSemanticThe Edit & Tag page allows to extract more relevant tags from a set of patents, to analyze these suggested tags, to edit them , to confirm the user validated vocabulary of tags. The user can also copy and paste his/her suggested vocabulary.
First level analysis by QuickView
IntelliSemantic .
This level of analysis provides a quick view of patent applicant, title, summary and extracted tags, which is a good proxy for identifying the patent interest for the user. In case of doubt, he/she can directly access from this page the whole document. This level of analysis can be enough for some tasks, as quick prior art searches.
IntelliSemantic
Tags in third level analysis: an example Tags allow to identify most relevant concepts in a patent and allows to extend the analysis based on metadata. This table summarizes the number of patents by year using a specific tag, and allows to identify first patents using a concept and the most popular concepts now.
This slide can be reatained as an useful example
This is just a slide presenting a screenshot with very interesting, although specific feature, which empasize the importance of tagging, which is an important feature of MyIntelliPatent.
Other more general screenshots should be provided , as the screenshot presenting the ordered list of results. In any case we prefer not to add other screenshots, since:They can evolve with the evolution of the product, hence we could have an additional problem in the maintenance of this presentation
This presentation is typically followed by the product presentation, in which these details are more appropriate.
TOPAS demo
The TOPAS project, participants and results
A demo with Patent description and Measurements extraction
IntelliSemantic planned exploitation
TOPAS demo
IntelliSemantic
This demo exemplifies some second generation semantic technologies not yet integrated in MyIntelliPatent.
This demo was developed by IntelliSemantic for the FP7 research project TOPAS (Tool Platform for Patent Anaysis and Summarization), which will be summarized in the following.
The EU cofunded research project TOPAS (Tool Platform for Patent Analysis and Summarization) studied, prototyped and tested some of second generation semantic technologies for English, German and French.
TOPAS was a 24 months FP7 Capacity project, under grant agreement number FP7-SME-2011 286639, from october 2011 to september 2013.
TOPAS research project
IntelliSemantic
5 TOPAS participants were Bruegman Software, IALE, IntelliSemantic, University of Stuttgart and University Pompeu Fabra.
Universities transferred all rights of technologies.
SMEs have the whole ownership of TOPAS technologies and are mutuallt independent in the exploitation.
TOPAS had qualified advisors to provide feedback on the application side; between them we can mention EPO, Fraunhofer and some companies and consultants.
TOPAS participants
IntelliSemantic
TOPAS prototyped and tested solutions for:
Qualified entities extraction.
Entities relationship identification.
Patent segmentation.
Patent summarization (not detailed here)
In English, German and French
The overview of project results has been recently published on WPI magazine, march 2015, in the paper Towards content oriented patent document processing: intelligent patent analysis and summarization.
TOPAS results
IntelliSemantic
Patent description first level This screenshot exemplifies the first level patent segmentation used to analyze a patent set, e,g to focus the analysis of results to specific sections, as in this case the background art.
IntelliSemantic
Patent description large grain is performat and efficient as well and it has to be pushed
Measurement extraction
IntelliSemantic
This screenshot exemplifies the measurement extraction used to analyze a single patent, i.e, to retrieve patent sections citing measurements and to extract the meanings of these measurements.
Patent description large grain is performat and efficient as well and it has to be pushed
IntelliSemantic has further developed some some of the TOPAS technologies and is ready to expioit them:
Integrated in new MyIntelliPatent releases, e.g. to extend patent set analyses and single patent analysis.
As technology engines to be integrated into the customer platform, and to extend it with features like patent segmentation and qualified entities extraction:
this last solution is more suitable to advanced users, as patent offices and big companies.
IntelliSemantic TOPAS exploitation
IntelliSemantic
For more information
Visit us at stand 4 for more details about
MyIntelliPatent.
Other semantic technologies.
And/or:
Contact IntelliSemantic
e-mail [email protected]
tel. +39 011 9550 380
for a Web Conference presentation.
IntelliSemantic
Status: ok
Comment.
This slide motivates the audience to visit IntelliSemantic stand, since the solution is more feature rich than presened here, and at the same time it provides a German telephone number (no, in this slide, since the German number presently costs too much).
Licence
This work is licenced under Creative Commons Attribution-NonCommercial-Share A like 3.0 Unported Licence
To view a copy of this licence visit:
http://creativecommons.org/licenses/by-nc-sa/3.0/
Intellisemantic, Politecnico di Torino
Fare clic per modificare stili del testo dello schema
Secondo livello
Terzo livello
Quarto livello
Quinto livello
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare lo stile del sottotitolo dello schema
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Fare clic per modificare stili del testo dello schema
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Fare clic per modificare stili del testo dello schema
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Secondo livello
Terzo livello
Quarto livello
Quinto livello
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Secondo livello
Terzo livello
Quarto livello
Quinto livello
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Fare clic per modificare stili del testo dello schema
Secondo livello
Terzo livello
Quarto livello
Quinto livello
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare lo stile del sottotitolo dello schema
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Fare clic per modificare stili del testo dello schema
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schemaSecondo livello
Terzo livello
Quarto livello
Quinto livello
Fare clic per modificare stili del testo dello schema
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Secondo livello
Terzo livello
Quarto livello
Quinto livello
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino
Fare clic per modificare lo stile del titolo
Fare clic per modificare stili del testo dello schema
Secondo livello
Terzo livello
Quarto livello
Quinto livello
I3P - 04/10/2007
Intellisemantic, Politecnico di Torino