Inventor Mobility Index Thorsten Doherr Zentrum für Europäische Wirtschaftsforschung Center of...
Preview:
Citation preview
- Slide 1
- Inventor Mobility Index Thorsten Doherr Zentrum fr Europische
Wirtschaftsforschung Center of Economic Research, Mannheim
Germany
- Slide 2
- Two inventors with the same name are not neccessarily the same
person Defining an inventor only by its name results in too much
false mobility especially for inventors with common names
Restricting the definition too much (i.e.: name and home address)
will cancel any mobility You have to decide wether two patents from
inventors with the same name are actually from the same person or
from different persons that share the same name Mission: The
complete patent data Problem: Tools: Mission
- Slide 3
- if they are inventing for the same applicant if they have the
same home address if they are working with the same co-inventors if
one is citing the other if they have patents in the same area of
technology (ipc) Two inventors with the same name are the same
person Plausibility Rules Inventor: A single inventor entry in a
patent document Person: All inventors with a specific name that are
linked by at least one plausibility rule
- Slide 4
- Harmonization of Applicants The SearchEngine is an in-house
developed software package specialized in company address matching.
It implements the following steps: Normalizing of the search fields
(company name, address fields) by transforming them to uppercase,
replacing special letters to their common (phonetic) representation
(i.e.: UE, SS), compressing abbreviations (i.e.: S.P.A. SPA) and
replacing special characters with blanks Creating a dictionary
containing all the words of the search fields along with their
occurrence. To preserve the context, every search field has its own
chapter. The occurence is the base for the heuristic search
algorithm. There are also supporting tables that link the
dictionary entries back to the company table. The search algorithm
separates a search term into words. Each word is associated with
the occurrence counter of the appropriate dictionary entry. The
occurrence reflects the identification potential of the word. A low
occurrence has a high identity, because the resulting list of
potential hits is small. SearchEngine
- Slide 5
- ENTRYOCCURSIDENTITY CORPORATION161/16 = 0.062500 ITALIA4911/491
= 0.002037 LEAR41/4 = 0.250000 SPA61191/6119 = 0.000163 DICTIONARY
- Chapter: APPLICANT_NAME LearCorporationITALIAS.p.A.
LEARCORPORATIONITALIASPA SUM
0.2500000.0625000.0020370.000163.3147000
79.441%19.860%0.647%0.052%100% NAMEIDENTITY LEAR CORPORATION ITALIA
S.p.A.100.000% Lear Corporation Italia S.r.l.99.947% LEAR ITALIA
SEATING S.p.A.80.139% Searching for Result Example of the
SearchEngine Algorithm Harmonization of Applicants
- Slide 6
- The resulting list of matching pairs is not symmetric: A can be
linked to B but it is not required that B is linked to A linked
pairs create a network Network Analysis: if A is linked to B and B
is linked to C, the analysis identifies the group A,B,C
Re-iteration of the network analysis for too large groups with an
increased cutoff limit for their members. Finalization A cutoff
limit for the identity is applied to filter all results (i.e.
90%)
- Slide 7
- Creating phonetic representations of the name using the
Metaphone algorithm by Lawrence Philips, 1990 Phonetic algorithms
create unique representations for similar sounding words (names)
and can be indexed direct database access Originally the results
they delivered were manually validated because of their strong
tendency for false positives automated matching requires an
automated validation process Harmonization of Inventor Names
Automated comparison of the retrieved names with the searched name
The function is based on the least relative character position
deltas and requires two words as parameters can not be used for
index based direct access Needs phonetic indexing to quickly
generate a list of potential candidates Tolerance for typing errors
increases with the length of the words longer words are more prone
to typing errors The SearchEngine is of limited use because it is
most efficient with search terms consisting of multiple words the
main problem are typing errors and misspellings
- Slide 8
- Harmonization of Inventor Names MRBRTN MAUROBARATONI
MARIOBERRETTONI MARIOBERTINI MARIOBERTON MAUROBERTONI MAUROBORDIN
FIRST NAMELAST NAME Example for the Metaphone Search
- Slide 9
- Harmonization of Inventor Names 01.0 CZARNITZKI CHARNIZKI
00.1250.250.3750.50.6250.750.8751.0 0 0 + + ++ + + + + + == 1.875
Example for the Least Relative Character Position Deltas
- Slide 10
- if they are inventing for the same applicant. if they have the
same home address. if they are working with the same co-inventors.
if one citing the other. if they have patents in the same area of
technology (ipc). Two inventors with the same name are the same
person Plausibility Rules Inventor: A single inventor entry in a
patent document. Person: All inventors with a specific name that
are linked by at least one plausibility rule.
- Slide 11
- All Patents of an Inventor Name 1 2 3 4 5 7 8 6 9 10 11 12 14
15 17 16 18 19 20 21 13 22
- Slide 12
- The Same Applicant Rule 1 2 3 4 5 7 8 6 9 10 11 12 14 15 17 16
18 19 20 21 13 22
- Slide 13
- The Same Home Address Rule 1 2 3 4 5 7 8 6 9 10 11 12 14 15 17
16 18 19 20 21 13 22
- Slide 14
- The Co-Inventor Rule 1 2 3 4 5 7 8 6 9 10 11 12 14 15 17 16 18
19 20 21 13 22
- Slide 15
- The Citation Rule 1 2 3 4 5 7 8 6 9 10 11 12 14 15 17 16 18 19
20 21 13 22
- Slide 16
- The IPC Rule 1 2 3 4 5 7 8 6 9 10 11 12 14 15 17 16 18 19 20 21
13 22
- Slide 17
- Italian Inventor Mobility Index patents from Italian applicants
and inventors different harmonized inventor names nodes after
applying the same applicant rule nodes after applying the
co-inventor rule nodes after applying the citation rule 123356
49101 60268 nodes after applying the same home address rule53316
53572 52504 50276 nodes after applying the ipc rule Espace Bulletin
(March 2010), EPO Patstat (September 2010), OECD Main Database:
Citations: Development:Microsoft Visual FoxPro 9.0
- Slide 18
- FROMTO 12 15 21 25 27 51 52 67 72 76 Traversal of a Network
Table 1 2 3 4 5 7 8 6 GROUPMEMBER 11 12 15 17 16