Inventor Mobility Index Thorsten Doherr Zentrum fr Europische
Wirtschaftsforschung Center of Economic Research, Mannheim
Germany
Slide 2
Two inventors with the same name are not neccessarily the same
person Defining an inventor only by its name results in too much
false mobility especially for inventors with common names
Restricting the definition too much (i.e.: name and home address)
will cancel any mobility You have to decide wether two patents from
inventors with the same name are actually from the same person or
from different persons that share the same name Mission: The
complete patent data Problem: Tools: Mission
Slide 3
if they are inventing for the same applicant if they have the
same home address if they are working with the same co-inventors if
one is citing the other if they have patents in the same area of
technology (ipc) Two inventors with the same name are the same
person Plausibility Rules Inventor: A single inventor entry in a
patent document Person: All inventors with a specific name that are
linked by at least one plausibility rule
Slide 4
Harmonization of Applicants The SearchEngine is an in-house
developed software package specialized in company address matching.
It implements the following steps: Normalizing of the search fields
(company name, address fields) by transforming them to uppercase,
replacing special letters to their common (phonetic) representation
(i.e.: UE, SS), compressing abbreviations (i.e.: S.P.A. SPA) and
replacing special characters with blanks Creating a dictionary
containing all the words of the search fields along with their
occurrence. To preserve the context, every search field has its own
chapter. The occurence is the base for the heuristic search
algorithm. There are also supporting tables that link the
dictionary entries back to the company table. The search algorithm
separates a search term into words. Each word is associated with
the occurrence counter of the appropriate dictionary entry. The
occurrence reflects the identification potential of the word. A low
occurrence has a high identity, because the resulting list of
potential hits is small. SearchEngine
Slide 5
ENTRYOCCURSIDENTITY CORPORATION161/16 = 0.062500 ITALIA4911/491
= 0.002037 LEAR41/4 = 0.250000 SPA61191/6119 = 0.000163 DICTIONARY
- Chapter: APPLICANT_NAME LearCorporationITALIAS.p.A.
LEARCORPORATIONITALIASPA SUM
0.2500000.0625000.0020370.000163.3147000
79.441%19.860%0.647%0.052%100% NAMEIDENTITY LEAR CORPORATION ITALIA
S.p.A.100.000% Lear Corporation Italia S.r.l.99.947% LEAR ITALIA
SEATING S.p.A.80.139% Searching for Result Example of the
SearchEngine Algorithm Harmonization of Applicants
Slide 6
The resulting list of matching pairs is not symmetric: A can be
linked to B but it is not required that B is linked to A linked
pairs create a network Network Analysis: if A is linked to B and B
is linked to C, the analysis identifies the group A,B,C
Re-iteration of the network analysis for too large groups with an
increased cutoff limit for their members. Finalization A cutoff
limit for the identity is applied to filter all results (i.e.
90%)
Slide 7
Creating phonetic representations of the name using the
Metaphone algorithm by Lawrence Philips, 1990 Phonetic algorithms
create unique representations for similar sounding words (names)
and can be indexed direct database access Originally the results
they delivered were manually validated because of their strong
tendency for false positives automated matching requires an
automated validation process Harmonization of Inventor Names
Automated comparison of the retrieved names with the searched name
The function is based on the least relative character position
deltas and requires two words as parameters can not be used for
index based direct access Needs phonetic indexing to quickly
generate a list of potential candidates Tolerance for typing errors
increases with the length of the words longer words are more prone
to typing errors The SearchEngine is of limited use because it is
most efficient with search terms consisting of multiple words the
main problem are typing errors and misspellings
Slide 8
Harmonization of Inventor Names MRBRTN MAUROBARATONI
MARIOBERRETTONI MARIOBERTINI MARIOBERTON MAUROBERTONI MAUROBORDIN
FIRST NAMELAST NAME Example for the Metaphone Search
Slide 9
Harmonization of Inventor Names 01.0 CZARNITZKI CHARNIZKI
00.1250.250.3750.50.6250.750.8751.0 0 0 + + ++ + + + + + == 1.875
Example for the Least Relative Character Position Deltas
Slide 10
if they are inventing for the same applicant. if they have the
same home address. if they are working with the same co-inventors.
if one citing the other. if they have patents in the same area of
technology (ipc). Two inventors with the same name are the same
person Plausibility Rules Inventor: A single inventor entry in a
patent document. Person: All inventors with a specific name that
are linked by at least one plausibility rule.
Slide 11
All Patents of an Inventor Name 1 2 3 4 5 7 8 6 9 10 11 12 14
15 17 16 18 19 20 21 13 22
Italian Inventor Mobility Index patents from Italian applicants
and inventors different harmonized inventor names nodes after
applying the same applicant rule nodes after applying the
co-inventor rule nodes after applying the citation rule 123356
49101 60268 nodes after applying the same home address rule53316
53572 52504 50276 nodes after applying the ipc rule Espace Bulletin
(March 2010), EPO Patstat (September 2010), OECD Main Database:
Citations: Development:Microsoft Visual FoxPro 9.0