16
Dr. Kai Simon Large-Scale Patent Classification at the European Patent Office

II-SDV 2015, 20 - 21 April, in Nice

Embed Size (px)

Citation preview

Dr. Kai Simon

Large-Scale Patent Classification

at the European Patent Office

ABOUT AVERBIS

Founded: 2007

Location: Freiburg im Breisgau

Team: Domain & IT-Experts

Focus: Leverage structured & unstructured information

Current Sectors: Pharma, Health, Automotive, Publishers & Libraries

SELECTED CUSTOMERS

PORTFOLIO

Products

Information Discovery Platform

Text Mining Search and Analysis Terminology

Management

PATENT CLASSIFICATION AT EPO

Tender No. 1585

1) Pre-Classification of

unpublished patents into departments

2) Re-Classification on

published patents, if category system changes

PATENT CLASSIFICATION AT EPO

Tender No. 1585

1) Pre-Classification of

unpublished patents into departments

Our Motivation:

• Great Classification Use-Case

– Big Data (80 Mio. patents available)

– Large Scale Category System >250.000 CPC codes

– Tough classification quality and response time

constraints

• Text Mining Success Story

http://www.epo.org/about-us/annual-reports-statistics/annual-report/2014.html

OLD CLASSIFICATION PROCESS

PATENTS CLA SSIFICATION DEPARTMENTS

CLASSIFICATION COMPLEXITY

~250.000

CPC Codes

~1.500

Ranges

250

Departments

CLASSIFICATION PROCESS

PATENTS CLA SSIFICATION DEPARTMENTS

NEW CLASSIFICATION PROCESS

PATENTS CLA SSIFICATION DEPARTMENTS

STATUS & OUTLOOK

• Quality Evaluation

[passed]

• Going Live [April]

• Continuous Optimization

COME VISIT OUR BOOTH

Products

Information Discovery Platform

Text Mining Search and Analysis Terminology

Management

For further questions, please contact:

Noël Lochtenbergh

& Dr. Kai Simon

+ 49 (0)761 203 97690

[email protected]

NUMBER OF STAFF

Status: December 2008

SOME FACTS

• about 650k training documents from 2005-2013

• supervised learning: light-weight and fast linear support

vector machine

• Training time (16 Cores, 128 GB RAM)

– Feature Extraction: ~14 minutes

– Training of Classifiers: ~10 minutes

– Classification: < 2 seconds