170
© 2015 WissensWert Seminare Dr. Kirch-Verfuß WissensWert Tutorial Mind the language gap! Multilingual Searching in Practice

Mind the language gap! Multilingual Searching in Practice

Embed Size (px)

Citation preview

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

WissensWert Tutorial

Mind the language gap!

Multilingual Searching in Practice

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

About WissensWert Consulting

We provide expert support in the field of patent information:

• Patent searching in the field of chemistry, engineering, materials

(metalls, polymers, paper, textiles)

• Consultancy in the selection / implementation of patent information

databases / software

• Training on all issues of patent information in German and English,

as open seminars and webinars as well as inhouse, subjects are

worldwide free patent databases, basic searching, professional

searching, patent classification, legal status, citations, results

processing, and more)

www.wissenswert-seminare.de

Phone 02361-9040-273 Mail [email protected]

2

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Multilingual information retrieval

What is it?

Making digitally stored information items available,

regardless of the language they are written in.

3

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Warning to language freaks A Prussian got lost in Munich. A group of locals (Bavarian) is waiting at a

tram stop. The Prussian asks one of them in High German:

"Können Sie mir sagen, wo es hier zum Bahnhof geht?" No reaction.

The Prussian now tries in English:

"Excuse me, can you tell me the way to the station?" Again, no reaction.

Now he tries in French, Spanish, Russian and Japanese. No reaction.

The Prussian moves on quiet frustrated.

Says one of the Bavarians:

"Hund sans scho, de Preissn. Fünf Spracha hot a kennt."

Says one of the others: “Des scho – aber wos hats eam gnutzt? "

4

„Foxy they are, these Prussians.

Five languages he knew.“

„True, but what did it help him?“www.uni-marburg.de/fb09/igs/mitarbeiter/wiese/linguistenwitze

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Multilingual searching

• Nowadays searching in multiple languages

does not demand skills in a variety of languages,

except English.

• But there is a demand for more general skills

in state-of-the-art search technology.

5

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

As an example: Additive manufacturing

www.eos.info

6

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 7

www.eos.info

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Additive manufacturing as an example

8

Journal

Webside

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Development of patent applications in the field

9

Additive manufacturing

Source Patbase

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Development of technical literature in the field

0

50

100

150

200

250

300

350

400

450

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

10

Additive manufacturing

Publication yearSource INSPEC

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Additive manufacturing

Generative manufacturing

Sheet layer manufacturing

Rapid Manufacturing

Rapid prototyping

3D-printing

Three-dimensional printing

3D-shaping

Three-dimensional shaping

Electron Beam Melting

Laser Deposition

Laser Melting

Laser Sintering

Powder Bed Fusion

11

Further terms:

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Agenda

12

• Usage of language – in general, in science and in patents

• Patent information searching with words

Using the English language

Using machine translation

Using Crosslingual searching

• Non patent literature searching with words

• The future of multilingual searching

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

10 World languages (in millions of people)

Language Native speakers Worldwide speakers Internet users

English 375 1.500 851

Chinese 982 1.100 705

Spanish 330 420 245

Arabic 206 300 156

Portuguese 216 235 132

Japanese 127 128 115

Russian 165 275 103

French 79 370 92

German 105 185 84

Korean 78 78 ?

13

www.weltsprachen.net http://www.internetworldstats.com/stats7.htm

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 14

http://blog.unbabel.com/2015/06/10/top-languages-of-the-internet/

English –

THE universal language

on the internet

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is special about the English language?

English

• is the universal language of the Internet,

• is the language of international communication,

• is the intermediate language for machine translation

15

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is special about the English language?

• English is often used by non-native speakers and writers.

• Many texts published on the internet are written in a

simplified variant of the English, contain errors and might

even be hard to understand.

16

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is special about the German language?

• Compound constructions like Additivherstellungsvorrichtung

In English: additive manufacturing device

Requires particular handling for information searching:

Special indexing of German text, i.e. breaking the word

into its constituent parts

Need for left truncation for searching

• Diacritics: ä, ö, ü, ß

17

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is special about the Finnish language?

“When Finns speak, everbody listens –

it's just that nobody else understands”by Bill Farmer (Knight-Ridder newspapers)

• Special wording – an example:

cigarette lighter = savukkeensytyttimen

• Finnish has numerous grammatical cases,

similar to the Ungarian language.

Concerning information searching:

Problems with machine translation

18

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is special about the Asian languages?

• No whitespaces between words

• Whole sentences written as continuous strings of

characters.

Difficulties in indexing of asian text

Solution: Word segmenter software helps to find the most

plausible splitting of a sentence into words.

19

从技术角度看,该魔方的缺点是要相当复杂的实施小的构建方体,这导致形状复杂并且成本高。The disadvantage of this puzzle cube is rather complicated

execution of the small building cubes, from the technical

point of view, which makes the shape complicated and high

cost. CN104582803

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Various types of languages each with ist

own way of expression

• Scientific language

• Technical language

• Language of media publications

• Professional Jargon

• Patentese

20

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Scientific language – Example

Larsen, Michael Byrnes; Thesis (Ph.D.) University

of Washington, 2015

Fundamental and Applied Investigations in

Solid-State Polymer Mechanochemistry

21

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Which

language is

used in

science?

Scientific

articles in

Scopus

Time period

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Technical language – Example

Xu, Hongxiao et al.; Proceedings of the Fraunhofer

Direct Digital Manufacturing Conference 2014

Towards Improved Thermal Management of Laser

Beam Melting Processes

23

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Language of media publications – Example

Forbes Tech May 2012

“Additive manufacturing technologies create a

world of possibilities that can take an

organization in an entirely new direction and help

launch new businesses and business models,”

said Wohlers of Wohlers Associates.

24

The Little Black Book of Billionaire SecretsMay 2, 2012

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Professional Jargon – Example

FORMRISE GmbH http://formrise.com/

Laser sintering is a worthwhile manufacturing

method not only for the product launch process,

but also when forecasts of the number of units to

produce are unclear.

25

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentese – an example

A glass

Half full?

Half empty?

26

Open ended cylinder

horizontally bisected

by liquid H2O

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentese – Example

EP2890466A1

METHOD FOR

TRANSFORMING A THREE

DIMENSIONAL DIGITAL

MODEL IN A SPACE OBJECT

MADE OF SHEET MATERIAL

27

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

A general problem – diversity of language

• Verbalise one concept by a variety of linguistical

expressions

• Verbalise several concepts by the same

linguistical expression

28

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Synonyms

A synonym is a word or phrase that means exactly or nearly

the same as another word or phrase in the same language.

manufacturing

production

makingpreparing

fabrication

assemblingcomposition

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Synonyms – an example

30

Fowlers Modern English Usage

Pigeon:

the word in everyday use

Dove:

the word found in poetical or

symbolic context

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Homonyms

note

tone (in music)

memoremark

bank note

A homonym is a word that is said or spelled the same way as

other words but has different meanings.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

„Homonymic“ phrasing

EP1925219A Channel wall element

The invention is about a tabacco channel in a

machine for cigarette production

32

Steel pilings Sewage tunnel

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

„Homonymic“ phrasing

Additive manufacturing is

• either

a kind of 3D shaping

• or

production of additives

33

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Agenda

34

• Usage of language – in general, in science and in patents

• Patent information searching with words

Using the English language

Using machine translation

Using Crosslingual searching

• Non patent literature searching with words

• The future of multilingual searching

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

English - the Language of patents?

• Patents are published in the official language or

one of the official languages of the publishing patent office.

• For most offices this is the national language.

• Some patent offices allow for English applications without

later translation, with correspondence in English and finally

with English patent publications, e.g. Sweden.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

English - the Language of patents?

Regional patent offices have their own language regimes –

for example:

• EAPO (Russia, Azerbaijan, Armenia, Belarus, Kazakhstan ...)

100 % Russian

• EPO

70 % English, 23 % German, 7 % French

• WIPO

62 % English, 15 % Japanese, 11 % German, 4 % Chinese,

4 % French, < 1 % Arabic, < 1 % Korean, < 1 % Portuguese,

< 1 % Russian and < 1 % Spanish

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

825136

571612

328436

204589

147987

6316744914 43031 34741 30884 29717 22938 16886 15444

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

CN US JP KR EP DE RU IN CA BR AU UK FR MX

37

Patent offices with most applications

35%

35%

14%

9%

2%2%

1%

1%

1%

Chinese

English

Japanese

Korean

German

Russian

Portugese

French

Spanish

Publishing

languages

(estimated)

WIPO 2013

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

• Patent families

• Patent abstracts

• Full text of patent documents

What is special with patent information?

• Patent families

• Patent abstracts

• Full text of patent documents

38

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patent family – definitions

A patent family in general

A collection of related patent applications covering the same

or similar technical content.

A simple patent family specifically

A set of patent documents protecting the (most likely) same

invention filed in several countries which are related to each

other by one common priority filing.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patent families effecting search results

Documents

Title 750

Abstract 997

Titel and Abstract 1252

Description 12

Claims 2

40

DEPATISnet search for „Additive manufacturing“

Content of the DEPATISnet

patent database:

• > 90 million patent documents

• with title and abstract

• Fulltext for DE documents only

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patent families effecting search results

Documents Families

Title 761 376

Abstract 1008 602

Titel and Abstract 1266 711

41

DEPATISnet search for „Additive manufacturing“

The DEPATISnet result

set is processed so that it

has one member per

family only.

As soon as the text of one family member comprises the

search terms the document is included in the result set.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 42

Priority document

Source: SIP

Invention Manager

No English text

English textEnglish text

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

A German utility model

43

DE 202015004336 U

Vorrichtung zum Auftragen

zumindest eines Werkstoffes,

Extruder, 3D-Druckkopf, 3D-Drucker,

Werkzeugmaschine und

Steuereinrichtung

Patent family list in DEPATISnet

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

How often do we have to expect no family members

44

Patent families by origin and number of offices

Source WIPO 2014

% Better don‘t rely on English

language family members!

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is special with patent information?

• Patent families

• Patent abstracts

• Full text of patent documents

45

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patent abstracts

• Abstracts written by the applicant

• Abstracts written by a patent office

• Abstracts from patent abstracting services

46

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Abstracts written by the applicant

Regulated by standards and rules

Rules WIPO Standard 12

The abstract should enable the reader thereof […] to ascertain

quickly the character of the subject matter covered by the

technical disclosure.

DPMA guidelines on patent abstracts

The abstract is for technical information purpose only, in a way

to allow for a quick overview of the content of the invention

and also to be used in documentation systems.

47

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

The aim of an abstract …

• is to support post search review of search results

• is to support a rough understanding of an

invention without reading the full document

• is not primarily to support the discovery of the

document by searching the abstract text.

48

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Abstracts written by patent office

Offices, that generate an English language abstract:

• WIPO – World Intellectual Property Organisation

• SIPO – Chinese Patent Office

• KIPO – Korean Patent Office

• JPO – Japanese Patent Office

• …

Except for WIPO the preparation of the English

abstract takes some time (around three months).

49

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 50

Abstract in Bulgarian language

WO English abstract used all over the family

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Original Bulgarian Patent abstract

BG111248 Изобретението се отнася до методът за преобразуване на

триизмерен дигитален модел в пространствен обект от листов материал и

може да намери приложение за изследване на триизмерни дигитални модели

и създаване на пространствени обекти от листов материал, които

представляват точно копие на съществуващ обект (музеен експонат,

скулптура и други) или свободно генериран такъв, както и за мащабиране на

модели на движими и недвижими културно-исторически паметници на

културата, архитектурни сгради и туристически обекти. Методът включва

поставяне на триизмерен дигитален обект (1) в геометрична форма (2),

създаване на формообразуващи елементи чрез разрязване на обект с

равнина, изрязването им и последващото им подреждане до изграждане на

пространствен обект, при което се образуват едновременно два

самостоятелни обекта - позитив и матрица, като и двата са изградени от

листов материал.

51

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Google translate

52

https://translate.google.com

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patent abstract – Google machine translation

BG111248 Изобретението се отнася до методът за преобразуване на

триизмерен дигитален модел в пространствен обект от листов материал и

може да намери приложение за изследване на триизмерни дигитални модели

и създаване на пространствени обекти от листов материал, които

представляват точно копие на съществуващ обект (музеен експонат,

скулптура и други) или свободно генериран такъв, както и за мащабиране на

модели на движими и недвижими културно-исторически паметници на

културата, архитектурни сгради и туристически обекти. Методът включва

поставяне на триизмерен дигитален обект (1) в геометрична форма (2),

създаване на формообразуващи елементи чрез разрязване на обект с

равнина, изрязването им и последващото им подреждане до изграждане на

пространствен обект, при което се образуват едновременно два

самостоятелни обекта - позитив и матрица, като и двата са изградени от

листов материал.

53

The present invention relates to a method for converting a three-

dimensional digital model of the spatial object of sheet material and can

be applied to study the three-dimensional digital models and creating

spatial objects from sheet material, which are replicas of an existing

object (a museum exhibit, sculpture and other) or freely generated

therefrom, and to scale models of movable and immovable cultural and

historical monuments, architectural buildings and tourist sites. The

method involves placing a three-dimensional digital object (1) in the

geometric shape (2) creation of shaping elements by cutting the object

plane, cutting them and their subsequent arrangement to build a spatial

object forming simultaneously two separate sites - positive and a

matrix, both of which are made of sheet material. Google translate

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

WIPO‘s English patent abstract

WO 14005202 The invention relates to a method for transforming a three

dimensional digital model in a space object made of sheet material. The

method allows the shape forming elements of the sheet matrix to be

permanently or temporarily fixed so that it can be used for circulation copies

or unique pieces of the space object in material. The method according to

the invention included the object is a three dimensional digital model (1),

which is brought in a geometry form (2) comprising the whole three

dimensional digital model - the positive (1) and representing its matrix (2),

whereby the outer contour of the three dimensional digital model (1),

determined by a multitude of points belonging simultaneously to the three

dimensional digital model (1) and to the geometry form (2), determines the

dividing contour (3).

54

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

825136

571612

328436

204589

147987

6316744914 43031 34741 30884 29717 22938 16886 15444

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

CN US JP KR EP DE RU IN CA BR AU UK FR MX

55

Patent offices offering English abstracts

94%

6%

English abstract

No English abstract

Delay can be

3 months Patent abstracts

languages

(estimated)

WIPO 2013

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

English patent abstracts in Espacenet

Content of the Espacenet patent database:

• > 90 million patent documents

• with searchable titles and abstracts

• only as far as an English abstract is available

English-language abstracts are available for all patent

applications from PCT minimum documentation countries

dating from 1970 onwards, in some cases even earlier.

56

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

English patent abstracts in Espacenet

For patent documents not originally in English and without a

corresponding document (family member) in English an

abstract may be available as a translation into English of the

original abstract.

In some cases the EPO has translated neither the title nor the

abstract into English (FR applications, DE utility models etc.).

In this case the patent document can not be found in

Espacenet using text search!

57

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

DE utility model without patent family

58

DE 202015004336 U

Vorrichtung zum Auftragen

zumindest eines Werkstoffes,

Extruder, 3D-Druckkopf, 3D-Drucker,

Werkzeugmaschine und

Steuereinrichtung

Espacenet

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

English patent abstracts

DEPATISnet

documents

DEPATISnet

families

Espacenet

families

Title 761 376 427

Abstract 1008 602 628

Titel and

Abstract1266 711 761

59

Espacenet search for „Additive manufacturing“

compared to DEPATISnet

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patent abstracting services

• Derwent World Patent Index DWPI

• Chemical Abstract Services

• …

60

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Derwent WPI

Derwent documents are created per family and consist of an

English abstract and further coding.

The process of abstracting a new invention (by specialists)

involves

• reviewing the entire document with particular emphasis on

the claimed invention

• to capture and clearly describe the invention …

61

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

The aim of an abstract …

• is to support post search review of search results

• is to support a rough understanding of an

invention without reading the full document

• is not primarily to support the discovery of the

document by searching the abstract text.

62

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is special with patent information?

• Patent families

• Patent abstracts

• Full text of patent documents

63

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Limited copyright for patents

• Patent offices have to disclose patents to the public.

• Patent offices offer free patent databases to the public.

• Patent documents are available for free or for a just cost-

covering fee.

• But:

Patent text and figures must not be copied without citation.

64

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching patent text for free

Fulltext availability in free patent databases

• DEPATISnet – searchable fulltext for DE documents only

• Espacenet – searchable fulltext for EP documents only

using a separate database

• Patentscope – serchable fulltext for documents from 25

offices in the official language, some in non latin script

65

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching text using Patentscope

DEPATISnet

documents

Patentscope

documents

Title 761 633

Abstract 1008 845

Titel and Abstract 1266 1.057

Fulltext 4.079

Claims only 1.108

66

Search for „Additive manufacturing“

Patentscope content :

• > 49 million patent

documents

• from 43 patent offices

• fulltext from 25 offices,

some in non latin

script

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching text using Patentscope

Stemming – a curse or a blessing?

Example

Searching Patentscope for „Additive Manufacturing“

Stemming activated: 14.405

Stemming disabled: 4.079

Reason

additive has the same stem as additives and additional

67

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Fee based full-text patent databases

• Infoapps Sem-IP

• LexisNexis Total Patent

• Minesoft Patbase

• Questel Orbit

• SIP Invention Navigator

• Thomson Innovation

• STN

• Proquest Dialog

68

Patbase content :

• > 100 million patent

documents

• from > 100 patent offices

• fulltext from 27 offices,

some in non latin script

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching text using Patbase

DEPATISnet

Documents

Patentscope

Documents

Patbase

Documents

Compared to

Patentscope

Title 761 633 871 + 37 %

Abstract 1008 845 1.342 + 58 %

Titel and Abstract 1266 1.057 1.565 + 48 %

Fulltext 4.079 7.234 + 77 %

Claims 1.108 2.381 + 115 %

69

Search for „Additive manufacturing“

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching text using Patbase

DEPATISnet

Documents

Patentscope

Documents

Patbase

Documents

Patbase

Patentfamilies

Title 761 633 871 423

Abstract 1008 845 1.342 644

Titel and Abstract 1266 1.057 1.565 770

Fulltext 4.079 7.234 2.757

Claims 1.108 2.381 850

70

Search for „Additive manufacturing“

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searchable text available for a patent family in PatBase

71

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searchable machine translated text available in Patbase

In case of no English family member

Patbase makes available searchable

machine translations for title,

abstract and full text.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searchable machine translated text available in Patbase

DE 201520004336 U – Description – Machine translation

DE 201520004336 U – Description

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Agenda

74

• Usage of language – in general, in science and in patents

• Patent information searching with words

Using the English language

Using machine translation

Using Crosslingual searching

• Non patent literature searching with words

• The future of multilingual searching

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translation

An often told story about machine translating

a well known saying:

The spirit is willing, but the flesh is weak.

The vodka is great, but the meat is lousy.

This story is not true!

75

Machine translated into Russian

and back again

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translation

A translation machine can not understand the meaning of a

text, e.g. when translating this sentence:

• English

There are seven words in this sentence.

• German

Es gibt sieben Wörter in diesem Satz.

• France

Il y a sept mots dans cette phrase.

76

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Some basics on machine translation

Mainly there are these approaches

• Rule-Based Machine Translation

“is like breaking a secret office’s secret code”

• Statistical Machine Translation

“an electronic bulldozer allowed to steal

other people’s work”

• Hybrid Machine translation

Something inbetween

77

Quotes from David Bellos‘ book

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Rule-Based Machine Translation

An approach based on linguistic information about

source and target languages basically retrieved from

dictionaries and grammars covering the main semantic,

morphological, and syntactic regularities of each language

respectively.

Example: JPO translation machine

78

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 79

„OCR Technologies and machine translation“ by Keiji Asano, HATSUMEI‐TSUSHIN Co.,Ltd. EPO EMW 2014

at the Japanese patent office

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 80

„OCR Technologies and machine translation“ by Keiji Asano, HATSUMEI‐TSUSHIN Co.,Ltd. EPO EMW 2014

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Statistical machine translation

This approach works by detecting patterns in hundreds of

millions of documents that have previously been translated

by humans and making intelligent guesses based on the

findings.

Example: Google translate

81

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 82

http://works.bepress.com/cgi/viewcontent.cgi?article=1054&context=uwe_muegge

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translation at the EPO

MT is more than ever necessary for global patent

documentation:

• Volume of patent collections grows; systematic manual

translation is not an option anymore

• Machine Translation provides multi-lingual access to patent

documents

• Support for all EPO member state languages, as well as

Asian and Russian languages

• Service offered for free to users of Espacenet and

European publication server

83

Source: Auke Hoekstra, European Patent Office EPO East Meets West 2014

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translation specialised on patents

The EPO on Google translate in Espacenet

The cooperation with Google […] has already led to a

significant improvement in the quality of the machine

translation of patents.

This was achieved by the introduction of several hundred

thousand high quality translations of patents in the languages

provided by the EPO, which Google used to ‘train' its Google

translate system.

84

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translation specialised on patents

The EPO on Google translate in Espacenet

It takes a statistical approach,

comparing the source document sentence by sentence to

millions of patent documents previously translated by

humans.

The final translation profits from this "previous learning"

by the translation engine.

85

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translation specialised on patents

The EPO on Google translate in Espacenet

We know that confidentiality is crucial for users of

information provided by the EPO,

so our agreement with Google ensures that nobody has

access to information about your searches or translations.

86

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Englisch

Deutsch

Französisch

Languages using

non latin script

Quelle: EPO 2014

Espacenet machine translation

patenttranslate for 32 languages

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope machine translation

Patentscope allows for translation

• of titles and abstracts in the

result list

• of full text of description or claims

Patentscope offers

a choice of

translation machines

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translation specialised on patents

WIPO Translate

is a statistical machine translation tool that was developed

in-house. It can translate any patent document from and

to 14 different language pairs:

English Chinese, French, German, Japanese, Korean,

Russian, Spanish and vice versa

89

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope

result list

90

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope

result list

translation

91

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope

full text

translation

92

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope

full text

translation

93

While Patentscope‘s translation

machine works on the fly you

can monitor the progress of text

transformation.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Translation machines work on the fly

• In free patent databases machine translated text is not

stored and indexed.

• It is generated again and again on request.

• This is why machine translated text is not searchable

using free patent databases.

• This is different for fee based patent databases.

94

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

How Patbase machine translation works

• PatBase stores English machine translations for Japanese,

Chinese, Korean, Russian and Thai text.

• When clicking on the non-Latin full-text link, where available

the original text and the stored machine translation will be

displayed side by side for easier comparison of the text.

• There is also the option to translate non-Latin full text again

on the fly to profit from eventual improvements of the

translation machine.

95

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translated text available in PatBase

96

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 97

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 98

There is also the option to translate non-Latin full text

again on the fly to profit from eventual improvements

of the translation machine.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

How Patbase machine translation works

• For some languages (CN, KR, JP, RU. TH) the translated

text is indexed and thus available for searching in English.

99

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching text using Patbase

DEPATISnet

Documents

Patentscope

Documents

Patbase

Documents

Patbase

Documents

Title 761 633 871 Searching the

stored machine

translations in

English

Abstract 1008 845 1.342

Titel and Abstract 1266 1.057 1.565

Fulltext 4.079 7.234 14.541 documents,

13.024 additional

Claims 1.108 2.381

100

Search for „Additive manufacturing“

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Machine translated text searched in PatBase

101

Example CN104802412

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Agenda

102

• Usage of language – in general, in science and in patents

• Patent information searching with words

Using the English language

Using machine translation

Using Crosslingual searching

• Non patent literature searching with words

• The future of multilingual searching

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is crosslanguage searching?

Searching a multilingual text corpus with a

multilingual search query

Suitable terms can be selected from various sources:

• Translation machines like Google translate

• Dictionaries

• Thesauri (with foreign language synonyms)

• Terminology

103

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Search terms in foreign languages

104

Now copy and paste the translated term to the search mask

of the database suited for searching non latin script.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

WIPO Pearl – a multilingual terminology

Mterminology portal built up by specialists in a controlled

process and comprising

• 29 subject fields

• 16.000 technical concepts

• 105.000 terms offered in up to ten languages:

• English, French, German, Spanish, Portuguese

• Japanese, Chinese, Korean

• Russian, Arabic

www.wipo.int/reference/en/wipopearl/

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Concept Map presenting 29 subject fields

Mouse over gives the complete

title of the field and the number

of concepts therein

http://www.wipo.int/reference/en/wipopearl

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Subject field Manufacturing

and product handling

with its 17 Subfields

Again mouse over gives

the complete title of a

subfield and the number

of concepts therein.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Concept map with 66 concepts and relations

108

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Concept maps available in up to ten languages

109

DEES

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Detailed information about each concept and its

synonyms for all available languages

110

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Context and

comments

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Google translated term

添加剂制造

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 113

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patbase

Keyword searches can be run in Chinese, Japanese, Korean,

Russian and Thai in the original language with non-Latin

characters. The non-Latin search tool is useful to identify

documents via searching for non-Latin keywords, which might

be missed by an exclusively English keyword-search.

There is a Term Translator tool (powered by WIPO CLIR)

within the Non-Latin search form.

114

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patbase

Non-Latin

search

115

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching text using Patbase

DEPATISnet

Documents

Patentscope

Documents

Patbase

Documents

Patbase

Documents

Title 761 633 871 Searching the

full text in non latin

script

(CN, JP, KR, RU, TH)

Abstract 1008 845 1.342

Titel and Abstract 1266 1.057 1.565

Fulltext 4.079 7.234 6.817 documents,

6.085 additional

Claims 1.108 2.381

116

Search for „Additive manufacturing“

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Searching text using Patbase

Patbase

Documents

Patbase

Documents

Patbase

Documents

Title 871Searching the stored

machine translations

in English

Searching the

full text in non latin

script

(CN, JP, KR, RU, TH)

Abstract 1.342

Titel and Abstract 1.565

Fulltext 7.234 14.541 documents,

13.024 additional

6.817 documents,

6.085 additional

Fulltext total Total: 24.520

117

Search for „Additive manufacturing“

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patbase Results for a search in Chinese

118

Example CN104802412

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

WIPO CLIR

CLIR stands for Cross-Lingual Information Retrieval.

CLIR is available in 13 languages:

English, French, German, Spanish, Portuguese, Japanese,

Russian, Chinese, Korean, Italian, Swedish, Dutch

119

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

WIPO CLIR

• CLIR is a tool that can propose synonyms for keywords

you enter.

• It can also translate your original input and the generated

synonyms into 12 other languages.

• It can automatically search your collection of terms,

synonyms and translations.

120

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

WIPO CLIR

There is an option to finetune the search query CLIR will

develop before usage (Expansion mode: supervised)

• Select technical fields

• Select proposed synonyms

• Specify proximity

• Choose languages

• Choose text elements to be searched (TI, AB, DE, CL)

121

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 122

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope CLIR – Search string part 1

ALLTXT:

(JA_TI:("添加 製造"~2 OR "アディティブ製造"~2 OR "混和製造"~2 OR "添加生産"~2 OR "アディティブ生産"~2 OR "混和生産"~2) OR JA_AB:("添加 製造"~2 OR "アディティブ製造"~2 OR "混和製造"~2 OR "添加生産"~2 OR "アディティブ生産"~2 OR "混和生産"~2)) OR

123

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope CLIR – Search string part 2

(KO_TI:("제조첨가제"~2 OR "제조첨가제의"~2 OR "제조첨가물"~2 OR "제조코팅물로서의"~2 OR "제조첨가"~2 OR "제조첨가재료"~2 OR "생산첨가제"~2

OR "제조방법첨가제"~2 OR "생산첨가제의"~2 OR "생산첨가물"~2 OR

"제조방법첨가제의"~2 OR "생산코팅물로서의"~2 OR "제조방법첨가물"~2 OR

"제조방법코팅물로서의"~2) OR KO_AB:("제조 첨가제"~2 OR "제조첨가제의"~2

OR "제조첨가물"~2 OR "제조코팅물로서의"~2 OR "제조첨가"~2 OR "제조첨가재료"~2 OR "생산첨가제"~2 OR "제조방법첨가제"~2 OR "생산첨가제의"~2 OR "생산첨가물"~2 OR "제조방법첨가제의"~2 OR "생산코팅물로서의"~2 OR "제조방법첨가물"~2 OR "제조방법코팅물로서의"~2)) OR

124

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Patentscope CLIR – Search string part 3

(ZH_TI:("外加剂制造"~2 OR "附加制造"~2 OR "添加制造"~2 OR "助剂制造"~2

OR "外加剂生产"~2 OR "附加生产"~2 OR "添加生产"~2 OR "助剂生产"~2 OR

"外加剂制作"~2 OR "附加制作"~2 OR "添加制作"~2 OR "助剂制作"~2) OR

ZH_AB:("外加剂 制造"~2 OR "附加制造"~2 OR "添加制造"~2 OR "助剂制造"~2

OR "外加剂生产"~2 OR "附加生产"~2 OR "添加生产"~2 OR "助剂生产"~2 OR

"外加剂制作"~2 OR "附加制作"~2 OR "添加制作"~2 OR "助剂制作"~2)))

5743 documents

125

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Sometimes language is not the answer

In this case the answer could be

• Thesauri and descriptors

• Codes, e.g. Derwent Manual Codes

• Patent Classification, e.g. IPC

best if integrated into the database used.

126

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Derwent Manual Codes

127

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

IPC

128

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Development of B33Y classified patent documents

Compared to full text searching:

„additive manufacturing“

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Agenda

130

• Usage of language – in general, in science and in patents

• Patent information searching with words

Using the English language

Using machine translation

Using Crosslingual searching

• Non patent literature searching with words

• The future of multilingual searching

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Non patent literature – NPL

What is it?

• Scientific articles

• Books

• Standards

• Conference proceedings

• Company publications, e.g. product information

• Publications on the internet

• „Gray literature“

• …

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Sources für literature searching

• Internet search engines, mostly used Google

• Google Scholar

• Abstracting services

• Citation services

• Standards databases

• …

132

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Reasons for (multilingual) NPL searching

• Scientists will find publications out of the scope of their

regular spectrum of journals.

• Information specialists can identify worldwide publications

• using abstracting services or

• through the internet using machine translation

• Patent information specialists have to search for NPL to

complete novelty or FTO searches.

133

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Multilingual NPL as part of patent searching

PCT Minimum Documentation – non patent literature as part

of the examiners search duty. Examples:

• Kobunshi Ronbunshu

Japanese Journal of Polymer Science and Engineering

• Korean Journal of Traditional Knowledge

• Žurnal obsej himii

Russian Journal of General Chemistry

134

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 135

Source EPO

at the EPO

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

http://oversea.cnki.net

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Quelle: STN

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Multilingual searching using Google

• Choose the searching language

• Use Google translate for applicable terms in the

selected language

• Search these terms using Google

• Use the Google toolbar option to highlight terms in the

resulting document list

• Use Google‘s webpage translation

138

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Korean

Chinese, simpl.

Chinese

Japanese

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 140

Google toolbar

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Google translate for search terms

141

Now copy and paste the translated term to Google.

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 142

Machine translated result page

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Korean

For some languages a keyboard is available

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Google scholar

• A freely accessible web search engine that indexes the full

text or metadata of scholarly literature across an array of

publishing formats and disciplines.

• The Google Scholar index includes most peer-reviewed

online journals of Europe and America's largest scholarly

publishers, plus scholarly books and other non-peer

reviewed journals.

• Google Scholar is searchable in 13 languages.

144

WIKIPEDIA 12.10.15

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Google scholar

• While Google does not publish the size of Google Scholar's

database. Third-party researchers estimated it to contain

roughly 160 million documents as of May 2014.

• Google Scholar also resembles the subscription-based

tools, Elsevier's Scopus and Thomson Reuters' Web of

Science.

145

WIKIPEDIA 12.10.15

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Origin of literature

146

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 147

Using Google Scholar in English

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 148

Using Google Scholar in Chinese

with highlighting and machine translation

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Multilingual NPL abstracting services

Abstracts written by specialists

• CAS (NPL and patents)

• Medline

• Inspec

• …

149

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

503

8780 2611990496

266915

515502

31286 23547 19189 2612837355 34123

0

200000

400000

600000

800000

1000000

1200000

1974 1984 1994 2004 2009 2014

Development of CAS content (NPL and patents) in different languages

English Chinese German French Russian

150

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Mulitligual searching using language tools

• Terminology

• Thesaurus

• Taxonomy

• Ontology

• Classification

151

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

WIPO Pearl – terminology available in up to

ten languages

152

DEES

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Thesaurus

153

• A reference work that lists words grouped together

according to similarity of meaning (containing synonyms

and sometimes antonyms).

• A thesaurus may contain foreign language synonyms.

• This in contrast to a dictionary, which provides definitions

for words, and generally lists them in alphabetical order.

Source: Wikipedia

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

INSPEC – Content

Multidisciplinary Database

• Physics

• Electrical & Electronics Engineering

• Manufacturing, Production & Mechanical

Engineering

• Computing & Control Engineering

• Information Technology

154

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

INSPEC – Content

• more than 15 million records

• from 1898 to date

• > 3,800 journals

• other publications: conference proceedings,

books, reports, dissertations, patents

155

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Inspec Thesaurus – Selected items

156

BT Broader term

TT Top term

NT Narrower term

RT Related term

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

INSPEC documents in non-English languages

0

50

100

150

200

250

300

Chinese Japanese German Russian

Additive manufacturing

157

Plus >5000 documents in English

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Sources für literature searching

• Internet search engines, mostly used Google

• Google Scholar

• Abstracting services

• Citation services

• Standards databases

• …

158

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Asian Science Citation Index – ASCI

is committed to provide an authoritative, trusted and

significant information by the coverage of the most

important and influential journals

http://ascidatabase.com

159

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß 160

„Asian Documentation &

Patents and Standards:

an update” 2015

by Niclas Morey, EPO

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Agenda

161

• Usage of language – in general, in science and in patents

• Patent information searching with words

Using the English language

Using machine translation

Using Crosslingual searching

• Non patent literature searching with words

• The future of multilingual searching

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

The future

• What will propably not become reality?

• What can we expect?

• What will come next?

• What is quiet far away?

• What is really far away?

162

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What will propably not become reality?

„I would like to see patent offices and others

writing patent abstracts using a thesaurus

with recommended wording and spelling,

which would suggest the „right“ word

if a variant was used.“

Stephen van Dulken 2014

163

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

The future – What can we expect?

Joining search tools with language tools like thesauruses,

taxonomies, ontologies etc.

• This works nearly perfectly for patent classification in

patent databases.

• This works well for thesauri in some databases, e.g.

INSPEC

• This has to be enabled for further systems,

e.g. a combination of Patentscope and Pearl

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What will come next?

• WIPO:

Further patent offices‘ documents, data and fulltext will be

included.

• EPO:

Full text search for some patent offices‘ documents is

announced.

• More databases with searchable machine translated text

• Increasing user acceptance of machine translated text to

justify further developing efforts and funding

165

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is quiet far away?

• Better automatic processing of linguistic

constructions, especially for Asian languages

• Multilingual access to non-textual documents, e.g.

images, videos, etc.

166

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

What is really far away?

A change of THE global language?

Will English be the Lingua Franca of the future?

Other languages fell back that were important once:

• Latin – as the universal language in medieval times

• French – the language of politics and diplomacy of the 18th

and 19th century

• German – the language of science until recently

• Esperanto – never really made it

167

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Thank you for listening!

Contact:

WissensWert Seminare – Beratung

Dr. Gabriele Kirch-Verfuß

[email protected]

T: +49-2361-9040-273

www.wissenswert-wm.de

Subscribe to our newsletter: [email protected]

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Literature – books

• Peters, Carol, Braschler, Martin, Clough, Paul:

Multilingual Information Retrieval

Springer 2012

• David Bellos: Is That a Fish in Your Ear?

Faber & Faber; Reprint 2012

169

© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß© 2014 WissensWert Seminare Dr. Kirch-Verfuß© 2015 WissensWert Seminare Dr. Kirch-Verfuß

Literature – journal articles

• Stephen van Dulken: Do you know English? The

challenge of the English language for patent searchers

World Patent Information, 2014, vol. 39, pages 35-40

• Stephen Adams: The text, the full text and nothing but the

text: Part 1

World Patent Information, 2010, vol. 32, pages 22-29

• Stephen Adams: The text, the full text and nothing but the

text: Part 2

World Patent Information, 2010, vol. 32, pages 120-128

170