32
KNIME used in integrated Operations and Economics Research, Forecasting and Budgeting KNIME UGM 2012 | February 1 st 2012 | Zurich, Switzerland Marc Richter | Office for Harmonization in the Internal Market (EU Trade Marks and Designs) | Alicante, Spain

KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

KNIME used in integrated Operations and Economics Research, Forecasting and Budgeting

KNIME UGM 2012 | February 1st 2012 | Zurich, Switzerland

Marc Richter | Office for Harmonization in the Internal Market

(EU Trade Marks and Designs) | Alicante, Spain

Page 2: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

DISCLAIMERAny views or opinions expressed herein are those of the

author and do not necessarily reflect the views of OHIM,

or those of its management or staff.

2

Page 3: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

ToC• OHIM: Institutional role and strategy

• The links between operations and

economics research, forecasting and

budgeting

• Internal KNIME uses & experiments

• Cooperation Fund and other joint

projects

• Conclusions3

Page 4: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

OHIM: Institutional role and strategy

Page 5: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

OHIM: basic institutional information

5

• Non-profit-making EU agency

• Established in the late 90s

• Registering trade marks and designs that are valid in all 27 EU

member states ���� Intellectual / Industrial Property (IP) Office

• 1 Million Community Trade Marks filed by Sept 2011

• Around 700 internal staff (1000+ incl. contractors)

• Annual income of around €180m (fully self-financed)

• Supervised by the European Commission, but having legal,

administrative and financial autonomy

Page 6: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Registered Community Design (RCD)

•Protects the outward appearance of a product or part of it, resulting from the lines, contours, colours, shape, texture, materials and/or its ornamentation

•Limited renewals (max. 25 years)

•Mostly a formality-driven processCommunity Trade Mark (CTM)

• Can be any sign which serves in business to distinguish the goods and services of one undertaking from those of other undertakings and over which the owner has an exclusive right

• Unlimited renewals

• Complex legal process & reqs.

[CTM No. 6314546]

[RCD 000181607-0001]

OHIM Industrial Property (IP) Rights

6

Page 7: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

CTM registration process: an example

For fruit retail services, home computers and fusion reactors

For fruit retail services, home computers and fusion reactors

For fruit retail services, home computers and fusion reactors

Flowchart source: http://oami.europa.eu/ows/rw/pages/CTM/regProcess/regProcess.en.do 7

~100,000/yr.

~93,000/yr.

~18,000/yr.

Page 8: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

2011: Strategic Management approach

8

Page 9: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

9

• Strategic Plan metrics

− Two pillars: Organisational Excellence | International Cooperation

− Three goals: Organisation | Quality & Timeliness | Convergence of practices

− Six lines of action (LoA)

− 33 key initiatives (KI)

• Strategy implementation

− Strategy map with 20 linked strategic objectives (mapping all KIs)

− One strategic programme per line of action (for all KIs)

− May 2011 reorganisation to support the Strategy (next slide)

• Strategy progress monitoring

− Corporate Balanced Scorecard (BSC)

o 45 aggregate corporate indicators

o 127 mid-level indicators

o Even more detail at bottom level

− Departmental scorecard deployment, adding

further breakdowns at bottom level

2011: Strategic Management approach

Page 10: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

The Economics and Statistics Service shall be responsible

for all activities related to the following tasks:

Economic development

IP trends

New emerging markets and new IP rights

Social and demographic data including IP demand projections

CTM and RCD statistics

Inofficial motto (quoting Chief Economist Nathan Wajsman):

“We torture the numbers until they confess”

10

2011: Strategic Management approach

Page 11: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

The links between operations and economics research, forecasting and

budgeting

Page 12: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Basic planning need: Public sector balance

12

Balance = Income - Expenses = 0

Balance = (gov‘t funding + fees after refunds) -

(operational + strategic + overhead costs) = 0

OHIM bal. = fees a/r - (operational + strategic + overhead costs) = 0

���� All income and most costs (operations) depend exclusively on filings

Fees after refunds:•Fee level•Demand at fee level

• New applications• Renewals• Defensive actions

•(Fee refunds)

Why? How? When?From where?In which language?How conflictive?How difficult?

Page 13: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Basic planning need: Public sector balance

13

Operational cost waterfall with feedback loop

demand by

procedure, area,

language

workload & stocks

/ backlogs

service level commitments

& case handling approach

staff needs & mean salary

levels by competence

satisfaction by

procedure, area,

language, country

Multipliers

(media)

Page 14: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Draft internal analytic flow model for OHIM

14

Absolute Grounds Classification Formalities

Publication

Registered

Opposition

Renewal

Appeal

Cancellation

GC CJFees check

All casesSome cases, neutral pathSome cases, escalated pathDead cases

[Richter 2010]

Page 15: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

FC: Filing channel FY: Filing yearL/C: Language / country DEF: Deficiency/(-ies)DEL: Delay PT: PartiesOC: Outcome(s)EXA

PUB / OPP wait

REG

OPP

REN

Boards / Courts

CANC

Pre-PUB (PP)

Volume (non-duplicates) by FC, FY, L/C

Volume by FC, FY, L/C, EXA DEF / DEL

Volume by FC, FY, L/C, EXA DEF / DEL

Volume by FC, FY, L/C,

EXA/PP DEF / DEL

Volume by FC, FY, L/C, EXA/PP DEF / DEL

Volume by FC, FY, L/CEXA DEF / DEL

Volume by FC, FY, L/C,EXA/PP DEF / DEL

Volume by FC, FY, L/C, EXA/PP DEF / EXA DEL

Volume by FC, FY, L/C, EXA/PP DEF / DELVolume by FC, FY, L/C,EXA/PP DEF / DEL

Volume by FC, FY, L/C, EXA/PP/OPP DEF / DEL, PT, OC

Vo

lum

eb

yF

C, F

Y, L

/C,

EX

A/P

P/O

PP

D

EF

/ D

EL

, P

T, O

C

Volume by FC, FY, L/C, EXA/PP/OPP DEF / EXA DEL, PT, OC

Vo

l. by

FC

, FY

, L/C

, EX

A/P

P/O

PP

/AP

P D

EF

/ D

EL

, PT

, OC

Vo

lum

eb

yF

C, F

Y, L

/C, E

XA

/PP

/OP

P/A

PP

D

EF

/ DE

L, P

T, O

C

Volume by FC, FY, L/C, EXA/PP/OPP/CANC DEF / DEL, PT, OC

Volume by FC, FY, L/C, EXA/PP/OPP/CANC DEF / DEL, PT, OC

Vo

lum

eb

yF

C,

FY

, L

/C,

EX

A/P

P/O

PP

/CA

NC

D

EF

/ D

EL

, P

T, O

C

Volume by FC, FY, L/C, EXA/PP/OPP DEF / DEL, PT, OC

Generic TM flow draft & param. of interest

15

[Richter 2010]

Page 16: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Typical questions, methods & data sources

16

• How many applications will I get from the anonymous market?

���� econometrics

• How many filings come from not-so-anonymous returning filers?

���� behaviour analysis

• Who does not return? Why? (“churn”)

���� behaviour analysis

• Who returns and does not identify oneself as returning filer?

���� data quality assurance

• Who renews how often? Who files refused applications (i.e.

waste)? Who opposes? Appeals? Why? What can we do about it?

���� behaviour analysis

External economic / market data Internal register data

Page 17: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

• IP Offices generally like easy application filings and tend to dislike

complicated escalations

• However, in practice they are extremely dependent on market

actors’ use of their services (only 2 of “four P’s” partially controlled)

− Product, Price, Place, Promotion

• Unfortunately, practice and real life show complex interactions and

have many dimensions

• As a consequence, IP Office registers are underused treasure vaults

of valuable information – particularly underused in trade marks and

designs, and particularly valuable for internal purposes

Conclusions

17

Page 18: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

case characteristic category samples scale values df

multiples (per

right)

suitability (ease and

quality)

CTM/ER type word, fig, 3d, … cat ~5 ~4no very high

CTM/ER nature indiv/collect cat 2 1no very high

ER kind nat. mark, nat. unreg., … cat ~4 ~3no very high

CTM origin e-filing, fax, mail, ir cat 4 3no very high

CTM/ER age metric 1no very high

CTM/ER classes [01; 02; …. 45] cat 45 44yes very high

CTM/ER pairwise similarities highly dissimilar -> highly similar ord 5 4yes (sign, g&s) low (wide gaps)

OPP page count metric 1no medium (unreliable)

OPP grounds cat 6 5yes very high

OPP 1st language EN, DE, ES, FR, IT cat 5 4no very high

OPP 2nd language cat 22 21no very high

OPP procedural language EN, DE, ES, FR, IT cat 5 4no very high

OPP procedural language flag 1, 2 cat 2 1no very high

CTM 1st language EN, DE, ES, FR, IT cat 5 4no very high

CTM 2nd language cat 22 21no very high

CTM procedural language EN, DE, ES, FR, IT cat 5 4no very high

CTM procedural language flag 1, 2 cat 2 1no very high

CTM/ER owner(s)/rep(s) see below�

owner/rep characteristic category samples scale values df

multiples (per

party)

suitability (ease and

quality)

nationality DE, US, ES… cat ~300 299yes very high

OHIM client count metric 1yes high

OHIM product count metric 1yes high

OHIM success count metric 1yes high

A glimpse at OHIM predictor candidates

18

Page 19: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Predictor interactions and complications

19

• Language and Nice class preferences (and choices)

strongly depend on country

• Some classes co-occur more frequently than others

• Actor market focus, relative strength & strategy will

impact filing actions

• Actor size depends on market, as indicated by

country & class use

• Actor country and behaviour interact to some degree

(e.g. EU vs. national system use, defence behaviour,

strongest/most conflictive industries, filing habits,

use/non-use of legal counsel, etc.)

• Constraints and structural/systematic interactions

show in the data (e.g. legal counsel actors always from

within EU)

Page 20: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Internal KNIME uses & experiments

Page 21: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

• Planning use:

− Time-series extrapolations for SP period (2011-2015)

− Balanced Scorecard indicators

− QC control sample design and analysis

• Text mining for

− Opposition decision classification

− Experimental RSS feed monitoring

• Client classification experiments

− Bayesian classific.: SME / large-scale enterprise (survey rev. eng.)

− Filing habits: Fully electronic, mixed or fully paper

• User satisfaction survey

− Data extraction and linguistic preference determination

− Satisfaction modelling and analysis (forthcoming)

• Testing predictions of opposition outcomes (next slide)

Internal KNIME uses & experiments

21

Page 22: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Internal KNIME uses & experiments

22

Evidence for modern, model-free methods outperforming classic ones on IP data:

– Outcome prediction test of OHIM opposition outcomes (win full, win partially, lose)

– Predictors: Opposition grounds, PC1 of 6-dim governance index of both parties’ and their agents’ countries

– Some preliminary results (in-sample figures):

• 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

• 38.3 % using logistic regression (42.7 % with country dummies)

• 60.5 % using a random tree

Page 23: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Cooperation Fund and other joint projects

Page 24: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Background: OHIM Cooperation Fund

The CF is a major undertaking by any standards:

- Total budget of €50 million

- 200 individuals working on the Fund

- This figure will at times touch almost 500

- This is composed of a multitude of different experts, not all of

whom work full time

- Approximately 450 man years now planned across the lifetime

of the Fund

- 257 national project implementations over the next four years

- It’s bigger than many National Offices

24

CF

Page 25: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Background: OHIM Cooperation Fund

According to current estimates, 100,000 man days or 450 full-time equivalents

have been foreseen for the whole lifespan of the Cooperation Fund.

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

2011 2012 2013 2014 2015

PSO - PSO

PACKET 2 - Integration

PACKET 2 - WG

PACKET 2 - OHIM

PACKET 2 - External Provider

PACKET 1 - Integration

PACKET 1 - WG

PACKET 1 - OHIM

PACKET 1 - External Provider

25

Page 26: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

CF project 126: Harmonized Forecasting

26

• Project scope

− Try to forecast / predict whatever may be of interest to IP Offices

(as long as somehow related to trade marks and designs)

− Comparison of forecasting practice vs. state-of-the-art

− Cover Breiman’s “two cultures of modelling”

• Participants (commitments)

− EU offices: Denmark, Spain, Poland, Portugal, UK, Hungary

− International offices: EPO (Munich), WIPO (Geneva)

• Basic project metrics

− Kick-off next week

− 24 months duration

− Budget: €759k

− Some KNIME contribs (if

all goes according to plan)

Page 27: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

CF126 Forecasting: IT architecture vision

27

KNIME Server(computational back-end)

Cooperation Fund: project web platform

internal & external

data sources

internal & external

data sourcesoptional result

storage

optional result

storage

Documented &

deployed

model access

for end users

Analyst access

Trigger

deployed

model(s)

Trigger

deployed

model(s)Return

model

results

Return

model

results

Page 28: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Other cooperation activities

28

• Economics and Statistics Service research projects

– IP Value, Growth and Innovation study (with European Patent Office, Munich)

– Scenarios for the Future project

– IP Economics research network (as at 01/2012 with USPTO, UK-IPO, EPO and WIPO, as well as several universities)

• OHIM Strategic Plan implementation projects

– Balanced Scorecard indicator deployment

– Project metric monitoring

– Sentiment metrics development (users, staff, media)

• EU Observatory on Counterfeiting and Piracy

– IP infringement magnitude assessment

– Seizures data analysis

• OHIM benchmarking:

– “TM5” indicator exchanges with US, JP, CN and KR

– Indicator and data exchanges with national offices, national, international and non-government organisations

Page 29: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

IP Value, Growth and Innovation study

29

Pre-analysis step: company data matching (fuzzy / distance-based)

“Amadeus”companymicrodata

EPO patent applicants

OHIM TM&D applicants

Page 30: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

Conclusions

Page 31: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

• KNIME has a promising future for broad IP Office use

• An OHIM-run and paid-for server may serve as a “trade mark

economics hub”, tying in external experts

• Adding some missing functionality would be nice ☺ - e.g.

− Better workflow metadata management (think Zotero tags)

− Newer version of Weka (with package manager support)

− Metric response models:

o Simplify access to metric responses predicted by regression

nodes (linear, polynomial, Weka SimpleCART)

o Add new regression functions capable of handling “Big Data”

(i.e. not “standard R” ☺): restricted cubic splines, MARS, …

Conclusions

31

Page 32: KNIME used in integrated Operations and Economics Research ... · – Some preliminary results (in-sample figures): • 33.3 % chance of naïve guessing (3 outcomes, equal size sampling)

� (+34) 965 13 9100 (switchboard)

� (+34) 965 13 8711 (personal extension)

[email protected]

� Marc Richter

Office for Harmonization in the Internal Market

(Trade Marks and Designs)

Avenida de Europa, 4

E-03008 Alicante

SPAIN

Questions? In person @ UGM today or via