17
Michele Filannino COMP80122: final presentation Manchester, 29/02/2012 temporal expressions identification in biomedical texts

Temporal expressions identification in biomedical texts

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Temporal expressions identification in biomedical texts

Michele Filannino

COMP80122: final presentation

Manchester, 29/02/2012

temporal expressionsidentification in biomedical texts

Page 2: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

where we are

■ Computer science

● natural language processing

▶ information extraction

★ temporal expressions extraction

2

Page 3: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

1 L. Ferro, I. Mani, B. Sundheim, and G. Wilson, “Tides temporal annotation

guidelines, v. 1.0.2,” MITRE, 2001

temporal expression definition

■ natural language phrase that denotes a temporal

entity: an interval, or an instant (Ferro et Al.)1

● She has been at work for more than a month

● He wrapped up a three-hour meeting with the Iraqi

president in Baghdad today.

3

Page 4: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

why?

■ user’s perspective

● temporal aspects of events and entities provide a

natural mechanism for organising information.

■ machine’s perspective

● improvements in

▶ question answering, summarisation, browsing

4

Page 5: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

why clinical domain?

■ diagnosis explanation

■ disease progression

modelling

■ analysis of effectiveness of

treatment

5

Page 6: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

Source: Google Scholar (last update 27/02/2012)

scientific interest

6

0

7

14

21

28

35

42

49

56

63

70

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

10

61

4946

4338

46

25

1516181210

“temporal expressions” AND “clinical”

Page 7: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

1 J. Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the Recognition of Temporal Expressions”, 2009

temporal forms1

■ time or date references

● 11pm, February 14th, 2005

■ time references that anchor on another time

● one hour after midnight, two weeks before Christmas

■ durations

● few months, two days, five years

■ recurring times

● every third month, twice in the hour

7

Page 8: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

1 J. Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the Recognition of Temporal Expressions”, 2009

temporal forms1

■ context-dependent times

● today, last year

■ vague references

● somewhere in the middle of June, the near future

■ times indicated by an event

● the day S. Berlusconi resigned

▶ an event is considered a cover term for situations that

happen or occur

8

Page 9: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

methodology

■ annotation

● recognition

▶ automatically detect and delimitate expressions

▶ mostly machine-learning techniques

● normalisation

▶ assign attributes values for all the recognised

expressions

▶ using a shared and formal format

▶ mostly rule-based techniques

■ reasoning or searching

9

Page 10: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

Source: TRIOS TimeBank v.0.1

example: raw text

That means Unisys must pay about $100 million in interest every

quarter, on top of $27 million in dividends on preferred stock.

10

Page 11: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

Source: TRIOS TimeBank v.0.1

example: recognition

That means Unisys must <ev>pay</ev> about $100 million in

interest <te>every quarter</te>, on top of $27 million in

dividends on preferred stock.

11

Page 12: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

Source: TRIOS TimeBank v.0.1

example: normalisation

That means Unisys must <EVENT eid="e110" ...>pay</EVENT>

about $100 million in interest <TIMEX3 tid="t256" type="SET"

value="P1Q" temporalFunction="false"

functionInDocument="NONE" quant="every">every quarter</

TIMEX3>, on top of $27 million in dividends on preferred stock.

<TLINK lid="l32" relType="BEFORE" relatedToEvent="e110"

eventID="e107"/>

<TLINK lid="l26" relType="OVERLAP" eventID="e110"

relatedToTime="t256"/>

12

Page 13: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

lack of corpora

13

Page 14: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

my contributions

■ built the first timex corpus using all the possible

freely available timexes

● {timex, type, normalised_value, utterance_reference}

● 2822 different timexes

■ built a normaliser

● as TRIOS’ extension (University of Rochester)

● 71.66% accuracy from 62.57%

14

Page 15: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

human mistakes

15

utterance expression type annotation

- three years before DATE FUTURE_REF

26/09/2011 this morning DATE 1998-02-06TMO

- two decades DURATION P20Y

- the summer of 1862 DATE FUTURE_REF

- centuries DURATION PXE

- the last half of ‘80s DATE 198

Page 16: Temporal expressions identification in biomedical texts

/ 2329/02/2012, Michele Filannino

presentation temporal expressions

my to-do list

16

0 3 6 9 12 15 18 21 24 27 30

8 days remaining22 days elapsed

✓ study the literature

✓ build a corpus of timexes

✓ build a normaliser

■ release my timexes corpus freely

■ literature review

Page 17: Temporal expressions identification in biomedical texts

Thank you.