NLP Tales in Biomedicine (introductory presentation for the Auckland NLP MeetUp group by Anna...

Preview:

DESCRIPTION

Slides from talk: NLP tales in Biomedicine Auckland MeetUp group, June 2014 http://www.meetup.com/Natural-Language-Processing-in-NZ/events/184030662/ Mining text to answer biomedical questions is a fascinating applied research area. The biomedical domain is one of the first 'big data' domains. It attracts people from the domain itself passionate to answer pressing scientific questions as well as computer scientists and linguists who see a domain with great standards, resources and numerous applications. During this talk I will give you a brief overview of different NLP problems in the biomedical domain and I'll make comparisons to mainstream NLP applications (e.g., search) and other, more commercial domains (e.g., voice of customer). My aim is to introduce you to a domain with state of the art solutions, free high-quality resources and well developed methodologies. If I inspire anyone to work on challenging biomedical problems, will be a bonus!

Citation preview

NLP Tales in Biomedicine

Anna Divoli@annadivoli

nelshami.deviantart.com

Auckland NLP MeetUp June 2014

Biology recap

Traits DiscussionsDNA > Genes > Proteins > Phenotype: Function > Emotions

Disease Medical Notes (scientific literature)

Information of interestGenes / Proteins specific information for database annotation

Gene names:tinman, lilliputian, dreadlocks, lush, cheap date, methuselah, Van Gogh, maggie, brainiac, grim, reaper, cleopatra, swiss cheese, ken and barbie, kenny, out cold, lava lamp, hamlet, sonic hedgehog, werewolf, half pint, fucK, drop dead, chardonnay, agnostic, I’m not dead yet…

imp.princeton.edu/static//css/images/network.png

Information of interestGenes / Proteins relationships & network building

www.frontiersin.org/files/Articles/77923

Proteins: their sub-cellular location, their structure, the conditions of their expression, their interactions, disease associations…

Disease – Drug: interactions, adverse effects, secondary indications…

Other entities: organs/tissues, metabolites/chemicals, phenotypes…

Detecting: methodologies & findings in experimental papers, paradigm shifts…

Systems for specific: diseases, pathways, drug targets, organisms…

Examples of information of interest

Don Swanson’s ABC model:

dietary fish oil

reduction of: blood viscosity, platelet aggregability, vascular reactivity

Raynaud’s disease

- Swanson, D. R. (1986). Fish oil, Raynaud's syndrome and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1): 7-18.- Swanson, D. R. (1987). Two medical literatures that are logically but not bibliographically connected. Journal of the American Society for Information Science 38: 228-233.

Literature-based Discovery: Text mining!

causes

ameliorates

Don Swanson’s ABC model:

dietary fish oil

reduction of: blood viscosity, platelet aggregability, vascular reactivity

Raynaud’s disease

- Swanson, D. R. (1986). Fish oil, Raynaud's syndrome and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1): 7-18.- Swanson, D. R. (1987). Two medical literatures that are logically but not bibliographically connected. Journal of the American Society for Information Science 38: 228-233.

Literature-based Discovery: Text mining!

causes

ameliorates

Ontologies

OBO Foundry: GOCHEBI…

Medical:UMLSSNOMED CTIDC…

MeSH

Search

Search

Search query: SAF LTR

Looking for: interactions between SAF and viral LTR elements(SAF is a transcription factor, LTR stands for ‘long terminal repeat’)

but also:SAF: Single And FreeLTR: Long Term Relationship

better to use domain specific resources in occassions like this

Search

Search

Search

Search

Search

Search

Search

Search

Search

Social Biomedicine

NewsPatient ForumsBlogsTwitter…

Social Biomedicine

NewsPatient ForumsBlogsTwitter…

Social Biomedicine

NewsPatient ForumsBlogsTwitter…

Social Biomedicine

NewsPatient ForumsBlogsTwitter…

344 online forum posts on DepressionSource: www.patient.co.ukDate: July 2013

Social Biomedicine

NewsPatient ForumsBlogsTwitter…

344 online forum posts on DepressionSource: www.patient.co.ukDate: July 2013

Social Biomedicine

NewsPatient ForumsBlogsTwitter…

344 online forum posts on DepressionSource: www.patient.co.ukDate: July 2013

Social Biomedicine

NewsPatient ForumsBlogsTwitter…

344 online forum posts on DepressionSource: www.patient.co.ukDate: July 2013

Citation Analysis

From: clinical meta-analysis… to: detect information for knowledge augmentation and summarization…

Citation Analysis

From: clinical meta-analysis… to: detect information for knowledge augmentation and summarization…

Citation Analysis

From: clinical meta-analysis… to: detect information for knowledge augmentation and summarization…

Growing field: publications over the past 20 years

“text mining” ontology

Summary

Entities – Relationships/Interactions

Resources: Databases, Ontologies, Corpora…

Networks: Systems Biology, Translational Medicine, Literature-based Discovery

End Users – Search

Social Biomedicine

Citation analysis

… and this is just a 30 min introduction…

Resources

Databases (from genes to literature):http://www.ncbi.nlm.nih.gov/http://www.ebi.ac.uk/services

Ontologies & Linked Data:http://www.obofoundry.org/http://www.nlm.nih.gov/research/umls/http://linkedlifedata.com/

Corpora:http://compbio.ucdenver.edu/ccp/corpora/obtaining.shtmlhttp://www.nactem.ac.uk/resources.php

Recommended