Discover more, discover faster. - Linguamatics

Discover more,discover faster.High performance, flexible NLP-basedtext mining for life sciences

Life Sciences organizations face the challenge of filtering ever-increasing

volumes of textual data to gain actionable insights for key decision-making.

The volume, variety and velocity of data is increasing exponentially. The big question

is, how do we make the best use of these data?

Key benefits

Save time during R&D with better decision support

Reduce resources and experimental costs

Generate new opportunities - answer questions that couldn't

otherwise be answered

Create visual summaries from unstructured text - for rapid

understanding and evaluation

Gain competitive advantage - reveal weak signals, sentiment and

novel relationships

“It’s not informationoverload, it’s filter failure.”Clay Shirky

Linguamatics I2E has been used for many applications

in life sciences across the drug discovery-development

lifecycle, including biomarker discovery, drug

re-purposing, clinical trial analytics, analyzing chemical

safety/toxicity signatures, and market intelligence

through the analysis of social media.

Linguamatics’ agile NLP text mining software, I2E,

provides rapid knowledge discovery from

unstructured and semi-structured text.

Using I2E, knowledge can be extracted from a wide

range of content sources such as scientific literature,

patents, clinical trials data, electronic health records

(EHRs), news feeds and proprietary content. This

knowledge can then be used to answer high value

questions in real time.

Linguamatics’ text mining technology is more

effective for knowledge discovery than traditional

search and is now well established and proven across

the life science industry, including pharmaceuticals,

biotechnology, healthcare, consumer products,

agrochemicals, government and more.

Solutions & applications in life sciences

Advanced text analytics delivers value along the pipeline

“Anyone considering text mining will find

a specialist with unrivalled domain

knowledge in Linguamatics”

Jason Stamper, 451 Research

“

Race to patent Race to market Maximise market value

Opportunity scoutingPatent analysis

Drug repurposing KOL identification Social mediaanalysis

Biomarker discovery Comparative effectiveness

Competitive intelligenceSARMutation/expressionanalysis

Pre-discovery

Drugdiscovery

Pre-clinical Phase lclinical trials

Phase llclinical trials

Phase lllclinical trials

Regulatoryreview

Scale up tomanufacture

Post marketing

surveillance

Toxicity analysis andprediction Safety Pharmacovigilance

Target ID/selection Trial site selection and study design HEOR

Gene-diseasemapping Regulatory submission QC

Selecting the best targets in the drug discovery

process is crucial for optimizing return on R&D spend

across a portfolio of research projects. Researchers

use text mining to establish target ranking based on

efficacy and safety. Methods include providing links

to biological pathways and processes, and supporting

gene expression analysis in specific tissues and

species.

A variety of pathway databases exist but their scope

may be limited to a small number of premier

journals. In addition, there is often a lack of

contextual information to focus the specific pathway

analysis. Text mining approaches complement

pathway database searches by providing both target

context and access to up-to-date results from a much

more comprehensive range of documents.

Making the most of your information assets

These case studies show some of the ways I2E has been used to

capture valuable information from life sciences literature, saving time

and increasing productivity across the drug discovery pipeline.

At a top-10 global pharmaceutical company,

Linguamatics I2E forms part of a standard reusable

framework for novel target selection in use for a

variety of R&D projects. Our advanced NLP

capabilities and intuitive reporting make it simple for

scientists to see assertions and drill down to

supporting evidence in source documents.

Targets: identification, validation and selection

I2E customers have reported ten fold

time savings in this type of literature

analysis. For 100 scientists, this is

equivalent to savings of 10 FTE years or

approximately $1m/year.

Because a natural language processing approach

involves understanding the meaning of the text, text

mining using I2E enables a rapid and deep analysis of

patent documents, potentially saving millions of

dollars. I2E can search for numerical information,

chemicals by name or structure, classes of drug

targets or therapeutic area, focus the search on

specific regions of the patent documents, or follow

claim chains across a patent.

A top-10 Pharma company used the chemistry search

capability built into I2E, along with sophisticated

query strategies and algorithms to pre-process

tables, in order to extract detailed numeric and

biological information from patent documents.

The company was able to perform more rapid

freedom-to-operate searches in comparison to

standard patent search.

Patent literature often provides the first mention of

much critical data for novel chemistry and biology -

for example, compound structure, protein target,

intended disease area. Access to these valuable data

can provide a competitive edge for pharmaceutical

researchers.

However, patent literature is notoriously hard to

search – patents can be hundreds or thousands of

pages long and contain complex information often

written with obfuscation rather than communication in

mind. Matching up mentions of chemical structures or

gene targets in one part of a document with properties

and other information mentioned somewhere else,

especially tables, is very challenging.

A traditional keyword search/document retrieval

approach is time-consuming and tedious as many

hundreds or thousands of patents may be retrieved,

particularly for in-depth patent analytics such as

opposition searches or patent landscaping.

Providing a competitive edge with patent analytics

Clinical trial analytics

Clinical trials are used to gather safety and efficacy

data on new drugs in development or existing drugs

tested for new indications. Although some

information in published clinical trial reports is well

structured and searchable using keywords, much of

the key information lies in unstructured text.

I2E is essential to extract and synthesize the high

value information that is found only in these

unstructured regions. This can then be used to:

Select trial sites more effectively, and find

precedents for study design protocols

Gain actionable information about competitors'

worldwide clinical development activities, for

example monitoring progress of competitors’

trials, or finding other companies running clinical

trials in the same therapeutic area

Uncover in-licensing opportunities, by finding

sponsors running early-stage clinical trials in

particular therapeutic areas

According to our customers: using I2E,

the time for site selection can be reduced

by over 80%. For patient recruitment, time

spent can be reduced by at least 25%.

Biomarkers: identifying thecrucial link between pre-clinicaland clinical information

Mechanistic studies of compounds require specific

traits to be measured; similarly, clinical studies of

patients need to have a biomarker to quantify effects

of disease progression and treatments. These

biomarkers can take different forms, e.g. enzymes

with varying activity, changes in expression levels of

particular genes, or the presence or absence of

individual metabolites. The flexibility of I2E allows the

user to search for any of these data types and to find

relationships between items that link therapeutics to

phenotypic effects.

At a top-10 pharmaceutical company, I2E is used to

create a database of candidate biomarkers by mining

MEDLINE and full-text articles that can then be

queried by scientists. I2E can also scan the literature

for specific disease biomarkers on a day-to-day basis,

to maintain the currency of the in-house database.

Investigating indirect gene-gene relationships between a drug

compound, Raptiva, and a disease, Psoriasis, using interaction

network visualization.

“Scientific literature must be in a

computationally accessible format to be

used for systems biology studies, and

custom curation is frequently needed.

However, text analytics speeds creation

of custom annotation by as much as an

order of magnitude, lowering the barrier

to accessing the wealth of information

available in scientific literature.”

Library & Information Services,

Top-20 Biotech

“

Clinical and post-market safety

Organizations increasingly require auditable methods

to check whether signals indicating adverse or

toxicity related events appear in clinical records. If

events do occur, companies need to be able to react

fast to find out if they are caused by the drug, are

side effects of the original disease or are the result

of external factors.

Text mining can be used to review clinical reports to

search for signals of adverse events. For example,

Linguamatics I2E has been used to highlight different

adverse event profiles at different dosages.

Researchers can also search medical records for

particular adverse effects, code the effects found

and assess for drug associations. The linguistic

capabilities of I2E are critical in providing a

distinction between new effects, a history of an

effect, the lack of an effect, or the lack of a history

of an effect.

“Once you see what I2E can do, you won't

want to go back to wading through

irrelevant documents.”

Associate Director, Safety Assessment,

Top-10 Global Pharmaceutical Company

“

© 2

015

Ling

uam

atic

s Lt

d. Th

e Li

ngua

mat

ics

logo

is a

tra

dem

ark

of L

ingu

amat

ics

Ltd.

All

righ

ts r

eser

ved.

All

othe

r tr

adem

arks

men

tion

ed in

thi

s do

cum

ent

are

the

prop

erty

of th

eir

resp

ective

ow

ners

.

Linguamatics 324 Cambridge Science Park

Milton Road Cambridge CB4 0WG UK

Tel: +44 (0)1223 651910

Linguamatics 1900 West Park Drive Suite 280

Westborough MA 01581 USA

Tel: +1 617 674 3256

www.linguamatics.com

About Linguamatics

Linguamatics is the world leader in deploying

innovative natural language processing (NLP)-based

text mining for high-value knowledge discovery and

decision support. Linguamatics I2E is used by top

commercial, academic and government

organizations, including 17 of the top 20 global

pharmaceutical companies, the US Food and Drug

Administration (FDA) and leading US healthcare

providers. I2E can be used to mine a wide variety

of text resources, such as scientific literature,

patents, Electronic Health Records (EHRs), clinical

trials data, news feeds, social media and proprietary

content.

Linguamatics is committed to excellence in

healthcare informatics and is a corporate member

of AMIA and HIMSS. The company operates globally,

with headquarters in Cambridge, UK, and a U.S.

office in Westborough, MA.

Linguamatics is a winner of the Queen’s Award for

Enterprise 2014 for International Trade.

For further information, visit

www.linguamatics.com

About I2E

I2E is an agile, scalable, high performance text

mining system that facilitates discovery and

knowledge synthesis from unstructured text in large

document collections.

I2E has a proven track record in delivering best of

breed text mining capabilities across a broad range

of application areas. Its agile nature allows tuning

of query strategies to deliver the precision and recall

needed for specific tasks, but at enterprise scale.

There is a choice of ways in which you can connect

to I2E’s unique capabilities: either by deploying

I2E Enterprise in-house, or via I2E OnDemand,

our Software-as-a-Service (SaaS) version of I2E.

For more information, visit www.linguamatics.com

or www.whatistextmining.com

To fully evaluate the unique and compelling benefits that I2E can bring to your

organization, please contact your local Linguamatics representative or email us at

[email protected].

For more information or a demonstration,

please call us on +44 1223 651910

[email protected]

Documents

Discover more, discover faster. - Linguamatics