Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Discover more,discover faster.High performance, flexible NLP-basedtext mining for life sciences
Life Sciences organizations face the challenge of filtering ever-increasing
volumes of textual data to gain actionable insights for key decision-making.
The volume, variety and velocity of data is increasing exponentially. The big question
is, how do we make the best use of these data?
Key benefits
Save time during R&D with better decision support
Reduce resources and experimental costs
Generate new opportunities - answer questions that couldn't
otherwise be answered
Create visual summaries from unstructured text - for rapid
understanding and evaluation
Gain competitive advantage - reveal weak signals, sentiment and
novel relationships
“It’s not informationoverload, it’s filter failure.”Clay Shirky
Linguamatics I2E has been used for many applications
in life sciences across the drug discovery-development
lifecycle, including biomarker discovery, drug
re-purposing, clinical trial analytics, analyzing chemical
safety/toxicity signatures, and market intelligence
through the analysis of social media.
Linguamatics’ agile NLP text mining software, I2E,
provides rapid knowledge discovery from
unstructured and semi-structured text.
Using I2E, knowledge can be extracted from a wide
range of content sources such as scientific literature,
patents, clinical trials data, electronic health records
(EHRs), news feeds and proprietary content. This
knowledge can then be used to answer high value
questions in real time.
Linguamatics’ text mining technology is more
effective for knowledge discovery than traditional
search and is now well established and proven across
the life science industry, including pharmaceuticals,
biotechnology, healthcare, consumer products,
agrochemicals, government and more.
Solutions & applications in life sciences
Advanced text analytics delivers value along the pipeline
“Anyone considering text mining will find
a specialist with unrivalled domain
knowledge in Linguamatics”
Jason Stamper, 451 Research
“
Race to patent Race to market Maximise market value
Opportunity scoutingPatent analysis
Drug repurposing KOL identification Social mediaanalysis
Biomarker discovery Comparative effectiveness
Competitive intelligenceSARMutation/expressionanalysis
Pre-discovery
Drugdiscovery
Pre-clinical Phase lclinical trials
Phase llclinical trials
Phase lllclinical trials
Regulatoryreview
Scale up tomanufacture
Post marketing
surveillance
Toxicity analysis andprediction Safety Pharmacovigilance
Target ID/selection Trial site selection and study design HEOR
Gene-diseasemapping Regulatory submission QC
Selecting the best targets in the drug discovery
process is crucial for optimizing return on R&D spend
across a portfolio of research projects. Researchers
use text mining to establish target ranking based on
efficacy and safety. Methods include providing links
to biological pathways and processes, and supporting
gene expression analysis in specific tissues and
species.
A variety of pathway databases exist but their scope
may be limited to a small number of premier
journals. In addition, there is often a lack of
contextual information to focus the specific pathway
analysis. Text mining approaches complement
pathway database searches by providing both target
context and access to up-to-date results from a much
more comprehensive range of documents.
Making the most of your information assets
These case studies show some of the ways I2E has been used to
capture valuable information from life sciences literature, saving time
and increasing productivity across the drug discovery pipeline.
At a top-10 global pharmaceutical company,
Linguamatics I2E forms part of a standard reusable
framework for novel target selection in use for a
variety of R&D projects. Our advanced NLP
capabilities and intuitive reporting make it simple for
scientists to see assertions and drill down to
supporting evidence in source documents.
Targets: identification, validation and selection
I2E customers have reported ten fold
time savings in this type of literature
analysis. For 100 scientists, this is
equivalent to savings of 10 FTE years or
approximately $1m/year.
Because a natural language processing approach
involves understanding the meaning of the text, text
mining using I2E enables a rapid and deep analysis of
patent documents, potentially saving millions of
dollars. I2E can search for numerical information,
chemicals by name or structure, classes of drug
targets or therapeutic area, focus the search on
specific regions of the patent documents, or follow
claim chains across a patent.
A top-10 Pharma company used the chemistry search
capability built into I2E, along with sophisticated
query strategies and algorithms to pre-process
tables, in order to extract detailed numeric and
biological information from patent documents.
The company was able to perform more rapid
freedom-to-operate searches in comparison to
standard patent search.
Patent literature often provides the first mention of
much critical data for novel chemistry and biology -
for example, compound structure, protein target,
intended disease area. Access to these valuable data
can provide a competitive edge for pharmaceutical
researchers.
However, patent literature is notoriously hard to
search – patents can be hundreds or thousands of
pages long and contain complex information often
written with obfuscation rather than communication in
mind. Matching up mentions of chemical structures or
gene targets in one part of a document with properties
and other information mentioned somewhere else,
especially tables, is very challenging.
A traditional keyword search/document retrieval
approach is time-consuming and tedious as many
hundreds or thousands of patents may be retrieved,
particularly for in-depth patent analytics such as
opposition searches or patent landscaping.
Providing a competitive edge with patent analytics
Clinical trial analytics
Clinical trials are used to gather safety and efficacy
data on new drugs in development or existing drugs
tested for new indications. Although some
information in published clinical trial reports is well
structured and searchable using keywords, much of
the key information lies in unstructured text.
I2E is essential to extract and synthesize the high
value information that is found only in these
unstructured regions. This can then be used to:
Select trial sites more effectively, and find
precedents for study design protocols
Gain actionable information about competitors'
worldwide clinical development activities, for
example monitoring progress of competitors’
trials, or finding other companies running clinical
trials in the same therapeutic area
Uncover in-licensing opportunities, by finding
sponsors running early-stage clinical trials in
particular therapeutic areas
According to our customers: using I2E,
the time for site selection can be reduced
by over 80%. For patient recruitment, time
spent can be reduced by at least 25%.
Biomarkers: identifying thecrucial link between pre-clinicaland clinical information
Mechanistic studies of compounds require specific
traits to be measured; similarly, clinical studies of
patients need to have a biomarker to quantify effects
of disease progression and treatments. These
biomarkers can take different forms, e.g. enzymes
with varying activity, changes in expression levels of
particular genes, or the presence or absence of
individual metabolites. The flexibility of I2E allows the
user to search for any of these data types and to find
relationships between items that link therapeutics to
phenotypic effects.
At a top-10 pharmaceutical company, I2E is used to
create a database of candidate biomarkers by mining
MEDLINE and full-text articles that can then be
queried by scientists. I2E can also scan the literature
for specific disease biomarkers on a day-to-day basis,
to maintain the currency of the in-house database.
Investigating indirect gene-gene relationships between a drug
compound, Raptiva, and a disease, Psoriasis, using interaction
network visualization.
“Scientific literature must be in a
computationally accessible format to be
used for systems biology studies, and
custom curation is frequently needed.
However, text analytics speeds creation
of custom annotation by as much as an
order of magnitude, lowering the barrier
to accessing the wealth of information
available in scientific literature.”
Library & Information Services,
Top-20 Biotech
“
Clinical and post-market safety
Organizations increasingly require auditable methods
to check whether signals indicating adverse or
toxicity related events appear in clinical records. If
events do occur, companies need to be able to react
fast to find out if they are caused by the drug, are
side effects of the original disease or are the result
of external factors.
Text mining can be used to review clinical reports to
search for signals of adverse events. For example,
Linguamatics I2E has been used to highlight different
adverse event profiles at different dosages.
Researchers can also search medical records for
particular adverse effects, code the effects found
and assess for drug associations. The linguistic
capabilities of I2E are critical in providing a
distinction between new effects, a history of an
effect, the lack of an effect, or the lack of a history
of an effect.
“Once you see what I2E can do, you won't
want to go back to wading through
irrelevant documents.”
Associate Director, Safety Assessment,
Top-10 Global Pharmaceutical Company
“
© 2
015
Ling
uam
atic
s Lt
d. Th
e Li
ngua
mat
ics
logo
is a
tra
dem
ark
of L
ingu
amat
ics
Ltd.
All
righ
ts r
eser
ved.
All
othe
r tr
adem
arks
men
tion
ed in
thi
s do
cum
ent
are
the
prop
erty
of th
eir
resp
ective
ow
ners
.
Linguamatics 324 Cambridge Science Park
Milton Road Cambridge CB4 0WG UK
Tel: +44 (0)1223 651910
Linguamatics 1900 West Park Drive Suite 280
Westborough MA 01581 USA
Tel: +1 617 674 3256
www.linguamatics.com
About Linguamatics
Linguamatics is the world leader in deploying
innovative natural language processing (NLP)-based
text mining for high-value knowledge discovery and
decision support. Linguamatics I2E is used by top
commercial, academic and government
organizations, including 17 of the top 20 global
pharmaceutical companies, the US Food and Drug
Administration (FDA) and leading US healthcare
providers. I2E can be used to mine a wide variety
of text resources, such as scientific literature,
patents, Electronic Health Records (EHRs), clinical
trials data, news feeds, social media and proprietary
content.
Linguamatics is committed to excellence in
healthcare informatics and is a corporate member
of AMIA and HIMSS. The company operates globally,
with headquarters in Cambridge, UK, and a U.S.
office in Westborough, MA.
Linguamatics is a winner of the Queen’s Award for
Enterprise 2014 for International Trade.
For further information, visit
www.linguamatics.com
About I2E
I2E is an agile, scalable, high performance text
mining system that facilitates discovery and
knowledge synthesis from unstructured text in large
document collections.
I2E has a proven track record in delivering best of
breed text mining capabilities across a broad range
of application areas. Its agile nature allows tuning
of query strategies to deliver the precision and recall
needed for specific tasks, but at enterprise scale.
There is a choice of ways in which you can connect
to I2E’s unique capabilities: either by deploying
I2E Enterprise in-house, or via I2E OnDemand,
our Software-as-a-Service (SaaS) version of I2E.
For more information, visit www.linguamatics.com
or www.whatistextmining.com
To fully evaluate the unique and compelling benefits that I2E can bring to your
organization, please contact your local Linguamatics representative or email us at
For more information or a demonstration,
please call us on +44 1223 651910