Upload
others
View
17
Download
0
Embed Size (px)
Citation preview
Artificial intelligence and machine learning for precision
medicine with the special focus on oncology- the state
of the art
Pekka Neittaanmäki
Dean of the Faculty of Information Technology
Professor, Department of Mathematical information Technology
University of Jyväskylä
Sami Äyrämö
Research Coordinator
Department of Mathematical Information Technology
University of Jyväskylä
Khaula Zeeshan
Web Intelligence and service engineering
Department of Mathematical Information Technology
University of Jyväskylä
TEKES-HANKE: Value from Public Health Data with Cognitive Computing
CONTENTS
FUTURE IS WHAT WE CALL ARTIFICIAL INTELLIGENCE ............................. 3
MIND BLOWING AI APPLICATIONS ................................................................... 4
AI AND PRECISION MEDICINE – ROAD TO SUPER INTELLIGENCE ............ 6
AI AND MACHINE LEARNING ............................................................................. 8
AI DRIVING PRECISION MEDICINE IN 2017..................................................... 10
PREDICTIVE MODELS FOR PRECISION MEDICINE ........................................ 11
FOCUS ON ONCOLOGY AS SPECIAL CASE...................................................... 15
NEW ERA OF ONCOLOGY - STATE OF THE ART ............................................ 17
CANCER PROGNOSIS AND PREDICTION – ML AT WORK ........................... 20
ML PREDICTIVE TOOLS AND MODELS ........................................................... 21
DEEP LEARNING IN ONCOLOGY – FIGHTING CANCER .............................. 28
DATA TYPES – GENOMICS AND CLINICAL .................................................... 31
NEXT GENERATION SEQUENCING ................................................................... 32
MICROARRAY TECHNOLOGY AND GENE EXPRESSION DATA ................. 33
AI AND GENOMICS ............................................................................................... 34
COMMERCIAL AI/ML SOLUTIONS FOR ONCOLOGY ................................... 37
ML TECHNIQUES FOR DIFFERENT CANCER TYPES – SURVEY TABLES ... 40
CANCER DATABASES ........................................................................................... 46
NATIONAL AND INTERNATIONAL RESEARCH GROUPS ........................... 47
Future is what we call Artificial Intelligence
Artificial intelligence is the technology of future, research in this field is going to change the
face of the modern world. From exploring extraterrestrial planets, to cars driving by
themselves, robots serving coffee in cafes, computers assisting doctors for treating patients
and making accurate predictions, working robots in fields, on roads as traffic guards, in
homes and offices helping and assisting humans is no longer a science fiction, but soon will
be a reality of this earth.
http://www.bbc.com/future/story/20161114-how-we-built-machines-that-can-think-for-
themselves
https://www.youtube.com/watch?v=0XmUaHf-11A&feature=youtu.be
https://www.youtube.com/watch?v=5J5bDQHQR1g
https://www.youtube.com/watch?v=fP5zFpsThqk
Mind blowing AI applications
Humanesquue behavior, mind and actions make Artificial Intelligence!
Mind-blowing AI applications representing humanesque behavior
http://www.pcworld.com/article/220685/tech_of_the_future_today_breakthroughs_in_
artificial_intelligence.html
Police Robots
They are powered by solar panels and are equipped with surveillance cameras
.http://edition.cnn.com/2014/02/24/tech/robot-cops-rule-kinshasa/index.html
Autonomous car-A Neural Network drives A Car
The software learns from its human counterparts to identify the multiple features and
objects encountered in the driving experience like lane markings, streetlights, bushes and,
of course, other cars.
http://www.lostateminor.com/2017/06/16/artificial-intelligence-learning-drive-car/
Autonomous Robot Surgeon-Soft Tissue Autonomous Robot STAR: STAR solved
the soft tissue challenge by integrating a few different technologies. Its vision system relied
on near-infrared fluorescent (NIRF) tags placed in the intestinal tissue; a specialized NIRF
camera tracked those markers while a 3D camera recorded images of the entire surgical
field. Combining all this data allowed STAR to keep its focus on its target. The robot made its
own plan for the suturing job, and it adjusted that plan as tissues moved during the
operation.
http://spectrum.ieee.org/the-human-os/robotics/medical-robots/autonomous-robot-
surgeon-bests-human-surgeons-in-world-first
Latest trends in Artificial Intelligence, reshaping the modern age!
● Deep learning
● AI as manpower
● Autonomous vehicles
● Medicine
● Internet of things (IoT)
● Emotional understanding
● Shopping and customer service
http://www.techrepublic.com/article/7-trends-for-artificial-intelligence-in-2016-like-2015-
on-steroids/
Artificial Intelligence and Precision Medicine - Road to
Superintelligence
● AI making smart wearable devices
● Providing Machine learning tools for medical diagnosis
● AI framework for stimulating clinical decisions and predicting disease outcomes
https://phys.org/news/2017-06-artificial-intelligence-health-revolution.html
http://www.aiimjournal.com/
How AI affecting the medical domain
Our next doctor could very well be a bot. Bots, or automated programs, are likely to play a
key role in finding cures for some of the most difficult-to-treat diseases and conditions.
http://www.healthcentral.com/slideshow/8-ways-artificial-intelligence-is-affecting-the-
medical-field#slide=10
https://www.chipin.com/artificial-intelligence-health-care/
http://www.cs.cmu.edu/~neill/papers/ieee-is2013.pdf
http://www.newvision.co.ug/new_vision/news/1455810/artificial-intelligence-health-
care
https://thenextweb.com/artificial-intelligence/2017/04/13/artificial-intelligence-
revolutionizing-healthcare/#.tnw_YZSfu1AG
The AI Boogie Man-Watson
https://www.youtube.com/watch?v=ZPXCF5e1_HI
Watson is not alone. Other AI computer applications, including chatbots, promise to assist
humans to practice medicine: advice, counsel, treat, and correct defects, disease, and
illness. In fact, telemedicine or remote clinical services–virtual office visits, for example–are
predicted to increase 700% by 2020, according to MIT Sloan’s calculations.
https://hitinfrastructure.com/news/artificial-intelligence-key-in-ibm-watson-health-
partnerships
● Artificial Intelligence Uses EHRs as Smart Analytics Tools
● IBM Watson Artificial Intelligence Improves Cancer Treatment
● BI Artificial Intelligence Supports Health IT Analytics
Each year, more and more medications, scientific developments, treatments and technology
enter the healthcare area. It is unreasonable to expect a healthcare provider to be able to
sift through all that information on their own. After all, we are just humans.
Supercomputers and AI can help make quickly finding the right information a reality.
https://www.fool.com/investing/2017/03/19/ibms-watson-is-tackling-healthcare-with-
artificial.aspx
Artificial Intelligence and Machine Learning
One of the basic requirement of intelligent behavior is learning, and there is no intelligence
without learning so machine learning is one of major branches of artificial intelligence and
most rapidly developing subfield of AI research.
Machine learning algorithms have been used for years in medical domain for the intelligent
data analysis to make not only future predictions of outcome of certain treatment but also
finding hidden relationships within the medical data. Some of the most commonly used
state of the art ML algorithms used in medical domain are as follows:
Assistance-R, Assistance-I, LFC, Naive Bayesian classifier, Semi Naive Bayesian classifier, ANN
(Backpropagation with weight elimination), k-NN.
http://www.sciencedirect.com/science/article/pii/S093336570100077X
http://spectrum.ieee.org/the-human-os/biomedical/diagnostics/ai-predicts-heart-
attacks-more-accurately-than-standard-doctor-method
● Heart murmur detection using ANN and modified neighbor annealing methods
● Prediction of apoptosis proteins based on evolutionary information and SVM
● Identifying risk factors and diagnose ovarian cancer recurrence
● ML based identification of protein-protein interactions
● Prediction of heart attacks and strokes by AI
● Prediction of Autism from infant brain scan
http://www.infoworld.com/article/3199295/artificial-intelligence/primer-how-to-tell-if-
ai-or-machine-learning-is-real.html
AI assists in the home treatment of heart patients in Finland http://sciencebusiness.net/health/research-reports/finland-artificial-intelligence-assists-
in-the-home-treatment-of-heart-patients/
Machine learning reshaping diagnostic medicine
Much of the diagnostic data is image-based, such as X-rays, MRI scans, and ultrasound
imagery, but can also include things like genomic profiles, epidemiological data, blood tests,
biopsy results, and even medical research papers. As a result, there is a wealth of data
available for training neural networks and for other machine learning techniques.
https://www.top500.org/news/machine-learning-will-reshape-diagnostic-medicine/
http://analytics-magazine.org/healthcare-analytics/
State of the art ML applications in health domain
Burgeoning applications of ML in pharma and medicine are glimmers of a potential future in
which synchronicity of data, analysis, and innovation are an everyday reality. McKinsey
estimates that big data and machine learning in pharma and medicine could generate a
value of up to $100B annually, based on better decision-making, and optimized innovation,
improved efficiency of research /clinical trials, and new tool creation for physicians,
consumers, insurers, and regulators.
https://www.techemergence.com/applications-machine-learning-in-pharma-medicine/
http://spectrum.ieee.org/the-human-os/biomedical/diagnostics/in-hospital-intensive-
care-units-ai-could-predict-which-patients-are-likely-to-die
http://it.toolbox.com/blogs/accessible-bi/applications-of-machine-learning-in-
healthcare-diagnosis-76933
https://medium.com/health-ai/artificial-intelligence-in-health-care-weekly-roundup-8-
da5dcf3ff449
AI driving Precision Medicine in 2017
Precision medicine (PM) is a medical model that proposes the customization of
healthcare, with medical decisions, practices, or products being tailored to the individual
patient. In this model, diagnostic testing is often employed for selecting appropriate and
optimal therapies based on the context of a patient’s genetic content or other molecular or
cellular analysis. Tools employed in precision medicine can include molecular diagnostics,
analytics and imaging.
https://www.brighttalk.com/webcast/9293/243547/machine-learning-towards-precision-
medicine
http://www.cio.com/article/3157477/healthcare/how-ai-and-blockchain-are-driving-
precision-medicine-in-2017.html
https://content.medicine.ai/what-is-medicine-ai-b2e5a2a9fbf4
AI techniques have been applied in cardiovascular medicine to explore novel genotypes and
phenotypes in existing diseases, improve the quality of patient care, enable cost-
effectiveness, and reduce readmission and mortality rates. Over the past decade, several
machine-learning techniques have been used for cardiovascular disease diagnosis and
prediction.
http://www.medscape.com/viewarticle/880843
The basic idea behind precision medicine is that large quantities of health data can be
analyzed to determine how small differences among people affect their health outcomes.
That analysis can then help people understand how their unique traits make them more or
less susceptible to a given disease or condition.
https://www.linkedin.com/pulse/artificial-intelligence-precision-treatment-transform-
daniel-burrus
Predictive models and Precision medicine
Appropriate diagnosis is fundamental in medicine because it sets the basis for the prediction
of disease outcome at the single patient level (prognosis) and decisions regarding the most
appropriate therapy. However, given the large series of social, clinical and biological factors
that determine the likelihood of an individual's future outcome, prognosis only partly
depends on diagnosis and aetiology and treatment is not decided solely on the basis of the
underlying diagnosis. Predictive models with deep analytics of big medical data is facilitating
clinicians to step towards high precision medicine with more confidence and reliability in
this modern age. Approaches that take due account of prognosis limit the lingering risk of
over diagnosis and maximize the value of prognostic information in the clinical decision
process.
http://www.thejournalofprecisionmedicine.com/wp-content/uploads/2015/01/Robert-
Hun
ter-Article1.pd
Predictive Analytics and Electronic health records(EHR)
Predictive analytics is fueling a transformation from a focus on the volume of procedures to
the value of outcome. Predictive tools are helping providers — both doctors’ groups and
hospitals — assess patients’ risk of contracting a whole host of diseases and conditions.
They can come up with individualized regimens by tapping into electronic medical records
to identify the types of patients who are most likely to respond to a particular type of
therapy. They can pinpoint treatments that sustain health in a more precise way than ever
before. And they can identify individuals who are likely to stop benefiting from a specific
regimen at a given time. For the volume-to-value paradigm shift in healthcare, predictive
analytics, though rarely visible, is the essential enabler.
https://hbr.org/2016/04/making-predictive-analytics-a-routine-part-of-patient-
care?referral=03759&cm_vc=rr_item_page.bottom
https://hbr.org/2014/10/predictive-medicine-depends-on-analytics
https://www.elsevier.com/connect/seven-ways-predictive-analytics-can-improve-
healthcare
Modern Statistical methodologies
● Trellis Graphics: Scatterplot matrix
● Generalized linear models: Smooth in time Logistic Regression
● Time Sliced Log Linear Regression
● Survival Time Analysis: Cox Regression
● Smooth Regression: Local Regression, Spline Regression
● Tree Based Methods: Regression Tree, Classification Tree
Predictive models from development to validation to clinical impact
http://www.canceropole-gso.org/download/fichiers/2774/1_Moons.pdf
Regression Analysis in Medical Research
Regression Analysis: Mathematical measure of average relationship between two or
more variables in terms of original units of data. It is a powerful technique used for
predicting the unknown value of a variable from known value of two or more variables, also
known as Predictors.
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ClinStat/model.pdf
https://www.ncbi.nlm.nih.gov/pubmed/6729549
Multiple regression analysis: It is a powerful technique used for predicting the
unknown value of a variable from known value of two or more variables, also known as
Predictors.
https://explorable.com/multiple-regression-analysis
Regression Models in Medicine by Markku Nurminen
A knowledge based report on diagnostic, etciognosti, prognostic regression models in
medicine.
https://www.researchgate.net/publication/303891610_Diagnostic_etiognostic_prognosti
c_regression_models_in_medicine
Regression Modelling and Validation Strategies
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ClinStat/model.pdf
Monte Carlo Model and Simulations
Monte Carlo methods (or Monte Carlo experiments) are a broad class of computational
algorithms that rely on repeated random sampling to obtain numerical results. They are
often used in physical and mathematical problems and are most useful when it is difficult or
impossible to use other mathematical methods.
This technique is widely used for diagnostic and predictive applications. Monte Carlo
method is used for imaging, nuclear medicine, risk assessment and for many other cases.
https://www.amazon.com/Monte-Carlo-Calculations-Nuclear-Medicine/dp/0750304790
http://pinlab.hcuge.ch/pdf/IAEA02.pdf
http://www.ingentaconnect.com/content/tandf/gsar/2015/00000026/00000006/art0000
2
http://omlc.org/software/mc/
Testing the accuracy of Predictive Models
http://www.plottingsuccess.com/3-predictive-model-accuracy-tests-0114/
External validation of clinical prediction models
http://www.bmj.com/content/bmj/353/bmj.i3140.full.pdf
https://www.nap.edu/read/13395/chapter/7
Focus on Oncology as special case
Oncology is the branch of medicine that deals with the prevention, diagnosis and treatment
of cancer. With the advent of new technologies like Artificial Intelligence and cognitive
science, there is tremendous research has done in past decades, and its potential is
increasing each year. Oncologists are now looking towards this new era of cancer prognosis
and prediction assisted by the state of the art algorithms based on machine learning
techniques, including Artificial neural network (ANN), Bayesian Network (BN), Support
Vector Machine (SVM), and Decision trees (DT`s). Modeling of cancer progression provides
oncologists a decision support system by developing predictive models, resulting in effective
and accurate decision making and improving the understanding of cancer progression.
Cancer prognosis/predictions is concerned with three predictive tasks;
● Prediction of cancer susceptibility (risk assessment)
● Prediction of cancer recurrence/local control
● Prediction of cancer survival
Challenges while treating cancer Cancer is highly heterogeneous disease consisting of many subtypes. The early diagnosis and
prognosis of a cancer type have become a necessity in cancer research. Applications of
Artificial intelligence and Machine learning techniques has opened a new avenue of
research in bioinformatics and biomedical field to dig into the data pools and bring valuable
information to fight cancer by making reliable and well in time cancer diagnosis, prognosis
and prediction. Prediction of cancer outcome usually refers to the cases of Life expectancy,
survivability, progression and treatment sensitivity.
https://www.cancer.gov/research/areas/public-health
https://www.cancer.gov/research/areas/biology
https://www.cancer.gov/research/areas/disparities
Cancer facts and figures 2017
Cancer is the leading cause of human deaths and proved to be most fatal disease. In 2015,
about 90.5 million people had cancer. About 14.1 million new cases occur a year and it
caused about 8.8 million deaths a year. Most common types of cancer in males is lung
cancer, prostate cancer, colorectal cancer and stomach cancer. In females, the most
common types are lung cancer, breast cancer, colorectal cancer and cervical cancer. In
children, acute lymphoblastic leukemia and brain tumors are most common types of cancer.
The risk of cancer significantly increases with age.
https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-
statistics/annual-cancer-facts-and-figures/2017/cancer-facts-and-figures-2017.pdf
New era of Oncology…. State of the art
For the first time, artificial intelligence has been used to discover the exact
interventions needed to obtain a specific, previously unachievable result in vivo,
providing new insight into the biophysics of cancer and raising broad implications for
biomedicine.
https://www.fronteo-healthcare.com/en/cpmais/en/index.html
https://www.sciencedaily.com/releases/2017/01/170127113030.htm
http://esmoopen.bmj.com/content/2/2/e000198
http://www.bbc.com/news/health-36482333
AI and big genomic data:
https://www.bbvaopenmind.com/en/fight-against-cancer-with-artificial-intelligence-and-
big-data/
https://www.mesotheliomahelpnow.com/blog/artificial-intelligence-cancer-patients/
Big Data and artificial intelligence, combined with genetic analysis, allow researchers to
search for and find patterns among patients with rare diseases, who may be separated by
distance but carry the same mutation. The ultimate goal is to create a huge digital medical
data library, a kind of big data of medicine, which respects the privacy of the patient but
accelerates diagnosis and treatment.
http://jamanetwork.com/journals/jamaoncology/article-abstract/2330621
AI Improving oncology safety through post approval analytics
AI has ability to impact patient care through; 1) Innovations and improvements in efficiency,
2) Patient safety in the care delivery process and 3) Efficacy of care delivery
http://www.mwestonchapman.com/artificial-intelligence-improving-safety-through-post-
approval-analytics/
http://www.cbsnews.com/news/artificial-intelligence-making-a-difference-in-cancer-care
AI helping in finding cancer cells
http://sciencenewsjournal.com/artificial-intelligence-helps-find-cancer-cells/
https://news.microsoft.com/stories/computingcancer/
AI powered Microscope to detect cancer cells-State of the art
A new microscope, developed by researchers from using AI for detecting cancer cells in
blood samples. Faster and more accurate than its contemporary techniques, it can analyze
36 million images every second without damaging the blood samples. Deep learning is a
popularly used artificial intelligence that works with complex algorithms to pull meaning
from data, leading to better decision making.
The photos are processed using deep learning, which runs data through a mass of
algorithms to efficiently and accurately “read” the information. Deep learning has also been
used to analyze patients’ genes, allowing identification of diseases or cancer that may
otherwise go undetected, and has the potential to further understand cancer-forming
mutations.
https://futurism.com/microscope-uses-artificial-intelligence-find-cancer-cells/
http://www.breitbart.com/tech/2016/04/28/ai-powered-microscope-helps-root-out-
cancer
http://www.popsci.com/artificial-intelligence-helps-diagnose-cancer
Cancer Prognosis and Prediction-Machine learning at work
Several studies have shown that, the application of ML techniques has significantly
improved the cancer prediction outcome by 15%-20% the last years. Continuous evolution
related to cancer research has been performed. Oncologists have applied different
screening techniques to detect cancer types before it shows symptoms. There are also
statistical predictive models for early prediction. However, to meet the challenging task of
accurate prediction of disease outcome, ML methods have become a popular tool for
medical researchers. These techniques can discover and identify patterns and relationships
between them, from complex datasets, while they are able to effectively predict future
outcomes of a cancer.
http://europepmc.org/articles/PMC4348437
https://www.researchgate.net/publication/285779472_Machine_Learning_
n_Genomic_Medicine_A_Review_of_Computational_Problems_and_Data_Sets
http://csbj.org/articles/e2015004.pdf
https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2008-6.pdf
Machine Learning Predictive Models-Different techniques
For cancer, diagnosis and prognosis many supervised ML techniques have been
applied for years, but they are not adopted in daily clinical routines because of lack
of external validation. Most common ML techniques includes, Artificial neural
network (ANN), Bayesian network (BN`s), Support vector machine (SVM) and
Decision tree (DT´s).
Clinical or genomic data samples are given as inputs with several features and every
feature is having different types of values. Data is preprocessed to address the
quality issues. Preprocessing data steps are, dimensionality reduction, feature
selection and feature selection. ML techniques classify the data into predefined
classes.
http://www.nature.com/ctg/journal/v5/n1/full/ctg201319a.html
Methods of validation: Hold-out method, Random sampling, cross validation, Bootstrap
http://www.sciencedirect.com/science/article/pii/S2001037014000464
http://newatlas.com/machine-learning-predicts-breast-cancer-treatment-
responses/39510/
A Data-mining approach: Data collected from hematopoietic SCT (HSCT) centers are
becoming more abundant and complex owing to the formation of organized registries and
incorporation of biological data. Typically, conventional statistical methods are used for the
development of outcome prediction models and risk scores. However, these analyses carry
inherent properties limiting their ability to cope with large data sets with multiple variables
and samples. Machine learning (ML), a field stemming from artificial intelligence, is part of a
wider approach for data analysis termed data mining (DM). It enables prediction in complex
data scenarios, familiar to practitioners and researchers. Technological and commercial
applications are all around us, gradually entering clinical research.
http://www.nature.com/bmt/journal/v49/n3/full/bmt2013146a.html
http://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-015-0015-0
Predict-ML Tool for clinical data
https://link.springer.com/article/10.1186/s13755-016-0018-1
Artificial Neural Network
In machine learning and cognitive science, artificial neural networks (ANNs) are a
family of models inspired by biological neural networks (the central nervous systems
of animals, in particular the brain) and are used to estimate or approximate functions
that can depend on a large number of inputs and are generally unknown.
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol2/ds12/article2.html
http://www.phil.gu.se/ann/annimabintro.pdf
https://www.researchgate.net/publication/228349773_Artificial_Neural_Network_in_Me
dicine
http://onlinelibrary.wiley.com/doi/10.1111/jan.12691/abstract
ANN in medical imaging
http://www.ehealthlab.cs.ucy.ac.cy/oldmedinfo/documents/medinf_03_NeuralNetworks
MedicalImaging.pdf
http://www.ehealthlab.cs.ucy.ac.cy/oldmedinfo/documents/medinf_03_NeuralNetworks
MedicalImaging.pdf
https://pure.strath.ac.uk/portal/files/34307047/medical_imaging_ann_v1.pdf
ANN for breast cancer diagnosis and prognosis
http://neuroph.sourceforge.net/tutorials/PredictingBreastCancer/PredictingBreastCancer
.html
http://www.ijcaonline.org/journal/number26/pxc387783.pdf
http://www.sciencedirect.com/science/article/pii/S0957417408001103
Breast cancer risk predictiML and breast cancer http://newatlas.com/machine-
learning-predicts-breast-cancer-treatment-responses/39510/
on models
https://epi.grants.cancer.gov/cancer_risk_prediction/breast.html
Breast cancer risk assessment tool
https://www.cancer.gov/bcrisktool/
Prognosis of prostate cancer by artificial neural networks
The ANN, yielded high rates of reliability, will help doctors make quick and reliable
diagnoses without any risks and make it a better option to monitor patients with low
prostate cancer risk on whom biopsies must not be carried out through a policy of wait and
see.
http://www.sciencedirect.com/science/article/pii/S095741741000237X
ANN for survival prediction in colon cancer
http://molecular-cancer.biomedcentral.com/articles/10.1186/1476-4598-4-29
ANN advanced technology for forecasting and clustering-NeuroXL
NeuroXL Clusterizer and Predictor are both powerful, easy-to-use and affordable solutions
for advanced prediction and clustering of medical data. Both are designed as add-ins to
Microsoft Excel, are easy to learn and do not require that data be exported out of or
imported into Excel. They provide a cost-effective way to harness the power of artificial
intelligence for a wide variety of applications.
http://neuroxl.com/applications/medicine/neural-networks-in-medicine/index.htm
Predicting prostate biopsy outcome: artificial neural networks and
polychotomous regression are equivalent models
A polychotomous logistic regression (PR) model and an artificial neural network (ANN) for
predicting biopsy results, particularly for clinically significant PC.
https://link.springer.com/article/10.1007/s11255-010-9750-7
ANN in Medicine world map
http://www.phil.gu.se/ann/annworld.html
Other Machine Learning Techniques
Other ML techniques like Bayesian Network has been used for the cancer recurrence
prediction in case of oral cancer, for survival prediction in case of breast cancer and for
susceptibility prediction in case of colon carcinomatosis.
http://www.sciencedirect.com/science/article/pii/S2001037014000464
Support vector Machine: In machine learning, support vector machines (SVMs, also
support vector networks) are supervised learning models with associated learning
algorithms that analyze data used for classification and regression analysis.
http://www.sciencedirect.com/science/article/pii/S1110866510000241
Support Vector Machines combined with Feature selection for breast cancer
diagnosis
http://www.sciencedirect.com/science/article/pii/S0957417408000912
Automated diagnostic systems for breast cancer detection and high precision
accuracy of Support vector Machines
http://www.sciencedirect.com/science/article/pii/S0957417406002442
Cancer prognosis using Support Vector regression in image modality
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3095462/
es/PMC3095462/
Latent Space Support vector machine for cancer diagnosis
http://www.sciencedirect.com/science/article/pii/S1877050914012721
Deep Learning in oncology-Applications in fighting cancer
Deep learning is a class of machine learning algorithm that are based on the (unsupervised)
learning of multiple levels of features or representations of the data. Higher level features
are derived from lower level features to form a hierarchical representation.
Deep Learning plays a vital role in the early detection of cancer. A study published by NVIDIA
showed that deep learning drops error rate for breast cancer 85%.
https://blogs.nvidia.com/blog/2016/09/19/deep-learning-breast-cancer-diagnosis/
https://www.techemergence.com/deep-learning-in-oncology/
Deep Learning a tool for increased efficiency and accuracy
https://www.nature.com/articles/srep26286
Google Deep learning system for Pathologists
https://9to5google.com/2017/03/03/google-deep-learning-cancer-diagnosis/
https://research.googleblog.com/2017/03/assisting-pathologists-in-detecting.html
Deep Learning approach for cancer detection and gene expression
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5177447/
https://rpubs.com/EdMwa/241538
Leveraging deep learning to predict breast cancer proliferation scores with
Apache Spark and Apache SystemML
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/56151
http://systemml.apache.org/
DeepGene: An advanced cancer type classifier based on deep
learning and somatic point mutations
Based on deep learning and somatic point mutation data, DeepGene, an advanced cancer
type classifier. Experiments indicate that DeepGene outperforms three widely adopted
existing classifiers, which is mainly attributed to its deep learning module that is able to
extract the high-level features between combinatorial somatic point mutations and cancer
types.
.
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-
1334-9
http://ascopubs.org/doi/abs/10.1200/JCO.2017.35.4_suppl.164
MODCELL Predictive Model: Mechanistic predictive model with wide range of
applications in medicine;
● Personalized Medicine
● Virtual Clinical Trials
● Drug Target Identification
http://www.alacris.de/modcell/
Predictive model for cancer*
http://www.oncodesign.com/assets/files/Webinars/Webinar_PDX_Predictive_Cancer_M
odels_for_Better_Precision_Medicine.pdf
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC467154
Data types-Genomic and Clinical
Cancer is fundamentally a disease of the genome, caused by changes in the DNA, RNA, and
proteins of a cell that push cell growth into overdrive. Identifying the genomic alterations
that arise in cancer can help researchers decode how cancer develops and improve upon
the diagnosis and treatment of cancers based on their distinct molecular abnormalities.
https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/data-types
http://www.nature.com/articles/srep41674
https://wiki.nci.nih.gov/display/TCGA/Data+Levels+and+Data+Types
Genomic Profiling of Multiple Data types
Genomic profiling of multiple data types in the same set of tumors has gained prominence.
https://academic.oup.com/bioinformatics/article/25/22/2906/180866/Integrative-
clustering-of-multiple-genomic-data
Next Generation Sequencing
Next generation sequencing (NGS), massively parallel or deep sequencing are related terms
that describe a DNA sequencing technology which has revolutionized genomic research.
Using NGS an entire human genome can be sequenced within a single day.
https://www.ncbi.nlm.nih.gov/pubmed/25108476
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3841808/
https://www.illumina.com/science/technology/next-generation-sequencing.html
Fighting cancer with AI and DNA sequencing using Big Data
http://www.wired.co.uk/video/fighting-cancer-with-dna-sequencing-big-data-ai
Cancer Genome Atlas
Genome Atlas is a platform to provide understanding of the molecular analysis of cancer
through the application of genome analysis technologies and explore the entire Cancer
spectrum of genomic changes involved in human cancer.
https://tcga-data.nci.nih.gov/docs/publications/tcga/?
https://biospecimens.cancer.gov/relatedinitiatives/overview/tcga.asp
Microarray technology and Gene Expression data
The DNA microarray is a tool used to determine whether the DNA from a particular
individual contains a mutation in genes like BRCA1 and BRCA2. The chip consists of a small
glass plate encased in plastic. Some companies manufacture microarrays using methods
similar to those used to make computer microchips. On the surface, each chip contains
thousands of
short, synthetic, single-stranded DNA sequences, which together add up to the normal gene
in question, and to variants (mutations) of that gene that have been found in the human
population.
http://www.premierbiosoft.com/tech_notes/microarray.html
https://www.genome.gov/10000533/dna-microarray-technology/
Gene expression data for cancer classification and Prediction
Gene Expression is the process by which information from a gene is used in the synthesis of
a functional gene product. These products are often proteins but in non-protein coding
genes such as transfer RNA or small nuclear RNA genes, the product is a functional RNA.
http://hanj.cs.illinois.edu/pdf/is03_cancer.pdf
http://www.sciencedirect.com/science/article/pii/S187705091500561X
https://www.ncbi.nlm.nih.gov/books/NBK6624/
Gene expression profiling and data is used for the classification and prediction of cancer
and neural network is used as a method to get meaningful outcome in this regard.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1282521/
https://www.ncbi.nlm.nih.gov/pubmed/1940735
Microarray and clinical data yielded excellent prediction results using
Artificial neural network
http://ieeexplore.ieee.org/document/5162409/
http://cbbp.thep.lu.se/~markus/publications/papers/nm0601_673.pdf
Artificial intelligence and Genomics
Genomics is the study of genome, the complete set of genetic material within an organism.
Genomics involves sequencing and analysis of genome.
Deep Genomics-AI meets the Human Genome
Using artificial intelligence (AI), deep learning algorithms, and complex data sets, the entire
healthcare industry could be revolutionized — from diagnostics to gene therapies to
personalized medicine. Deep Genomics holds the key to unlocking the biggest disruptions in
the medical, life sciences, and pharmaceutical industries.
https://www.vlab.org/events/deep-genomics/
https://www.deepgenomics.com/news/
AI cracking the Genomic mysteries with Deep Genomics
http://www.businessinsider.com/how-deep-genomics-is-using-ai-to-solve-genetic-
mysteries-2015-9?r=US&IR=T&IR=T
How AI and Genomic sequencing helping cancer patients*(Sophia DDM)
http://www.techrepublic.com/article/how-ai-and-next-generation-genomic-sequencing-
is-helping-cancer-patients/
Intersection of AI and Genomics
https://www.wilsoncenter.org/article/brave-new-world-the-intersection-genomics-and-
artificial-intelligence
DNA Avatar
https://www.wilsoncenter.org/blog-post/your-dna-avatar-what-happens-when-artificial-
intelligence-meets-cutting-edge-genetics
https://aitrends.com/features/deep-genomics-applies-artificial-intelligence-personalized-
medicine/
Genomic Mapping by Deep Learning
http://medcitynews.com/2015/07/deep-learning-artificial-intelligence-genome/
https://www.forbes.com/sites/mikemontgomery/2016/12/22/in-cancer-fight-artificial-
intelligence-is-a-smart-move-for-everyone/#6ab724704064
Watson for Genomics
http://researcher.watson.ibm.com/researcher/view_group.php?id=5347
http://www.healthcareitnews.com/news/ai-can-speed-precision-medicine-new-york-
genome-center-ibm-watson-study-shows
Commercial AI/ML solutions for Oncology
Oncology's Incessant Grip
http://socialarma.com/column/oncologys-incessant-grip
IBM Watson for Oncology
https://www.ibm.com/watson/health/oncology-and-genomics/oncology/
Watson Genomics Analytics for Cancer
http://wwwna.sanfordhealth.org/sioux-falls/watson-genomic-analytics
Watson for Oncology SlideShare
https://www.slideshare.net/InsideDNA/watson-genomics
Integrated Genomic solution for Oncology by Philips and Illumi
http://www.philips.com/a-w/about/news/archive/standard/news/press/2017/20170109-
philips-and-illumina-join-forces-to-offer-integrated-genomics-solutions-for-oncology.htm
Berg´s AI Solution for pancreatic and prostate cancer
http://fortune.com/2015/04/07/pancreatic-cancer-research-berg/
HALO® Deep Learning Technology for Automated Breast Cancer Metastasis
Staging http://tissuepathology.com/2017/05/01/halo-deep-learning-technology-awarded-for-automated-
breast-cancer-metastasis-staging/#axzz4nBj52nv8
Microsoft Fighting Cancer
https://news.microsoft.com/stories/computingcancer/
ai4gi-For gastrointestinal cancer
http://ai4gi.com/our-vision/
Google for treating cancer and Genomic Research
https://edgylabs.com/google-ai-cancer-diagnosis/
http://hitconsultant.net/2014/10/16/google-partners-create-cancer-genomics-cloud/
https://9to5google.com/2017/03/03/google-deep-learning-cancer-diagnosis/
https://research.googleblog.com/2017/03/assisting-pathologists-in-detecting.html
GENOME 7-Providing cutting edge AI solutions for cancer
http://genome7.com/en/index.cfm#.WXCQeoVOI2w
Amazon and cancer genomic research
http://www.frontlinegenomics.com/review/12588/amazon-cloud-solutions-support-
cancer-genomics-research/
Predicting breast cancer biopsies for malignancies using ML
https://www.bing.com/videos/search?q=ML+commercial+models+for+cancer+prognosis
&&view=detail&mid=4FAA79AC9CA8B26CC2BF4FAA79AC9CA8B26CC2BF&FORM=VRDGA
R
AI diagnosis skin cancer with dermatologist accuracy skin cancer App
https://blogs.nvidia.com/blog/2017/05/23/ai-app-skin-cancer-diagnosis/
https://cosmosmagazine.com/technology/artificial-intelligence-diagnoses-skin-cancers-
as-well-as-dermatologists
https://skinvision.com/
http://news.stanford.edu/2017/01/25/artificial-intelligence-used-identify-skin-cancer/
Machine Learning Techniques for different cancer types-Survey
Table
Cancer Type
Clinical
Endpoint
Machine
Learning
Algorithm Benchmark
Improvement
(%)
Training
Data Reference
bladder recurrence fuzzy logic statistics 16 mixed
Catto et al,
2003
bladder recurrence ANN N/A N/A clinical
Fujikawa et
al, 2003
bladder survivability ANN N/A N/A clinical Ji et al, 2003
bladder recurrence ANN N/A N/A clinical
Spyridonos
et al, 2002
brain survivability ANN statistics N/A genomic
Wei et al,
2004
breast recurrence clustering statistics N/A mixed
Dai et al,
2005
breast survivability decision tree statistics 4 clinical
Delen et al,
2005
breast susceptibility SVM random 19 genomic
Listgarten et
al, 2004
breast recurrence ANN N/A N/A clinical
Mattfeldt et
al, 2004
breast recurrence ANN N/A N/A mixed
Ripley et al,
2004
breast recurrence ANN statistics 1 clinical
Jerez-
Aragones et
al, 2003
breast survivability ANN statistics N/A clinical
Lisboa et al,
2003
breast
treatment
response ANN N/A N/A proteomic
Mian et al,
2003
breast survivability clustering statistics 0 clinical
Seker et al,
2003
breast survivability fuzzy logic statistics N/A proteomic
Seker et al,
2002
breast survivability SVM N/A N/A clinical
Lee et al,
2000
breast recurrence ANN expert 5 mixed
De Laurentiis
et al, 1999
breast survivability ANN statistics 1 clinical
Lundin et al,
1999
breast recurrence ANN statistics 23 mixed
Marchevsky
et al, 1999
breast recurrence ANN N/A N/A clinical
Naguib et al,
1999
breast survivability ANN N/A N/A clinical Street, 1998
breast survivability ANN expert 5 clinical
Burke et al,
1997
breast recurrence ANN statistics N/A mixed
Mariani et al,
1997
breast recurrence ANN expert 10 clinical
Naguib et al,
1997
cervical survivability ANN N/A N/A mixed
Ochi et al,
2002
colorectal recurrence ANN statistics 12 clinical
Grumett et
al, 2003
colorectal survivability ANN statistics 9 clinical
Snow et al,
2001
colorectal survivability clustering N/A N/A clinical
Hamilton et
al, 1999
colorectal recurrence ANN statistics 9 mixed
Singson et
al, 1999
colorectal survivability ANN expert 11 clinical
Bottaci et al,
1997
esophageal
treatment
response SVM N/A N/A proteomic
Hayashida et
al, 2005
esophageal survivability ANN statistics 3 clinical
Sato et al,
2005
leukemia recurrence decision tree N/A N/A proteomic
Masic et al,
1998
liver recurrence ANN statistics 25 genomic
Rodriguez-
Luna et al,
2005
liver recurrence SVM N/A N/A genomic
Iizuka et al,
2003
liver susceptibility ANN statistics –2 clinical
Kim et al,
2003
liver survivability ANN N/A N/A clinical
Hamamoto
et al, 1995
lung survivability ANN N/A N/A clinical
Santos-
Garcia et al,
2004
lung survivability ANN statistics 9 mixed
Hanai et al,
2003
lung survivability ANN N/A N/A mixed Hsia et al,
2003
lung survivability ANN statistics N/A mixed
Marchevsky
et al, 1998
lung survivability ANN N/A N/A clinical
Jefferson et
al, 1997
lymphoma survivability ANN statistics 22 genomic
Ando et al,
2003
lymphoma survivability ANN expert 10 mixed
Futschik et
al, 2003
lymphoma survivability ANN N/A N/A genomic
O’Neill and
Song, 2003
lymphoma survivability ANN expert N/A genomic
Ando et al,
2002
lymphoma survivability clustering N/A N/A genomic
Shipp et al,
2002
head/neck survivability ANN statistics 11 clinical
Bryce et al,
1998
neck
treatment
response ANN N/A N/A clinical
Drago et al,
2002
ocular survivability SVM N/A N/A genomic
Ehlers and
Harbour,
2005
osteosarcoma
treatment
response SVM N/A N/A genomic
Man et al,
2005
pleural
mesothelioma survivability clustering N/A N/A genomic
Pass et al,
2004
prostate
treatment
response ANN N/A N/A mixed
Michael et al,
2005
prostate recurrence ANN statistics 0 clinical
Porter et al,
2005
prostate
treatment
response ANN N/A N/A clinical
Gulliford et
al, 2004
prostate recurrence ANN statistics 16 mixed
Poulakis et
al, 2004a
prostate recurrence ANN statistics 11 mixed
Poulakis et
al, 2004b
prostate recurrence SVM statistics 6 clinical
Teverovskiy
et al, 2004
prostate recurrence ANN statistics 0 clinical Kattan, 2003
prostate recurrence
genetic
algorithm N/A N/A mixed
Tewari et al,
2001
prostate recurrence ANN statistics 0 clinical
Ziada et al,
2001
prostate susceptibility decision tree N/A N/A clinical Crawford et
al, 2000
prostate recurrence ANN statistics 13 clinical
Han et al,
2000
prostate
treatment
response ANN N/A N/A proteomic
Murphy et al,
2000
prostate recurrence naïve Bayes statistics 1 clinical
Zupan et al,
2000
prostate recurrence ANN N/A N/A clinical
Mattfeldt et
al, 1999
prostate recurrence ANN statistics 17 clinical
Potter et al,
1999
prostate recurrence ANN N/A N/A mixed
Naguib et al,
1998
skin survivability ANN expert 14 clinical
Kaiserman et
al, 2005
skin recurrence ANN expert 27 proteomic
Mian et al,
2005
skin survivability ANN expert 0 clinical
Taktak et al,
2004
skin survivability
genetic
algorithm N/A N/A clinical
Sierra and
Larranga,
1998
stomach Recurrence ANN expert 28 clinical
Bollschweiler
et al, 2004
throat Recurrence fuzzy logic N/A N/A clinical
Nagata et al,
2005
throat Recurrence ANN statistics 0 genomic
Kan et al,
2004
throat survivability decision tree statistics N/A proteomic
Seiwerth et
al, 2000
thoracic
treatment
response ANN N/A N/A clinical
Su et al,
2005
thyroid survivability decision tree statistics N/A clinical
Kukar et al,
1997
tropho- survivability
genetic
algorithm N/A N/A clinical
Marvin et al,
blastic 1999
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2675494/
Publications relevant to ML methods used for cancer susceptibility prediction
Publication Ayer T et al. [19]
Waddell M et al. [44]
Listgarten J et al. [45]
Stajadinovic et al. [46]
Method ANN SVM SVM BN
Cancer type Breast cancer Multiple myeloma Breast cancer
Colon carcinomatosis
No of patients 62,219 80 174 53
Type of data Mammographic,
demographic SNPs SNPs Clinical, pathologic
Accuracy AUC = 0.965 71% 69% AUC = 0.71
Validation method
10-fold cross validation
Leave-one-out cross validation
20-fold cross validation Cross-validation
Important features
Age, mammography
findings
snp739514, snp521522, snp994532
snpCY11B2 (+) 4536 T/C
snpCYP1B1 (+) 4328 C/G
Primary tumor histology, nodal
staging, extent of peritoneal cancer
Publications relevant to ML methods used for cancer recurrence prediction
Publication Exarchos K et al.
[24] Kim W et al. [47] Park C et al. [48] Tseng C-J et al.
[49] Eshlaghy A et al. [34]
ML method BN SVM Graph-based SSL
algorithm SVM SVM
Cancer type Oral cancer Breast cancer Colon cancer, breast cancer Cervical cancer
Breast cancer
No of patients 86 679
437 374 168 547
Type of data
Clinical, imaging tissue genomic, blood genomic
Clinical, pathologic,
epidemiologic Gene expression,
PPIs Clinical,
pathologic Clinical,
population
Accuracy 100% 89% 76.7% 80.7% 68% 95%
Validation method
10-fold cross validation Hold-out
10-fold cross validation Hold-out
10-fold cross
validation
Important features
Smoker, p53 stain, extra-tumor
spreading, TCAM, SOD2
Local invasion of tumor
BRCA1, CCND1, STAT1, CCNB1
pathologic_S, pathologic_T, cell type RT
target summary
Age at diagnosis,
age at menarche
Publications relevant to ML methods used for cancer survival prediction
http://www.sciencedirect.com/science/article/pii/S2001037014000464
Publication Chen Y-C et al. [50]
Park K et al. [26]
Chang S-W et al. [32]
Xu X et al. [51]
Gevaert O et al. [52]
Rosado P et al. [53]
Delen D et al. [54]
Kim J et al. [36]
ML method ANN
Graph-based SSL algorithm SVM SVM BN SVM DT
SSL Co-training algorithm
Cancer type
Lung cancer Breast cancer Oral cancer
Breast cancer Breast cancer Oral cancer
Breast cancer Breast cancer
No of patients 440 162,500 31 295 97 69 200,000 162,500
Type of data
Clinical, gene
expression SEER Clinical, genomic Genomic
Clinical, microarray
Clinical, molecular SEER SEER
Accuracy 83.50% 71% 75% 97% AUC = 0.851 98% 93% 76%
Validation method
Cross validation
5-fold cross validation
Cross validation
Leave-one-out cross validation Hold-Out
Cross validation
Cross validation
5-fold cross validation
Important features
Sex, age, T_stage, N_stage LCK and ERBB2 genes
Tumor size, age at
diagnosis, number of
nodes
Drink, invasion, p63 gene
50-gene signature
Age, angioinvasion,
grade MMP9,
HRASLA and RAB27B genes
TNM_stage, number of
recurrences
Age at diagnosis,
tumor size, number of
nodes, histology
Age at diagnosis,
tumor size, number of
nodes, extension of
tumor
Cancer Databases
Databases for oncogenomic research are biological databases dedicated to cancer data and
oncogenomic research. They can be a primary source of cancer data, offer a certain level of
analysis (processed data) or even offer online data mining.
https://en.wikipedia.org/wiki/List_of_databases_for_oncogenomic_research
The National Cancer Database
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2234447/
SEER Cancer Research Database
https://healthcaredelivery.cancer.gov/seermedicare
ESMO-European Society for Medical Oncology
http://www.esmo.org/Research/Research-Groups-Databases-and-Tools
NCRI Cancer research database
http://www.ncri.org.uk/what-we-do/research-database/
International/National Research groups by cancer type and topic
http://www.esmo.org/Research/Research-Groups-Databases-and-Tools/Cancer-research-
groups-by-type
Cancer Research Groups in Finland
https://www.docrates.com/en/treatments/patient-satisfaction/finland-leading-country-
in-cancer-care/
Finnish Cancer Registry
http://www.cancer.fi/syoparekisteri/en/research/
Finnish center of Excellence in cancer Genetics Research
http://www.helsinki.fi/coe/
University of Helsinki
http://research.med.helsinki.fi/cancerbio/Research/Index.html
University of Eastern Finland
https://www.sciencedaily.com/releases/2016/06/160621094243.htm
University of Tampere
http://www.uta.fi/bmt/institute/research/nykter/index.html
University of Oulu
http://www.oulu.fi/medicine/research-groups/cancer-research-and-translational-
medicine-research-unit
http://www.oulu.fi/biocenter/research