View
25
Download
0
Category
Tags:
Preview:
Citation preview
Большие Данные в биомедицине
Юрий Никольский, Директор по науке, Биомед
кластер
DNA sequencing timeline
• 1953: DNA structure established by James Watson and Francis Crick– 4 “letters”: A (adenine), G (guanine), T (tjymine), C (cytozine)
• 1961: Cracking genetic code for protein synthesis by Marshall Nirenberg• 1977: Rapid DNA sequencing by Frederick Sanger• 1983: First genetic disease mapped; Huntington’s disease• 1986: Automated DNA sequencer is invented by Lee Hood at Caltech• 1990: Human genome project begins• 1998: Next generation sequencing “NGS” methods
invented • 2001: Draft of human genome published by Celera, NIH• 2014: - Over 1m genomes sequenced
- BGI is the market leader in services- Clinical sequencing is becoming mainstream
Cost of sequencing vs. # individuals sequenced
Democratization of DNA sequencing Heather Dewey-Hagborg, Artist, Ph.D. Student
http://biogenfutur.es/
What’s wrong with knowing your genome sequence? BRAF mutations in melanoma
Wagle N et al. JCO 2011;29:3085-3096
©2011 by American Society of Clinical Oncology
“NATURE OF THE ACTION” 1. This proposed class action alleges that 23andMe, Inc. (“Defendant”) falsely and misleadingly advertises their Saliva Collection Kit/Personal Genome Service (“PGS”) as providing “health reports on 240+ conditions and traits”, “drug response”, “carrier status”, among other things, when there is no analytical or clinical validation for the PGS for its advertised uses. 2. In addition, Defendant uses the information it collects from the DNA tests consumers pay to take to generate databases and statistical information that it then markets to other sources and the scientific community in general, even though the test results are meaningless. 3. Despite Defendant’s failure to receive marketing authorization or approval from the Food and Drug Administration (“FDA”), Defendant has slowly increased its list of indications for the PGS, and initiated new marketing campaigns, including television advertisements in violation of the Federal Food, Drug and Cosmetic Act (“FDC Act”).
Timeline of sequencing applications
Geographic Information System (GIS) of a human being
Pharmacogenomics
• Most drugs don’t work in most patients• Variants to assure efficacy or avoid side effects.
Over 100 drugs with FDA genetic labels• Huge number of alleles for individual drug
response• Typical odds ratio is 3-40 fold; much higher
than for diseases. This is probably due to direct selection by a drug
Problem: Companion biomarkers and patient stratification for clinical trials and therapy
StratificationTool
Responder
Non-responder
BiomarkerPanel Content
Moleculardata
Patient A
- New opportunity: clinical sequencing for biomarkers. Was not available till 2014
- Not only big pharma, but ALL entities in drug development (small biotechs, NIH, foundations, universities) must do patient stratification in trials
New opportunity: market for companion biomarkers
Clinical trials. Gov (US FDA website)
• 12,300 drugs in clinical development Integrity, Thomson Reuters, 2013• > $ 500B-1T in total development costs at stake• 10% improvement to predict drug failure saves $100M per drug (FDA)
19
Pharmacogenomics
Pharmacogenetics & Pain Management
CYP2D6 Gene Enzyme
кодеин <2 copies LOW MORPHINELOW RELIEF
кодеин >2 copies HIGH MORPHINEHIGH TOXICITY RISK
Frequencies of CYP2D6 Alleles in major race/ethnic Groups
Allele African Amer.
Caucasian Middle Eastern
East Asian Americas Oceanian
1 копия 0.4% 0.7% 3.8% 0.29 0.8% 11.8%
4 копии 2.1% 0.2% 0% 0% 0.5% 0%
Crews et al. Clinical Pharmacogenetics Implementation Guidelines for Codeine Therapy CYP2D6 Genotype Clinical Pharmacology & Therapeutics V91 N2 Feb 2012
Genotyping for defining the origin
• Cheaper and faster than whole genome sequencing• Can be used for identification of risk group
TruePredicted
SPA algorithm for human origin (23 and me)
23
Modeling Admixture (ООО Генус)
Earlier studies by National Genographic (2012) demonstrated that there are K=9 population clustersNine reference populations:North East Asian, Mediterranean, South AfricanSouth West Asian, Native American, OceanianSouth East Asian, Northern European, Sub-Saharan African ADMIXTURE tool in a supervised mode – Assigns similarity to K known populations based on their allele frequencies.
Data: Combination of National Genographic and 1000 genomes populations (a total of 54 populations).
D.H. Alexander, J. Novembre, and K. Lange. Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19:1655–1664, 2009.
24
TruePredicted
GPS
SPA
Геномные и постгеномные направления Кластера БМТ
Геномные и биоинформатические компании в Сколково
• БиоСофт: компьютационная система для предсказания активности генов • Генотек: генотипирование, медицинская геномика
• Кномикс: система для интерпретация данных микробиоты
• ПОНКЦ: индивидуальные комбинации для раковых пациентов
• РосГенДиагностика: мутационный анализ раковых пациентов
• ПарСек: пренатальные тесты для трисомии, других врожденных заболеваний
• Генус: определение происхождения, генотипирование популяций
27
Big Data in Biomedicine’ 2015
• Конкурс проектов по биоинформатике, ОМИКс биологии, трансляционной медицине– Алгоритмы и программы– Анализ и хранение данных– Биомаркеры и поиск мишеней
• Совместные конкурс между МБТ и ИТ кластерами• Заявки с 1 марта 2015г.; финал 16 июля
Recommended