Download pdf - Genome Big Data

Transcript
Page 1: Genome Big Data

genomebigdata

Adrián Báez16/06/2014

Page 2: Genome Big Data

DNA Genes

Proteins

Genome

Genomics

Biomedicine

Page 3: Genome Big Data

Sequencing

Assembly

Page 4: Genome Big Data

Fragments

Reads

FASTQ file

Genome sequencing

Page 5: Genome Big Data

2003

2014

Human Genome Project ending (1990-2003)

2.7 billion dollars

Illumina launchs HiSeqX Ten

1000 dollars/genome

“Forty such machines would be able to sequence more genomes in one year than had

been produced by all other sequencers to date.”

Genome sequencing

Page 6: Genome Big Data

÷400x20

Reads

MB ~ GB

HDD

Assembly

Intermediate data structures

GB ~ TB

RAM

Original sequence

MB ~ GB

HDD

Reads Assembly (RAM) Result

Escherichiacoli 82.4 MB 1.64 GB 3.8 MB

Trypanosomacruzi 1 GB 13.75 GB 38.6 MB

Genome assembly

Page 7: Genome Big Data

Instituto Universitario de Enfermedades Tropicales y Salud Pública de Canarias

Current system: Web assembly and analysis

Page 8: Genome Big Data

Future work: Big Data solutions

Instituto Universitario de Enfermedades Tropicales y Salud Pública de Canarias

Page 9: Genome Big Data

Data transfer

Biotorrents

Implementing Big Data

Security and privacy

Advanced encryption algorithms

Custom hardware solutions instead of cloud computing

Consent forms to share personal genome data

Data storage

Lack of an integral, economic and safe solution

Page 10: Genome Big Data

Sequencing/assembly projects

Google Scholar: papers that mention genome sequencing or assembly

Human Genome Project

Cancer Genome Project Pine Genome Project

Dog Genome Project

Pediatric Cancer GenomeProject

Bovine Genome Project

Mammoth GenomeProject

Pear Genome Project

Fugu Genome Project

Page 11: Genome Big Data

thanksfor yourattention


Recommended