17
Initial sequencing and Initial sequencing and analysis of the human analysis of the human genome genome Averya Johnson Nick Patrick Aaron Lerner Joel Burrill Computer Science 4G October 18, 2005

Initial sequencing and analysis of the human genome

  • Upload
    hova

  • View
    46

  • Download
    2

Embed Size (px)

DESCRIPTION

Initial sequencing and analysis of the human genome. Averya Johnson Nick Patrick Aaron Lerner Joel Burrill Computer Science 4G October 18, 2005. How and why the human genome project was started. - PowerPoint PPT Presentation

Citation preview

Page 1: Initial sequencing and analysis of the human genome

Initial sequencing and Initial sequencing and analysis of the human analysis of the human

genomegenomeAverya Johnson

Nick PatrickAaron LernerJoel Burrill

Computer Science 4GOctober 18, 2005

Page 2: Initial sequencing and analysis of the human genome

How and why the human genome project was started

There was a dynamic interplay of goals and aspirations that initially drove scientists to

undertake the monumental task of sequencing the human genome.

Page 3: Initial sequencing and analysis of the human genome

Planning the project• Early 1980s: realizations about what a global genome

project would require and could accomplish: could accelerate biomedical research but would require global cooperation

• 1984-1986: first discussions about the idea of sequencing the entire human genome

• 1988: U.S. endorses the idea, but realizes that project must encompass several things: creation of genetic, physical, and sequence maps of genome, development of new genetic technology to support the program, research into ethical, legal, and social issues

Page 4: Initial sequencing and analysis of the human genome

Progress of the project• Early 1990s: groups began to

collaborate and sequence, pilot projects to determine if the overall project was feasible

• 1995: construction of genetic and physical map of the human and mouse genomes, sequencing of the yeast and worm genomes

• Late 1990s: Human Genome Organization created to coordinate efforts

• October 7, 2000: human genome draft sequence released J. Craig Venter, head of

Celera, and Francis Collins, head of the

Human Genome Project

Page 5: Initial sequencing and analysis of the human genome

How the human genome was sequenced and technologies

involvedMany important technologies were vital in sequencing human

genetic data on a large scale during the Human Genome Project. These technologies, as well as group collaboration

and effective execution of their applications, were essential to making the project a success.

Page 6: Initial sequencing and analysis of the human genome

Technologies• Whole genome shotgun

sequencing vs. hierarchical shotgun sequencing: decided to use hierarchical technique (Celera, however, used both techniques)

• Technologies for gathering and improving the quality of data: fluorescence-based sequence detection, specially designed polymerases, gel electrophoresis

• Automated sequencing techniques: automatic, faster, standardized sequencing algorithms

Page 7: Initial sequencing and analysis of the human genome

Generating sequence data1. Cloning selected genome sequences 2. Sequencing the clones using hierarchical shotgun sequencing3. Assembling sequenced clones into an overall, finished sequence

1. Filtering – eliminate contaminated segments2. Layout – associate sequences with locations on a physical genomic map3. Merging – ordering, orienting, and connecting overlapping sequences using

computer algorithms

Page 8: Initial sequencing and analysis of the human genome

Group collaboration• Important principles related to data sharing

1. Global effort: collaboration open to any sequencing center from any nation

2. Public, rapidly released data: all data will be released rapidly into public databases accessible by all groups involved in the project

• Collaboration extremely important and efficient• Sequence data developed all around the world at different

rates using different techniques• However, data could be directly integrated because of

standardized analysis procedures and rapidly released, readily available data

Page 9: Initial sequencing and analysis of the human genome

The result: a draft sequence

• Integrated draft sequence of the human genome released on October 7, 2000

• Important to note that this is a draft sequence: errors and gaps in data

• A work in progress: data still being added, improvements being made to the physical genomic map, new clones are being sequenced to close the gaps and reduce errors

Page 10: Initial sequencing and analysis of the human genome

What scientists learned from the Human Genome Project

Scientists have been able to draw many conclusions from the genetic sequence data gathered by the Human Genome

Project. They have been able to draw direct conclusions about how different aspects of the sequence directly influence

genes and human development.

Page 11: Initial sequencing and analysis of the human genome

Patterns in the human genome sequence

Variation in GC content: why do some regions of the genome have higher CG ratios while others may have lower?

CpG islands: similar to GC content in that there are regions where the CpG dinucleotide occurs much more frequently

Page 12: Initial sequencing and analysis of the human genome

Repeat content of the human genome

Transposon-derived repeats: 45% of the human genome is composed of various transposable elements

Age distribution: transposable elements can be analyzed to determine, with relative accuracy, their age

Page 13: Initial sequencing and analysis of the human genome

Comparison with other organisms: three distinct differences were found when comparing the transposable elements from those genomes to those of the human genome

Distribution of transposable elements: transposable elements are like GC content in that they occur more frequently in some portions of the genome

Repeat content of the human genome

Page 14: Initial sequencing and analysis of the human genome

Gene content of the human genome

Non-coding RNAs: there are four major groups of non-coding RNAs

Protein-coding genes: one of the more difficult parts of the project, but also one of the most important

Page 15: Initial sequencing and analysis of the human genome

Applications in medicine and the future of human

genome researchThe Human Genome Project was not just about coming up with a nucleotide sequence, as there are many applications for the data in real life. And though the human genome has basically

been sequenced, scientists still have a long way to go in terms of understanding and finalizing the draft sequence.

Page 16: Initial sequencing and analysis of the human genome

Applications in medicine and biology

• Identifying disease genes: will allow a more rapid identification of susceptibility to a disease

• Finding drug targets: will help us to understand how diseases work within the body, develop personalized medicine, better treatment

• Applications to basic biology: will allow us to more fully understand how body processes work

Page 17: Initial sequencing and analysis of the human genome

The future of human genome research

What is still left to do?

• Finish the sequence: gaps and errors in the data• Identify all genes and proteins: much is still unknown

about the genes in the human genome and the proteins they produce

• Sequence other genomes: conclusions about the human genome can be drawn from comparing it to other organisms

• Understand the function of sequences: scientists still have much to figure out about what sequences code for and how they work