36
1 BIOINFORMATICS BIOINFORMATICS واه خ د ی ح و ت ر کت ای د ق ا اب ن ح مدرس :( ری صف ا م : رض ی% ظ ن( ت( ری صف ا م : رض ی% ظ ن( ت85233515 85233515 ) )

BIOINFORMATICS

Embed Size (px)

DESCRIPTION

BIOINFORMATICS. مدرس :جناب آقای دکتر توحید خواه تنظیم : رضا صفری(85233515). DEFINITION. Any use of computer to handle biological information. (Tk ATTWOD,…,intrud to bioinf.99) با این تعریف موضوعاتی چون med imaging-image analysis-AI و neural network جزو بیوانفورماتیک هستند. - PowerPoint PPT Presentation

Citation preview

Page 1: BIOINFORMATICS

11

BIOINFORMATICSBIOINFORMATICS

خواه : توحید دکتر آقای جناب مدرس( صفری : رضا )تنظیم صفری : رضا ((8523351585233515تنظیم

Page 2: BIOINFORMATICS

22

DEFINITIONDEFINITION

Any use of computer to handle biological information.Any use of computer to handle biological information.

)Tk ATTWOD,…,intrud to bioinf.99()Tk ATTWOD,…,intrud to bioinf.99( موضوعاتی تعریف این موضوعاتی با تعریف این با neural network neural network ووmed imaging-image analysis-AImed imaging-image analysis-AI چونچون

هستند بیوانفورماتیک هستند جزو بیوانفورماتیک ..جزو تعیین جهت کامپیوتر از استفاده معنی به اصطالح این عمل تعیین در جهت کامپیوتر از استفاده معنی به اصطالح این عمل در

زنده ) عناصر مولکولی زنده )محتویات عناصر مولکولی computational molecular computational molecularمحتویاتbiologybiology.).)

Fredj Takaya –Institute Pasteur:Fredj Takaya –Institute Pasteur: The mathematical ,statistical & computing methods that The mathematical ,statistical & computing methods that

aim to solve biological problems using DNA and amino aim to solve biological problems using DNA and amino acid sequences and related information.acid sequences and related information.

Page 3: BIOINFORMATICS

33

DefinitionDefinition…… سال واز جدید نسبتا بیوانفورماتیک سال اصطالح واز جدید نسبتا بیوانفورماتیک منابع 9191اصطالح منابع وارد وارد

گردید. گردید. دهه دهه در ساخت 6060در زمینه در ساخت حرکتهایی زمینه در توسعه databasedatabaseحرکتهایی توسعه و و

کمک با بیولوژیکی کشف و ها کمک الگوریتم با بیولوژیکی کشف و ها sequence analysissequence analysisالگوریتم

که شد که انجام شد . molecular evolutionmolecular evolutionانجام میشد .گفته میشد گفته دهنده تشکیل دهنده عناصر تشکیل : : bioinformaticsbioinformaticsعناصر

BiologyBiology Computerscience)computational biology(Computerscience)computational biology( Mathematics)biomathematics(Mathematics)biomathematics( InformaticsInformatics StatisticsStatistics

Page 4: BIOINFORMATICS

44

Bioinformatics vs computational biologyBioinformatics vs computational biology..

Bioinformatics is concerned with the information.Bioinformatics is concerned with the information.

Comp.biology is concerned with the hypothesis.Comp.biology is concerned with the hypothesis.

Bioinformatics is also often specified as an Bioinformatics is also often specified as an applied applied subfieldsubfield of the more general discipline of of the more general discipline of biomedical informatics.biomedical informatics.

Page 5: BIOINFORMATICS

55

Tool-users

Tool-makers

bioinformatics

public healthinformatics

medicalinformatics

infrastructure

databases algorithms

Page 6: BIOINFORMATICS

66

Page 7: BIOINFORMATICS

77

Why does bioinformatics appearWhy does bioinformatics appear?? بر هزینه و گیر وقت آزمایشگاه در تحقیقات بر انجام هزینه و گیر وقت آزمایشگاه در تحقیقات انجام دهه چند طی بیولوژیکی های داده انفجاری دهه رشد چند طی بیولوژیکی های داده انفجاری رشد هر دادها هر حجم دادها .1515حجم شود می برابر دو .ماه شود می برابر دو ماه روزانه ژنتیک آزمایشگاه یک در دادها روزانه حجم ژنتیک آزمایشگاه یک در دادها .100100حجم است بایت .گیگا است بایت گیگا وجود به نیاز حجیم های وجود داده به نیاز حجیم های ذخیره databasedatabaseداده تا کامپیوتری ذخیره های تا کامپیوتری های

این– تا آید بوجود ابزارهایی و شده گذاری وایندکس بندی این– دسته تا آید بوجود ابزارهایی و شده گذاری وایندکس بندی دسته. باشند آنالیز و آسان دسترسی قابل ها .داده باشند آنالیز و آسان دسترسی قابل ها داده

داده حفظ و تولید به معطوف ویژه توجه ژنی انقالب ابتدای داده در حفظ و تولید به معطوف ویژه توجه ژنی انقالب ابتدای درو ) آمینه اسید توالی لوژیکی بیو اطالعات ذخیره بمنظور و ) ها آمینه اسید توالی لوژیکی بیو اطالعات ذخیره بمنظور ها

نوکلئوتیدها(.نوکلئوتیدها(. : کامپیوتر تکنولوژی در مالحظه قابل های : پسشرفت کامپیوتر تکنولوژی در مالحظه قابل های پسشرفت

( ( CPU,disk storage,internetCPU,disk storage,internet))

Page 8: BIOINFORMATICS

88

Growth of GenBank

Year

Bas

e p

airs

of

DN

A (

bil

lio

ns)

Seq

uen

ces

(mil

lio

ns)

Updated 8-12-04:>40b base pairs

1982 1986 1990 1994 1998 2002

Page 9: BIOINFORMATICS

99

DNA RNA protein

Central dogma of molecular biology

genome transcriptome proteome

Central dogma of bioinformatics and genomics

Page 10: BIOINFORMATICS

1010

Aims of BioinformaticsAims of BioinformaticsAims of BioinformaticsAims of Bioinformatics1.Biological database:1.Biological database:A large ,organized body of persistent data , usually A large ,organized body of persistent data , usually

associated with computerized software designed to associated with computerized software designed to update,query,and retrieve components of the data update,query,and retrieve components of the data stored within the system.stored within the system.

Simple database:simple file,some records,same sets of Simple database:simple file,some records,same sets of informations.informations.Additional requrementsAdditional requrements: easy access: easy access a method for extractingonly neededa method for extractingonly needed information to answer a specificinformation to answer a specific

qeustion.qeustion.

Page 11: BIOINFORMATICS

1111

GenBankEMBL DDBJ

Housedat EBI

EuropeanBioinformatics

Institute

There are three major public DNA databases

Housed at NCBINational

Center forBiotechnology

Information

Housed in Japan

Page 12: BIOINFORMATICS

1212

List of URLList of URL

Page 13: BIOINFORMATICS

1313

NCBINCBI )natioal center for biotechnology )natioal center for biotechnology information(information(

www.ncbi.nlm.nih.govwww.ncbi.nlm.nih.gov

Entrez: Entrez: a unique search and retrievala unique search and retrieval systemsystem

access to many databasesaccess to many databases

for exam: Entrez protein DB crosslink to Entrez for exam: Entrez protein DB crosslink to Entrez

Taxonomy DB)finding tax. Inf for the species Taxonomy DB)finding tax. Inf for the species from from

which a prot seq was derived.which a prot seq was derived.

Page 14: BIOINFORMATICS

1414

Entrez integrates…

• the scientific literature; • DNA and protein sequence databases; • 3D protein structure data; • population study data sets; • assemblies of complete genomes

Page 15: BIOINFORMATICS

1515

Entrez is a search and retrieval system that integrates NCBI databases

Page 16: BIOINFORMATICS

1616

Four ways to access DNA and protein sequences

[1] Entrez Gene with RefSeq[2] UniGene [3] European Bioinformatics Institute )EBI( and Ensembl )separate from NCBI([4] ExPASy Sequence Retrieval System )separate from NCBI(

Page 27

Page 17: BIOINFORMATICS

1717

2.Data Analysis:2.Data Analysis:The information in these DBs is useless until analysed .The information in these DBs is useless until analysed .

Bioinf. Tools can be used to obtain seq. of genes or proteins.Bioinf. Tools can be used to obtain seq. of genes or proteins.

Seq canbe analysed in many ways:Seq canbe analysed in many ways:

Assembling:Assembling:

Mapping:Mapping:

Compare:a comparison of genes within a species or between diff.spp.Compare:a comparison of genes within a species or between diff.spp.

can show similarities between protein function or relation can show similarities between protein function or relation

between spp.)use to construct phylogenic trees(between spp.)use to construct phylogenic trees(

Phylogenetics: understanding the relatioships between diff. kinds of lifePhylogenetics: understanding the relatioships between diff. kinds of life

Page 18: BIOINFORMATICS

1818

Analysis of:Analysis of:

Gene expression: )measuring mRNA level by EST,SAGE,..tech( noise-prone )developing statistical tools to separate signal

from noise(.applies in tumor cells. Identification of genes that are expressed differentialy in a Identification of genes that are expressed differentialy in a

affected cell provide a basis for explaining the cause of affected cell provide a basis for explaining the cause of illness and highlights potential drug targets.illness and highlights potential drug targets.

Page 19: BIOINFORMATICS

1919

Analysis of:

RegulationRegulation:: complex events starting with complex events starting with extracellular signal such as a hormone and extracellular signal such as a hormone and leading to increase or decrease in the activity of leading to increase or decrease in the activity of one or more proteins.one or more proteins.

bioinformatics tech.have been applied to explore bioinformatics tech.have been applied to explore various steps in this process.various steps in this process.

Protein expression: Protein expression: protein microarrays,HT MSprotein microarrays,HT MS

Mutations inMutations in cancercancer:point mutation,detction :point mutation,detction methods measure several hundred thousand methods measure several hundred thousand sites throughout the genom,generate tetrabyte of sites throughout the genom,generate tetrabyte of data per experiment.data per experiment.

Page 20: BIOINFORMATICS

2020

Prediction of protein structure :Prediction of protein structure :Amino acid seq.)primary structure( can be determined from Amino acid seq.)primary structure( can be determined from

the seq of gene that codes for it.the seq of gene that codes for it.

Prediction of secondary,tertiary ,….. Protein structures.Prediction of secondary,tertiary ,….. Protein structures.

Using of homology to predict gene function:Using of homology to predict gene function:

similar function with similar seq.similar function with similar seq.

Which part of prot. Is important in structure formation&Which part of prot. Is important in structure formation&

Interaction with other prot.Interaction with other prot.

Homology modeling

Hemoglobin & leghemoglobin)same structure &function-diff. a.a(

Page 21: BIOINFORMATICS

2121

Comparative Genomics:Comparative Genomics:Establishment of the correspondence between Establishment of the correspondence between

gene)orthology analysis( or other genomic gene)orthology analysis( or other genomic features.features.

Gene)Gene)pointmutationpointmutation(,(,chromosom)chromosom)duplication,lateraduplication,lateralltransfer,inversion,delet….(,transfer,inversion,delet….(,

whole genome )whole genome )hybridization,polypeptidasionhybridization,polypeptidasion,…(,…(

RAPID SPECIATION

Page 22: BIOINFORMATICS

2222

3.Evolutionary Biology:3.Evolutionary Biology:The study of the origin & descent of spp.and their change over time.The study of the origin & descent of spp.and their change over time.

New insight to molecular basis of disease.New insight to molecular basis of disease.

Investigating the function of homologs of a disease gene.Investigating the function of homologs of a disease gene.

Homology:two genes sharing a common evolut.history.Homology:two genes sharing a common evolut.history.

Finding evolut.relationships between diff.forms of life.Finding evolut.relationships between diff.forms of life.

Closely related orgnisms have similar seq.Closely related orgnisms have similar seq.

Protein Family:proteins that show a significant seq.Protein Family:proteins that show a significant seq.

Protein Folds:distinct protein building block.Protein Folds:distinct protein building block.

Reconstruct the evolut. Rlationship between two species.Reconstruct the evolut. Rlationship between two species.

Estimate time of divergenceEstimate time of divergence..

Page 23: BIOINFORMATICS

2323

Bioinformatics&evolutionary biologyBioinformatics&evolutionary biology Trace the evolution of a large number of organism by

measuring changing in their DNA Compare entire genomes and the prediction of important

factors Build complex computational models of populations to

predict the outcome of the system overtime Track and share information on an increasingly large

number of spp.

Page 24: BIOINFORMATICS

2424

Measuring Biodiversity of an Ecosystem:Measuring Biodiversity of an Ecosystem:

Total genomic complement of a particular environment,from Total genomic complement of a particular environment,from all of the spp. Present.all of the spp. Present.

Collect the spp.names,descriptions,genetic information,status Collect the spp.names,descriptions,genetic information,status and size of population,habitant needs,…..and size of population,habitant needs,…..

Genetic health of a breeding pool)agriculture(Genetic health of a breeding pool)agriculture(

Endangered population)in silico(Endangered population)in silico(

Page 25: BIOINFORMATICS

2525

4.Modeling biological systems:4.Modeling biological systems:Computer simulations of cellular subsystems to analyze & Computer simulations of cellular subsystems to analyze &

visualize the complex connection of cellular processes.visualize the complex connection of cellular processes.

Artificial life)virtual evolution(attemps to understand Artificial life)virtual evolution(attemps to understand evolutionary processes via comp. simulation of simple evolutionary processes via comp. simulation of simple )artificial( life forms.)artificial( life forms.

Protein-protein docking: protein structure by XRC&NMRProtein-protein docking: protein structure by XRC&NMR

Predict p-p interaction only by these 3D shapes.Predict p-p interaction only by these 3D shapes.

The most straightforward application of the database is to predict the function of uncharacterised protein through their homology to characterised proteins.

Page 26: BIOINFORMATICS

2626

Protein ModelingProtein Modeling::DNA seq encode proteins with specific functionsDNA seq encode proteins with specific functions..

In the absence of a protein structure ,by using protein or molecular In the absence of a protein structure ,by using protein or molecular modeling researchers try to predict 3D structuremodeling researchers try to predict 3D structure..

By using Templates predict TargetBy using Templates predict Target

Helpful in proposing and testing biological hypothesisHelpful in proposing and testing biological hypothesis..

Starting point to confirm a structure through XRC & NMRStarting point to confirm a structure through XRC & NMR

Increasingly important tool for scientists working to understand normal Increasingly important tool for scientists working to understand normal and disease-related process in living organismsand disease-related process in living organisms..

Changig of undesired action of an enzymeChangig of undesired action of an enzyme..

Page 27: BIOINFORMATICS

2727

5.Genom Mapping:5.Genom Mapping:Serve a scaffold for orienting seq. information.Serve a scaffold for orienting seq. information.

Past:Past: Manually mapping the genomic region Manually mapping the genomic region

time-consuming and painstaking process.time-consuming and painstaking process.

Now:Now: by new tec. A number of high quality genom-wide by new tec. A number of high quality genom-wide maps are available.maps are available.

comp.maps gene hunting: faster,cheaper,morecomp.maps gene hunting: faster,cheaper,more

practicalpractical

By these advances,researcher‘s burden has shifted from By these advances,researcher‘s burden has shifted from mapping a genommapping a genom to navigate a vast number of to navigate a vast number of web web sites and DBssites and DBs

Page 28: BIOINFORMATICS

2828

6.Map Viewer:6.Map Viewer:A tool for visualizing whole genome or single chromosomes.A tool for visualizing whole genome or single chromosomes.

Whole genomWhole genom viewview:display a schematic for all of an organism‘s:display a schematic for all of an organism‘s

chromosomes.chromosomes.

Map viewMap view:: show one or more detailed maps for a single ch. show one or more detailed maps for a single ch.

Using Map viewer ,researchers can find answers to question Using Map viewer ,researchers can find answers to question such as: such as:

Where does a particular gene exist within an organism`s genome?Where does a particular gene exist within an organism`s genome?

Which gene are located on a particular chromosome& in what order?Which gene are located on a particular chromosome& in what order?

What is the corresponding seq. data for a gene that exist in a particular What is the corresponding seq. data for a gene that exist in a particular chromosome region?chromosome region?

What is the distance between two gene/What is the distance between two gene/

Page 29: BIOINFORMATICS

2929

An important aspects of complete genom is distinguish between coding & non-coding region.

The biggest excitement : availability of complete genom seq. for diff. organism.

Page 30: BIOINFORMATICS

3030

>100,000 species are represented in GenBank

all species 128,a941

viruses 6,137

bacteria 31,262

archaea 2,100

eukaryota 87,147

Page 31: BIOINFORMATICS

3131

Human Genome Human Genome projectproject

The greatest achievment of bioinformatics methods.

Page 32: BIOINFORMATICS

3232

A typical scenarioA typical scenario

Post-natal genotyping

Assess susceptibility or immunity From specific disease&pathogens

Unique combination of vaccines

Minimising healthcare costs

Early detection of illness

Page 33: BIOINFORMATICS

3333

Rapid progress of bioinformaticsRapid progress of bioinformatics

Advances in the diagnosis,treatment,and Advances in the diagnosis,treatment,and prevention of many genetic diseaseprevention of many genetic disease

Bioinformatics has transformed the biology from Bioinformatics has transformed the biology from purely lab-basedpurely lab-based science to an science to an information information

science science

Page 34: BIOINFORMATICS

3434

Page 35: BIOINFORMATICS

3535

Page 36: BIOINFORMATICS

36