12
BIO-INFORMATICS GROUP-1 JALPAIGURI GOVERNMENT ENGINEERING COLLEGE Computer Science and Engineering

Introduction to Bio

Embed Size (px)

DESCRIPTION

Introduction to Bio

Citation preview

  • BIO-INFORMATICS GROUP-1

    JALPAIGURI GOVERNMENT ENGINEERING COLLEGE

    Computer Science and Engineering

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 1

    Table Of Contents

    Bioinformatics-Introduction

    What Is Bioinformatics?

    What is Cell?

    Basic Structure of Cell

    Types of cell

    What is DNA?

    What is RNA?

    What is Protein?

    The Central Dogma

    Analyzing Protein Sequences

    A brief history of sequence analysis

    Reading protein sequences from N to C

    Working with protein 3-D structures

    Protein bioinformatics covered

    Analyzing DNA Sequences

    Reading DNA sequences the right way

    The two sides of a DNA sequence

    Palindromes in DNA sequences

    Analyzing RNA Sequences

    RNA structures: Playing with sticky strands

    More on nucleic acid nomenclature

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 2

    DNA Coding Regions: Pretending to Work with Protein Sequences

    Turning DNA into proteins: The genetic code

    More with coding DNA sequences

    DNA/RNA bioinformatics covered

    Working with Entire Genomes

    Genomics

    How Most People Use Bioinformatics

    Becoming an Instant Expert with PubMed/Medline

    Finding out about a protein by its name

    Searching PubMed using authors names

    Searching PubMed using fields

    Searching PubMed using limits

    A few more tips about PubMed

    Retrieving Protein Sequences

    More advanced ways to retrieve protein sequences

    Retrieving a list of related protein sequences

    Retrieving DNA Sequences

    Not all DNA is coding for protein

    Going from protein sequences to DNA sequences

    Retrieving the DNA sequence relevant to my protein

    Using BLAST to Compare My Protein Sequence

    Using Nucleotide Sequence Databases

    Reading into Genes and Genomes

    Prokaryotes: Small bugs, simple genes

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 3

    Eukaryotes: Bigger bugs, complex genes

    Making Use (and Sense) of GenBank

    Making sense of the GenBank entry of a prokaryotic gene

    Making sense of the GenBank entry of an eukaryotic mRNA

    Making sense of a GenBank eukaryotic genomic entry

    Working with related GenBank entries

    Retrieving GenBank entries without accession numbers

    Using a Gene-Centric Database

    Working with Whole-Genome Databases

    Working with complete viral genomes

    Working with complete bacterial genomes

    More bacterial genomics at TIGR

    Microbes from the environment at DoE

    Exploring the Human Genome

    Finding out about the Ensembl project

    Using Protein and Specialized Sequence Databases

    Protein Alignment: Often More Informative Than DNA Alignment

    Definitions: Homology, Similarity, Identity

    Gaps

    Scoring Matrices

    Dayhoff Model: Accepted Point Mutations

    PAM1 Matrix, PAM250 and Other PAM Matrices

    Practical Usefulness of PAM Matrices in Pairwise Alignment

    Pairwise Alignment and Limits of Detection: The Twilight Zone

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 4

    Pairwise Alignment, Homology, and Evolution of Life

    From Translated ORFs to Mature Proteins

    ORFs: What we see is NOT what we get

    A personal final destination for each protein

    A combinatorial diversity of folds and functions

    Name and origin of the protein

    The References and Cross-References

    Finally, the sequence itself

    Finding Out More about our Protein

    Finding out more about modified amino acids

    Some advanced biochemistry sites

    Finding out more about biochemical pathways

    Finding out more about protein structures

    Finding out more about major protein families

    Working with a Single DNA Sequence

    Catching Errors Before Its Too Late

    Removing vector sequences

    Cases when you shouldnt discard your sequence

    Computing/Verifying a Restriction Map

    Designing PCR Primers

    Analyzing DNA Composition

    Establishing the G+C content of our sequence

    Counting words in DNA sequences

    Counting long words in DNA sequences

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 5

    Experimenting with other DNA composition analyses

    Finding internal repeats in our sequence

    Identifying genome-specific repeats in our sequence

    Finding Protein-Coding Regions

    ORFing your DNA sequence

    Analyzing your DNA sequence with GeneMark

    Finding internal exons in vertebrate genomic sequences

    Complete gene parsing for eukaryotic genomes

    Analyzing your sequence with Genome Scan

    Assembling Sequence Fragments

    Managing large sequencing projects with public software

    Assembling your sequences with CAP3

    Working with RNA

    Predicting, Modeling and Drawing RNA Secondary Structures

    Noncoding RNA

    Noncoding RNAs in the Rfam Database

    Transfer RNA

    Ribosomal RNA

    Small Nuclear RNA

    Small Nucleolar RNA

    MicroRNA

    Short Interfering RNA,

    Using Mfold

    Interpreting mfold results

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 6

    Forcing interaction in mfold

    Searching Databases and Genomes for RNA Sequences

    Finding tRNAs in a genome

    Using PatScan to look for RNA patterns

    Finding the New RNAs: miRNAs and siRNAs

    Doing RNA Analysis for Free over the Internet

    Studying evolution with ribosomal RNA

    Finding the small, non-coding RNA we need

    Generic RNA resources

    Working with a Single Protein Sequence

    Doing Biochemistry on a Computer

    Predicting the main physico-chemical properties of a protein

    Interpreting ProtParam results

    Digesting a protein in a computer

    Doing Primary Structure Analysis

    Looking for transmembrane segments

    Looking for coiled-coil regions

    Predicting Post-Translational Modifications in Your Protein

    Looking for PROSITE patterns

    Interpreting ScanProsite results

    Finding Known Domains in Your Protein

    Choosing the right collection of domains

    Finding domains with InterProScan

    Interpreting InterProScan results

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 7

    Finding domains with the CD server

    Interpreting and understanding CD server results

    Finding domains with Motif Scan

    Discovering New Domains in Your Proteins

    More Protein Analysis for Free over the Internet

    Similarity Searches on Sequence Databases

    Understanding the Importance of Similarity

    The Most Popular Data-Mining Tool Ever: BLAST

    BLASTing protein sequences

    Understanding your BLAST output

    BLASTing DNA sequences

    The BLAST way of doing things

    Controlling BLAST: Choosing the Right Parameters

    Controlling the sequence masking

    Changing the BLAST alignment parameters

    Controlling the BLAST output

    Making BLAST Iterative with PSI-BLAST

    PSI-BLASTing protein sequences

    Avoiding mistakes when running PSI-BLAST

    Discovering and using protein domains

    with BLAST and PSI-BLAST

    Similarity Searches for Free over the Internet

    Comparing Two Sequences

    Making Sure You Have the Right Sequences and the Right Methods

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 8

    Choosing the right sequences

    Choosing the right method

    Making a Dot Plot

    Choosing the right dot-plot flavor

    Using Dotlet over the Internet

    Doing biological analysis with a dot plot

    Making Local Alignments over the Internet

    Choosing the right local-alignment flavor

    Using Lalign to find the ten best local alignments

    Interpreting the Lalign output

    Making Global Alignments over the Internet

    Using Lalign to Make a Global Alignment

    Aligning Proteins and DNA

    Free Pairwise Sequence Comparisons over the Internet

    Building a Multiple Sequence Alignment

    Finding Out if a Multiple Sequence Alignment Can Help You

    Identifying situations where multiple alignments do not help

    Helping your research with multiple sequence alignments

    Choosing the Right Sequences

    The kinds of sequences youre looking for

    Gathering your sequences with online BLAST servers

    Choosing the Right Method of Multiple Sequence Alignment

    Using ClustalW

    Aligning sequences and structures with Tcoffee

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 9

    Crunching large datasets with MUSCLE

    Interpreting Your Multiple Sequence Alignment

    Recognizing the good parts in a protein alignment

    Taking your multiple alignment further

    Comparing Sequences That You Cant Align

    Making multiple local alignments with the Gibbs sampler

    Searching conserved patterns

    Internet Resources for Doing Multiple Sequence Comparisons

    Making multiple alignments with ClustalW around the clock

    Finding your favorite alignment method

    Searching for motifs or patterns

    Working with Protein 3-D Structures

    From Primary to Secondary Structures

    Predicting the secondary structure of a protein sequence

    Predicting additional structural features

    From the Primary Structure to the 3-D Structure

    Retrieving and displaying a 3-D structure from a PDB site

    Guessing the 3-D structure of your protein

    Looking at sequence features in 3-D

    Finding proteins with similar shapes

    Finding other PDB viewers

    Classifying your PDB structure

    Doing homology modeling

    Folding proteins in a computer

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 10

    Threading sequences onto PDB structures

    Looking at structures in movement

    Predicting interactions

    Gene Expression: Microarray Data Analysis

    Introduction

    Microarray Data Analysis Software and Data Sets

    Reproducibility of Microarray Experiments

    Microarray Data Analysis: Preprocessing

    Scatter Plots and MA Plots

    Global and Local Normalization

    Accuracy and Precision

    Robust Multi-array Analysis (RMA)

    Microarray Data Analysis: Inferential Statistics

    Expression Ratios

    Hypothesis Testing

    Corrections for Multiple Comparisons

    Significance Analysis of Microarrays (SAM)

    From t-Test to ANOVA

    Microarray Data Analysis: Descriptive Statistics

    Hierarchical Cluster Analysis of Microarray Data

    Partitioning Methods for Clustering: k-Means Clustering

    Clustering Strategies: Self- Organizing Maps

    Principal Components Analysis: Visualizing Microarray Data

    Supervised Data Analysis for Classification of Genes or Samples

  • BIO-INFORMATICS 06/16/2015

    GROUP-1 11

    Functional Annotation of Microarray Data

    Perspective

    Pitfalls

    Building Phylogenetic Trees

    Finding Out What Phylogenetic Trees Can Do for Us

    Preparing Your Phylogenetic Data

    Choosing the right sequences for the right tree

    Preparing your multiple sequence alignment

    Building the Kind of Tree we Need

    Computing your tree

    Knowing whats what in our tree

    Displaying your phylogenetic tree

    Doing Phylogeny for Free over the Internet

    Finding online resources

    Finding generic resources

    Collections of orthologous genes