19
leic Acid Secondar Structure AND Primer Selection Bioinformatics 90-

Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Embed Size (px)

Citation preview

Page 1: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Nucleic Acid SecondarilyStructure

ANDPrimer Selection

Bioinformatics 90-07

Page 2: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Graphic Display in GCGGraphic Display in GCGConfiguring Graphics Languages and Devices

•GIF (Graphics Interchange Format) – GIF87a, GIF89a

•HPGL (HP Graphics Language) – ColorPro, HP7470, HP7475, HP7550, HP7580, LaserJet3

•PNG (Portable Network Graphics) – For WWW Browser

•PostScript •ReGIS •Sixel •Tektronix •Xwindows – Dowload x-win412.exe

ProgramCodonPreferenceDotPlotFigureFramesFrameSearch /PLOtGapShowGrowTreeHelicalWheelIsoelectricMapPlotMomentPepPlotPileUpPAUPDisplayPlasmidMapPlotFoldPlotSimilarityPlotStructurePlotTestPrettyBoxPrimeStatPlotTestCodeWordSearch -PLOt

Page 3: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Exercise 07-1Exercise 07-1Configuring X-windows

Download x-win412.exe from ftp://163.25.92.42Double click x-win412.exe, accept all default settings.Start x-win32

Connect to GCG via TELNETgcg 2% gogcg 3% xwindows Use XWindows graphics with what device: Color Workstation Gray Scale Workstation Monochrome Workstation Please choose one ( * COLORWORKSTATION * ) Plotting Configuration set to: Language: xwindows Device: COLORWORKSTATION Port or Queue: GCG_Graphics

gcg 4% plottestGIF & PostScript

Page 4: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Nucleic Acid Secondary StructureNucleic Acid Secondary StructureStemloop and Mfold

In Nucleic acids, inverted repeat sequences may indicate foldback (self pairing)structures.

Identifying Inverted Repeats

Calculating RNA Folding

Displaying of Folding Structures

StemloopStemloop

Plotfold/DotplotPlotfold/Dotplot

MfoldMfold

Page 5: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

STEMLOOPSTEMLOOPStemLoop finds stems (inverted repeats) within a sequence. You specify the minimum stem length (number of nucleotides in a paired stretch), minimum and maximum loop sizes, and the minimum number of bonds per stem (length of nucleotide sequence between the paired regions).

217 AGGCTGCAGTG AGCCGTGAT 11, 25 |||||| |||| C 257 TCCGGCCTCAC GTCACCGCG

start

end

quality

size

stem

Vertical bars ('|') indicating the base pairs. The associated loop is shown to the right of the stem. If either the stem or loop is too long to be displayed in its entirety on the line, then only that part that fits on the line is shown. The first and last coordinates of the stem are displayed on the left, and the length of the stem (size), the number of bonds in the stem (quality), and the loop size are shown on the right.

Page 6: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

STEMLOOPSTEMLOOPOutput formats

221 TGCAGTG AGCCGTG 7, 18 ||||||| 248 ACGTCAC CGCGCTA 14

Loop Start End Size Quality 1 35 54 8 18

*.stem

*.pnt DOTPLOT

1) See the stems2) See the stem coordinates3) File the stems (*.fld)4) File the stems as points for DOTPLOT5) Choose new parameters6) Get a different sequence

Sort stems by: 1) Position 2) Quality 3) Size

Page 7: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

MFOLDMFOLD

Using energy minimization criteria, any predicted "optimal" secondary structure for an RNA or DNA molecule depends on the model of folding and the specific folding energies used to calculate that structure. Different optimal foldings may be calculated if the folding energies are changed even slightly. Because of uncertainties in the folding model and the folding energies, the "correct" folding may not be the "optimal" folding determined by the program. You may therefore want to view many optimal and suboptimal structures within a few percent of the minimum energy. You can use the variation among these structures to determine which regions of the secondary structure you can predict reliably. For instance, a region of the RNA molecule containing the same helix in most calculated optimal and suboptimal secondary structures may be more reliably predicted than other regions with greater variation.

Mfold output file: *.mfold

Page 8: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

MFOLDMFOLD How to read *.mfold?

Survey of optimal and suboptimal foldingsA) sub-optimal energy plotB) p-num plotSampling of optimal and suboptimal foldingsC) circlesD) domesE) mountainsF) squiggles

PLOTFOLD

Page 9: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

PLOTFOLDA) sub-optimal energy plot

The energy dotplot indicates all of the base pairs involved in all optimal and suboptimal secondary structures within the energy increment you specify. The plot takes the form of a two-dimensional graph where both axes of the graph represent the same RNA

sequence. Each point drawn in the graph indicates a base pair between the ribonucleotides whose positions in the sequence are the coordinates of that point on the graph

Page 10: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

PLOTFOLDB) p-num plot

This plot shows the amount of variability in pairing at each position in the sequence in all predicted foldings within the increment of the optimal folding energy you specify

Page 11: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

PLOTFOLDplotC) circles

Page 12: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

PLOTFOLDD) domes

Page 13: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

PLOTFOLDE) mountains

The program plots representative secondary structures that satisfy the energy increment and window size criteria you specify.

Page 14: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

PLOTFOLDF) squiggles

Page 15: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Exercise 07-2Exercise 07-2Stemloop & X-windows

Open the file “exercise07-2.doc” and follow the steps.

gcg2 4% fetch gb:d00063d00063.gb_pl1 gcg2 5% stemloop d00063.gb_pl1

There are 16 stems. Would you like to 1) See the stems 2) See the stem coordinates 3) File the stems 4) File the stems as points for DOTPLOT 5) Choose new parameters 6) Get a different sequence Q)uit? Please choose one (* 1 *): Try 1-4 Sort stems by: 1) Position 2) Quality 3) Size Q)uit Please choose one (* 1 *):

Page 16: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Exercise 07-3Exercise 07-3Mfold

Open the file “Exercixe07-3.doc” and follow the steps.

gcg2 4% fetch gb:j02061J02061.gb_vi gcg2 5% mfold j02061.gb_vi j02061.mfold

$ Mfold (Linear) MFOLD what sequence ? j02061.gb_vi

Begin (* 1 *) ? End (* 121 *) ?

What should I call the energy matrix output file (* j02061.mfold *) ?

Plotfold

Page 17: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Primer SelectionPrimer SelectionSpecificity - %GC -Dimer – Hairpin - Tm

Nucleotide sequencesNucleotide sequences

Amino AcidsequencesAmino Acidsequences

CONSENSUSCONSENSUS

PileupPrettyPrettybox

Primer Selection Program-PrimePrimer Selection Program-Prime

Amino AcidAmino AcidNucleotideNucleotide

backtranslate

Confirm by BLASTConfirm by BLAST

Page 18: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Primer LengthMinimum - Maximum -----------------------------------------------PCR Product LengthMinimum - Maximum -----------------------------------------------Maximum number of primers or PCR products in output (range 1 thru 2500) Primer DNA concentration (nM) (range .1 thru 500.0) -Salt concentration (mM) (range .1 thru 500.0) -----------------------------------------------Select: forward primers, only reverse primers, only primers on both strands for PCRSet maximum overlap (in base pairs) between predicted PCR products Forward strand primer extension must include position Reverse strand primer extension must include position ----------------------------------------------

Reject duplicate primer binding sites on templateSpecify primer 3' clamp (using IUB ambiguity codes) -----------------------------------------------Primer % G+CMinimum (range 0.0 thru 100.0)Maximum -----------------------------------------------Primer Melting Temperature (degrees Celsius)Minimum (range 0.0 thru 200.0)Maximum -----------------------------------------------Maximum difference between melting temperatures of two primers in PCR (degrees Celsius) (range 0.0 thru 25.0)-----------------------------------------------Product % G+CMinimum (range 0.0 thru 100.0)Maximum ----------------------------------------------- Product Melting Temperature (degrees Celsius)Minimum (range 0.0 thru 200.0)Maximum

Page 19: Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics 90-07

Exercise 07-4Exercise 07-4Primer Selection

Use the human npm cDNA sequence to design a pair of primers that will copy the

whole coding sequence whentranslated in frame.

THENCheck the specificity of the primers by using BLAST.