57
Integrated DNA Technologies Use of NCBI Databases in qPCR Assay Design Elisabeth Wagner, PhD Scientific Applications Specialist

Use of NCBI Databases in qPCR Assay Design

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Use of NCBI Databases in qPCR Assay Design

Integrated DNA Technologies

Use of NCBI Databases in qPCR Assay Design

Elisabeth Wagner, PhDScientific Applications Specialist

Page 2: Use of NCBI Databases in qPCR Assay Design

2

Session Outcomes

You will: Learn which NCBI tools are useful for designing qPCR assays Become proficient using tools for qPCR design in the IDT SciTools® suite Navigate the features and tools available on the NCBI website

Obtain sequence information for your gene of interest Perform a BLAST search for assay specificity Search for SNPs

Understand how to proceed with a basic qPCR design

Page 3: Use of NCBI Databases in qPCR Assay Design

3

qPCR Design Covers A Lot of Ground

There are many uses for quantitative PCR. For some examples:

Gene expression Copy number variation Genotyping Multi-species analysis Splice variant specific (or common) expression

We will address the general considerations for design in this session, and cover more specific examples later this afternoon.

Page 4: Use of NCBI Databases in qPCR Assay Design

4

SciTools® Overview

http://www.idtdna.com/pages/scitools Several Tools are available in the IDT SciTools® suite to assist with qPCR design

1. RealTime PCR Tool 2. PrimerQuest® Tool 3. OligoAnalyzer® Tool 4. PrimeTime® Predesigned qPCR Assay Database

Page 5: Use of NCBI Databases in qPCR Assay Design

5

NCBI Databases Overview:

1. Obtain sequence information for your gene of interest- NCBI Nucleotide or Gene

2. Perform a BLAST search for assay specificity NCBI BLAST

3. Search for SNPs NCBI dbSNP

NCBI enables you to access all of this information necessary for design in one location.

Page 6: Use of NCBI Databases in qPCR Assay Design

6

Using NCBI Databases for Custom qPCR Assay Design

Page 7: Use of NCBI Databases in qPCR Assay Design

NCBI Overview (National Center for Biotechnology and Information)

Founded in 1988 as part of the United States National Library of Medicine Houses a series of databases relevant to biotechnology and biomedicine Curates Genbank, a database of over 1x1012 bp of DNA sequences Gene database, which integrates gene-specific information from numerous species dbSNP, which is a database of reported Single Nucleotide Polymorphisms (SNPs) Contains the BLAST sequence similarity search program Maintains PubMed, a journal database for biomedical literature Much, much more information!

7

Page 8: Use of NCBI Databases in qPCR Assay Design

NCBI Database Search: Sequence Information for qPCR Assay Design

http://www.ncbi.nlm.nih.gov/8

Page 9: Use of NCBI Databases in qPCR Assay Design

NCBI Sequence Files

Files: Can be entered by anyone May or may not be checked for accuracy May contain contaminated sequence (plasmid or other) May contain annotation errors

Accession numbers: Letters at the beginning indicate the type of file

Nucleotide sequences start with 1 or 2 letters:

9

Page 10: Use of NCBI Databases in qPCR Assay Design

The RefSeq Database non-redundant explicitly linked nucleotide and protein

sequences ongoing curation by NCBI staff and

collaborators, with reviewed records indicated includes data validation and

format consistency distinct accession numbers

all accessions include an underscore '_' character

Different versions are tracked

10

Page 11: Use of NCBI Databases in qPCR Assay Design

RefSeq Accession Numbers

mRNAs and Proteins NM_123456 Curated mRNA NP_123456 Curated Protein NR_123456 Curated non-coding RNA XM_123456 Predicted mRNA XP_123456 Predicted Protein XR_123456 Predicted non-coding RNA

Gene Records NG_123456 Reference Genomic Sequence

Chromosome NC_123455 Microbial replicons, organelle genomes, human chromosomes AC-123455 Alternate assemblies

Assemblies NT_123456 Contig NW_123456 WGS Supercontig

11

Page 12: Use of NCBI Databases in qPCR Assay Design

Accessing Sequence Information in NCBI

12

NCBI

Page 13: Use of NCBI Databases in qPCR Assay Design

NCBI Gene Database Information: Gene Search

13

Page 14: Use of NCBI Databases in qPCR Assay Design

Sequence Data Searches Using Nucleotide

Sequence Files mRNA and genomic Transcript variants

http://www.ncbi.nlm.nih.gov/nuccore14

Page 15: Use of NCBI Databases in qPCR Assay Design

Genbank information

15

Page 16: Use of NCBI Databases in qPCR Assay Design

Data Retrieval: Graphics View

16

Page 17: Use of NCBI Databases in qPCR Assay Design

Data Retrieval: FASTA Sequence Format

17

Page 18: Use of NCBI Databases in qPCR Assay Design

18

Using PrimerQuest® Tool for Custom qPCR Designs

Page 19: Use of NCBI Databases in qPCR Assay Design

19

PrimerQuest® Tool for Generating Custom qPCR Designs

Highly customizable tool

Page 20: Use of NCBI Databases in qPCR Assay Design

20

You Can Use NCBI Accession Number or FASTA Sequence

Page 21: Use of NCBI Databases in qPCR Assay Design

21

Once Sequence Entered, 3 Defaults Become Available

Often you will need to adjust the parameters of the tool to meet experimental design requirements

Page 22: Use of NCBI Databases in qPCR Assay Design

22

PrimerQuest® Tool Assay Output

Page 23: Use of NCBI Databases in qPCR Assay Design

23

Changing Parameters Depend on the Assay Required

Before changing anything, make sure you have selected the correct assay

Sometimes you simply need to increase the number of designs returned

It is unlikely that you will need to change these parameters

Page 24: Use of NCBI Databases in qPCR Assay Design

24

Directing the Design to a Specific Region

Target a particular “junction”

Page 25: Use of NCBI Databases in qPCR Assay Design

25

Examples

Excluded region 260-280

Excluded region-probe 260-280Target region 260-280

Page 26: Use of NCBI Databases in qPCR Assay Design

26

Changing Primer/Probe ParametersIf the target is particularly biased (AT or GC rich), you may need to change primer/probe parameters (i.e. length)

Page 27: Use of NCBI Databases in qPCR Assay Design

27

Once Initial Design Completed, Back to NCBI

Use NCBI tools to: Check whether assay is specific (BLAST) Ensure there are no SNPs to worry about (dbSNP)

Use IDT OligoAnalzyer® Tool Check primers (and probe) for secondary structure and dimer

formation

Page 28: Use of NCBI Databases in qPCR Assay Design

28

Using NCBI BLAST to Check for Primer Specificity

Page 29: Use of NCBI Databases in qPCR Assay Design

29

What is BLAST?—Getting to BLAST

http://www.ncbi.nlm.nih.gov/

Or http://blast.ncbi.nlm.nih.gov/Blast.cgi

Page 30: Use of NCBI Databases in qPCR Assay Design

30

What is BLAST (Basic Local Alignment Search Tool)? BLAST stands for Basic Local Alignment Search Tool and is provided by the National Center for

Biotechnology and Information (NCBI) Aligns a user defined query (sequence) to a wide variety of databases Can translate the query or the database to align sequences Can align 2 or more sequences together Heuristic algorithm to create alignments very fast

Breaks sequences into “words” and searches the database for matches Reassembles these matches based on the criteria entered

Page 31: Use of NCBI Databases in qPCR Assay Design

31

What is BLAST?—Basic BLAST

Page 32: Use of NCBI Databases in qPCR Assay Design

32

How BLAST Works—Words

BLAST divides the query sequence into subsets called “words”, which the algorithm uses to perform the alignment

Example (35 nt sequence): CGATCGGGCATCACACAAAGTTATGTAGTAGAAAT

All possible words that can be generated from the sequence are used for the alignment

The max number of words for this sequence is 29

7-letter word

Page 33: Use of NCBI Databases in qPCR Assay Design

33

Overview—Definitions Hit: A sequence to which the query is aligned and is returned in the

results of BLAST Identity: the extent of exact matches between 2 sequences (eg ACGT

and ACGG have 75% identity) Similarity = Positives (in BLAST scoring)

Page 34: Use of NCBI Databases in qPCR Assay Design

34

How BLAST Works—Scores

The BLAST raw score is converted to a bit score for each alignment using parameters based on statistics described in Karlin and Altschul (1990) (www.ncbi.nlm.nih.gov/pmc/articles/PMC53667/pdf/pnas01031-0226.pdf).

A high score does not necessarily indicate that the query is unique The score is only dependent on the alignment, length of the sequence, and the

length of the database E-value is the expected amount of random sequences that have equivalent

sequence alignment Calculated using the Max bit score and the length of the query and database Tells you the relative strength of the alignment Shorter sequences have higher E-values because the probability of finding that

sequence is higher A low E-value does not mean you have a unique match!

Page 36: Use of NCBI Databases in qPCR Assay Design

36

Select the Correct Database

“Others” is the most general but contains a lot of sequences. If possible use Human or Mouse specific databases

For species with completed genome projects, consider using “NCBI Genomes” to limit BLAST results

Page 37: Use of NCBI Databases in qPCR Assay Design

37

Change the parameters of the BLAST scoring

Select less rigorous algorithm

Change Word size to “7”

Page 38: Use of NCBI Databases in qPCR Assay Design

38

Looking at the Results

The Graphic Summary can immediately give you a sense of what the overall results are

Hover over each result in the graphic to identify the sequence name

Page 39: Use of NCBI Databases in qPCR Assay Design

39

Then Look at Results List

Look at E-value and Query Coverage. Look for jumps in either/both.

Looks like assay is specific to a single gene by transcript

Ignore the “alternate” chromosome assemblies

Page 40: Use of NCBI Databases in qPCR Assay Design

40

Investigate details of alignment

Check distance between primer binding if looking at mRNA

Open Graphics result in a new tab/window

Page 41: Use of NCBI Databases in qPCR Assay Design

41

BLAST Shows Primer Aligned to Sequence

Zoom out with “-” sign

You can grab within window and drag sequence side to side

Page 42: Use of NCBI Databases in qPCR Assay Design

42

The Target Gene is on Chromosome 6

This looks promising with primers on different exons.

Page 43: Use of NCBI Databases in qPCR Assay Design

43

But We Had Other Chromosomal Hits……

“real” transcriptPseudogene— doesn’t look transcribed

Primers (red bar indicates mismatch)

Page 44: Use of NCBI Databases in qPCR Assay Design

44

And Another One……

Another pseudogene.But what’s this?

Intron of a transcribed gene. So potentially in RNA samples. Recommend avoiding if possible

Page 45: Use of NCBI Databases in qPCR Assay Design

45

Using NCBI to Check for SNPs

Page 46: Use of NCBI Databases in qPCR Assay Design

46

While Assessing BLAST Results, Also Assess for SNPs

Page 47: Use of NCBI Databases in qPCR Assay Design

47

Investigate SNPs in Primer Binding Sites

Page 48: Use of NCBI Databases in qPCR Assay Design

48

Assessing SNP Data

Tells you it’s a single base substitution

Indicates alternate forms (here recorded on opposite strand)

Indicates allele frequency if known

Sometimes more frequency data at bottom of page

Page 49: Use of NCBI Databases in qPCR Assay Design

49

SNP Data Roughly Divided by Risk

Trusted sourceVery low frequency

No data, likely not going to be problematic

Significant risk. Look to redesign if possible

Page 50: Use of NCBI Databases in qPCR Assay Design

50

Using OligoAnalzyer® Tool to Check Primers and Probes

Page 51: Use of NCBI Databases in qPCR Assay Design

51

Checking Primers with OligoAnalyzer® Tool

PrimerQuest® design tools give you the “best” assays for the region specified

They check for self- and hetero-dimers, but this is only part of the scoring system used

An assay maybe be “better” even with dimer issues if it scores well on other parameters

Go to the OligoAnalyzer Tool Perform self-dimer checks for primers and probe Perform heterodimer checks on all primer/probe combinations (especially important

to include all combinations when multiplexing) Check hairpin structures.

Look for stability of < -9 kcal/mol Or multiple hairpins forming with < -4 kcal/mol

Page 52: Use of NCBI Databases in qPCR Assay Design

52

Assessing Dimer Data

Looks stable < -9kcal/mol

But this is not “dangerous”, avoid if possible but ok

Looks stable < -9kcal/mol

Not extendable, not a problem

Doesn’t look stable > -9kcal/mol

Danger of extension, exponential amplification!

Page 53: Use of NCBI Databases in qPCR Assay Design

53

Assessing Hairpin Structures

Based on UNAfold predictions

Page 54: Use of NCBI Databases in qPCR Assay Design

IDT PrimeTime® Predesigned qPCR Database

54

Page 55: Use of NCBI Databases in qPCR Assay Design

55

Primer and Probe Design Criteria for PrimeTime® Assays

Primers equal Tm (60–63oC) 15–30 bases in length no runs of 4 or more Gs amplicon size 50–150 bp (max 400 bp)

Probe Probe length no longer than 30–35 bases Tm value 4–10oC higher than primers no runs of 4 or more consecutive Gs G+C content 30–80% no G at the 5 end′

Page 56: Use of NCBI Databases in qPCR Assay Design

56

PrimeTime Results

Page 57: Use of NCBI Databases in qPCR Assay Design

57

Questions?