64
http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial http://genome.ucsc.edu/ http://genome-test.cse.ucsc.edu/ The UCSC Toolset & Portal to the Human Genome Genome Browser Table Browser “I was blind and now I can see”

Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial The UCSC Toolset & Portal to the Human

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

http://cs273a.stanford.edu 1

UCSC Genome Browser Tutorial

http://genome.ucsc.edu/

http://genome-test.cse.ucsc.edu/

The UCSC Toolset & Portalto the Human Genome

• Genome Browser• Table Browser

“I was blind and now I can see”

Page 2: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

http://cs273a.stanford.edu 2

UCSC Genome Browser

[version9a]

http://www.openhelix.com/downloads/ucsc/ucsc_home.shtml

Page 3: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

3

The UCSC Homepage: http://genome.ucsc.edu

navigate

navigateGeneral information

Specific information—new features, current status, etc.

Page 4: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

4

The Genome Browser Gatewaystart page choices, December 2006

Make your Gateway choices:

1. Select Clade

2. Select species: search 1 species at a time

3. Assembly: the official backbone DNA sequence

1 2 3

practically speaking, there is no such thing as a genome.

there is only a genome assembly. assemblies update.

frequently. think moving target...

Page 5: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

5

Everything in Genomics is a Moving Target

The genomes Their annotations The Portals Our understanding of Biology

Conclusion:

write code

that can be

run...

and rerun

and rerun

and rerun

and rerun

Page 6: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

6

The Genome Browser Gatewaystart page, basic search

Page 7: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

7

The Genome Browser Gatewaystart page choices, December 2006

Make your Gateway choices:

1. Select Clade

2. Select species: search 1 species at a time

3. Assembly: the official backbone DNA sequence

4. Position: location in the genome to examine

5. Image width: how many pixels in display window; 5000 max

6. Configure: make fonts bigger + other choices

4 5

6

Page 8: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

8

The Genome Browser Gatewaystart page, basic search

text/ID searches

Helpful search examples,

suggestions below

Use this Gateway to search by: Gene names, symbols Chromosome number: chr7, or region: chr11:1038475-

1075482 Keywords: kinase, receptor IDs: NP, NM, OMIM, and more…

See lower part of page for help with format

4

Page 9: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

9

The Genome Browser Gatewaysample search for Human TP53

Sample search: human, March 2006 assembly, tp53

select

Select from results list ID search may go right to a viewer page, if unique

Page 10: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

10

Overview of the wholeGenome Browser page

(mature release)

}Genome viewer section

mRNA and EST Tracks

Expression and Regulation

Comparative Genomics

Variation and Repeats

Groups of data

Mapping and Sequencing Tracks

Genes and Gene Prediction Tracks

ENCODE Tracks

Page 11: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

11

Different species, different tracks, same software

Species may have different data tracks Layout, software, functions the same

Page 12: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

12

Sample Genome Viewer image, TP53 region

base positionSTS markers

Known genes

RefSeq genes

GenBank seqs

repeats

17 species compared

SNPs

single species compared

Page 13: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

13

Visual Cues on the Genome Browser

Track colors may have meaning—for example, Known Gene track:

•If there is a corresponding PDB entry, = black•If there is a corresponding NCBI Reviewed seq, = dark blue•If there is a corresponding NCBI Provisional seq, = light blue

Tick marks; a single location (STS, SNP)

For some tracks, the height of a bar is increased likelihoodof an evolutionary relationship (conservation track)

Intron, and direction of transcription <<< or >>>

<exon exon exon< < < < < < <ex 5' UTR3' UTR

Page 14: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

14

Options for Changing Images: Upper Section

Change your view or location with controls at the top Use “base” to get right down to the nucleotides Configure: to change font, window size, more…

Specifya

position

fonts,window,

more

Walkleft orright

Zoomin

Zoomout

click tozoom 3x

and re-center

Page 15: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

15

Annotation Track display options

Some data is ON or OFF by default

Links to infoand/or filters

Menu links to info about the tracks: content, methods You change the view with pulldown menus

enforcechanges

After making changes, REFRESH to enforce the change

Change track view

Page 16: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

16

Annotation Track options, defined Hide: removes a track from view

Dense: all items collapsed into a single line

Squish: each item = separate line, but 50% height + packed

Pack: each item separate, but efficiently stacked (full height)

Full: each item on separate line

Page 17: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

17

Reset, Hide, Configure or Refresh to change settings

You control the views Use pulldown menus Configure options page

reset, back to defaults start from

scratch

enforce any changes (hide, full, squish…)

Page 18: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

18

Annotation Track options, if altered….important point: the browser remembers!

Session information (the position you were examining) Track choices (squish, pack, full, etc) Filter parameters (if you changed the colors of any items, or the

subset to be displayed) …are all saved on your computer. When you come back in a

couple of days to use it again, these will still be set. You may—or may not—intend this.

To clear your “cart” or parameters, click default tracks

OR

Page 19: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

19

Saved Sessions

Page 20: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

20

Click Any Viewer Object for Details

Example: click your mouse anywhere on the TP53 line

Click the item

New web page

opens

Many details and links to more data about TP53

Page 21: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

21

Click annotation track item for details pages

Not all genes have This much detail.

Different annotation tracks

carry different data.

informativedescriptionother resource links

microarray data

mRNA secondary structure

links to sequences

protein domains/structure

homologs in other species

Gene Ontology™ descriptions

mRNA descriptions

pathways

Page 22: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

22

Get DNA, with Extended Case/Color Options

Use the DNA link at the top

Plain or Extended options

Change colors, fonts, etc.

Page 23: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

23

Get Sequence from Details Pages

Click a track, go to Sequence section of details page

Click the line Click the item

sequence sectionon detail page

Page 24: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

24

Accessing the BLAT tool

Rapid searches by INDEXING the entire genome Works best with high similarity matches See documentation and publication for details

Kent, WJ. Genome Res. 2002. 12:656

BLAT = BLAST-like Alignment Tool

Page 25: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

25

BLAT tool overview: www.openhelix.com/sampleseqs.html

Make choices

DNA limit 25000 basesProtein limit 10000 aa25 total sequences

Paste one or more

sequences

Or upload

submit

Page 26: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

26

BLAT results, with links

Results with demo sequences, settings default; sort = Query, Score Score is a count of matches—higher number, better

match Click browser to go to Genome Browser image location (next slide) Click details to see the alignment to genomic sequence (2nd slide)

sorting

go

to b

row

ser/

vie

we

r

go

to a

lign

me

nt d

eta

il

Page 27: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

27

BLAT results, browser link

From browser click in BLAT results A new line with your Sequence from BLAT Search appears!

query

click to flip frame

Watch out for reading frame! Click - - - > to flip frame Base position = full and zoomed in enough to see

amino acids

Page 28: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

28

BLAT results,alignment details

Your query

Genomic match, color cues

Side-by-side alignment

yoursgenomic

Page 29: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

29

Understand Blat’s Limitation

Blat was designed to rapidly align sequence from onegenome back to itself (e.g., EST/cDNA data)

It can and it does miss clear hits at times

Blat actually allows for a single mismatch, but it alsoremoves k-mers with excessive counts for efficiency.

Not suitable for cross-species mapping.

Page 30: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

30

Bunch More Goodies – Click Around

Page 31: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

31

Bibliography:

http://genome.ucsc.edu/goldenPath/pubs.html The UCSC Genome Browser Database:

update 2008, update 2007, and earlier. UCSC Genome Browser Tutorial UCSC Genome Browser: Deep support

for molecular biomedical research The UCSC Known Genes, 2006. The UCSC Gene Sorter, 2007. Piloting the Zebrafish Genome Browser,

2006.

Page 32: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

32

UCSC Genome Browser

[version9a]

Page 33: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

33

Genome Browser Database

Primary table: positions, names, etc.

UnderlyingDatabase(MySQL)

Auxiliary table: related data

visualize search & download

Page 34: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

34

The Table Browser

Open browser

Open browser

http://genome.ucsc.edu/

Page 35: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

35

Table Browser: Choose Genome

In the Human genome (hg16),

search for simple repeats on a chromosome 4 location

with copy number more than 10and download the sequence.

In the Human genome (hg16),

search for simple repeats on a chromosome 4 location

with copy number more than 10and download the sequence.

Choose Genome

Page 36: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

36

Table Browser: Choose Table to Search

In the Human genome (hg16),

search for simple repeats on a chromosome 4 location

with copy number more than 10and download the sequence.

Choose Data Table

Page 37: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

37

Table Browser: Describe Table

Describe table

Page 38: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

38

Table Browser: Choose Region to Search

In the Human genome (hg16), search for simple repeats

on a chromosome 4 locationwith copy number more than 10

and download the sequence.

Choose Region to Search

Page 39: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

39

Table Browser: Upload Locations to Search

Paste Upload

Page 40: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

40

Table Browser: Filter to Refine Search

Create Filter

In the Human genome (hg16), search for simple repeats

on a chromosome 4 locationwith copy number more than 10

and download the sequence.

Submit Filter

Page 41: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

41

Table Browser: Output Data

Output data

In the Human genome (hg16), search for simple repeats

on a chromosome 4 locationwith copy number more than 10

and download the sequence.

Page 42: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

42

Table Browser: Output Formats

Output formats

Text Fields

Page 43: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

43

Table Browser: Fasta Sequence Output

Sequence

Page 44: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

44

Table Browser: Database Format Outputs

Database

Page 45: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

45

Table Browser: Custom Track Output

Custom Track

Page 46: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

46

Table Browser: Hyperlinks Output

Hyperlinks

Page 47: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

47

Table Browser: Obtaining Output

Adding name creates file on desktop,leaving blank creates output in browser.

(exception: custom track)

Data Summary

Page 48: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

48

Table Browser: Output configuration

Sequence Format

Get Sequence

Page 49: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

49

Table Browser: Intersecting Data

Find simple repeats (copy number > 10) within known genes

and download the sequence.

Intersect

2nd Table

Any Overlap

Submit

Page 50: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

50

Table Browser: Intersecting Data Narrows Search

Filtered simple repeats, intersected (overlapping)

w/ known genes

Summary

Filtered simple repeats

Page 51: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

51

Table Browser: Downloading Sequence Data

Sequence Format

Get Sequence

Page 52: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

52

Table Browser: Correlating Data Tables

Correlate 2 Datasets

Get Results

Page 53: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

53

Custom Tracks: Table Browser Searches

Create Track

Get Output

Page 54: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

54

Custom Tracks: Name and Configure Track

Download track file to desktop

Name Track:SRepeatKGenes

Describe Track:Intersection …

Choose defaultview in browser

In Genome Browser

Page 55: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

55

Custom Tracks: Open Track in Genome Browser

Open Details

Compare

“…caused by anexpanded, unstable

trinucleotide repeat…”

Page 56: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

56

Custom Tracks: Track in Table Browser

Custom tracks also are available for filtering and intersections

on the Table Browser

Page 57: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

57

Custom Tracks: User-generated Data in Track

Custom Tracks Link

Custom Track How-to

Page 58: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

58

Custom Tracks: Four Steps to Create Track

Four steps to create a custom trackDefine track characteristics Define browser characteristicsFormat your dataUpload and view your track

Page 59: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

59

Custom Tracks: Submit Track

Copy and pastesmall or simple tracks

Submit File

http://genome.ucsc.edu/FAQ/FAQformat

Page 60: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

60

Custom Tracks: Track Appears in Genome Browser

Page 61: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

61

Custom Tracks: Track Characteristics

Default view of custom track is “pack”

Default viewof other tracks set

Page 62: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

62

Custom Tracks: Track Appears in Table Browser

Custom Track alsoappears in

Table Browser

Page 63: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

63

Custom Tracks from Outside Sources

Custom Tracks Link

Contributed Track

Page 64: Http://cs273a.stanford.edu 1 UCSC Genome Browser Tutorial   The UCSC Toolset & Portal to the Human

64

Bibliography:

http://genome.ucsc.edu/goldenPath/pubs.html The UCSC Table Browser, 2004. Bejerano et al., Nature Methods, 2005. The UCSC Proteome Browser Phylogenomic Resources at the UCSC Genome

Browser