Prorepeat.bioinformatics.nl ProRepeat a comprehensive directory of exact tandem repeats in proteins

prorepeat.bioinformatics.nl

ProRepeat a comprehensive directory of exact tandem repeats in proteins

www.bioinformatics.nl

9 diseases causes by polyQ repeats- HD- DRPLA- SCA 1,2,3,6,7,17- Kennedy’s disease (SBMA)

PolyQ and neurodegenerative diseases

Transcription Factor

-COOHNH3-

TRANSCRIPTIONAL REGULATIONDNA BINDING

HORMONE BINDING

T1 T2 T3

Region 1 Region 2 Region 3

Androgen receptor (AR)

polyQ tract length has important consequences■ shorter tracts : prostate cancer susceptibility■ longer tracts : feminization syndromes■ over 40 residues : SBMA (spinal and bulbar muscular atrophy) or Kennedy’s disease

9-35 residues, average of 20-25 depending on ethnic origin

PolyQ in AR

Collection of polyQ repeats 792 human individuals

available from earlier study (Edwards, 1992)

26 armadillo individuals sequenced by CP

77 mammals and marsupials from protein database

Céline Poux, RUCéline Poux, RU

What about repeats in other proteins?

ProRepeat database Data sources: UniProt and RefSeq Limited to exact tandem repeats

Standard, linear-time suffix tree algorithm Stored in Oracle 10g Interface in PHP5

unit length repetitions

1 ≥ 5

2 ≥ 4

3 ≥ 3

4 .. N ≥ 2Maarten van den Bosch, WURMaarten van den Bosch, WUR

Simple query syntax:

e.g. “Q” or “DE”

Simple query syntax:

e.g. “Q” or “DE”

DE is equivalent to ED; DEF is equivalent to EFD and FDE

Or use ProSite syntax:

e.g. “[DE]-{P}-X(0,1).”

Or use ProSite syntax:

e.g. “[DE]-{P}-X(0,1).”

Taxonomic distributions of hits

Sorting/grouping options

Identifier Repeat unit Repetitions Unit length Length Start location End location Protein Taxonomy Ontology

Link to DNA data

DNA coding sequences of available repeats also stored in the database Extracted from EMBL

and/or RefSeq

Hong Luo, WURHong Luo, WUR

Link to DNA data / errors

Approximately 3% of corresponding nucleotide sequences cannot be retrieved

Errors caused by No links to nucleotide database (35%)

• NO_ANNOTATED_CDS• No EMBL links

Annotation errors in the nucleotide database (65%)

Number of different units per unit size per proteome

Unit length

Hsapiens

Athaliana

Celegans

Cserevesiae

Ptroglodytes

Ggallus

Rnorvegicus

Mmusculus

Guido Kappé, RUGuido Kappé, RU

Single amino acid (SAA) repeat length distribution in Homo sapiens

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 >20

Total SAA repeat length (aa)

A B C D E F G H I K L M N P Q R S T U V W X Y Z

Amino acid distribution Homo sapiens

Amino acid

All prot. - Rep. Rep. - SAA SAA

Amino acid distribution Arabidopsis thaliana

Amino acid

All prot. - Rep. Rep. - SAA SAA

Current work

Annotation of repeats versus function Adding imperfect tandem repeats - a.k.a.

approximate tandem repeats (ATR) – to the database

Offering remote access via web services (WSDL and BioMoby)

Expansion of the analysis capabilities of the interface

PolyQ in AR (reprise)

Impure tracts longer and more variable than pure CAG tracts (mainly CAA, CCG, and CGG)

Presence of other codons better explained by codon duplication than multiple point mutations interrupting codons are part of elongation process,

rather than hampering their dynamics as proposed previously

Negative correlation between lengths of the different CAG tracts maximal expansion length that protein can handle

without being deleteriousCéline Poux, RUCéline Poux, RU

Acknowledgements

Wageningen University and Research Centre Maarten van den Bosch Hong Luo Mark Kramer Harm Nijveen

Radboud University, Nijmegen Guido Kappé Céline Poux Wilfried W. de Jong

This work was supported in part by project grants from NWO/BMI (GK, CP) and the NBIC/BioAssist program (HN)

prorepeat.bioinformatics.nl

Thank you for your attention!See also our posters on phylogenetic domain visualisation (TreeDomViewer) and microarray (re)annotation at the ISMB

Post-doc positions available: contact Jack.Leunissen@wur.nl or jack@bioinformatics.nl

Prorepeat.bioinformatics.nl ProRepeat a comprehensive directory of exact tandem repeats in proteins

Documents

PRECISION INSTRUMENTS USER GUIDETandem Tandem Tandem Tandem Tandem Tandem Tandem INTERNATIONAL LIMITED WARRANTY

Tandem Repeats and Satellite DNA in Bovideae - Colloquium on Animal Cytogenetics

Biochimica et Biophysica Acta - CORE · aretwofeatureswhicharepresentinbothShadoo'sandPrP'snatively unstructured N-terminus: i) a series of tandem repeats of short se-quences with

Improved Analysis of DNA Short Tandem Repeats With Time-of

University of California, Los Angeles - Tandem repeats upstream … · 2018. 6. 14. · Epigenomic profile of SDC. Graphical representation of DNA methylation and siRNAs detected

SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V3... - What are Tandem repeats? - How does one find

Transposons and Tandem Repeats Are Not Involved in the Control …symposium.cshlp.org/content/69/465.full.pdf · 2008. 10. 30. · Transposons are selfish mobile DNA elements that

A worldwide map of swine short tandem repeats and their

penicillin gene clusteris amplified in tandem repeats ... · is anearly high penicillin producingstrain provided byAnti-bioticos, S.A. P. chrysogenum P2, an initial strain of the

short tandem repeats

Tandem repeats upstream of the Arabidopsis endogene SDC ...genesdev.cshlp.org/content/22/12/1597.full.pdfmet1 indicated that SDC was also expressed in this back-ground, although not

Chapter 4Chapter 4 Results: Analysis of Autosomal ...shodhganga.inflibnet.ac.in/bitstream/10603/27693/10... · Polymorphism (SNP), Variable Number Tandem Repeats (VNTR), Restriction

Polymorphic tandem repeats within gene promoters act as

Functional Consequences of Variable Tandem Repeats within the … · 2018-03-23 · Functional Consequences of Variable Tandem Repeats within the Yeast Cyc8 Transcriptional Regulator

JS 190- Introduction to STRs- Continued I. Learning Objectives (C6 Butler ) a.Short Tandem Repeats 1. CE artifacts and Fluorescent Dye multiplexing revisited

23 4 Blackett Family DNA Paternity Study. Use of Short Tandem Repeats Non-coding sections (do not code from proteins) Inherited from parents –Individuals

Validation of Short Tandem Repeats (STRs) For

Chapter 1 · primers annealing to repetitious sequences (retrotransposons, DNA transposons or tandem repeats) [16]. Alternative product amplification can also occur when primers are

DNA Typing Methods RFLP- restriction fragment length polymorphism. AmpliType®PM+DQA1 DNA sequencing –Mitochondrial DNA typing. STR- short tandem repeats

Validation of short tandem repeats (STRs) for forensic ......D21S11, polymerase chain reaction, multiplex amplification, fluo-rescence, DNA typing, validation Multiplex amplification