Computational Skills Courseweek 4
Mike GilchristNIMR May-July 2011
WEEK FOUR
Individual project plans
Laurent
RNA seq -Reads from solexa
Alignment on the transcriptome
Normalization
Analysis of the transcriptional dynamic of chosen gene regulatory networks
P tc h1
0
1000
2000
3000
4000
9.5 10 10.5 11
forelimb
hindlimb
S hh
0100200300400500
9.5 10 10.5 11
forelimb
hindlimb
Aims of the project :
- Comparing the transcriptional profiles of forelimbs and hindlimbs over an embryonic time-course:
- Identify additional candidates “limb-type modifiers”
- Compare the transcriptional profile dynamics between FL and HL of common forelimb/hindlimb GRNs to establish a limb-type signature
Veronique
From about E11.5:reduced Lhx6 levels
After P60:Spontaneous
Seizures
From E13.5:reduced Sst levels P40 - P60:
Physiological defects at inhibitory dendritic synapses
Our lab generated an hypomorphic Lhx6 allele which expresses reduced levels of mRNA. This allele specifically affects differentiation of a subset of cortical interneurons. This results in the development of seizures. These unique mutants allow the study of mechanisms of specification of cortical interneuron subtypes and the generation of seizures.
P15: mRNA-Seq experimentmRNA extracted from cortex+/- : controls (4)-/-: nulls (3)LacZ/-: hypomorphs (3)Questions:1) Molecular processes affected2) Molecular markers forcell types affected
Guilherme
How do binding sites for the T-box transcription factor Brachyury change over time during frog embryogenesis?
George
Data• mRNA-seq in chick neural cells + and - Shh• database of chick transcription factors (TFs)• ChIP-seq analysis identifying binding sites of several TFs in
mouse neural cells responding to Shh
Analysis• map mRNA-seq data to chick genome• measure gene expression levels and identify differentially
expressed genes• identify subset of regulated genes that are TFs
• identify mouse orthologs of differentially expressed TFs (and genes)
• identify clusters of TFs binding near regulated genes• ask whether there is (i) any enrichment for clusters of binding
sites near regulated genes; (ii) any correlation between combination of Tfs bound and type of regulation; (iii) predictive value in the ChIP-seq data for the regulation of gene expression.
James
The ‘PROJECT’ - Siggi Sato (Parasitology)
Gene expression in P. falciparum
Nuclear genome
Identify genetic elements determining the limit of the
intron
Protein/RNA factors for splicing and controlling organelles
Organellar genomes (Plastid, Mitochondrion)
Identify genetic elements for replication and transcription
New anti-P. falciparum
Finding new substances and identifying their targets
Alaremycin (patent filed)
MRC-T “small molecules”
( Others? )
Siggi
http://www.ensembl.org/Mus_musculus/Location/
The molecular regulation of IL-10 and IL-12 in innate cells: Investigating the differential The molecular regulation of IL-10 and IL-12 in innate cells: Investigating the differential production of IL-10 and IL-12 in commonly used inbred mouse strainsproduction of IL-10 and IL-12 in commonly used inbred mouse strains
• Key initial questions include:
o Are there differences (SNPs/deletions) in the IL-10/IL-12/type I IFN loci?o Are these differences in regulatory elements (TF binding sites/3’UTR) or protein coding regions?
Ashleigh
Are there any genes in the Xenopus tropicalis genome
that do not have a corresponding EST in the Xenopus laevis database?
& Vice versa
Alex
DB IMGT RF http://imgt.cines.fr/cgi-bin/IMGTlect.jv?query=5+AB019437 AC AB019437 SP Human GL IGHV GN V7-81 NA caggtgcagctggtgcagtctggccatgaggtgaagcagcctggggcctcagtgaaggtc NA tcctgcaaggcttctggttacagtttcaccacctatggtatgaattgggtgccacaggcc NA cctggacaagggcttgagtggatgggatggttcaacacctacactgggaacccaacatat NA gcccagggcttcacaggacggtttgtcttctccatggacacctctgccagcacagcatac NA ctgcagatcagcagcctaaaggctgaggacatggccatgtattactgtgcgagata AA QVQLVQSGHEVKQPGASVKVSCKASGYSFTTYGMNWVPQAPGQGLEWMGWFNTYTGNPTY AA AQGFTGRFVFSMDTSASTAYLQISSLKAEDMAMYYCAR//
Flat file database of mouse and human sequences from databases IMGT, ABG, NCBI and VBASE2 in EMBL format.
Load (how?) into MySQL (database design: tables and primary key?)
Remove redundancy in sequences but retain pointers to other fields.
Flexible query and output different sets of sequences for e.g. blast search.
Jose
Identify differences in histone patterns between IL-10
secreting vs non-secreting T helper cells
FACS purifyCD4+CD44loCD25-Foxp3GFP-10BiT-
T cells from SPN
Culture with: - HEL peptide - DCs - Skewing cytokines/ blocking antibodies
FACS purifyCD4+10BiT+
vsCD4+10BiT-
T cells10BiT (IL-10 reporter) Foxp3GFPTCR7 Rag1-/-
ChIP-Seq: Histone Modification
IL-10 Associated Histone Modification Pattern of T helper Subsets
Gene status Histonepattern
Permissive TSS H3K4me3
Transcribed gene H3K4me3 + H3K36me3
Bivalent domain H3K4me3 + H3K27me3
Repressed TSS H3K27me3
Poised enhancer H3K4me1
Active enhancer H3K4me1 + H327Ac
Histone Modification Pattern Maps
List of histone marks of all genes in the different subsets
Assign histone patterns to genes in the different T helper cell subsets
Compare histone patterns in the different T helper cell subsets;
gaining insight into “housekeeping” vs activation vs lineage defining
genes
Foxp3
Rort
Tbet
Gata3
IL-10?10BiT+ 10BiT-
IL-
4,
IL-6
, IL
-12,
IF
N-
, TG
F
IL-4, IFN-,
IL-12
IL-12, IL-4
TGF, IL-6,IFN-, IL-4
TGF, IL-2
IL-4IL-10
IFN-IL-10
TGFIL-10
IL-17IL-10
IL-2
Leona
Khokha Lab – Computational Goal• Define All Exons in X. tropicalis• How?
• Combine current gene models, transcriptome assemblies, available and soon to be available RNA-seq
• Why?• Genome sequence/annotation - imperfect• Exon Capture Sequencing – mutant gene
identification• Gap Capture – gene model improvements• Analysis of RNA-seq data
• When?• Tomorrow would be good
Mustafa
Mary
One exercise I am trying now is to predict the potential PfSUB2 (a subtilisin-like protease with relative sequence-specificity at the cleavage site) in Plasmodium falciparum protein database. I downloaded predicted protein sequences for one chromosome (Chr 13) and by using grep detected 38 sequences. After editing output with sed to make it look like a fasta file (line numbers as identifiers for each sequence), queried against the chr13 protein database (blastp for max_target_seqs 1) to obtain accession numbers.
I am yet to write/try any script.
What I would like to do: I have a Pfmsp7 knock out parasite line which shows invasion phenotype. Once the technical hurdles are passed (like getting rid off rRNA sequences with polyA in it), we want to RNAseq analyse in order to identify genes that may have been affected.
Madhu