7
BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

Embed Size (px)

Citation preview

Page 1: BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

BIF-30806 Group Project

Group (A)rabidopsis:David NieuwenhuijseMatthew PriceQianqian ZhangThijs Slijkhuis

Species:C. Elegans

Project:Advanced (+Basic)

Page 2: BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

Progress Report

Page 3: BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

Project Overview

Dataset Preparation

Transcriptome Construction

Pipeline

Differentially Expressed

Genes

Gene Function

Biological Explanatio

n

Co-expressed Genes Modules

Functional Description

& Explanatio

n

Module Conservati

on b/w species

Gene Expression

(Basic Project)

Relationship to

Transcript Properties

Visualisation of

Interaction Network

Page 4: BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

Results so farDavid Nieuwenhuijse

◦ GeneID and GO term extraction tool◦ Cytoscape GO enrichment analysis◦ Finding automatic GO enrichment tool for pipeline

Qianqian Zhang◦ Create shell script for running Cuffdiff, Gffread

and Samtools program ◦ Get the gene lists of most differentially expressed

genes and highest expressed genes◦ Visualization of differentially expressed genes by

cummeRbund package: Density plot, Scatter plot, Volcano plot, P value distribution plot, MA plot etc.

◦ Basic statistics of differentially expressed genes

Page 5: BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

Results so farMatthew Price

◦ Script for listing the top 100 expressed genes◦ Script for determining GC-content, transcript & intron

length◦ Script for getting correlation between each transcript

property and the expression levelThijs Slijkhuis

◦ Created a shell script that: Downloads the source files Converts SRA into FASTQ files Performs bowtie2-build Performs tophat Performs cufflinks

◦ Programmed a script that sorts cuffdiff output on p-value (significance in differential expression), extracts gene names from it

Page 6: BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

Issues/Challenges Co-expressed Genes Modules

◦WGCNA package not usable in our case◦Use cummeRbund package to get Heatmaps

GO enrichment analysis ◦Not many genes are annotated in the GO

database.◦Gene id of the differentially expressed genes

are not compatible with the NCBI database.Transcript sequences

◦Not all expressed transcripts in the .gtf file can be matched to their corresponding sequence in the fasta file.

Page 7: BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)

Thank you for your attention!