4
Paris Diderot University MASTER DEGREE in BIOINFORMATICS M2 - 2015-2016 Internship Project Supervisors: Mohamed Elati & Costas Bouyioukos Name: ELATI First Name: Mohamed Professional Address: 5 rue Henri Desbruères, Genopole Bat. Genavenir 6, Evry 91030 Position: Maître de Conférences à l’Université d'Evry-Val-d'Essonne. Laboratory: Modelling and Engineering Genome Architecture. MEGA Institution: institute of Systems and Synthetic Biology, iSSB. CNRS, Genopole, Université d'Evry-Val- d'Essonne. Teams involved: François Képès DR CNRS, head of MEGA team iSSB, Rachida Tahar CR IRD team MERIT. Tel: 01 69 47 44 43 email: [email protected] Title: “Elucidating the complex interplay between genome regulatory, expression and architecture by an integrated multiview -omics analysis” Keywords: Statistical analysis and modelling of multi-omics data / Chromosome folding and structure / multiview machine learning. Description : See overleaf Additional questions: Does this project constitute the first steps of a PhD thesis that will be supported by a PhD fellowship? Yes Had you the opportunity to supervise a French student before? Yes Do you have any special accommodation or fellowship for foreign students? Yes Please provide us with any additional details you think important about your team.

MASTER DEGREE in BIOINFORMATICS - Inserm · MASTER DEGREE in BIOINFORMATICS M2 - 2015-2016 ... multiview machine learning. ... only responsible for the making of two crucial products

Embed Size (px)

Citation preview

Page 1: MASTER DEGREE in BIOINFORMATICS - Inserm · MASTER DEGREE in BIOINFORMATICS M2 - 2015-2016 ... multiview machine learning. ... only responsible for the making of two crucial products

Paris Diderot University

MASTER DEGREE in BIOINFORMATICS

M2 - 2015-2016

Internship Project

Supervisors: Mohamed Elati & Costas Bouyioukos

Name: ELATI

First Name: Mohamed

Professional Address: 5 rue Henri Desbruères, Genopole Bat. Genavenir 6, Evry 91030

Position: Maître de Conférences à l’Université d'Evry-Val-d'Essonne.

Laboratory: Modelling and Engineering Genome Architecture. MEGA

Institution: institute of Systems and Synthetic Biology, iSSB. CNRS, Genopole, Université d'Evry-Val-d'Essonne.

Teams involved: François Képès DR CNRS, head of MEGA team iSSB, Rachida Tahar CR IRD team MERIT.

Tel: 01 69 47 44 43 email: [email protected]

Title: “Elucidating the complex interplay between genome regulatory, expression and architecture by an integrated multiview -omics analysis”

Keywords: Statistical analysis and modelling of multi-omics data / Chromosome folding and structure / multiview machine learning.

Description : See overleaf

Additional questions:

Does this project constitute the first steps of a PhD thesis that will be supported by a PhD fellowship? Yes

Had you the opportunity to supervise a French student before? Yes

Do you have any special accommodation or fellowship for foreign students? Yes

Please provide us with any additional details you think important about your team.

Page 2: MASTER DEGREE in BIOINFORMATICS - Inserm · MASTER DEGREE in BIOINFORMATICS M2 - 2015-2016 ... multiview machine learning. ... only responsible for the making of two crucial products

Elucidating the complex interplay between

genome regulatory, expression and architecture

by an integrated multiview -omics analysis

Supervisors: Mohamed Elati & Costas BouyioukosContact: [email protected]

Labs involved: MEGA iSSB, Francois KepesMERIT IRD, Rachida Tahar

September 29, 2015

1 Background

Global genome expression both a↵ects and organises the three-dimensional(3D) folding and architecture of the genome. Interactions of genome featuressuch as genes, promoters, trans- and cis-regulatory elements in various scalescontribute to an additional control mechanism of genome expression regula-tion.

Figure 1: Interplay between genomestructure, genome expression andgenome layout.

This regulation, on the epigenomiclevel is realised by foci of localconcentration of the above men-tioned players into transcription fac-tories. The increasing availabil-ity of high- dimensional multi-omicsdatasets, such as Hi-C contact maps,epigenetic modifications and tran-scriptomics data, and the develop-ment of novel multivariate analysistechniques allow to systematicallyexplore the complex interplay be-tween the 3D-structure of genomeand its expression regulation. In thisproject the student will use publicly

1

Page 3: MASTER DEGREE in BIOINFORMATICS - Inserm · MASTER DEGREE in BIOINFORMATICS M2 - 2015-2016 ... multiview machine learning. ... only responsible for the making of two crucial products

available high-throughput multi-omics datasets as well as an array of toolsand techniques for genome analysis and statistical learning developed by theMEGA team to explore and discover key elements of the relationship betweengenome architecture and expression with a view of improving our knowledgeand ability to understand and synthesise genomes.

2 Project Description

The project concerns the study of the interplay between genome expressionand architecture in two monocellular eukaryotic organisms of extreme biolog-ical importance. Plasmodium falsiparum is the malaria parasite responsiblefor more than 10.000.000 deaths every year in the developing word and yeastSaccharomyces cerevisiae is the bakers yeast, the microorganism that is notonly responsible for the making of two crucial products for human nutrition(e.g. bread and beer) but also a workhorse for the synthetic production ofscores of important biolomolecules (antibiotics, drugs etc.).

We collect the 3D genome sequencing information for plasmodium [2, 1]and yeast [4] and the plethora of publicly available gene expression datasets for both organisms. We apply first a genomic analysis tool developed inMEGA team called GREAT:SCAN:patterns https://absynth.issb.genopole.fr/Bioinformatics/tools/GREAT [3] followed with a series of statisticallearning techniques based on canonical analysis [6], in house developed [5]statistical learning tools and multi-omics analysis techniques to identify bi-ological principles that connect genome architecture with expression regula-tion.

3 Requirements

Student applicants are expected to have working knowledge of either a sta-tistical analysis package (e.g. R) and/or a programming language (Python,Perl, Java in this order of preference as well as experience of the commandline) and casual knowledge of biology. Knowledge of biology is not essentialbut it is a requirement at the end of the project. Knowledge of program-ming is also not essential, but desire to learn how to code and open attitudetowards doing biology “in-silico” is highly desirable.

2

Page 4: MASTER DEGREE in BIOINFORMATICS - Inserm · MASTER DEGREE in BIOINFORMATICS M2 - 2015-2016 ... multiview machine learning. ... only responsible for the making of two crucial products

References

[1] Talleh Almelli, Gregory Nuel, Emmanuel Bischo↵, Agnes Aubouy, Mo-hamed Elati, Christian William Wang, Marie-Agnes Dillies, Jean-YvesCoppee, Georges Nko Ayissi, Leonardo Kishi Basco, Christophe Rogier,Nicaise Tuikue Ndam, Philippe Deloron, and Rachida Tahar. Di↵erencesin gene transcriptomic pattern of plasmodium falciparum in children withcerebral malaria and asymptomatic carriers. PLoS One, 9(12):e114401,2014.

[2] Ferhat Ay, Evelien M. Bunnik, Nelle Varoquaux, Sebastiaan M. Bol,Jacques Prudhomme, Jean-Philippe Vert, William Sta↵ord Noble, andKarine G. Le Roch. Three-dimensional modeling of the p. falciparumgenome during the erythrocytic cycle reveals a strong connection betweengenome architecture and gene expression. Genome Research, 24(6):974–988, Jun 2014.

[3] Costas Bouyioukos, Mohamed Elati, and Francois Kepes. Hydrocarbonand Lipid Microbiology Protocols, chapter Protocols for probing genomearchitecture of regulatory networks in hydrocarbon and lipid microorgan-isms, pages –. Springer Protocols Handbooks. Springer, Heidelberg, 2015.In Press.

[4] Zhijun Duan, Mirela Andronescu, Kevin Schutz, Sean McIlwain,Yoo Jung Kim, Choli Lee, Jay Shendure, Stanley Fields, C AnthonyBlau, and William S. Noble. A three-dimensional model of the yeastgenome. Nature, 465(7296):363–367, May 2010.

[5] Mohamed Elati, Remy Nicolle, Ivan Junier, David Fernandez, RimFekih, Julio Font, and Francois Kepes. PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION. Nucleic Acids Re-search, 41(3):1406–1415, Feb 2013.

[6] Kim-Anh Le Cao, Pascal G P. Martin, Christele Robert-Granie, andPhilippe Besse. Sparse canonical methods for biological data integra-tion: application to a cross-platform study. BMC Bioinformatics, 10:34,2009.

3