2
GenoCensus™: Analysis and Visualisation of Genomic Copy Number Variation E'Krame Ayari, Abdel Benajjou, Jean-Paul Saraiva, Stephanie Maillard, Pierre Lindenbaum, Emmanuel Martin, Philippe Gesnouin, Frederic Tores and Peter Brooks IntegraGen SA, Evry France GenoCensus™ is a stand-alone Java program for Unix or Windows that processes files of numerical data generated from analysis of microarray images. The program invokes BioConductor R modules for some tasks. IntegraGen has developed 500 kb resolution BAC arrays (IntegraChip™) with 5244 clones and by default the program uses annotations of chromosomal locations that are specific for these BAC clones. Preliminary experiments indicate that BAC arrays are capable of detecting lower levels of mosaicism than high density oligonucleotide arrays. IntegraChip™ manufacture is sufficiently robust to permit inter-slide “single colour” analysis. Cy5 : Test DNA Cy3 : Reference DNA Spot replicates: 4 replicate spots per clone: 2 spots in each block, each block in 2 zones block 5244 BAC clones Median of 0.5 Mb between clones. Numerous controls including Cot DNA and Rice BACs. Each block has 2 dilution series of pools of BACs for internal QC of normalization. Clone identity by end sequencing 0.5 Mb median resolution 20% euchromatin coverage Image analysis GenoCensus Array Image (TIFF) GenePix R - Spot Segmentation Image analysis gpr file flat file (txt) Filters (intensity, background, replicate variation…) Normalisation (Block Lowess) Group replicates Gain/Loss Regions (CBS BioConductor R module DNA Copy) Genomic position profiles GenoCensus™ processes array image numerical data such as from GenePix (Axon) gpr files. The next revision of GenoCensus will include image analysis with the BioConductor module SpotSegmentation. Data processing options are easily modified to adapt the analysis to different experimental contexts. IntegraChip™ provides high signal to noise ratios and homogeneous low ratios for autosomal clones. Thus low levels of mosaicism are readily detected. The figure reveals that about 10% of the cells in a stem cell culture have 3 copies of chromosome 12. . . . . zones IntegraChip ™ V7 Whole genome coverage Stem Cell Heterogeneity Summary Chromosomal profiles can be viewed with or without cytogenetic bands. This example shows a dye reversal from a “trio design”. Plotting average ratios of the dye swap is an option. Dye Swap Trios Click-drag zooming functionality. GenoCensus ™ whole genome view

GenoCensus™: Analysis and Visualisation of Genomic Copy Number Variation

Embed Size (px)

Citation preview

Page 1: GenoCensus™: Analysis and Visualisation of Genomic Copy Number Variation

GenoCensus™: Analysis and Visualisation of Genomic Copy Number VariationE'Krame Ayari, Abdel Benajjou, Jean-Paul Saraiva, Stephanie Maillard, Pierre Lindenbaum, Emmanuel Martin, Philippe Gesnouin, Frederic Tores and Peter Brooks IntegraGen SA, Evry France

GenoCensus™ is a stand-alone Java program for Unix or Windows that processes files of numerical data generated from analysis of microarray images.

The program invokes BioConductor R modules for some tasks.

IntegraGen has developed 500 kb resolution BAC arrays (IntegraChip™) with 5244 clones and by default the program uses annotations of chromosomal locations that are specific for these BAC clones.

Preliminary experiments indicate that BAC arrays are capable of detecting lower levels of mosaicism than high density oligonucleotide arrays.

IntegraChip™ manufacture is sufficiently robust to permit inter-slide “single colour” analysis.

Cy5 : Test DNA

Cy3 : Reference DNA

Spot replicates:

4 replicate spots per clone: 2 spots in each block, each block in 2 zones

block

5244 BAC clonesMedian of 0.5 Mb between clones.

•Numerous controls including Cot DNA and Rice BACs.

•Each block has 2 dilution series of pools of BACs for internal QC of normalization.

Clone identityby end sequencing

0.5 Mb median resolution

20% euchromatin coverage

Image analysis

GenoCensus™

Array Image (TIFF)

GenePix R - Spot SegmentationImage analysis

gpr file flat file (txt)

Filters(intensity, background, replicate variation…)

Normalisation (Block Lowess)Group replicates

Gain/Loss Regions(CBS BioConductor R module DNA Copy)

Genomic position profiles

GenoCensus™ processes array image numerical data such as from GenePix (Axon) gpr files. The next revision of GenoCensus will include image analysis with the BioConductor module SpotSegmentation.

Data processing options are easily modified to adapt the analysis to different experimental contexts.

IntegraChip™ provides high signal to noise ratios and homogeneous low ratios for autosomal clones. Thus low levels of mosaicism are readily detected. The figure reveals that about 10% of the cells in a stem cell culture have 3 copies of chromosome 12.

..

..

zones

IntegraChip™ V7

Whole genome coverage

Stem Cell HeterogeneitySummary

Chromosomal profiles can be viewed with or

without cytogenetic bands. This

example shows a dye reversal from a

“trio design”.Plotting average ratios of the dye

swap is an option.

Dye Swap Trios

Click-drag zooming functionality.

GenoCensus™ whole genome view

Page 2: GenoCensus™: Analysis and Visualisation of Genomic Copy Number Variation

Identification of recurrent gain/loss patterns on chromosome 11 among many tumour samples.(courtesy CIT, LNCC, Paris)

Comparison of BAC arrays with Illumina HapMap 317k arrays

IntegraGen’s full service Illumina platform enabled comparison of the 2 platforms with identical DNA samples. Illumina’s gain/loss calling algorithm obliges compromises between resolution and calling of false positives.

The distinction between the region of loss is more pronounced with the IntegraChip™ BAC array, implying that the BAC array will provide a greater sensitivity to detect low levels of mosaicism.

Pink bars: regions called as losses by Illumina software. The deletion includes both regions. The interruption is coincident with a CNV.

Detection of tumour DNA aberrations in the presence of a large excess of DNA from normal cells.

Tumour DNA with multiple gains and losses.

Regions of gain and loss beyond the user-specified threshold are tabulated for export.

Currently we use Circular Binary Segmentation by invoking the BioConductor module “DNA

Copy”. Other algorithms can be readily integrated into the Java program. Hot links

connect directly to the Toronto TCAG browser to determine if the region contains documented

CNVs or segmental duplications.

Inter-array analysis of copy number changes.

Tumour DNA heterogeneity Recurrent patterns in tumour DNA

Calling regions of gains and losses