Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource...

Analysis of GEO datasets using GEO2R

Parthav JailwalaCCR Collaborative Bioinformatics Resource

CCR/NCI/NIH

Outline

• Background on GEO datasets• What is GEO2R and how can it help you• How to use GEO2R• Options and features• Limitations and caveats• Hands-on exercise

• An international public repository that archives and freely distributes high-throughput microarray & NGS data submitted by the scientific community

• About a billion individual gene expression measurements, derived from over 100 organisms, wide range of biological issues

• Data can be explored, queried and visualized using user-friendly Web-based tools

GEO data organization

[ GPLxxx ] [ GSMxxx ] [ GSExxx ]

[ GDSxxx ]

What kinds of data does GEO host?

• GEO was designed around the common features of most of the high-throughput and parallel molecular abundance-measuring technologies in use today. These include:

– Gene expression profiling by microarray or next-generation sequencing – Non-coding RNA profiling by microarray or next-generation sequencing– Chromatin immunoprecipitation (ChIP) profiling by microarray or next-

generation sequencing– Genome methylation profiling by microarray or next-generation

sequencing– Genome variation profiling by array (arrayCGH)– SNP arrays– Serial Analysis of Gene Expression (SAGE)– Protein arrays

What is GEO2R ?

• Interactive web tool that allows users to compare two or more groups of Samples in a GEO Series in order to identify genes that are differentially expressed across experimental conditions

• Uses GEOquery and Limma R packages from Bioconductor project

• Simple interface that allows users to perform R statistical analysis without command line expertise

• Does not rely on curated ‘DataSets’ and interrogates the original Series Matrix data file directly

How to use GEO2R• Enter a Series accession number

– Follow a link from a Series record OR– Enter a Series accession number

• Define Sample groups

– Atleast 2, upto 10 groups can be defined

• Assign Samples to each group

– Not all samples in a series need to be selected

• Perform the test

– Assess sample value distributions– Edit default test parameters

• Interpret the results

– Table of the top 250 genes ranked by p-value– Select columns to be included in the output table– Edit the test parameters -> Recalculate to apply edits– Download the tab-delimited table and open in Excel

Options and features

• Value distribution

– Number summary or boxplot– Median centered values indicative that data are normalized and cross-comparable

• Options

– Apply adjustment of p-values– Apply log transformation to the data– Category of Platform annotation to display on results (NCBI generated (preferred)

or Submitter supplied)

• Profile graph

• R script

Limitations & caveats• Check that Sample values are comparable

– Assess the value distribution boxplot– Review the GEO Series experiment description

• Data type restriction– Some GEO data do not have data tables (eg. High-throughput sequencing or

genome tiling arrays)

• Within-Series restriction– No cross-series comparisons

• 255 Sample limit

• 10 minute timeout

Summary statistics from Limma

Hands-on exercise

• Google: GSE18388• Microarray Analysis of Space-flown Murine

Thymus Tissue

Further learning resources on GEO2R

• Full description: – http://www.ncbi.nlm.nih.gov/geo/info/geo2r.html

• Youtube Video:– https://www.youtube.com/watch?v=EUPmGWS8ik0

• Example walkthrough: – http

://www.bioinformatics.polimi.it/masseroli/bcbmm/material/practices/E2_GEO2R_Bioconductor_Tutorial.docx

Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource...

Documents

CCR 2450 GTS CCR 3650 GTS - Toro

Chemical Databases, Identifiers, Tool Kits and Web Services October 16, 2003 Marc C. Nicklaus, CADD Group, Lab. of Medicinal Chemistry, CCR, NCI, NIH;

StemCell - NCI

Mechanisms for CCR Program Change at NCI …...Mechanisms for CCR Program Change at NCI-Frederick Robert H. Wiltrout, Director, SD for Basic Science Lee J. Helman, SD for Clinical

Inside NCI

Effective Colony Management Mouse 101 Lecture 9/21/15 Wendy du Bois, Biologist NCI/CCR/LCBG

Transitions: CCR Standards...Transitions: CCR Standards Donna Price ... CCR Standards Focus on Literacy Instruction (ELA Literacy) ... • Identify ways of integrating the CCR skills

P.F. Lemkin LECB, CCR, NCI/FCRDC mail: lemkin@ncifcrf.gov [This document is under construction] Revised: 06-19-2002 Software Design of the MicroArray Explorer

NCI CCR Liver Cancer Program: Special Conference on Tumor ...Anna Mae Diehl, M.D., Duke University ... the 40 cases from whom both FFPE and fresh frozen tumor was available, there

CCR City Deal Strategic Business Plan Wider Investment Fundcardiffcapitalregioncitydeal.wales/ccr-business-plan/CCR-Strategic... · 08 — CCR City Deal Strategic Business Plan Wider

Jeffrey Green, M.D. Head, Transgenic Oncogenesis and Genomics Section Laboratory of Cancer Biology and Genetics CCR, NCI jegreen@nih.gov Commonly Used

Understanding NCI Reports Sarah Taub NCI Webinar Series April 29 th, 2014 National Core Indicators (NCI)

Outcomes Research HHS Public Access Allison W. Kurian, M.D ...stacks.cdc.gov/view/cdc/29810/cdc_29810_DS1.pdf · linked EMR-CCR datasets of Community and University patients.

CCR/NCI Integrated and Collaborative Knowledge Environment NCI CCR.pdf• BioFortis LabMatrix – internet-based, HIPAA compliant, scientiﬁc application that serves as a central

Introduction to RNASeq Data Analysis (Part 2) Peter FitzGerald, PhD · 2020. 12. 18. · Peter FitzGerald, PhD Head Genome Analysis Unit Director of BTEP CCR, NCI Introduction to

Strategies for Vaccine Design Jay A. Berzofsky, M.D., Ph.D. Chief, Vaccine Branch, CCR, NCI Preparing for Biothreats: Emerging and Re-emerging Infectious

2021 RNA Biology symposium - NCI at Frederick: NCI at

Inside The Library - NCI at Frederick: NCI at Frederick

Welcome to NORMA@NCI Library - NORMA@NCI Library

NCI SBIR Program Overview - National Cancer Institute · NCI SBIR PROGRAM OVERVIEW Michael Weingarten Director. NCI SBIR Development Center. NCI SBIR - CORE ACTIVITIES. 2. CENTRAL