Upload
alex-tribble
View
219
Download
5
Tags:
Embed Size (px)
Citation preview
1
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Keith SatterleyBioinformatics, WEHI, Nov. 15 2005
Keith SatterleyBioinformatics, WEHI, Nov. 15 2005
The Walter and Eliza Hall Institute of Medical Research 2
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
1. R, Environment, tools & resources 1. R, Environment, tools & resources
2. Graphical tools.2. Graphical tools.
3. LimmaGUI and AffylmGUI. 3. LimmaGUI and AffylmGUI.
4. Example Analysis. 4. Example Analysis.
Overview.Overview. Overview.Overview.
5. Resources available. 5. Resources available.
6. Future Developments. 6. Future Developments.
The Walter and Eliza Hall Institute of Medical Research 3
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• R is language and environment for statisticalcomputing and graphics. R is released under the GNU license.
• R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of platforms including Unix variants, Windows and MacOS.
• S was developed by by John Chambers and colleagues at Bell Labs. R can be considered as a different implementation of S.
• R was initially written by Robert Gentleman and Ross Ihaka of the Statistics Department of the University
• of Auckland. • Since mid-1997 a large group of individuals have
contributed to R by sending code and bug reports.• The R url is http://www.r-project.org/
The R Project for Statistical Computing
The Walter and Eliza Hall Institute of Medical Research 4
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• R has an effective data handling and storage facility,
• A suite of operators for calculations on arrays, in particular matrices,
• Provides a vast number of useful statistical tools, many of which have been painstakingly tested,
• R produces publication-quality graphics in a variety of formats, including JPEG, postscript, eps, pdf, and bmp,
• A well-developed, simple and effective programming language.
The R Project for Statistical Computing
The Walter and Eliza Hall Institute of Medical Research 5
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• R allows users to add additional functionality by defining new functions.
• C, C++ and Fortran code can be linked and called at run time.
• R can be extended (easily) via packages.
• There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites
The R Project for Statistical Computing
The Walter and Eliza Hall Institute of Medical Research 6
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• Frequently Asked Questions:http://www.ci.tuwien.ac.at/%7Ehornik/R/R-FAQ.html
• Archives - CRAN see next.
• Mailing Lists– [email protected]:– [email protected]:– [email protected].
• Bug-tracking System: http://bugs.r-project.org/
Resources for R
The Walter and Eliza Hall Institute of Medical Research 7
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• CRAN = Comprehensive R Archive Network.
• CRAN is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R.
Resources for R
The Walter and Eliza Hall Institute of Medical Research 8
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools• Australia
– http://cran.au.r-project.org/ PlanetMirror, Brisbane http://cran.ms.unimelb.edu.au/ University of Melbourne • Austria
– http://cran.at.r-project.org/ Technische Universitaet Wien • Brasil
– http://cran.br.r-project.org/ Universidade Federal do Parana?? http://www.insecta.ufv.br/CRAN/ Federal University of Vicosa http://cran.fiocruz.br/ Oswaldo Cruz Foundation, Rio de Janeiro http://lmq.esalq.usp.br/CRAN/ University of Sao Paulo, Piracicaba http://www.vps.fmvz.usp.br/CRAN/ University of Sao Paulo, Sao Paulo
• Canada– http://cran.stat.sfu.ca/ Simon Fraser University, Burnaby http://probability.ca/cran/ University of Toronto
• China– http://www.lmbe.seu.edu.cn/CRAN/ Southeast University, Nanjing
• Denmark– http://cran.dk.r-project.org/ dotsrc.org, Aalborg
• France– http://cran.fr.r-project.org/ CICT, Toulouse http://cran.univ-lyon1.fr/ Dept. of Biometry & Evol. Biology, University of Lyon http://mirror.internet.tp/cran/ Boese Internet, Paris
• Germany– http://cran.r-mirror.de/ Stefan Drees, Berlin http://pangora.org/cran/ Pangora GmbH, Hamburg http://cran.miscellaneousmirror.org/ Miscellaneousdata.de, Koeln
http://umfragen.sowi.uni-mainz.de/CRAN/ University of Mainz http://cran.mirrorplus.org/ mirrorplus.org, Muenchen • Hungary
– http://cran.hu.r-project.org/ Semmelweis University • Italy
– http://cran.arsmachinandi.it/ Ars Machinandi, Arezzo http://microarrays.unife.it/CRAN/ Universita di Ferrara http://rm.mirror.garr.it/mirrors/CRAN/ Garr Mirror, Milano http://dssm.unipa.it/CRAN/ Universita degli Studi di Palermo
• Israel– http://cran.active.co.il/ Activetech Ltd, Tel-Aviv
• Japan– ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN University of Aizu http://cran.md.tsukuba.ac.jp/ University of Tsukuba
• Korea– http://bibscvs.snu.ac.kr/R/ Seoul National University
• Netherlands– http://cran.nedmirror.nl/ Nedmirror, Amsterdam
• Poland– http://novum.am.lublin.pl/CRAN/ Skubiszewski Medical University, Lublin http://r.meteo.uni.wroc.pl/ University of Wroclaw
• Portugal– http://cran.pt.r-project.org/ Universidade do Porto
• Slovenia– http://www.fastmirrors.org/cran/ Fastmirrors.org, Besnica http://www.wsection.com/cran/ Wsection.com, Ljubljana
• South Africa– http://cbio.uct.ac.za/CRAN/ University of Cape Town http://cran.za.r-project.org/ Rhodes University
• Spain– http://cran.es.r-project.org/ Spanish National Research Network, Madrid
• Switzerland– http://cran.ch.r-project.org/ ETH Zuerich http://www.imsv.unibe.ch/cran/ Universitaet Bern http://cran.prokmu.com/ Prokmu Hosting, Bern
• Turkey– http://godel.cs.bilgi.edu.tr/mirror/cran/ Istanbul Bilgi University
• Taiwan– http://cran.cs.pu.edu.tw/ Providence University, Taichung http://cran.csie.ntu.edu.tw/ National Taiwan University, Taipei
• UK– http://cran.uk.r-project.org/ University of Bristol http://www.sourcekeg.co.uk/cran/ Sourcekeg, London
• USA– http://cran.cnr.Berkeley.edu University of California, Berkeley, CA http://cran.stat.ucla.edu/ University of California, Los Angeles, CA http://cran.ssds.ucdavis.edu/ University of California, Davis, CA
http://rh-mirror.linux.iastate.edu/CRAN/ Iowa State University, Ames, IA http://www.biometrics.mtu.edu/CRAN/ Michigan Technological University, Houghton, MI http://cran.wustl.edu/ Washington University, St. Louis, MO http://www.ibiblio.org/pub/languages/R/CRAN/ University of North Carolina, Chapel Hill, NC http://cran.us.r-project.org/ Pair Networks, Pittsburgh, PA http://lib.stat.cmu.edu/R/CRAN/ Statlib, Carnegie Mellon University, Pittsburgh, PA http://cran.hostingzero.com/ Hosting Zero, Dallas, TX http://cran.fhcrc.org/ Fred Hutchinson Cancer Research Center, Seattle, WA
The Walter and Eliza Hall Institute of Medical Research 9
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis ToolsCRAN Mirrors – 475 packages
The Walter and Eliza Hall Institute of Medical Research 10
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• Features of R.– Graphical abilities.– Package System.– Objects in R.
Resources for R
The Walter and Eliza Hall Institute of Medical Research 11
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Graphical Capabilities in R
• On unix(inc. Mac OS X) X11 is used.• On MS Windows it uses the MS windows system
commands.• This is not a GUI, but a graphics device for
plotting and drawing.• There are high level, low level and interactive
plotting commands.• plot(x) is a high level command.
– If x is a time series, this produces a time-series plot.– If x is a numeric vector, it produces a plot of the
values in the vector against their index in the vector.– If x is a complex vector, it produces a plot of
imaginary versus real parts of the vector elements.
The Walter and Eliza Hall Institute of Medical Research 12
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Graphical Capabilities in R
• Low-level plotting commands can be used to add extra information (such as points, lines or text) to the current plot.
• abline(a, b) – Adds a line of slope b and intercept a to the
current plot.
• title(main, sub)– Adds a title main to the top of the current plot
The Walter and Eliza Hall Institute of Medical Research 13
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
An R command line Example
• library(limma)• setwd("C:/aaa-R/swirl/")• getwd()• list.files()• targets <- readTargets("SwirlTargetsFile.txt")• targets• RG <- read.maimages(targets$FileName, source="spot")
• RG• par(fg="yellow",bg="green")• plot(RG$R,lwd=3)• abline(2000,1,lwd=5,col ="black")
The Walter and Eliza Hall Institute of Medical Research 14
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
R Graphics
The Walter and Eliza Hall Institute of Medical Research 15
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
R Graphics (cont.)PM Intensity distribution for PreS2
log2(PM Intensity)
Fre
quen
cy
6 8 10 12 14 16
050
0010
000
1500
0
The Walter and Eliza Hall Institute of Medical Research 16
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Bioconductor Graphics
The Walter and Eliza Hall Institute of Medical Research 17
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
R Packages
• Packages provide a mechanism for loading code and attached documentation.
• Packaging automatically checks and creates various documentation files from one source
• Creates distributable win.binary(.zip), mac.binary(.tgz) or source files(.tar.gz).
• Packages can specify dependent or suggested packages
The Walter and Eliza Hall Institute of Medical Research 18
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• install.packages() can install a package and all its dependencies (and their dependencies…), either the essential ones and/or the suggested ones (which maybe needed for examples etc.)
R Packages(cont.)
The Walter and Eliza Hall Institute of Medical Research 19
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Objects in R
• The entities R operates on are technically known as objects.
• The class of an object determines how it will be treated by what are known as generic functions.
• For example print, plot or summary will react according to what sort of object they are called to work on.
The Walter and Eliza Hall Institute of Medical Research 20
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Bioconductor
• Url is http://www.bioconductor.org/• Bioconductor is an open source and open
development software project for the analysis and comprehension of genomic data.
• The Bioconductor core team is based primarily at the Fred Hutchinson Cancer Research Center.
• Aims to promote high-quality documentation and reproducible research.
• Aims to provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data.
The Walter and Eliza Hall Institute of Medical Research 21
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Bioconductor
• R and the R package system are the main vehicles for designing and releasing software.
• Bioconductor has a commitment to full open source discipline, All contributions are expected to exist under an open source license such as GPL2 or BSD.
The Walter and Eliza Hall Institute of Medical Research 22
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• Features of the Bioconductor site.– Packages – code– Packages – metadata– Version management system
Bioconductor
The Walter and Eliza Hall Institute of Medical Research 23
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Bioconductor Packages
• 140 code packages listed• aCGH 1.4.0 Classes and functions for Array Comparative Genomic Hybridization data. • affxparser 1.2.0 Affymetrix File Parsing SDK • affy 1.8.1 Methods for Affymetrix Oligonucleotide Arrays • affycomp 1.6.0 Graphics Toolbox for Assessment of Affymetrix Expression Measures • affydata 1.6.0 Affymetrix Data for Demonstration Purpose • affylmGUI 1.4.0 GUI for affy analysis using limma package • affypdnn 1.4.0 Probe Dependent Nearest Neighbours (PDNN) for the affy package • affyPLM 1.6.0 Methods for fitting probe-level models • affyQCReport 1.8.0 QC Report Generation for affyBatch objects • altcdfenvs 1.4.0 alternative cdfenvs • ~~~~~~• limma 2.2.0 Linear Models for Microarray Data • limmaGUI 1.6.0 GUI for limma package • ~~~~~~• vsn 1.8.0 Variance stabilization and calibration for microarray data • webbioc 1.2.0 Bioconductor Web Interface • widgetInvoke 1.2.0 Evaluation widgets for functions • widgetTools 1.6.0 Creates an interactive tcltk widget • xcms 1.2.0 LC/MS and GC/MS Data Analysis
• PLUS• 250 metadata packages• From:• ag 1.10.0 Affymetrix Arabidopsis Genome Array Annotation Data (ag) • agahomology 1.10.0 A data package containing annotation data for agahomology• To:• zebrafishcdf 1.10.0 zebrafishcdf • zebrafishprobe 1.10.0 Probe sequence data for microarrays of type zebrafish
The Walter and Eliza Hall Institute of Medical Research 24
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Bioconductor – use the Subversion version mgt. system
• Subversion! http://svnbook.red-bean.com/en/1.1/svn-book.html
• Subversion is a free/open-source version control system. (replaces CVS).
• That is, Subversion manages files and directories over time.
• Subversion clients can access their repository across networks, which allows the version repository to be accessed by many users simultaneously.
The Walter and Eliza Hall Institute of Medical Research 25
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
• Subversion uses a Copy-Modify-Merge solution, rather than a Lock-Modify-Unlock procedure.
it remembers every change ever written to it:
A client can ask historical questions like, “What did this directory contain last Wednesday?” or “Who was the last person to change this file, and what changes did they make?”
Bioconductor – Version management system
The Walter and Eliza Hall Institute of Medical Research 26
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Graphical User Interfaces
• These items are known as widgets.• Tcl/Tk is a tool for creating and interacting with
widgets.• Tcl/Tk runs on unix, Windows and Mac OS X.
The Walter and Eliza Hall Institute of Medical Research 27
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Tcl/Tk
• Tcl/Tk needs to be installed on the computer as well as R.
• There are prewritten librarys of Tcl/Tk tools- - for eg. TkTable.
• The R package tcltk needs to be installed in R.
• The tcltk R package is an interface between the R language and Tcl/Tk commands.
The Walter and Eliza Hall Institute of Medical Research 28
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
GUI Programs
• On Windows Tcl/Tk talks to the MS Windows graphical window system.
• On Unix(&Mac), Tcl/Tk talks to the X Windows system, hence X11 must be started first.
• 1. Run X11 on Unix & Mac• 2. load the R package tcltk using:• library(tcltk)• library(affylmGUI) for example,
(actually affylmGUI will automatically load tcltk)
The Walter and Eliza Hall Institute of Medical Research 29
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
R tcltk example
• This can be used to test if tcltk (or Tcl/Tk) is working correctly:
• >library(tcltk)
• >tt <- tktoplevel()
• >lbl <- tklabel(tt, text="Hello, World!")
• >tkpack(lbl)
• >but <- tkbutton(tt, text="OK")
• >tkpack(but)
The Walter and Eliza Hall Institute of Medical Research 30
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
R tcltk testing tools
• To check the path that Tcl/Tk uses to find libraries– >tclvalue(“auto_path”)– [1] "{C:\\R\\rw2020\\R-2.2.0/Tcl/lib/tcl8.4} – C:/R/rw2020/R-2.2.0/Tcl/lib ./lib – C:/R/rw2020/R-2.2.0/Tcl/lib/tk8.4 – C:/R/rw2020/R-2.2.0/library/tcltk/exec“
• To add an extra path to search, use:– >addTclPath(“C:/bin”)– >tclvalue(“auto_path”)– [1] "{C:\\R\\rw2020\\R-2.2.0/Tcl/lib/tcl8.4} – C:/R/rw2020/R-2.2.0/Tcl/lib ./lib – C:/R/rw2020/R-2.2.0/Tcl/lib/tk8.4 – C:/R/rw2020/R-2.2.0/library/tcltk/exec C:/bin“
– For a list of package commands:– >ls(package:tcltk)
The Walter and Eliza Hall Institute of Medical Research 31
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Help Commands in R
• help(mean) #help window on mean function• ?mean #same as help(mean)• help.search(“regression”) #Help files with
alias or concept or title matching 'regression' using fuzzy matching:
• help.start() #Browser into R docs
• The Browser shows links into the R Language Definition, Installation & Administration of R, Package writing, Package documentation FAQ’s etc.
The Walter and Eliza Hall Institute of Medical Research 32
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Some Useful R Commands for the GUI user!
• getwd() #Get working directory.• setwd() #Set working Directory.• list.files() #list files in working directory.• ls() #list objects in workspace.• rm(list=ls()) #Remove all objects (recommended at
start of a session).• savehistory(file=“History.txt”)• source(file="C:/path/to/filename/file.R", echo=T)
#reads commands from file.R and executes them.
• installed.packages() #detailed info on all packages installed.
• summary(RG) #displays basic data about object RG.• library(limmaGUI) #loads limmaGUI package.
The Walter and Eliza Hall Institute of Medical Research 33
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Cross Platform Issues
• Installation issues are varied
• MS Windows – able to be installed in C:\R by ordinary user
• Unix – can be installed by user, but duplications if multiple users do so.
• Mac OS X – special procedures necessary
The Walter and Eliza Hall Institute of Medical Research 34
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
LimmaGUI
• limmaGUI is a Graphical User Interface (GUI) based on R-Tcl/Tk for the exploration and linear modelling of data from two-colour spotted microarray experiments, especially the assessment of differential expression in complex experiments.
• Swirl Example Analysis.
The Walter and Eliza Hall Institute of Medical Research 35
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
AffylmGUI
• AffylmGUI enables the user to perform quality assessment, low-level analysis and linear modeling of data from Affymetrix GeneChips®, with the ultimate goal of identifying differentially expressed genes.
• Estrogen Example Analysis
The Walter and Eliza Hall Institute of Medical Research 36
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
WEHI website Resources
• WEHI Bioinformatics home page
http://bioinf.wehi.edu.au/• Microarray Data Analysis
http://bioinf.wehi.edu.au/marray/index.html
LIMMA:Linear Models for Microarray Datahttp://bioinf.wehi.edu.au/limma/index.html
limmaGUI: http://bioinf.wehi.edu.au/affylmGUI/
affylmGUI: http://bioinf.wehi.edu.au/affylmGUI/
James Wettenhall's Bioinformatics Home Page:
http://bioinf.wehi.edu.au/folders/james/
R-Tcl/Tk Examples, Worked Examples for limma/affylmGUI at
http://bioinf.wehi.edu.au/limmaGUI/R/library/limmaGUI/doc/DocIndex.html
The Walter and Eliza Hall Institute of Medical Research 37
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Future Directions for AffylmGUI
• additional plots to aid in quality assessment of a set of chips, including RNA degradation plots;
• calculation and display of QC parameters recommended by Affymetrix (Affymetrix, 2004), such as percent present, ratios of 3’/5’ expression for hybridization controls and the like;
• fitting of mixed linear models where there is technical replication;
• support for other single-channel platforms.
The Walter and Eliza Hall Institute of Medical Research 38
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Future Directions for LimmaGUI
• additional plots to aid in quality assessment of a set of chips;
• fitting of mixed linear models where there is technical replication;
• fitting of mixed linear models where there is biological replication;
• ?
The Walter and Eliza Hall Institute of Medical Research 39
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools
Aknowledgments
• James Wettenhall
• Gordon Smyth
• Ken Simpson
• Terry Speed
• Bioinformatics – many seminars on microarrays!
The Walter and Eliza Hall Institute of Medical Research 40
Developing GUI Microarray Analysis ToolsDeveloping GUI Microarray Analysis Tools