1
BioMed Central Page 1 of 1 (page number not for citation purposes) BMC Systems Biology Open Access Poster presentation Variance stabilising transformations for NMR metabolomics data Helen Parsons* and Mark Viant Address: School of Biosciences, The University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK Email: Helen Parsons* - [email protected] * Corresponding author Classifying the fingerprint of an NMR spectrum is a crucial step in many metabolomics experiments. Since many clas- sification techniques such as principal component analy- sis (PCA) depend upon variance discrepancies, it is important to first maximise any contribution from wanted class variance between biological samples and minimise any contribution from unwanted technical var- iance arising from the preparation of the samples and measurement of the NMR metabolic fingerprints. The generalised logarithm (glog) transform was developed to stabilise the variance between technical replicates in a two component error model [1] and has also been applied to NMR spectra previously [2]. To increase the effectiveness of the transform on NMR spectra, the glog was extended to include a baseline offset term. This decreases the unwanted noise contribution on the transformed spectra. The extended glog transformation is given as: for z the transformed intensity and y the original intensity of the spectra. y 0 and λ are transformation parameters which are to be found. Here we have applied the extended glog transform to tech- nical replicates of NMR spectra of tissue extracts from marine mussels, to determine the optimised transforma- tion parameters λ and y 0 . Next we applied the optimised transformation to a data set comprised of two classes of NMR spectra from stressed and unstressed mussels. Fol- lowing transformation, the results show significantly bet- ter separation of the classes on a PCA scores plot than can be achieved with both untransformed data and also data transformed using Pareto scaling, a widely used method in NMR metabolomics [3]. In conclusion, we have dem- onstrated the value of the extended glog transformation to stabilise the technical variance in an NMR metabolomics dataset and have achieved significantly improved classifi- cation of NMR fingerprints from stressed and unstressed animals. References 1. Rocke D, Lorenzato S: A two-component model of measure- ment error in analytical chemistry. Technometrics 1995, 37(2):176-184. 2. Purohit P, Rocke D, Viant M, Woodruff D: Discrimination models using variance-stabilizing transformation of metabolomic NMR data. OMICS 2004, 8(2):118-130. 3. Keun H, Ebbels T, Antti H, Bollard M, Beckonert O, Holmes E, Lindon J, Nicholson J: Improved analysis of multivariate data by varia- ble stability scaling: application to NMR-based metabolic profiling. Analytica Chimica Acta 2003, 490(1):265-276. from BioSysBio 2007: Systems Biology, Bioinformatics and Synthetic Biology Manchester, UK. 11–13 January 2007 Published: 8 May 2007 BMC Systems Biology 2007, 1(Suppl 1):P22 doi:10.1186/1752-0509-1-S1-P22 <supplement> <title> <p>BioSysBio 2007: Systems Biology, Bioinformatics, Synthetic Biology</p> </title> <editor>John Cumbers, Xu Gu, Jong Sze Wong</editor> <note>Meeting abstracts – A single PDF containing all abstracts in this Supplement is available <a href="www.biomedcentral.com/content/files/pdf/1752-0509-1-S1-full.pdf">here</a>.</note> <url>http://www.biomedcentral.com/content/pdf/1752-0509-1-S1-info.pdf</url> </supplement> This abstract is available from: http://www.biomedcentral.com/1752-0509/1?issue=S1 © 2007 Parsons and Viant; licensee BioMed Central Ltd. z y y y y = + + ( ) ln ( ) ( ) 0 0 2 λ

Variance stabilising transformations for NMR metabolomics data

Embed Size (px)

Citation preview

Page 1: Variance stabilising transformations for NMR metabolomics data

BioMed Central

Page 1 of 1(page number not for citation purposes)

BMC Systems Biology

Open AccessPoster presentationVariance stabilising transformations for NMR metabolomics dataHelen Parsons* and Mark Viant

Address: School of Biosciences, The University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK

Email: Helen Parsons* - [email protected]

* Corresponding author

Classifying the fingerprint of an NMR spectrum is a crucialstep in many metabolomics experiments. Since many clas-sification techniques such as principal component analy-sis (PCA) depend upon variance discrepancies, it isimportant to first maximise any contribution fromwanted class variance between biological samples andminimise any contribution from unwanted technical var-iance arising from the preparation of the samples andmeasurement of the NMR metabolic fingerprints. Thegeneralised logarithm (glog) transform was developed tostabilise the variance between technical replicates in a twocomponent error model [1] and has also been applied toNMR spectra previously [2]. To increase the effectivenessof the transform on NMR spectra, the glog was extendedto include a baseline offset term. This decreases theunwanted noise contribution on the transformed spectra.The extended glog transformation is given as:

for z the transformed intensity and y the original intensityof the spectra. y0 and λ are transformation parameterswhich are to be found.

Here we have applied the extended glog transform to tech-nical replicates of NMR spectra of tissue extracts frommarine mussels, to determine the optimised transforma-tion parameters λ and y0. Next we applied the optimisedtransformation to a data set comprised of two classes ofNMR spectra from stressed and unstressed mussels. Fol-lowing transformation, the results show significantly bet-

ter separation of the classes on a PCA scores plot than canbe achieved with both untransformed data and also datatransformed using Pareto scaling, a widely used methodin NMR metabolomics [3]. In conclusion, we have dem-onstrated the value of the extended glog transformation tostabilise the technical variance in an NMR metabolomicsdataset and have achieved significantly improved classifi-cation of NMR fingerprints from stressed and unstressedanimals.

References1. Rocke D, Lorenzato S: A two-component model of measure-

ment error in analytical chemistry. Technometrics 1995,37(2):176-184.

2. Purohit P, Rocke D, Viant M, Woodruff D: Discrimination modelsusing variance-stabilizing transformation of metabolomicNMR data. OMICS 2004, 8(2):118-130.

3. Keun H, Ebbels T, Antti H, Bollard M, Beckonert O, Holmes E, LindonJ, Nicholson J: Improved analysis of multivariate data by varia-ble stability scaling: application to NMR-based metabolicprofiling. Analytica Chimica Acta 2003, 490(1):265-276.

from BioSysBio 2007: Systems Biology, Bioinformatics and Synthetic BiologyManchester, UK. 11–13 January 2007

Published: 8 May 2007

BMC Systems Biology 2007, 1(Suppl 1):P22 doi:10.1186/1752-0509-1-S1-P22

<supplement> <title> <p>BioSysBio 2007: Systems Biology, Bioinformatics, Synthetic Biology</p> </title> <editor>John Cumbers, Xu Gu, Jong Sze Wong</editor> <note>Meeting abstracts – A single PDF containing all abstracts in this Supplement is available <a href="www.biomedcentral.com/content/files/pdf/1752-0509-1-S1-full.pdf">here</a>.</note> <url>http://www.biomedcentral.com/content/pdf/1752-0509-1-S1-info.pdf</url> </supplement>

This abstract is available from: http://www.biomedcentral.com/1752-0509/1?issue=S1

© 2007 Parsons and Viant; licensee BioMed Central Ltd.

z y y y y= − + − +( )ln ( ) ( )0 02 λ