8
REVIEW Open Access Significance of bioinformatics in research of chronic obstructive pulmonary disease Hong Chen 1 and Xiangdong Wang 1,2* Abstract Chronic obstructive pulmonary disease (COPD) is an inflammatory disease characterized by the progressive deterioration of pulmonary function and increasing airway obstruction, with high morality all over the world. The advent of high-throughput omics techniques provided an opportunity to gain insights into disease pathogenesis and process which contribute to the heterogeneity, and find target-specific and disease-specific therapies. As an interdispline, bioinformatics supplied vital information on integrative understanding of COPD. This review focused on application of bioinformatics in COPD study, including biomarkers searching and systems biology. We also presented the requirements and challenges in implementing bioinformatics to COPD research and interpreted these results as clinical physicians. Keywords: bioinformatics, genomics, proteomics, chronic obstructive pulmonary disease, biomarkers Introduction Chronic obstructive pulmonary disease (COPD) is an inflammatory disease characterized by the progressive deterioration of pulmonary function and increasing air- way obstruction [1,2]. It can be caused by inflammatory responses triggered by noxious particles or gas, most commonly from tobacco smoking and is accompanied by chronic bronchitis and emphysema [3,4]. Some patients go on to require long-term oxygen therapy or even lung transplantation [3]. COPD was ranked as fourth leading cause of death worldwide and is esti- mated to become the top third cause of mortality by 2020 [5]. According to the data in China, COPD ranks as the fourth leading cause of death in urban areas and third in rural areas[6]. The high mortality and morbidity with COPD, and its chronic progressive nature, have promoted the need to investigate the underlying mechanisms and identify biomarkers for diagnosis, prog- nosis and drug target. The understanding of COPD increased by advanced molecular biology approaches, genetically modified ani- mals, virally administered genes, and high-throughput transcriptional profiling approaches. High-throughput methodologies, such as genomics and proteomics, are commonly used. The variety of data from biology, mainly in the form of DNA, RNA and protein sequences is putting heavy demand in computer sciences and com- putational biology. Bioinformatics, including many sub- disciplines, such as genomics, proteomics and system biology, is an integration of mathematical, statistical, and computational methods to analyze biological, bio- chemical, and biophysical data. Compared to wet-lab method, bioinformatics focused on data mining via com- putational means. Sophisticated bioinformatics techni- ques are developed to analyze the vast amount of data generated from genomics and proteomics studies, such as gene and protein function, interactions and metabolic and regulatory pathways. However, there is still a great challenge to combine the computer figures with clinical data for both bench-scientists and bedside-physicians. In COPD studies, there are usually three ways to ana- lyze omicsdata: 1) search correlation between single gene or protein and some clinical features in order to find diagnostic or prognostic biomarkers; 2) integrate clinical and wet-lab information, or omics data from dif- ferent levels for database establishment and computa- tional models. In this current review, we discussed application of bioinformatics in COPD study. We also presented the requirements and challenges in imple- menting bioinformatics to COPD research, and gave * Correspondence: [email protected] 1 Department of Pulmonary Medicine, Zhongshan Hospital, Fudan University, Shanghai, China Full list of author information is available at the end of the article Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35 http://www.jclinbioinformatics.com/content/1/1/35 JOURNAL OF CLINICAL BIOINFORMATICS © 2011 Chen and Wang; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

REVIEW Open Access

Significance of bioinformatics in research ofchronic obstructive pulmonary diseaseHong Chen1 and Xiangdong Wang1,2*

Abstract

Chronic obstructive pulmonary disease (COPD) is an inflammatory disease characterized by the progressivedeterioration of pulmonary function and increasing airway obstruction, with high morality all over the world. Theadvent of high-throughput omics techniques provided an opportunity to gain insights into disease pathogenesisand process which contribute to the heterogeneity, and find target-specific and disease-specific therapies. As aninterdispline, bioinformatics supplied vital information on integrative understanding of COPD. This review focusedon application of bioinformatics in COPD study, including biomarkers searching and systems biology. We alsopresented the requirements and challenges in implementing bioinformatics to COPD research and interpretedthese results as clinical physicians.

Keywords: bioinformatics, genomics, proteomics, chronic obstructive pulmonary disease, biomarkers

IntroductionChronic obstructive pulmonary disease (COPD) is aninflammatory disease characterized by the progressivedeterioration of pulmonary function and increasing air-way obstruction [1,2]. It can be caused by inflammatoryresponses triggered by noxious particles or gas, mostcommonly from tobacco smoking and is accompaniedby chronic bronchitis and emphysema [3,4]. Somepatients go on to require long-term oxygen therapy oreven lung transplantation [3]. COPD was ranked asfourth leading cause of death worldwide and is esti-mated to become the top third cause of mortality by2020 [5]. According to the data in China, COPD ranksas the fourth leading cause of death in urban areas andthird in rural areas[6]. The high mortality and morbiditywith COPD, and its chronic progressive nature, havepromoted the need to investigate the underlyingmechanisms and identify biomarkers for diagnosis, prog-nosis and drug target.The understanding of COPD increased by advanced

molecular biology approaches, genetically modified ani-mals, virally administered genes, and high-throughputtranscriptional profiling approaches. High-throughput

methodologies, such as genomics and proteomics, arecommonly used. The variety of data from biology,mainly in the form of DNA, RNA and protein sequencesis putting heavy demand in computer sciences and com-putational biology. Bioinformatics, including many sub-disciplines, such as genomics, proteomics and systembiology, is an integration of mathematical, statistical,and computational methods to analyze biological, bio-chemical, and biophysical data. Compared to wet-labmethod, bioinformatics focused on data mining via com-putational means. Sophisticated bioinformatics techni-ques are developed to analyze the vast amount of datagenerated from genomics and proteomics studies, suchas gene and protein function, interactions and metabolicand regulatory pathways. However, there is still a greatchallenge to combine the computer figures with clinicaldata for both bench-scientists and bedside-physicians.In COPD studies, there are usually three ways to ana-

lyze ‘omics’ data: 1) search correlation between singlegene or protein and some clinical features in order tofind diagnostic or prognostic biomarkers; 2) integrateclinical and wet-lab information, or omics data from dif-ferent levels for database establishment and computa-tional models. In this current review, we discussedapplication of bioinformatics in COPD study. We alsopresented the requirements and challenges in imple-menting bioinformatics to COPD research, and gave

* Correspondence: [email protected] of Pulmonary Medicine, Zhongshan Hospital, Fudan University,Shanghai, ChinaFull list of author information is available at the end of the article

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

JOURNAL OF CLINICAL BIOINFORMATICS

© 2011 Chen and Wang; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the CreativeCommons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.

Page 2: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

some suggestions on how to interprete these results asclinical physicians.

Application of bioinformatics in biomarkersearchingThe diagnosis of COPD is based on the presence oftypical symptoms of cough and shortness of breath,together with the presence of risk factors, and is con-firmed by spirometry. Therefore, searching for betterbiomarkers with high specificity and sensitivity indicat-ing the staging and severity of COPD remain as majorconcerns for clinical physicians. The main value of bio-markers in COPD would be in early diagnosis and toprovide the early proof of drug efficacy during the treat-ment [7]. As a biomarker for COPD, it is expected to bedetected in human lung fluids or tissues, sensitive to theprogress of COPD, disease-specific to COPD and asso-ciated with the status of patients [8]. In these researches,selected genes or proteins usually combined with severalclinical features, such as disease susceptibility, lungfunction, via statistical methods, i.e. logistic regression.

GenomicsIt is believed that many genetic factors increase a per-son’s risk of developing COPD[9]. The high mortalityand morbidity associated with COPD, and its chronicand progressive nature, has prompted the use of mole-cular genetic studies in an attempt to identify suscept-ibility factors for the disease. The advent of high-throughput methodologies to study genetic backgroundvariability, epigenetic regulation allowed us to explainthe individual variability in the susceptibility of humandiseases.Single nucleotide polymorphism (SNP) was a common

method in COPD study. These analyses usually study agroup of candidate genes, and then perform statisticaltest in different populations. The gene-related suscept-ibility can be approached by testing or unbiased studydesigns [10]. By SNP microarray, many genetic factorswere found to be related with the individual risk ofdeveloping COPD [9]. Apart from recognized deficiencyof alpha1antitrypsin [9], genomics in COPD found thatother gene alleles, such as IREB2[11], CYP2E1 andNAT2[12], CYP1A1, CYP1A2 and CYBA[13], TNF-a[14] were associated with COPD susceptibility. PIM3allele of the alpha1antitrypsin gene had an associationwith the pathogenesis of COPD in the Indian population[15]. The polymorphisms in SP-A1 and SP-A2[16],COX2 and p53 risk-alleles [17]might be genetic factorscontributing to the susceptibility to COPD. On theother hand, COPD could influence single gene expres-sion as well, such as cathepsin inhibitory cystatin A[18].COPD is featured by decline in lung function in dis-

ease progression. Therefore, except susceptibility, other

case-control genomics studies focused on the associationof several gene alleles with lung function. 105V/114Valleles of GSTP1 and 113H/139H alleles of mEPHX andthe combination of genotypes with same alleles wereassociated with imbalanced oxidative stress and lungfunction in patients [19]. Polymorphisms in ADAM33were associated with COPD and lung function declinein long-term smokers [20] and general population [21].The variants and their combinations of eNOS -786C,-922G, and 4A alleles in endothelial cells contribute todisturbed pulmonary function and oxidative stress inCOPD[22].An interleukin13 polymorphism in the promoter

region may modulate the adverse effects of cigarettesmoking on pulmonary function in long-term cigarettesmokers [23]. These genomics findings suggested thatenvironmental influences were important in COPD.Genetic polymorphisms contributed to the developmentof COPD, especially to the declined lung function. Thediversity in human genes could help us to understandthe susceptibility among different ethics and differentpopulations. The dissection of the genetic basis of com-plex diseases and the development of highly individua-lized therapies remain lofty but achievable goals [24].

ProteomicsProteomics is the systematic study of the many anddiverse properties of protein profiles in a parallel man-ner with the aim of providing detailed descriptions ofthe structure, function and control of biological systemsin health and disease [25]. A major research objective isto search for biomarkers in complex biological fluids.The proteomic analysis highlights the avenues to inves-tigate protein profiles of cells, biopsies and fluids,explore protein-based mechanisms of human diseases,define subgroups of disease, and identify novel biomar-kers for diagnosis, therapy and prognosis of multiplediseases and discover new targets for drug development.In particular, the application of complementaryapproaches, including gel- and liquid chromatographymass spectrometry-based proteomic techniques on spu-tum and/or bronchoalveolar lavage may provide a betterunderstanding of the proteome differentially expressedamong the courses, severities and populations of COPD[7]. We have previously reviewed the clinical studies onCOPD proteomics, highlighted the proteomic-orientedmethods applied and evaluated the diagnostic or prog-nostic values of potential biomarkers [8]. Those studiesmainly focused on disease classification, biomarkerdetection, or identification of mechanism, while thethree components are related with each other in COPD(Figure 1).An important goal of proteomic studies is to under-

stand biological roles of specific proteins and develop

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

Page 2 of 8

Page 3: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

new therapeutic targets [26]. Although there are fewproteomics studies performed in COPD patients, severalpotential proteins have been regarded as biomarkers.For example, matrix metalloproteinase (MMP)-13 andthioredoxin-like 2 in lungs increased in patients withCOPD [27]. The serine and MMP proteinase networkwas considered as an important feature in predictingclinical worsening of airway obstruction [28]. Pulmonarysurfactant A was found to link to the pathogenesis ofCOPD and could be considered as a potential COPDbiomarker [29]. Proteomic screening of sputum yieldspotential biomarkers of inflammation [30]. Airway andparenchymal phenotypes of COPD were suggested to beassociated with unique systemic serum biomarker pro-files [31]. The utility of proteomic profiling wouldimprove the understanding of molecular mechanismsinvolved in cigarette smoking-related COPD by identify-ing plasma proteins that correlate with declined lungfunction [32]. The concentrations of neutrophil defen-sins 1 and 2, calgranulin A, and calgranulin B were ele-vated in smokers with COPD when compared toasymptomatic smokers [33]. Other candidates, likeserum amyloid A [34], plasma retinal-binding protein,apolipoprotein E, inter-alpha-trypsininhibitor heavychain H4, and glutathione peroxidase[35], were alsobeen detected in plasma in COPD patients by proteo-mics approaches.

MetabolomicsMetabolomics is a global way to understanding regula-tion of metabolic pathways and metabolic networks of abiologic system[36]. Metabolites trigonelline, hippurateand formate in urinary were identified to be associatedwith baseline lung function of COPD patients and con-sidered to reflect lifestyle differences affecting overallhealth [37]. Another way to analyze omics data is clus-tering data to form a specific pattern for differentgroups by principal component analysis (PCA). Combi-nation of PCA and metabolomics identified the meta-bolic fingerprint of exhaled breath condensate of COPDpatients [38].

Multiplexed ELISAA recent advancement in ELISA is the multiplexedELISA which could determine multiple proteins withina single tissue sample. It has recently been shown to bemore sensitive than standard ELISA once optimized fora particular cytokine [39-41] and could be a promisingdiagnostic assay in lung diseases [42]. Moreover, themeasurement of multiple cytokines is required for manydiseases, particularly those like COPD that arise from acomplex process of initiation and progression of inflam-mation network. Simultaneous detection of multiplecytokines will undoubtedly provide a more powerfultool to quantifiably measure cytokines in different stagesof COPD. With help of this high-throughput measure-ment platform, we could integrate both biologic andclinical data to inform predictive multiscale models ran-ging from the molecular to the organ levels, as shown inFigure 2. For example, the integration of IL-9 pathwayand CCR3 pathway between biological function andpathology demonstrated cell proliferation-related remo-deling, intracellular signal-associated inflammatoryresponses and over-activation of kinases-correlatedemphysema in the pathogenesis of COPD. We proposethis new way will increase our insight into disease pro-cess and have great potential to identify new biomarkersfor disease diagnosis as well as novel therapeutic targets.

Systems biology and database establishmentThe difficulties encountered whilst exploring pathogen-esis and searching for biomarkers may be due in part tothe complex nature of COPD, which comprises a broadspectrum of histopathological findings and respiratorysymptoms [43]. All genetic information and molecularknowledge need to be semantically incorporated andassociated with clinical and experimental data. Systembiology in COPD (as reviewed before[44]) presented amanifold understanding of the complexity of COPD,therefore advanced biomedical research and drug devel-opment. This approach relies on global genome, tran-scriptome, proteome, and metabolome data sets

Figure 1 Overview of the utility of bioinformatics in chronicobstructive pulmonary disease. Both genomics and proteomicsprovide information on candidates. By searching in various datasetsand combining with clinical profiles, ‘omis’ studies may help toexplain questions on disease classification, biomarker detection, andidentification of mechanism.

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

Page 3 of 8

Page 4: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

collected in cross-sectional patient cohorts with high-throughput measurement platforms and integrated withbiologic and clinical data to inform predictive multidi-mensional models ranging from the molecular to theorgan levels[45].Comandini etc. [46] assayed a number of published

studies by creating a smoker datasets on which to per-form data-mining analysis. They utilized Ingenuity Path-ways Analysis, a web-based application that enablesidentifying relationships, biological mechanisms, func-tions, and pathways of relevance associated with themolecules under study. Their findings supported thecentral role of anti-oxidant genes in smoking populationand suggested Nrf2 may be a COPD risk biomarker. Abrand new knowledge base was generated from clinicaland experimental data for COPD based on BioXM soft-ware platform [47]. This integrated database reducedimplementation time and effort for the knowledge basecompared to similar systems and provided a free, com-prehensive, easy to use resource for all COPD relatedclinical research.Clinical profiles could also be considered as a form of

omics data since it provided a large quantity of patients’information in a direct conservative way. An in-silicoresearch applied various explorative analysis techniques(PCA, MCA, MDS) and unsupervised clustering

methods (KHM) to study a large dataset, acquired from415 COPD patients, to assess the presence of hiddenstructures in data corresponding to the different COPDphenotypes observed in clinical practice. This study maybe considered as a methodological example showingpossible applications of intelligent data analysis andvisual exploratory techniques to investigate clinicalaspects of chronic pathologies where a mathematicalreferring model is generally missing[48].

How to understand bioinformatics as clinicalphysiciansCOPD has been approached by genomic and proteomictechnologies to allow us to identify patterns of gene/protein expression that track with clinical disease or toidentify new pathways involved in disease pathogenesis.The results from these initial studies highlight thepotential for these omics approaches to reveal novelinsights into the pathogenesis of COPD and providenew tools to improve diagnosis, clinical classification,course prediction, and response to therapy. Existingknowledge such as genotype-phenotype relations or sig-nal transduction pathways must be semantically inte-grated and dynamically organized into structurednetworks that are connected with clinical and experi-mental data. This will require collaboration among

Figure 2 The integration of IL-9 pathway and CCR3 pathway between biological function and pathology demonstrated cellproliferation-related remodeling, intracellular signal-associated inflammatory responses and over-activation of kinases-correlatedemphysema in the pathogenesis of COPD.

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

Page 4 of 8

Page 5: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

multidisciplinary groups with expertise in the respectivetechnologies, bioinformatics, and clinical medicine forthe disease. More and more clinical physicians began torealize the promise of these studies and the potential torevolutionize the diagnosis and treatment of COPD,while obstacles still existed between these laboratoryfindings and their applications in clinical practice.Omics results need to be interpreted by translationalmedicine and systems biology. Since the vocabulary ofsystems biology is different than that of molecular biol-ogy or clinical research, the biggest challenge is the shiftin thinking[49].Instead of a single-factor approach, which is highly

effective in the lab, we need to think globally as clinicalphysicians. We need to shift from an approach that triesto explain lung pathogenesis by “one molecule, one celltype” to approach that looks at the network of interac-tions between multiple molecules, pathways, and cells.Given that all the samples were collected from human,it would be of great significance to standardize patientgroups. Criteria of clinical informatics and medicalinformatics, including age, gender, smoking history, sta-ging, complications and clinical signs as well as exami-nations, should be fully considered before and after anyomics investigation. We also need to pay attention tothe relation between clinical data and laboratory find-ings. For this to occur, a well-done history and physicalexamination would be helpful to supplement theselaboratory figures by providing multiple features ofhuman COPD.Although all of these exciting technological advances

that exponentially increase the levels of knowledgeabout every disease and model serve as facilitators ofintegration, they do not inherently provide integrativemodels of disease. Therefore, we proposed that digitaliz-ing essential clinical profiles, such as symptoms andsigns, by questionnaires and/or scores, would providedirect vision for physicians and shrink the distancebetween lab discovery and clinical condition. The com-bination of epidemiologists, clinicians, geneticists andspecialists in bioinformatics, in addition to specialists indisciplines less familiar to epidemiologists, is critical tobe prepared for new phenotypic characterizations basedon transcriptome and proteome [50]. Even though wehave spotted considerable advancement in bioinfor-matics, it still calls for more collaboration to fulfill itspotential (Figure 3). Interdisciplinary teams should allowus to access omics datasets integratively and generate aglobal model of COPD. It is important to have a specialattention from proteomic scientists to explore the com-bination between advanced proteomic biotechnology,clinical proteomics, tissue imaging and profiling, andorgan dysfunction score systems, to improve the clinicaloutcomes of these patients [51]. There is still a great

need to explore the COPD-specific and/or related tran-scriptional factors and regulation networks generatedfrom omics and bioinformatics like in other diseases[41].

ConclusionsThe use of high-throughput techniques for gene andprotein expression profiling and of computerized data-bases has become a mainstay of biomedical research.There is a need to perform omics studies on patientswith COPD, describing the association with the diseasein terms of specificity, severity, progress and prognosisand monitoring the efficacy of therapies. These omicsanalysis highlight the ways to investigate protein profilesof cells, biopsies and fluids, explore protein-basedmechanisms of human diseases, identify novel biomar-kers for diagnosis, therapy and prognosis of multiplediseases and discover new targets for drug development,as shown in Figure 3. Although the number of clinicalstudies on COPD is limited, they still serve as the out-standing initiation for proteomic research in such acomplex disease. The analysis of protein profiles thatare up- or down-regulated, modified, secreted in the air-ways during the disease may yield vital evidences tounderstand the pathogenesis and discover new therapeu-tic targets for the disease (Figure 4) [52]. With manyguidelines now in place and model studies on which todesign future experiments, there is reason to be optimis-tic that candidate protein biomarkers will be discoveredusing proteomics and translated into clinical assays[53].With better study design standardization and the imple-mentation of novel technologies to reach the optimalresearch standard, there is enough reason be optimistic

Figure 3 Bioinformatics begin to take part in COPDinvestigation. The collaboration of epidemiologists, clinicians,geneticists and specialists in bioinformatics, in addition to specialistsin disciplines less familiar to epidemiologists, is critical for furtherstudy. More co-operations are still needed.

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

Page 5 of 8

Page 6: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

about the future of omics research and its clinical impli-cations [54]. Clinical bioinformatics on COPD could beachieved from the combination of clinical informatics,medical informatics, bioinformatics and informatics bycollaborations among clinicians, bioinformaticians, com-puter scientists, biologists, and mathematicians.

Author details1Department of Pulmonary Medicine, Zhongshan Hospital, Fudan University,Shanghai, China. 2Biomedical Research Center, Zhongshan Hospital, FudanUniversity, Shanghai, China.

Authors’ contributionsHC mainly wrote the manuscript, as well as the revision. XDW conceived ofthe review and edited and wrote part of the manuscript. All authors readand approved the final manuscript.

Competing interestsThe authors declare that they have no competing interests.

Received: 3 October 2010 Accepted: 20 December 2011Published: 20 December 2011

References1. Celli BR, MacNee W: Standards for the diagnosis and treatment of

patients with COPD: a summary of the ATS/ERS position paper. Eur RespirJ 2004, 23(6):932-946.

2. Pauwels RA, Rabe KF: Burden and clinical features of chronic obstructivepulmonary disease (COPD). Lancet 2004, 364(9434):613-620.

3. Hogg JC, Chu F, Utokaparch S, Woods R, Elliott WM, Buzatu L,Cherniack RM, Rogers RM, Sciurba FC, Coxson HO, et al: The nature ofsmall-airway obstruction in chronic obstructive pulmonary disease. NEngl J Med 2004, 350(26):2645-2653.

4. Rabe KF, Hurd S, Anzueto A, Barnes PJ, Buist SA, Calverley P, Fukuchi Y,Jenkins C, Rodriguez-Roisin R, van Weel C, et al: Global strategy for thediagnosis, management, and prevention of chronic obstructivepulmonary disease: GOLD executive summary. Am J Respir Crit Care Med2007, 176(6):532-555.

5. Lopez AD, Murray CC: The global burden of disease, 1990-2020. Nat Med1998, 4(11):1241-1243.

6. Fang X, Wang X, Bai C: COPD in China: the burden and importance ofproper management. Chest 2011, 139(4):920-929.

7. Casado B, Iadarola P, Luisetti M, Kussmann M: Proteomics-based diagnosisof chronic obstructive pulmonary disease: the hunt for new markers.Expert Rev Proteomics 2008, 5(5):693-704.

8. Chen H, Wang D, Bai C, Wang X: Proteomics-based biomarkers in chronicobstructive pulmonary disease. J Proteome Res 2010, 9(6):2798-2808.

9. Pauwels RA, Buist AS, Calverley PM, Jenkins CR, Hurd SS: Global strategy forthe diagnosis, management, and prevention of chronic obstructive

Resources

Lungs Laser Capture Microdissection:Isolation of Specific CellsTissue frozen section Selected cells

cDNA Microarray:Key Genes

Identification

Lists of differentiallyexpressed genes

Probe Set Gene100057_at Cluster Incl AI838320:UI-M-AO0-aby-f-05-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-AO0-aby-f-05-0-UI /clone_end=3 /gb=AI838320 /gi=5472533 /ug=Mm.4467 /len=529100059_at cytochrome b-245, alpha polypeptide100069_at cytochrome P450, 2f2100311_f_at eosinophil-associated ribonuclease 1100331_g_at thioredoxin peroxidase, pseudogene 1100380_at H3 histone, family 3A100397_at TYRO protein tyrosine kinase binding protein100569_at annexin A2100583_at Cluster Incl J00475:Mouse germline IgH chain gene, DJC region- segment D-FL16.1 /cds=(0,1033) /gb=J00475 /gi=194417 /ug=Mm.6160 /len=1034100597_at glycogenin 1100711_at ribosomal protein L10A100944_at Cluster Incl AA958903:ua19f08.r1 Mus musculus cDNA, 5 end /clone=IMAGE-1347207 /clone_end=5 /gb=AA958903 /gi=3125133 /ug=Mm.6381 /len=384100946_at neuraminidase 1101019_at cathepsin C101045_at hydroxyacyl-Coenzyme A dehydrogenase, type II101048_at protein tyrosine phosphatase, receptor type, C101137_at ribosomal protein S3101212_at ribosomal protein S7101482_at protein phosphatase 1, catalytic subunit, gamma isoform101656_f_at Cluster Incl U60442:Mus musculus Ig kappa light chain mRNA, partial cds /cds=(0,363) /gb=U60442 /gi=1421746 /ug=Mm.30406 /len=391101753_s_at P lysozyme structural101781_f_at Cluster Incl V00754:Mouse gene coding for embryonic H3 histone /cds=(0,407) /gb=V00754 /gi=51297 /ug=Mm.57118 /len=408101876_s_at histocompatibility 2, T region locus 10101881_g_at procollagen, type XVIII, alpha 1101908_s_at CEA-related cell adhesion molecule 2101955_at heat shock 70kD protein 5 (glucose-regulated protein, 78kD)101973_at Cbp/p300-interacting transactivator, w ith Glu/Asp-rich carboxy-terminal domain, 2102029_at interleukin 16102104_f_at Cluster Incl AI504305:vl04g11.x1 Mus musculus cDNA, 3 end /clone=IMAGE-963236 /clone_end=3 /gb=AI504305 /gi=4402156 /ug=Mm.42485 /len=580102156_f_at Cluster Incl M80423:Mus castaneus IgK chain gene, C-region, 3 end /cds=(0,322) /gb=M80423 /gi=196865 /ug=Mm.46804 /len=323102196_at Cluster Incl U37413:Guanine nucleotide binding protein, alpha 11 /cds=(57,1136) /gb=U37413 /gi=1051237 /ug=Mm.989 /len=1342102207_at delta-like homolog (Drosophila)102208_at sialyltransferase 10 (alpha-2,3-sialyltransferase VI)102217_at G protein-coupled receptor kinase 5102372_at Cluster Incl M90766:Mouse Ig active joining chain (J chain) mRNA of the b allele, complete cds /cds=(21,500) /gb=M90766 /gi=194473 /ug=Mm.1192 /len=501102585_f_at Cluster Incl AB017349:Mus musculus mRNA for immunoglobulin light chain V region, partial cds /cds=(0,353) /gb=AB017349 /gi=4519566 /ug=Mm.57251 /len=360102641_at SFFV proviral integration 1102767_at EST AA536815102806_g_at CEA-related cell adhesion molecule 1102906_at T-cell specif ic GTPase102920_at Cluster Incl AW215585:up06c04.x1 Mus musculus cDNA, 3 end /clone=IMAGE-2651238 /clone_end=3 /gb=AW215585 /gi=6526261 /ug=Mm.1622 /len=329102995_s_at granzyme A103035_at ATP-binding cassette, sub-family B (MDR/TAP), member 2103089_at CD48 antigen103200_at Cluster Incl AA711773:vu58g05.r1 Mus musculus cDNA, 5 end /clone=IMAGE-1195640 /clone_end=5 /gb=AA711773 /gi=2721691 /ug=Mm.1902 /len=473103240_f_at eosinophil-associated ribonuclease 3103448_at S100 calcium binding protein A8 (calgranulin A)103462_at Cluster Incl AW122239:UI-M-BH2.2-aov-f-02-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-BH2.2-aov-f-02-0-UI /clone_end=3 /gb=AW122239 /gi=6097702 /ug=Mm.2173 /len=549103560_at Cluster Incl AW124401:UI-M-BH2.1-ape-d-04-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-BH2.1-ape-d-04-0-UI /clone_end=3 /gb=AW124401 /gi=6099931 /ug=Mm.19119 /len=217103887_at S100 calcium-binding protein A9 (calgranulin B)104014_at hemochromatosis104071_at Cluster Incl AI852433:UI-M-BH0-ajo-g-04-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-BH0-ajo-g-04-0-UI /clone_end=3 /gb=AI852433 /gi=5496304 /ug=Mm.21639 /len=544104093_at lymphocyte specif ic 1104173_at Cluster Incl AA797989:ub60d04.r1 Mus musculus cDNA, 5 end /clone=IMAGE-1382119 /clone_end=5 /gb=AA797989 /gi=2860944 /ug=Mm.27253 /len=413104267_at Cluster Incl AI844736:UI-M-AL1-ahq-c-10-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-AL1-ahq-c-10-0-UI /clone_end=3 /gb=AI844736 /gi=5488642 /ug=Mm.28535 /len=406104298_at Cluster Incl AI842544:UI-M-AQ1-aea-e-05-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-AQ1-aea-e-05-0-UI /clone_end=3 /gb=AI842544 /gi=5476757 /ug=Mm.22337 /len=457104415_at EST AI461938104606_at CD52 antigen104735_at Cluster Incl AI842065:UI-M-AN1-afg-a-10-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-AN1-afg-a-10-0-UI /clone_end=3 /gb=AI842065 /gi=5476278 /ug=Mm.24647 /len=444160108_at p8 protein160117_at Cluster Incl AI850638:UI-M-BG1-aik-e-06-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-BG1-aik-e-06-0-UI /clone_end=3 /gb=AI850638 /gi=5494544 /ug=Mm.19258 /len=381 /NOTE=rep160156_at Cluster Incl AA981015:vx55c11.r1 Mus musculus cDNA, 5 end /clone=IMAGE-1279124 /clone_end=5 /gb=AA981015 /gi=3159551 /ug=Mm.22383 /len=343 /NOTE=replacement for probe 160386_at Cluster Incl AI835981:UI-M-AQ0-aah-d-09-0-UI.s2 Mus musculus cDNA, 3 end /clone=UI-M-AQ0-aah-d-09-0-UI /clone_end=3 /gb=AI835981 /gi=5470234 /ug=Mm.30099 /len=318 /NOTE=r160388_at Cluster Incl AI848668:UI-M-AH1-agv-a-10-0-UI.s1 Mus musculus cDNA, 3 end /clone=UI-M-AH1-agv-a-10-0-UI /clone_end=3 /gb=AI848668 /gi=5492539 /ug=Mm.30119 /len=421 /NOTE=re160402_at ribosomal protein L13160493_at Cd63 antigen160495_at aryl-hydrocarbon receptor160502_at cellular repressor of E1A-stimulated genes

Statistical analysis

Criteria for genes Differentially expressed

Targets

Pathophysiologies

Hierarchical Clustering

Real Time Quantitative PCR:Assess Gene Expression

Figure 4 Clinical bioinformatics can be generated from the analysis of COPD-specific pathological alterations. Patients with COPD canbe selected by clinical informatics and criteria and the specific area is selected by micro-dissection. The tissue can be used for genomic (orproteomic) analysis and identified biomarkers are validated for the understanding of the pathogenesis.

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

Page 6 of 8

Page 7: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

pulmonary disease. NHLBI/WHO Global Initiative for Chronic ObstructiveLung Disease (GOLD) Workshop summary. Am J Respir Crit Care Med 2001,163(5):1256-1276.

10. Silverman EK, Spira A, Pare PD: Genetics and genomics of chronicobstructive pulmonary disease. Proc Am Thorac Soc 2009, 6(6):539-542.

11. DeMeo DL, Mariani T, Bhattacharya S, Srisuma S, Lange C, Litonjua A,Bueno R, Pillai SG, Lomas DA, Sparrow D, et al: Integration of genomic andgenetic approaches implicates IREB2 as a COPD susceptibility gene. AmJ Hum Genet 2009, 85(4):493-502.

12. Arif E, Vibhuti A, Alam P, Deepak D, Singh B, Athar M, Pasha MA:Association of CYP2E1 and NAT2 gene polymorphisms with chronicobstructive pulmonary disease. Clin Chim Acta 2007, 382(1-2):37-42.

13. Vibhuti A, Arif E, Mishra A, Deepak D, Singh B, Rahman I, Mohammad G,Pasha MA: CYP1A1, CYP1A2 and CYBA gene polymorphisms associatedwith oxidative stress in COPD. Clin Chim Acta 2010, 411(7-8):474-480.

14. Zhan P, Wang J, Wei SZ, Qian Q, Qiu LX, Yu LK, Song Y: TNF-308 genepolymorphism is associated with COPD risk among Asians: meta-analysisof data for 6,118 subjects. Mol Biol Rep 2011, 38(1):219-227.

15. Gupta J, Bhadoria DP, Lal MK, Kukreti R, Chattopadhaya D, Gupta VK,Dabur R, Yadav V, Chhillar AK, Sharma GL: Association of the PIM3 allele ofthe alpha-1-antitrypsin gene with chronic obstructive pulmonarydisease. Clin Biochem 2005, 38(5):489-491.

16. Saxena S, Kumar R, Madan T, Gupta V, Muralidhar K, Sarma PU: Associationof polymorphisms in pulmonary surfactant protein A1 and A2 geneswith high-altitude pulmonary edema. Chest 2005, 128(3):1611-1619.

17. Arif E, Vibhuti A, Deepak D, Singh B, Siddiqui MS, Pasha MA: COX2 and p53risk-alleles coexist in COPD. Clin Chim Acta 2008, 397(1-2):48-50.

18. Butler MW, Fukui T, Salit J, Shaykhiev R, Mezey JG, Hackett NR, Crystal RG:Modulation of Cystatin A Expression in Human Airway EpitheliumRelated to Genotype, Smoking, COPD, and Lung Cancer. Cancer Res 2011,71(7):2572-2581.

19. Vibhuti A, Arif E, Deepak D, Singh B, Qadar Pasha MA: Geneticpolymorphisms of GSTP1 and mEPHX correlate with oxidative stressmarkers and lung function in COPD. Biochem Biophys Res Commun 2007,359(1):136-142.

20. Sadeghnejad A, Ohar JA, Zheng SL, Sterling DA, Hawkins GA, Meyers DA,Bleecker ER: Adam33 polymorphisms are associated with COPD and lungfunction in long-term tobacco smokers. Respir Res 2009, 10:21.

21. van Diemen CC, Postma DS, Vonk JM, Bruinenberg M, Schouten JP,Boezen HM: A disintegrin and metalloprotease 33 polymorphisms andlung function decline in the general population. Am J Respir Crit CareMed 2005, 172(3):329-333.

22. Arif E, Ahsan A, Vibhuti A, Rajput C, Deepak D, Athar M, Singh B, Pasha MA:Endothelial nitric oxide synthase gene variants contribute to oxidativestress in COPD. Biochem Biophys Res Commun 2007, 361(1):182-188.

23. Sadeghnejad A, Meyers DA, Bottai M, Sterling DA, Bleecker ER, Ohar JA:IL13 promoter polymorphism 1112C/T modulates the adverse effect oftobacco smoking on lung function. Am J Respir Crit Care Med 2007,176(8):748-752.

24. Desai AA, Hysi P, Garcia JG: Integrating genomic and clinical medicine:searching for susceptibility genes in complex lung diseases. Transl Res2008, 151(4):181-193.

25. Patterson SD, Aebersold RH: Proteomics: the first decade and beyond. NatGenet 2003, 33(Suppl):311-323.

26. Marko-Varga G, Fehniger TE: Proteomics and disease–the challenges fortechnology and discovery. J Proteome Res 2004, 3(2):167-178.

27. Lee EJ, In KH, Kim JH, Lee SY, Shin C, Shim JJ, Kang KH, Yoo SH, Kim CH,Kim HK, et al: Proteomic analysis in lung tissue of smokers and COPDpatients. Chest 2009, 135(2):344-352.

28. Sepper R, Prikk K: Proteomics: is it an approach to understand theprogression of chronic lung disorders? J Proteome Res 2004, 3(2):277-281.

29. Ohlmeier S, Vuolanto M, Toljamo T, Vuopala K, Salmenkivi K, Myllarniemi M,Kinnula VL: Proteomics of human lung tissue identifies surfactant proteinA as a marker of chronic obstructive pulmonary disease. J Proteome Res2008, 7(12):5125-5132.

30. Gray RD, MacGregor G, Noble D, Imrie M, Dewar M, Boyd AC, Innes JA,Porteous DJ, Greening AP: Sputum proteomics in inflammatory andsuppurative respiratory diseases. Am J Respir Crit Care Med 2008,178(5):444-452.

31. Bon JM, Leader JK, Weissfeld JL, Coxson HO, Zheng B, Branch RA,Kondragunta V, Lee JS, Zhang Y, Choi AM, et al: The influence of

radiographic phenotype and smoking status on peripheral bloodbiomarker patterns in chronic obstructive pulmonary disease. PLoS One2009, 4(8):e6865.

32. York TP, van den Oord EJ, Langston TB, Edmiston JS, McKinney W,Webb BT, Murrelle EL, Zedler BK, Flora JW: High-resolution massspectrometry proteomics for the identification of candidate plasmaprotein biomarkers for chronic obstructive pulmonary disease.Biomarkers 2010, 15(4):367-377.

33. Merkel D, Rist W, Seither P, Weith A, Lenter MC: Proteomic study ofhuman bronchoalveolar lavage fluids from smokers with chronicobstructive pulmonary disease by combining surface-enhanced laserdesorption/ionization-mass spectrometry profiling with massspectrometric protein identification. Proteomics 2005, 5(11):2972-2980.

34. Bozinovski S, Hutchinson A, Thompson M, Macgregor L, Black J, Giannakis E,Karlsson AS, Silvestrini R, Smallwood D, Vlahos R, et al: Serum amyloid a isa biomarker of acute exacerbations of chronic obstructive pulmonarydisease. Am J Respir Crit Care Med 2008, 177(3):269-278.

35. Bandow JE, Baker JD, Berth M, Painter C, Sepulveda OJ, Clark KA, Kilty I,VanBogelen RA: Improved image analysis workflow for 2-D gels enableslarge-scale 2-D gel-based proteomics studies–COPD biomarker discoverystudy. Proteomics 2008, 8(15):3030-3041.

36. Nicholson JK, Connelly J, Lindon JC, Holmes E: Metabonomics: a platformfor studying drug toxicity and gene function. Nat Rev Drug Discov 2002,1(2):153-161.

37. McClay JL, Adkins DE, Isern NG, O’Connell TM, Wooten JB, Zedler BK,Dasika MS, Webb BT, Webb-Robertson BJ, Pounds JG, et al: (1)H nuclearmagnetic resonance metabolomics analysis identifies novel urinarybiomarkers for lung function. J Proteome Res 2010, 9(6):3083-3090.

38. de Laurentiis G, Paris D, Melck D, Maniscalco M, Marsico S, Corso G,Motta A, Sofia M: Metabonomic analysis of exhaled breath condensate inadults by nuclear magnetic resonance spectroscopy. Eur Respir J 2008,32(5):1175-1183.

39. Trune DR, Larrain BE, Hausman FA, Kempton JB, Macarthur CJ:Simultaneous measurement of multiple ear proteins with multiplexELISA assays. Hear Res 2010.

40. Elshal MF, McCoy JP: Multiplex bead array assays: performance evaluationand comparison of sensitivity to ELISA. Methods 2006, 38(4):317-323.

41. Ray CA, Bowsher RR, Smith WC, Devanarayan V, Willey MB, Brandt JT,Dean RA: Development, validation, and implementation of a multipleximmunoassay for the simultaneous determination of five cytokines inhuman serum. J Pharm Biomed Anal 2005, 36(5):1037-1044.

42. Farlow EC, Patel K, Basu S, Lee BS, Kim AW, Coon JS, Faber LP, Bonomi P,Liptay MJ, Borgia JA: Development of a multiplexed tumor-associatedautoantibody-based blood test for the detection of non-small cell lungcancer. Clin Cancer Res 2010, 16(13):3452-3462.

43. Bhattacharya S, Srisuma S, Demeo DL, Shapiro SD, Bueno R, Silverman EK,Reilly JJ, Mariani TJ: Molecular biomarkers for quantitative and discreteCOPD phenotypes. Am J Respir Cell Mol Biol 2009, 40(3):359-367.

44. Agusti A, Sobradillo P, Celli B: Addressing the complexity of chronicobstructive pulmonary disease: from phenotypes and biomarkers toscale-free networks, systems biology, and P4 medicine. Am J Respir CritCare Med 2011, 183(9):1129-1137.

45. Auffray C, Adcock IM, Chung KF, Djukanovic R, Pison C, Sterk PJ: Anintegrative systems biology approach to understanding pulmonarydiseases. Chest 2010, 137(6):1410-1416.

46. Comandini A, Marzano V, Curradi G, Federici G, Urbani A, Saltini C: Markersof anti-oxidant response in tobacco smoke exposed subjects: a data-mining review. Pulm Pharmacol Ther 2010, 23(6):482-492.

47. Maier D, Kalus W, Wolff M, Kalko SG, Roca J, Marin de Mas I, Turan N,Cascante M, Falciani F, Hernandez M, et al: Knowledge management forsystems biology a general and visually driven framework applied totranslational medicine. BMC Syst Biol 2011, 5:38.

48. Paoletti M, Camiciottoli G, Meoni E, Bigazzi F, Cestelli L, Pistolesi M,Marchesi C: Explorative data analysis techniques and unsupervisedclustering methods to support clinical assessment of ChronicObstructive Pulmonary Disease (COPD) phenotypes. J Biomed Inform2009, 42(6):1013-1021.

49. Studer SM, Kaminski N: Towards systems biology of human pulmonaryfibrosis. Proc Am Thorac Soc 2007, 4(1):85-91.

50. Kauffmann F: Post-genome respiratory epidemiology: a multidisciplinarychallenge. Eur Respir J 2004, 24(3):471-480.

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

Page 7 of 8

Page 8: REVIEW Open Access Significance of bioinformatics in research … · 2017. 8. 25. · ing plasma proteins that correlate with declined lung function [32]. The concentrations of neutrophil

51. Wang X, Adler KB, Chaudry IH, Ward PA: Better understanding of organdysfunction requires proteomic involvement. J Proteome Res 2006,5(5):1060-1062.

52. Plymoth A, Yang Z, Lofdahl CG, Ekberg-Jansson A, Dahlback M, Fehniger TE,Marko-Varga G, Hancock WS: Rapid proteome analysis of bronchoalveolarlavage samples of lifelong smokers and never-smokers by micro-scaleliquid chromatography and mass spectrometry. Clin Chem 2006,52(4):671-679.

53. Reisdorph NA, Reisdorph R, Bowler R, Broccardo C: Proteomics methodsand applications for the practicing clinician. Ann Allergy Asthma Immunol2009, 102(6):523-529.

54. Lin JL, Bonnichsen MH, Nogeh EU, Raftery MJ, Thomas PS: Proteomics indetection and monitoring of asthma and smoking-related lung diseases.Expert Rev Proteomics 2010, 7(3):361-372.

doi:10.1186/2043-9113-1-35Cite this article as: Chen and Wang: Significance of bioinformatics inresearch of chronic obstructive pulmonary disease. Journal of ClinicalBioinformatics 2011 1:35.

Submit your next manuscript to BioMed Centraland take full advantage of:

• Convenient online submission

• Thorough peer review

• No space constraints or color figure charges

• Immediate publication on acceptance

• Inclusion in PubMed, CAS, Scopus and Google Scholar

• Research which is freely available for redistribution

Submit your manuscript at www.biomedcentral.com/submit

Chen and Wang Journal of Clinical Bioinformatics 2011, 1:35http://www.jclinbioinformatics.com/content/1/1/35

Page 8 of 8