16
GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected] 1 The GenePool delivering the genomics revolution The University of Edinburgh Genomics Facility Annual Report 2010 Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

1

The GenePool delivering the genomics revolution The University of Edinburgh Genomics Facility

Annual Report 2010

Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT

http://genepool.bio.ed.ac.uk

e: [email protected]

Page 2: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

2

Contents

[3] Executive Summary

[4] Introduction to the GenePool

[6] Highlights of the Year 2009-2010

[8] The Coming Year

[9] GenePool Staff

[11] GenePool Instrumentation

[12] Research Highlights: Publications

[13] Research Highlights: Projects Completed 2009-2010

[15] GenePool Funding Sources

[16] GenePool Contact Information

GenePool Annual Report 2010, published 01 September 2010

written by Prof Mark Blaxter (Director) and Dr Karim Gharbi (Scientific Manager)

digital copies available from http://genepool.bio.ed.ac.uk

Page 3: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

3

Executive Summary

The last year has been one of rapid change and consolidation for the GenePool. On 1st July 2009, we officially started our MRC Hub project, with the arrival of our new Scientific Manager Dr Karim Gharbi. Karim has taken on the mantle of managing the operations of the GenePool, and in particular steering the facility through a two-fold increase in staff and instrumentation. Karim brought not only skills in science management, but also much needed expertise in transmission genetics and genomics. He is now the first point of contact for most researchers wishing to access the GenePool facility, offering experimental and project design advice as well as expert overview of the whole next generation sequencing landscape.

With the MRC Hub award, the GenePool has appointed six new staff members, including wet lab technologists, bioinformaticians to build GenePool’s ability to deliver the analyses users require, and support staff to manage the increased project management load. We have been lucky to recruit skilled and dedicated new staff, and these additions to our team have already made a significant impact in improving lab and bioinformatic workflows.

As the MRC Hub project started, we finalised purchase of our second Illumina GAIIx instrument, and completed outright purchase of the Roche 454 Titanium instrument we had on placement from Roche. This sequencing capacity increase was matched by improvements in wet lab infrastructure, including Covaris focussed ultrasound and Hydroshear instruments for DNA shearing, improved size selection equipment and upgrades to the cooling and other infrastructure. On the data analysis side we installed an Isilon 36 Tbyte storage system (recently upgraded to 72 Tbyte) and purchased several high-end compute instruments, including one with 256 Gbyte RAM for genome assembly tasks. We continue to use the University of Edinburgh’s Edinburgh Compute and Data Facility heavily for both analysis and secure storage.

A major goal of the first year of the MRC Hub award was to trial different targeted resequencing methodologies for both bespoke and ‘whole exome’ resequencing studies. We have used and analysed both in-solution and solid-phase oligonucleotide hybridisation capture technologies (Agilent and Nimblegen), and also analysed custom PCR-amplicon libraries.

A second goal was to develop custom multiplexing adapter sets across the range of short-read sequencing applications. This is ongoing, and we will soon be able to offer deep multiplexing for many different sequencing approaches. For RAD sequencing, we have adapters compatible with SbfI overhangs available for up to 64 samples per lane.

Through the year to June 2010, GenePool staff were authors on 5 peer-reviewed publications relating to sequencing work, ranging from genome resequencing of Drosophila melanogaster, to metagenetic screening of littoral environments to digital transcriptomics of Trypanosoma parasites, and GenePool and/or the NBAF facility was credited or acknowledged on many more.

As the year closed, we received delivery of our Illumina HiSeq 2000 instrument, only the second in the UK and one of the first 10 in Europe. Once this 200 Gbase/week machine is in production, we will be prepared to meet the challenges of the next year of terabases of DNA sequence generation and analysis.

Page 4: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

4

Introduction to the GenePool

The GenePool is a leading next-generation genomics facility based in the Institute of Evolutionary Biology in the School of Biological Sciences of the University of Edinburgh. Using high-throughput sequencing instrumentation, and high-end computing facilities, we deliver collaborative access to cutting-edge genomics tools to the academic community.

This is the first annual report from the GenePool, despite the fact that the facility (under an evolving series of names) has been delivering DNA sequence data to the

biological sciences community for over 14 years. The ‘sequencing service’, as we were first known, was set up after a successful application by Prof. Rick Maizels and Prof. Josephine Pemberton to NERC for funding to purchase one of the then-cutting-edge ABI 373 gel-based fluorescent automatic sequencers. This instrument, and the quality of service delivered by Jill Lovell, the technician hired to run it, generated the enthusiasm and institutional support for development of a highly efficient Sanger dideoxy sequencing service

for the School of Biological Sciences under the direction of Mark Blaxter. The need for the local sequencing facility and the high quality of service delivered were recognised through the award of additional infrastructure grants from BBSRC and the Darwin Trust to purchase firstly ABI 377 gel-based, and then two ABI 3730 capillary-based Sanger sequencing instruments in the early 2000’s.

To fully exploit the installation in the facility, and following successful participation in the NERC Environmental Genomics programme, Mark Blaxter bid for, and was awarded, an ongoing contract with NERC Services and Facilities to deliver Sanger sequencing and bioinformatics support to NERC science through the Molecular Genetics Facility. At peak (2008), the Sanger sequencing facility processed nearly 200,000 reactions per year and employed five staff.

In 2006 and 2007 it became clear that the next generation of DNA sequencing instrumentation was approaching maturity, and also that the kinds of platforms that were being developed were well suited to installation in smaller but highly-skilled facilities such as the Sequencing Service.

Expressed sequence tags being sequenced on a GenePool ABI 3730 Sanger sequencing instrument

GenePool throughput, 1996-2007 (Sanger sequencing) and for the first 8 months of 2008 (including Illumina and Roche 454)

Page 5: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

5

Our user community, across the breadth of biological disciplines from medicine to ecology, also was clamouring for access to higher throughput sequencing at reasonable cost. Thus, in late 2007, we negotiated purchase of our first ‘next generation’ sequencing instrument, and concomitantly relaunched the facility under the GenePool banner. Upon installation in early 2008, the first run of the Illumina GAI more than doubled the GenePool’s lifetime sequence output: in the decade to 2008 we had generated 0.8 Gbase of Sanger data, and in the first month of 2008 we generated our second gigabase of 36-base Illumina SOLEXA reads (see graph opposite). In spring 2008, we added a Roche 454 FLX instrument to offer our user base access to the longer reads delivered by the pyrosequencing technology. To exploit the ever-increasing throughputs possible with the Illumina platform, the GAI was upgraded to a GAII, and then GAIIx over the next 18 months, and a second GAIIx was purchased in early summer 2009. The Roche 454 FLX was upgraded in spring 2009 to permit delivery of the longer reads available in the ‘Titanium’ version of 454 chemistry.

The increase in data generation capacity was matched by recruitment of both wet lab technology specialists to perform the exacting molecular biological manipulations required for preparation of DNA and RNA samples for sequencing, and skilled bioinformaticians to develop and implement analytical solutions to make sense of the deluge of data. The GenePool’s user base grew to include collaborators from across the UK, but especially from Scotland and the SULSA universities academic base. The NERC MGF contract was renewed, and next-generation sequencing added to the menu of analyses available to NERC science through the rebranded NERC Biomolecular Analysis Facility (NBAF)

In early 2009, the MRC announced its intention to fund academic sequencing centres across the UK, and the GenePool, in collaboration with five other institutions (The Roslin Institute and the MRC Human genetics Unit in the University of Edinburgh, the University of Glasgow, the University of Dundee and the University of Aberdeen) under Mark Blaxter’s leadership, was successful in winning a £2.5 M grant to become one of four MRC UK sequencing hubs. This investment has permitted purchase of additional instrumentation, appointment of additional staff (including, importantly ‘spokes’ bioinformaticians in the collaborating institutions) and development of the GenePool’s expertise across the spectrum of next generation sequencing applications.

Today, the GenePool employs 18 staff, and runs two Sanger sequencing ABI 3730 instruments, two Illumina GAIIx instruments, an Illumina HiSeq 2000, and a Roche 454 Titanium instrument. The GenePool can generate and analyse over 100 Gbase of DNA sequence data a week, and yet, importantly, also continues to offer a single-sample Sanger sequencing service to users.

Using the GenePool Illumina GAI to resequence the genome of a wild strain of the nematode Caenorhabditis elegans identifies over 44,000 single nucleotide polymorphisms compared to the reference sequence.

Page 6: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

6

Highlights of the Year 2009-2010

In the year to the end of June 2010, we completed and delivered to our collaborators a total of 98 next-generation sequencing projects. We ran our Illumina Genome Analyzers 68 times and carried out 66 sequencing runs on our Roche 454 instrument, producing over 600 Gbase of raw data.

At the beginning of the year we installed a second GAIIx, and had the existing GAII upgraded to GAIIx. These instruments have performed reasonably through the year. There were global issues with Illumina reagent quality in late 2009, which we also experienced. In spring 2010 we had a prolonged Illumina instrument downtime for a complex series of reasons. These issues affected our ability to deliver service to users on schedule. The Roche 454 Titanium instrument has performed well, although, again, global issues with Roche sequencing reagents compromised run quality in late 2009 and the early part of 2010.

The Sanger dideoxy sequencing service, based on two AB 3730 instruments, has seen a year on year decrease in requests for sequencing, but still generated over 100,000 reads (both DNA sequencing and genotyping) in 2009-2010.

NextGenBUG and other community meetings

Contact with our user community through open meetings is very important to us. The GenePool coordinates the Next Generation Bioinformatics User Group (NextGenBUG), a bimonthly forum for bioinformatics researchers across Scotland. In 2009-2010, there were six NextGenBUG meetings, including a very successful genome assembly workshop. NextGenBUG meetings have included international as well as local invited speakers, and also presentations from technology companies. The GenePool also hosted a NERC-sponsored RAD sequencing meeting, and this is likely to become an annual event. We were also represented at several SULSA technology roadshows across Scotland. The MRC Hubs awards, the opening of BBSRC’s TGAC in Norwich and the founding of several additional next-generation sequencing facilities across the UK led the GenePool, along with Nottingham’s DeepSeq and Liverpool’s Centre for Genomic Research, to organise the first UK all-hands next-generation facilities meeting in Nottingham in August 2010.

Page 7: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

7

Increasing our computing power

As the throughput of the GenePool has increased, so our users have demanded ever more sophisticated analyses of data generated. To meet these demands and to make our informatics pipelines more robust, we have made several major compute equipment investments in 2009-2010. To provide local data storage for both raw data from the sequencing instruments and also shared workspace for projects in the process of analysis, we have invested in an Isilon multi-Tbyte storage system. This technology uses proprietary software and dedicated hardware to provide a very resilient and expandable solution, and we currently have 72 Tbyte of local storage capacity. Larger datasets require larger computers, and so through the year we have upgraded bioinformaticians’ workstations and also invested in a high-memory instrument (256 Gbyte RAM) to permit genome assembly of medium sized (up to 1 Gbase) genomes. We have also negotiated with the Edinburgh Compute and Data Facility to access multi-terabyte, secure, offsite, long-term storage and their 1500-node ‘eddie’ cluster for parallel computing tasks.

Staff recruitment and training

The MRC Hub award included, in addition to funding for the Scientific Manager, funding for wet lab technologists, bioinformaticians and an administrator. Through the year we have been recruiting to the team, and have been fortunate in attracting excellent candidates. All the posts we intended to fill in 2009-2010 have been filled. Two staff members went on maternity leave this year, and we have also been fortunate to have been able to recruit excellent cover for these posts. Staff have been trained in instrument use and key bioinformatics tools, and attended the Roche 454, Applied Biosystems SOLiD, and Illumina European User Group meetings.

Spoke bioinformaticians across Scotland

Integral to the MRC Hub project is the co-funding of “spoke” bioinformatician posts in collaborating institutions at the University of Dundee, Glasgow and Aberdeen, the Roslin Institute and the MRC Human Genetics Unit in Edinburgh. Following a successful series of interviews over the year, spoke bioinformaticans are now in post in all five institutions and have started delivering bespoke analyses and support to GenePool users.

AB SOLiD trial

The AB SOLiD technology is an alternative to Illumina SOLEXA for generating multi-gigabase short-read datasets. We, in collaboration with AB and two local academic groups (Mark Woolhouse, IIIR, and David Burt, Roslin), performed a trial install of a SOLiD v3.5 instrument in the GenePool as part of our MRC Hub programme. We constructed fragment and mate-pair libraries and ran the instrument several times. Data were analysed by GenePool bioinformaticians as well as by AB. After the six-month trial, we decided not to proceed to purchase of a SOLiD instrument at this time.

NERC NBAF Facility

During the year we participated in the reapplication to NERC Services and Facilites for continuing funding of the NBAF portion of our activities. In late spring 2010, NERC S&F reviewed three of the five nodes of NBAF, and we have been informed that a recommendation to fund for a further five years has been made.

Page 8: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

8

The Coming Year

In 2011, our plans are to bed in the expansion engendered by the MRC Hub award, develop our wet lab and bioinformatics capabilities to fully exploit our installed sequencing technology base, and to keep a close watch on the developments afoot in the field of single-molecule, ‘third generation’ sequencing technologies with a view to installing one or several platforms in future years.

In June 2010, we took delivery of an Illumina HiSeq 2000, one of the first three to be delivered and installed in the UK. This instrument is specified at approximately four times the throughput of the GAIIx, and we intend that it will become the workhorse of our Facility. The instrument is currently undergoing install and testing, and we hope that it will be fully operational, delivering up to 200 Gbase per week, in September 2010.

Other developments from Illumina include the roll-out of 150 base paired end read chemistries for the GAIIx, and we will trial this in Q4. We will also be extensively testing additional simplifications of the library preparation process for both Roche 454 and Illumina technologies, using sample preparation robots.

In collaboration with other NERC and MRC hubs we are in the advanced stages of discussion concerning the installation of a new laboratory information management system (LIMS) software in the GenePool. Our current wiki-based LIMS is unlikely to scale gracefully to the challenge of increased throughput, and we hope to proceed to a decision as to which LIMS platform to purchase in Q4.

The GenePool, in collaboration with Mark Blaxter’s research group, Per Smiseth of Edinburgh University, Chris Jiggins of Cambridge University, and Matthew Hegarty of Aberystwyth University, was successful in July 2010 in being awarded two BBSRC and one NERC grants to develop and apply the RAD sequencing methodology to novel organisms. These awards will result in the appointment of two additional staff in the GenePool to develop and deliver RAD sequencing technology to our collaborators.

A key role of the GenePool within the Scottish and UK next-generation sequencing scene is our coordination of meetings and workshops, and we will continue to promote our facility, and best practice in next-generation data acquisition and analysis through this means. Already planned for 2010-2011 are an international RAD sequencing workshop (August 2010), a workshop focussing on next-generation transcriptome assembly (November 2010) and the second UK all-hands next generation sequencing meeting (July 2011). We will also continue to coordinate and promote the bimonthly NextGenBUG meetings across Scotland, and attend vendor-coordinated technology user group meetings. GenePool staff will also attend other national and international meetings to both promote GenePool and to keep abreast of developments in this fast moving field.

Installing the Illumina HiSeq 2000 in the GenePool, July 2010. The HiSeq 2000 achieves its increased throughput partly by running two flow cells (each of which is larger than a GAIIx flowcell) at a time.

Page 9: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

9

GenePool Staff

The GenePool’s mix of high-end equipment, high-throughput data generation, and high-quality analyses could not be run without a cadre of outstanding staff.

Professor Mark Blaxter Director.

Mark is a research scientist with interests in genome evolution and function in a wide range of nonvertebrate animals, especially nematodes and other meiofauna (see http://www.nematodes.org). He has directed the GenePool since its inception (as the Sequencing Service) in 1996.

Dr Karim Gharbi Scientific Manager.

Karim has a research background in fish genetics and genomics, and joined the GenePool in July 2009 as scientific manager. He has oversight of the whole of the facility’s workflows and outputs, whilst continuing to investigate fish genomes using next-generation tools

Jill Lovell Sanger Sequencing Lab Manager.

Jill has been running the GenePool Sanger sequencing facility since it started in 1996. She is currently on maternity leave.

Andrew Gillies Sanger Sequencing Technologist.

Andy has been running the GenePool’s two ABI 3730 capillary sequencing instruments for over 10 years, and is acting service head while Jill is on maternity leave.

Dr Jenna Mann Sanger Sequencing Technologist.

Jenna has a background in DNA barcoding, and is part of the Sanger sequencing team, delivering tens of thousands of reads per month to users. Jenna joined the team in Spring 2010.

Marian Thomson Senior Next Generation Sequencing Technologist.

Marian joined GenePool as the first of our next generation instruments (GAI) was in installed in early 2008, and now leads a team of 5 technologists dedicated to Illumina sequencing. Marian has decades of molecular biology experience in a range of labs across Edinburgh.

Karolina Grabara Next Generation Sequencing Technologist (Illumina).

Karolina joined the GenePool in 2009 from a local genetics services company. She has been working on the Illumina platforms, particularly focussing on library construction for digital transcriptomics and genome re-sequencing, and is currently on maternity leave.

Jenna Nicholls Next Generation Sequencing Technologist (Illumina).

Jenna is an experienced sequencing technologist, who was previously part of the Sanger team, moved on to work in the Chemistry department, and has recently rejoined the GenePool (2009) to work on the Illumina platforms. Jenna has a particular role in preparing Illumina libraries and long-insert mate-pair libraries for sequencing on both the Illumina and 454 platforms.

Page 10: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

10

Alexi Balmuth Next Generation Sequencing Technologist (Illumina).

Alexi joined the GenePool from the John Innes Institute in early 2010, and works on Illumina library production, specialising in targeted re-sequencing applications.

Anna Montazam Next Generation Sequencing Technologist (Roche 454).

Anna joined the GenePool on the ABI 3730 team, and now runs the Roche 454 sequencer, producing libraries from genomic DNA and RNA.

Denis Cleven Next Generation Sequencing Technologist (Roche 454).

Denis has a background in forensic genetics, and works with Anna to deliver 454 Titanium data to collaborators.

Nicola Wrobel Next Generation Sequencing Technologist.

Nicola has a background in Sanger sequencing and drove of the 454 platform. After recently returning from maternity leave, she is now the GenePool’s RNA library specialist, delivering samples to both Illumina and 454 platforms.

Urmi Trivedi Next Generation Sequencing Lead Bioinformatician.

Urmi joined the GenePool in 2008, as we installed our first GAI, from the Leeds MSc in Bioinformatics. She currently runs the bioinformatics team, and has particular skills in genome resequencing and assembly de novo.

Stephen Bridgett Next Generation Sequencing Bioinformatician.

Stephen is also a graduate of the Leeds Bioinformatics MSc course, and focuses on Roche 454 sequence analyses.

Timothee Cezard Next Generation Sequencing Bioinformatician.

Tim joined the GenePool from the Vancouver Genome Sequencing Center in 2010, and has particular skills in read mapping and analysis of RNASeq, ChIP and related experiments.

Nick Moir GenePool Systems Administrator.

Nick joined the GenePool from the University of Edinburgh’s computer systems support team, and he now has responsibility for installation and maintenance of the GenePool’s growing bank of high-end computing and data storage instruments.

Dr. John Davey Support Bioinformatician.

John works part time in the GenePool, delivering bioinformatics support to collaborators in the Ashworth Laboratories and the Center for Immunity, Infection, and Evolution. The rest of his time is occupied on research projects using next generation data.

Christine Bradbury Administrator.

Christine recently joined the GenePool from Napier University. She provides administrative and clerical support to GenePool staff, and is also responsible for processing sample submissions and maintaining the website.

Anne Wyllie Financial Administrator.

Anne is the GenePool’s financial assistant, primarily processing user orders and invoices.

Sujai Kumar also assisted with GenePool bioinformatics analyses for a year (2008-2009) before joining the University of Edinburgh postgraduate student programme.

We are also very grateful for support from the School of Biological Sciences finance and administrative staff (Jayne Glendinning, David Clark, David Walker and their team in Finance; Kim Lainson, Sharon Grieg and their team in HR).

Page 11: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

11

GenePool Instrumentation

Sample Processing.

Preparation of samples for sequencing (whether traditional Sanger or next-generation sequencing) requires a well-equipped laboratory. The GenePool occupies modern laboratory space in the Ashworth Laboratories complex, with dedicated, temperature controlled major instrument rooms, wet and bioinformatics laboratory space and linked offices. We have instruments for extraction and purification of nucleic acids, a wide range of PCR machines and robotics for preparation of sequencing libraries. Many of these are available for use by other researchers: for example our Covaris focused acoustics DNA shearing instrument is used by collaborators preparing RAD sequencing libraries.

Data generation.

The GenePool operates four different sequencing platforms: two ABI 3730 capillary Sanger sequencers, a Roche 454 Titanium pyrosequencer, two Illumina GAIIx sequencers and an Illumina HiSeq 2000 sequencer.

Platform Installation Throughput Read length Major uses

ABI 3730 2

(2002 & 2004)

48 samples per run cycle of 2 hrs

up to 800 bases; also used for microsatellite fragment analyses

Sanger sequencing, genotyping

Roche 454 Titanium

1

(2008)

Up to 1 million sequences per 12 hr run

350~450 bases Genome sequencing and resequencing, targeted resequencing, amplicon sequencing (metagenetic analyses), transcriptome sequencing

Illumina GAIIx

2

(2008 & 2009)

400 million 100 base sequences per 10 day run

up to 150 bases

Illumina HiSeq 2000

1

(2010)

2 billion 100 base sequences per 8 day run

up to 100 bases

Genome sequencing and resequencing, targeted resequencing, RNASeq, small RNA (miRNA) sequencing, ChIP and RNAIP sequencing, RADSeq

Data Analysis.

The GenePool makes use of both locally installed compute facilities as well as the University of Edinburgh’s Edinburgh Compute and Data Facility (ECDF). As well as powerful user workstations (4-16 core, 16-32 Gbyte RAM LINUX instruments), we also have share facilities such as a 256 Gbyte, 32-core high-end instrument for genome assembly and other exacting tasks. We have secure local storage (72 Tbyte Isilon system), and use the ECDF for secure off-site backup.

Page 12: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

12

Research highlights: Publications

The GenePool has a clear policy regarding publication of data produced and analyses performed within the facility. We expect at the least acknowledgement and, for projects where staff have paid key roles in developing novel wet lab or bioinformatics analyses, authorship.

Publications with GenePool staff as co-authors

Bradley AJ, Lurain NS, Ghazal P, Trivedi U, Cunningham C, Baluchova K, Gatherer D, Wilkinson GW, Dargan DJ, Davison AJ. 2009. High-throughput sequence analysis of variants of human cytomegalovirus strains Towne and AD169. J. Gen. Virol. 90, 2375-2380.

Ferguson L, Lee SF, Chamberlain N, Nadea N, Joron M, Baxter S, Wilkinson P, Papanicolaou A, Kumar S, Thuan-Jin Clark R, Davidson C, Glithero R, Beasle H, Vogel H, Ffrench-Constant R H, Jiggins CD. 2010. Characterization of a hotspot for mimicry: assembly of a butterfly wing transcriptome to genomic sequence at the HmYb/Sb locus. Mol. Ecol. 19 s1: 240-254.

Green S, Studholme DJ, Laue BE, Dorati F, Lovell H, Arnold D, JCottrell JE, Bridgett S, Blaxter M, Huite E, Thwaites R, Sharp PM, Jackson RW, Kamoun S. 2010. Comparative genome analysis provides insights into the evolution and adaptation of Pseudomonas syringae pv. aesculi on Aesculus

hippocastanum. PLoS ONE 5(4): e10224

Keightley PD, Trivedi U, Thomson M, Oliver F, Kumar S, and Blaxter ML. 2009. Analysis of the genome sequences of three Drosophila melanogaster spontaneous mutation accumulation lines. Genome Res. 19: 1195-1201.

Veitch NJ, Johnson PCD, Trivedi U, Terry S, Wildridge D, MacLeod A. 2010. Digital gene expression analysis of two life cycle stages of the human-infective

parasite, Trypanosoma brucei gambiense reveals differentially expressed clusters of co-regulated genes. BMC Genomics, 11:124.

Publications with GenePool support acknowledged

Cunningham C, Gatherer D, Hilfrich B, Baluchova K, Dargan DJ, Thomson M, Griffiths PD, Wilkinson GWG, Schulz T, Davison AJ. 2009. Sequences of complete human cytomegalovirus 1 genomes from infected cell cultures and clinical specimens. J. Gen. Virol. 91: 605-615.

Granneman S, Kudla G, Petfalski E, Tollervey D. Identification of protein binding sites on U3 snoRNA and pre-rRNA by UV cross-linking and high-throughput analysis of cDNAs. 2009. Proc. Natl Acad. Sci. USA 106: 9613-9618.

Granneman S, Petfalski E, Swiatkowska A, Tollervey D. 2010. Cracking pre-40S ribosomal subunit structure by systematic analyses of RNA-protein cross-linking. EMBO J.,doi:10.1038/emboj.2010.86

Obbard DJ, Welch JJ, Kim KW, Jiggins FM. 2009. Quantifying adaptive evolution in the Drosophila immune system. PLoS Genet. 5: e1000698.

Simmer F, Buscaino A, Kos-Braun IC, Kagansky A, Boukaba A, Urano T, Kerr AR, Allshire RC. 2010. Hairpin RNA induces secondary small interfering RNA synthesis and silencing in trans in fission yeast. EMBO Rep. 11:112-118.

Page 13: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

13

Research Highlights: Projects Completed 2009-2010

GenePool projects completed between 01 July 2009 and June 30 2010.

Project number

Principal Investigator

Institution Platform Application

2008034 David Shuker University of Edinburgh Illumina solexa Deep SAGE 2008035 Ken Forbes University of Edinburgh Illumina solexa Genome re-sequencing 2008041 Peter Simmonds University of Edinburgh Roche 454 De novo genome 2008046 Peter Hedley Scottish Crop Research

Institute Roche 454 De novo transcriptome

2008049 Bob Dalziel University of Edinburgh Illumina solexa Small RNA 2008053 John Hopkins University of Edinburgh Illumina solexa Deep SAGE 2008054 Nigel Saunders University of Edinburgh Illumina solexa Genome re-sequencing 2008055 Seiran Sumner Institute of Zoology, London Roche 454 De novo transcriptome 2008057 Hendrik Schaefer University of Warwick Roche 454 De novo genome 2008069 Bob Dalziel University of Edinburgh Illumina solexa Deep SAGE 2008070 Betty Devitt, University of Edinburgh Illumina solexa Genome re-sequencing 2008074 Robin Allshire University of Edinburgh Illumina solexa Genome re-sequencing 2008076 Mark Blaxter University of Edinburgh Roche 454 De novo transcriptome 2008079 Giovanni Widmer Cummings School of

Veterinary Medicine, USA Illumina solexa Genome re-sequencing

2008080 Roy Bicknell University of Birmingham Illumina solexa RNA-seq 2008081 Ferenc Nagy University of Edinburgh Illumia solexa Genome re-sequencing 2008082 Robin Allshire University of Edinburgh Illumina solexa Small RNA 2008083 Sam Griffiths-Jones University of Manchester Illumina solexa Small RNA 2009001 Sarah Green Forest Research (Edinburgh) Roche 454 De novo genome 2009002 David Longbottom Moredun Institute Roche 454 De novo genome 2009003 Andrew Hudson University of Edinburgh Roche 454 De novo transcriptome 2009004 David Taylor University of Edinburgh Roche 454 De novo transcriptome 2009005 Ian Jackson MRC Human Genetics Unit Roche 454 Targeted re-

sequencing 2009006 Joseph Hughes University of Glasgow Roche 454 De novo transcriptome 2009008 David Tollervey University of Edinburgh Illumina solexa Small RNA 2009009 Nick Gilbert University of Edinburgh Illumina solexa RNAIP-seq 2009010 Joseph Huges University of Glasgow Roche 454 De novo transcriptome 2009013 Maud Swanson Scottish Crop Research

Institute Roche 454 De novo genome

2009014 Andy Tait University of Glasgow Illumina solexa Genome re-sequencing 2009015 Andy Tait University of Glasgow Illumina solexa Genome re-sequencing 2009016 Annette MacLeod University of Glasgow Illumina solexa Deep SAGE 2009017 Andy Tait University of Glasgow Illumina solexa Genome re-sequencing 2009018 Nick Gilbert University of Edinburgh Illumina solexa RNA-seq 2009023 Jim Allan University of Edinburgh Illumina solexa ChIP-seq 2009025 Wickneswari Ratnam Universiti Kebangsaan

Malaysia Illumina solexa RNA-seq

2009029 Javier Caceres MRC Human Genetics Unit Illumina solexa Small RNA 2009032 Nigel Saunders University of Edinburgh Illumina solexa Targeted re-

sequencing 2009034 Ben Pickard Univeristy of Edinburgh Illumina solexa Targeted re-

sequencing 2009035 David Dowling Institute of Technology,

Carlow, Ireland Roche 454 De novo genome

2009036 Nick Gilbert University of Edinburgh Illumina solexa RNA-seq 2009037 Karen Stevenson Moredun Research Institute Illumina solexa De novo genome 2009039 Tom Owen-Hughes University of Dundee Illumina solexa ChIP-seq 2009040 Mark Blaxter University of Edinburgh Illumina solexa Deep SAGE 2009041 Ian Chambers University of Edinburgh Illumina solexa Deep SAGE 2009043 Jack Gilbert Plymouth Marine Laboratory Illumina solexa Genome re-sequencing 2009049 Adrian Bird University of Edinburgh Illumina solexa Genome re-sequencing 2009051 Laura Piddock University of Birmingham Illumina solexa Genome re-sequencing

Page 14: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

14

GenePool projects completed between 01 July 2009 to June 30 (continued)

Project number

Principal Investigator

Institution Platform Application

2009052 David Shuker University of Edinburgh Illumina solexa Deep SAGE 2009053 Wickneswari Ratnam Universiti Kebangsaan

Malaysia Illumina solexa Small RNA

2009054 Achim Schnaufer University of Edinburgh Illumina solexa Genome re-sequencing 2009056 Karen Stevenson Moredun Research Institute Roche 454 De novo genome 2009059 Robert Hill MRC Human Genetics Unit Illumina solexa ChIP-seq 2009060 Ross Fitzgerald University of Edinburgh Illumina solexa Genome re-sequencing 2009061 Sarah Green Forest Research Illumina solexa De novo genome 2009063 Andrew Davison University of Glasgow Illumina solexa Genome re-sequencing 2009064 Thamarai Schneiders University of Edinburgh Illumina solexa Small RNA 2009070 David Taylor University of Edinburgh Illumina solexa Deep SAGE 2009070 Mark Blaxter University of Edinburgh Roche 454 De novo genome 2009071 Chris Boyd University of Edinburgh Illumina solexa Deep SAGE 2009075 Masaru Nakamoto University of Aberdeen Illumina solexa Deep SAGE 2009076 Alasdair Nisbet Moredun Research Institute Illumina solexa Deep SAGE 2009077 Mark Blaxter University of Edinburgh Illumina solexa Genome re-sequencing 2009078 Achim Schnaufer University of Edinburgh Illumina solexa Genone re-sequencing 2009080 Michael Fontaine Moredun Research Institute Illumina solexa Genome re-sequencing 2009083 Ian Chambers University of Edinburgh Illumina solexa Deep SAGE 2009087 Jean Beggs University of Edinburgh Illumina solexa Small RNA 2009088 Martin Holmstrup Aarhus University Roche 454 De novo genome 2009092 Chris Boyd University of Edinburgh Illumina solexa RNA-seq 2009100 Jeremv Mottram University of Glasgow Illumina solexa Genome re-sequencing 2009102 Joseph Huges University of Glasgow Roche 454 De novo transcriptome 2009103 Keith Matthews University of Edinburgh Illumina solexa Genome re-sequencing 2009104 Mark Brown Trinity College Dublin Roche 454 De novo transcriptome 2009108 Ian Johnston University of St Andrews Illumina solexa Targeted re-

sequencing 2009109 David Tollervey University of Edinburgh Illumina solexa Small RNA 2009115 Anne Donaldson University of Aberdeen Illumina solexa ChIP-seq 2009117 David Tollervey University of Edinburgh Illumina solexa Small RNA 2009117 Brian Charlesworth University of Edinburgh Roche 454 De novo transcriptome 2009118 Mark Brown Trinity College Dublin Roche 454 De novo transcriptome 2009119 Paul Hunt University of Edinburgh Illumina solexa Genome re-sequencing 2009120 Simon Baxter University of Cambridge Illumina solexa RAD-seq 2009123 Angus Davidson University of Nottingham Illumina solexa RAD-seq 2009124 Jean Beggs University of Edinburgh Illumina solexa Small RNA 2009126 Gill Malin John Innes Center Illumina solexa RNA-seq 2009127 Richard Carter University of Edinburgh Illumina solexa Genome re-sequencing 2009130 Annette MacLeod University of Glasgow Illumina solexa Deep SAGE 2009134 Robin Allshire University of Edinburgh Illumina solexa ChIP-seq 2009135 Tom Owen-Hughes University of Dundee Illumina solexa ChIP-seq 2009139 Ross Fitzergald University of Edinburgh Roche 454 De novo genome 2009141 Tom Owen-Hughes University of Dundee Illumina solexa ChIP-seq 2009146 Mikael Bjrklund University of Dundee Illumina solexa ChIP-seq 2009149 Ian Denholm Rothamsted Research Illumina solexa Deep SAGE 2010001 David Tollervey University of Edinburgh Illumina solexa Small RNA 2010004 Jeff Williams University of Dundee Illumina solexa Deep SAGE 2010008 John Fazakerley University of Edinburgh Illumina solexa Small RNA 2010011 Mark Bradley University of Edinburgh Illumina solexa Genome re-sequencing 2010013 Sohail Ali Plymouth Marine Laboratory Illumina solexa De novo genome 2010014 Stephen Skill Plymouth Marine Laboratory Illumina solexa De novo genome 2010015 Stephen Skill Plymouth Marine Laboratory Roche 454 De novo genome

Page 15: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

15

GenePool Funding Sources

The GenePool operates on a collaborative, cost-recovery basis, and this covers the running costs of the facility. The core funding received by the GenePool serves to permit installation of large pieces of equipment and expansion of the kinds of analyses offered. We are very grateful to the following for their support for the activities of the GenePool.

School of Biological Sciences and College of Science and Engineering, University of Edinburgh

Support in the context of the MRC Hub award and for acquisition of new instrumentation

(£0.46 M)

Natural Environment Research Council

ongoing support through NERC Services and Facilities supporting NERC Biomolecular Analysis Facility access by NERC scientists; also funding in support of meetings

(circa £0.4 M annually since 2002)

Medical Research Council

MRC Sequencing Hub award to support expansion of GenePool instrumentation, development of staff skill base, and appointment of ‘spokes’ bioinformaticians in collaborating institutions

(£2.3 M)

The Darwin Trust of Edinburgh

support for purchase of sequencing instrumentation

(£0.1 M)

Biotechnology and Biological Sciences Research Council

support for purchase of sequencing instrumentation

(£0.16 M)

Scottish Universities Life Science Alliance

salary support for sequencing technologist and next generation bioinformatician

(£0.4 M)

The Wellcome Trust support for purchase of instrumentation;

support via the Wellcome Trust Centre for Cell Biology, Edinburgh, for a sequencing technologist

(circa £30k annually)

Scottish Bioinformatics Forum

support for “Next Generation Bioinformatics User Group” meetings

(£8 k)

We are grateful to both Roche and Applied Biosystems, who placed instruments in the GenePool for testing and production sequencing.

Page 16: GenePool AnnualReport 2010 KG edit 100907genepool.bio.ed.ac.uk/documents/reports/GenePool_AnnualReport_2… · GenePool Annual Report 2010 GenePool, The University of Edinburgh Genomics

GenePool Annual Report 2010

GenePool, The University of Edinburgh Genomics Facility, Ashworth Laboratories, King’s Buildings, Edinburgh EH9 3JT http://genepool.bio.ed.ac.uk e: [email protected]

16

GenePool Contact Information

The GenePool Ashworth Laboratories The King's Buildings The University of Edinburgh Edinburgh EH9 3JT Scotland

phone: +44 (0)131 651 3633 fax: +44 (0)131 651 3629 web: http://genepool.bio.ed.ac.uk email contacts:

General Enquiries: [email protected] NERC Biomolecular Analysis Facility Enquiries: [email protected] Illumina SOLEXA Enquiries: [email protected] Roche 454 Enquiries: [email protected] Bioinformatics Enquiries: [email protected] Administration and Invoicing Enquiries: [email protected] GenePool Scientific Manager

Dr Karim Gharbi +44 (0)131 650 7489 [email protected] Room 357, Ashworth Laboratories, King’s Buildings, Edinburgh, EH9 3JT

GenePool Director

Prof. Mark Blaxter +44 (0)131 650 6760 [email protected] Institute of Evolutionary Biology, Ashworth Laboratories, King’s Buildings, Edinburgh, EH9 3JT