View
1
Download
0
Category
Preview:
Citation preview
UsingBioGridsforRNA-SeqonAWSandYourLaptop
TMEC304 July252019
JamesVincent
BioGridsConsortiumHarvardMedicalSchool
biogrids.orghelp@biogrids.org
Todaywewill.....
InstallsoftwarewithBioGrids
RunanRNA-Seqworkflow
ReplicateaboveonAWSandlaptop
biogrids.orghelp@biogrids.org
BioGridsonAWS
faithfullaptop AWSEC2Instance
biogrids.orghelp@biogrids.org
RNA-SeqWorkflow
openaterminalopenabrowser:biogrids.org/wiki/workshops
biogrids.orghelp@biogrids.org
SubtleThings
• CapsuleEnvironment• .bashrc/.profilenotchanged• binaryinstalls
biogrids.orghelp@biogrids.org
https://www.biostars.org/p/189261/:Thisseemstobeabugwheninstallingfastqcusingapt-getinstallfastqc
STARmanual.pdf….whichcreatesproblemsforSTARcompilation.Oneoptiontoavoidthisproblemistoinstallgcc…….
http://github.gersteinlab.org/exceRpt/ManualInstallation:….generallynotrecommended…<snip>…instructionsonhowtoinstallexceRptanditsvariousdependencieswill[oneday]belistedtowardthebottomofthispage.
AvoidTimeSinks
biogrids.orghelp@biogrids.org
ReproducibleResearch
$ STAR --sbapp:d !!Capsule:STAR using star version 2.5.3a ! Version information for: /programs/i386-mac/star !!Default version: 2.5.3a !In-use version: 2.5.3a !Other available versions: none !Overrides use this shell variable: STAR_M !
SelfDocumenting
STAR --sbapp:d ! samtools --sbapp:d !
Includeinworkflow:
biogrids.orghelp@biogrids.org
ConfigFile
[installer] !site = biogrid-production !key = 70rYFBTDnmCr93VUklfbf1s3M4jdyC9bFVYHew== !user = jvincent1 !![packages] !star@2.5.3a = i386-mac !samtools@1.5 = i386-mac !igv@2.4.10 = i386-mac !
biogrids.orghelp@biogrids.org
ThisIsHandy
biogridssavemysetup.txt biogridsreactivatemysetup.txt
faithfullaptop newworkstation
biogrids.orghelp@biogrids.org
BioGridsisPortable
faithfullaptop
laboratoryworkstation
HMSO2computecluster
BCHcomputecluster
biogrids.orghelp@biogrids.org
BioGridsBenefits
savetime-reduceheadachesscaleandshareworkflowspartofreproducibleresearch
biogrids.orghelp@biogrids.org
BioGridsConsortium
ComputeInfrastructure
PersonnelSBGridBioGrids
FundingHMSToolsandTechnologiesCommittee
biogrids.orghelp@biogrids.org
WhyBioGrids?
You CoverofNature
compilesoftwarecompilelibrariesmanagedependenciesmanageversionsmanagepathschangeversions....
learntousesoftwareoptimizeworkflowgetsciencedone
biogrids.orghelp@biogrids.org
RNA-SeqOverview
HarvardChanBioinformaticsCore(HBC)
http://bioinformatics.sph.harvard.edu/training
biogrids.orghelp@biogrids.org
RNA-SeqOverview
hbctraining.github.io/Intro-to-rnaseq-hpc-O2
Biologicalsamples/Libraryprep
sequencereads
qualitycheck
adapter/qualitytrimming
spliceawaremappingtogenome
countreadsassociatedwithgenes
statisticalanalysisidentifydifferentiallyexpressedgenes
biogrids.orghelp@biogrids.org
RNAPrep
biogrids.orghelp@biogrids.org
Sequencing
biogrids.orghelp@biogrids.org
RNA-SeqOverview
hbctraining.github.io/Intro-to-rnaseq-hpc-O2
Biologicalsamples/Libraryprep
sequencereads
qualitycheck
adapter/qualitytrimming
spliceawaremappingtogenome
countreadsassociatedwithgenes
statisticalanalysisidentifydifferentiallyexpressedgenes
FastQC
(trimmomatic)
STAR
subRead
BioGridsApps
biogrids.orghelp@biogrids.org
MappedReads
biogrids.orghelp@biogrids.org
CheckResults
IGV1:Genomes/LoadGenomefromFile...(chr1_MOV10.fa)2:File/Loadfromfile...(.gtffile)3:File/Loadfromfile...(.bamfile)
biogrids.orghelp@biogrids.org
workflow softwarestack
computeresources
DevOpswithBioGrids
bioinformatics BioGrids laptopHMSO2AWS
biogrids.orghelp@biogrids.org
AWSHandsOn
https://sbgrid.signin.aws.amazon.com/consoleusername:workshop21password:Biogrids_Workshop1
biogrids.orghelp@biogrids.org
AWS-AmazonWebServices
biogrids.orghelp@biogrids.org
AWSParallelClusteraws-parallelcluster.readthedocs.io
scalableHPCcluster
biogrids.orghelp@biogrids.org
help@biogrids.org
BioGrids is funded by the Harvard Medical School
Tools and Technologies Committee
biogrids.orghelp@biogrids.org
AdditionalResourcesENCODEdatafilescanbefoundhereforCalTechRNA-Seq:http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/Usethisbamfile:wgEncodeCaltechRnaSeqK562R1x75dAlignsRep1V2RegionofMOV10gene:chr1:113,214,934-113,243,900Howtodownloadwholegenome:-UCSCftpsite:hgdownload.cse.ucsc.edu-UCSCwebsite:http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/-UCSCrecommendsusinganftpclientforlargefiledownloads-chr1isonly70M
biogrids.orghelp@biogrids.org
References
TRAININGhbctraining.github.io/Intro-to-rnaseq-hpc-O2AWShttps://aws.amazon.com/ec2/getting-startedENCODEhttps://www.encodeproject.orgIMAGEShttps://www.diagenode.com/en/categories/Library-preparation-for-RNA-seqhttps://rnaseq.uoregon.edu
Recommended