39
Doug Raiford Lesson 1

Doug Raiford Lesson 1. Biologists and Computer Scientists Note the word “Scientists” 12/12/2015Introduction2

Embed Size (px)

Citation preview

Page 1: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Doug RaifordLesson 1

Page 2: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Biologists and Computer ScientistsNote the word “Scientists”

04/21/23 Introduction 2

Page 3: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Wikipedia Computational biology encompasses bioinformatics Bioinformatics applies algorithms and statistical

techniques to the interpretation, classification and understanding of biological datasets

NCBI Bioinformatics: Research, development or application

of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze or visualize such data.

Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems

For the purposes of this course we are

treating the terms as synonymous

04/21/23 3Introduction

Page 4: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

It’s All About the DataVirtually every biological experiment

requires a processor and software

04/21/23 Introduction 4

Page 5: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Genetic material comprised of 3 billion base-pairs

The sheer volume of data requires the involvement of computational and storage techniques in order to analyze

04/21/23 5Introduction

Page 6: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Can now identify which genes are affected by a disease or treatment

Thousands of genes per experiment

Multiple experiments per time-point

Multiple time-points04/21/23 6Introduction

Page 7: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Data growing exponentiallyThousands of complete genomesEach genome results in thousands of

experiments

04/21/23 7Introduction

Page 8: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Vast amounts of data More data coming in daily Sophisticated

computational techniques required Clustering Searches Optimizations Data mining Pattern recognition Classification

04/21/23 8Introduction

Page 9: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

04/21/23 Introduction 9

A little about me Work School

Page 10: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Moodle is the primary page Weekly schedule▪ When homeworks

are due▪ When projects are

due▪ Links to quizzes,

projects, and homeworks

Instructor website Syllabus Slides

04/21/23 10Introduction

Page 11: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Bioinformatics: Sequence and Genome Analysis

Beginning Perl for Bioinformatics

04/21/23 11Introduction

Page 12: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

3:30 to 5:00 Tuesday and ThursdayOr by appointment

Social Science 412Phone 406-243-5605Email

[email protected]

A little about myself

04/21/23 12Introduction

Page 13: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Try to get assignments in on time Letter grade for each day late

04/21/23 13Introduction

Component Undergrad Graduate

Homework 10% 8%

Quizzes 25% 21%

Exams (3 of them) 30% 25%

Projects 35% 29%

Grad Project NA 17%

90 - 100 A87 - 89 B+80 - 86 B77 - 79 C+70 - 76 C67 - 69 D+60 - 66 D00 - 59 F

90 - 100 A87 - 89 B+80 - 86 B77 - 79 C+70 - 76 C67 - 69 D+60 - 66 D00 - 59 F

Page 14: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Your work in this class needs to be your own

Overly similar work (to that of your classmates or to content from the web) will be considered to be the result of copying First offense will result in a zero

on the assignment Second will be referred to the

Dean of Academic Affairs Student Conduct Code

http://life.umt.edu/vpsa/student_conduct.php

04/21/23 14Introduction

Page 15: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Let me know of any special needs during this first week Letter from Disability Services

for Students (DSS) Religious observances Officially sanctioned,

scheduled University extracurricular activity opportunity to make up class

assignments or other graded assignments

04/21/23 15Introduction

Page 16: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Improve the computer scientist’s understanding of biological systems and problems

Improve the biologist’s understanding of the science of computing and provide the beginnings of a CS skill-set

04/21/23 16Introduction

Page 17: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Four Distinct Audiences

Computer scientists all about the algorithms, implementations, programming languages, design, etc.

Biologists mostly just want an introduction to programming

Undergrads High-level overview

Graduate Students Specific tools and skills that will aid them in research04/21/23 Introduction 17

Computer Scientists Biologists etc.

Undergrad Grad Undergrad Grad

Page 18: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Undergraduates Some algorithms (even implement some) New language: Perl and R Introduce programming concepts Lots of practice programming (8 projects) Lots of guidance from me

Graduate students Practice writing a grant (a draft and a final version) Practice writing a paper (a draft and a final

version) Practice using several actual Bio Tools

All Team projects

04/21/23 Introduction 18

Page 19: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

04/21/23 19Introduction

Page 20: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Computer science wise Not really anything new More of an application

of existent techniques Dynamic programming

techniques Hidden Markov Models Exploratory data analysis▪ Clustering▪ Multivariate analysis▪ Clustering▪ Principal components analysis

04/21/23 20Introduction

Page 21: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Research Ph.D. generating

publicationsEmployee in a company

Drug company Genomics lab

04/21/23 21Introduction

Page 22: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Bioinformatician www.simplyhired.com

04/21/23 22Introduction

Page 23: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Techniques that are successful in bioinformatics are the same that are successful in other data-intensive fields

04/21/23 23Introduction

Page 24: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Hunger, need for clean water

Global warmingDisease

04/21/23 24Introduction

Page 25: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Genetically engineered crops Disease resistant Greater yields

Water treatment Genetically

engineered microbes▪ Sewage treatment—

purification▪ Clean oil spills

04/21/23 25Introduction

Page 26: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Plants consume CO2 and release O2 But the carbon is released back into the

atmosphere over a period of time

Genetically engineered plants could convert into stable form

04/21/23 26Introduction

Page 27: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Genetically enhanced microbes convert back to fuel Methanococcus jannaschii Takes CO2 and converts it

to methane

04/21/23 27Introduction

Page 28: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Test for increased risk of certain cancers

Personalize medicine Leukemia▪ Genetic profile

resistant to certain chemotherapy

Increased risk of drug reactions

04/21/23 28Introduction

Page 29: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Many drugs bind to protein active sites

Computational techniques for predicting drug performance

04/21/23 29Introduction

Page 30: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Actually alter our genetic code to treat genetic disorder

Or simply add disembodied gene to our complement

04/21/23 30Introduction

Page 31: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

What does it have to do with informatics?

Where do computer scientists fit in this picture?

Role of computers and computer

scientists

Role of computers and computer

scientists

04/21/23 31Introduction

Page 32: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Why biologists would attend

04/21/23 Introduction 32

Page 33: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

CS types good at the data analysis

Must understand what the data means

Don’t know what to look for—what questions to ask

Don’t speak the lingo

Haploid Hypertonic Hypotonic Erythematous Cilia Cell membrane Nucleus Lytic cycle Gene Biotic factors Nulliparity Hyperosmotic Natural selection Fluid mosaic model Solute Homologous chromosome Ribosome Mitochondria Diffusion Leucocytes Photosynthesis Genetic variation Organism Plasma membrane Cytoplasm Wagners disease Meiosis Habitat Diploid Cell Youpon Concentration gradient Ecosystem Homeostasis Mitosis Osmosis Allele Enzyme Autotrophic Egestion Mitochondrion Gamete Organisms Nucleotide Amino-acyl Gene expression Point mutation Duplication event

Haploid Hypertonic Hypotonic Erythematous Cilia Cell membrane Nucleus Lytic cycle Gene Biotic factors Nulliparity Hyperosmotic Natural selection Fluid mosaic model Solute Homologous chromosome Ribosome Mitochondria Diffusion Leucocytes Photosynthesis Genetic variation Organism Plasma membrane Cytoplasm Wagners disease Meiosis Habitat Diploid Cell Youpon Concentration gradient Ecosystem Homeostasis Mitosis Osmosis Allele Enzyme Autotrophic Egestion Mitochondrion Gamete Organisms Nucleotide Amino-acyl Gene expression Point mutation Duplication event

04/21/23 33Introduction

Page 34: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Biologists understand the data

Don’t know how to formulate the problem in CS terms

Don’t know what magic the CS types can bring to the table

Don’t speak the lingo

Acyclic graph Heap sort Huffman coding Adjacency-matrix Admissible vertex Abstract data type Algorithm All pairs shortest path Euclidean distance Hash Tree Linked list Heap Complexity analysis Recursion Dynamic programming Graph Hamiltonian path Heuristic Hidden Markov Model Principal components analysis Isomorphic Simplex algorithm Mahalanobis distance Discrete event simulation NP-complete Big O Optimization problem Polymorphism Polynomial time Clustering Classifying Stack Queue Stochastic modeling Tail recursion Binary tree Self organizing map Shortest common string Minimum spanning tree Singular matrix Trie Vertex cover

04/21/23 34Introduction

Page 35: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Won’t be a full-fledged bioinformatician Will be able to contribute given

close guidance practice and continued training and

guidance

04/21/23 35Introduction

Page 36: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Determine problem to be solved given data Determine which tool to utilize Manually Format data for input to tool

Might involve data retrieval if utilizing repository data

Run tool Analyze results

Biologists

perform all steps

04/21/23 36Introduction

Page 37: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Determine problem to be solved given data Develop algorithmic approach Implement algorithm (write code) Format data for input to algorithm

Might involve data retrieval if utilizing repository data

Run code Analyze results

ComputerScientist

Biologist

Biologist

04/21/23 37Introduction

Page 38: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

Determine problem to be solved given data Develop algorithmic approach Implement algorithm (write code) Format data for input to algorithm

Might involve data retrieval if utilizing repository data

Run code Analyze results

ComputerScientist

Biologist

Biologist

04/21/23 38Introduction

Page 39: Doug Raiford Lesson 1.  Biologists and Computer Scientists  Note the word “Scientists” 12/12/2015Introduction2

CS types Provide beginnings of a

biology background Introduce some existing

tools, sources of data, and analysis techniques

Biologists Introduce some existing

tools, sources of data, and analysis techniques

Provide some programming essentials

04/21/23 39Introduction