Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Lecture 1 - Introduction
Statistics 102
Kenneth K. Lopiano
August 26, 2013
adapted from Professor Colin Rundel’s Lecture Slides
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 1 / 27
Syllabus & Policies
1 Syllabus & PoliciesSyllabus Redux
2 Why (Bio)StatisticsScientific MethodHistorical ConnectionsSelected Biology Applications
3 Other ApplicationsCell Phone CommunitiesBasketballPolitics, Journalism and Sports
Statistics 102
Lecture 1 - Introduction Kenneth K. Lopiano
Syllabus & Policies Syllabus Redux
General Info
Professor: Dr. Kenneth Lopiano - [email protected] Chemistry 223E
Teaching Shirley Liao - [email protected]: Taylor Popisil - [email protected]
Lecture: Monday and Wednesday, 11:45 am - 1:00 pmOld Chem 116
Lab Sections: 01L - Tuesdays 1:25 - 2:40 pm02L - Tuesdays 3:05 - 4:20 pm03L - Tuesdays 4:40 - 5:55 pmOld Chem 101
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 1 / 27
Syllabus & Policies Syllabus Redux
Course goals & objectives
1 Recognize the importance of data collection, identify limitations in data collectionmethods, and determine how they affect the scope of inference.
2 Use statistical software to summarize data numerically and visually, and to performdata analysis.
3 Have a conceptual understanding of the unified nature of statistical inference.
4 Apply estimation and testing methods to analyze single variables or therelationship between two variables in order to understand natural phenomena andmake data-based decisions.
5 Model numerical response variables using a single explanatory variable or multipleexplanatory variables in order to investigate relationships between variables.
6 Interpret results correctly, effectively, and in context without relying on statisticaljargon.
7 Critique data-based claims and evaluate data-based decisions.
8 Complete an independent research project employing what you learn in this class.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 2 / 27
Syllabus & Policies Syllabus Redux
Major topics
Introduction to data: Observational studies and non-causal inference,principles of experimental design and causal inference, exploratorydata analysis: description, summary and visualization.
Probability and distributions: The basics of probability and chanceprocesses, Bayesian perspective in statistical inference, the normaldistribution.
Framework for inference: Central Limit Theorem and samplingdistributions
Statistical inference for numerical variables
Statistical inference for categorical variables
Simple linear regression: Bivariate correlation and causality,introduction to modeling
Additional Topics: Multiple regression, logistic regression, survivalanalysis
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 3 / 27
Syllabus & Policies Syllabus Redux
Course materials
Textbook (Required) Statistics for the Life SciencesSamuels, Witmer, SchaffnerPearson, 4th Edition, 2012ISBN: 978-0-321-65280-5
Calculator (Optional) Four function calculator [one that cando square roots is enough but there is no limitationon the type of calculator you can use]
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 4 / 27
Syllabus & Policies Syllabus Redux
Webpage
http:// www.stat.duke.edu/∼kkl13/ courses/ sta102F13/
Announcements, slides, assignments, etc. will be posted on the website.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 5 / 27
Syllabus & Policies Syllabus Redux
Support
Office hours Mondays 2:00 - 4:00 pm or by appointment.
SECC TBAThe statistics education center has upper level statis-
tics students available to help you. For more informa-
tion and a schedule see http:// stat.duke.edu/ courses/
resources-students .
You are highly encouraged to stop by with any questions or commentsabout the class, or just to say hi and introduce yourself.
Most problem sets due on Wednesday - I strongly recommend at leastattempting all problems to make the most of office hours.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 6 / 27
Syllabus & Policies Syllabus Redux
Grading
Homework - 15% Midterm 1 - 15%Labs - 10% Midterm 2 - 15%Quizzes - 5% Final - 25%Research Project - 15%
Grades will be curved at the end of the course after overall averageshave been calculated.
Average of 90-100 guaranteed A-.Average of 80-90 guaranteed B-.Average of 70-80 guaranteed C-.
The more evidence there is that the class has mastered the material,the more generous the curve will be.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 7 / 27
Syllabus & Policies Syllabus Redux
Homework
Objective: Help you develop a more in-depth understanding of thematerial and help you prepare for exams and the project.
Questions from the textbook and outside sources.
Due at the beginning of class on the due date.
Show all your work to receive credit.
You are encouraged to work with others, but turn in your own work.
∼11 problem sets - lowest score will be dropped.
Excused absences do not excuse homework.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 8 / 27
Syllabus & Policies Syllabus Redux
Labs
Objective: Give you hands on experience with data analysis using astatistical software and provide you with tools for the project.
http:// beta.rstudio.org
A gmail account is needed to register your Rstudio account, add yourgmail address using the quiz on Sakai.
∼11 labs - lowest lab score will be dropped.
Write ups due the following week - most can be completed in class.
You must attend the lab you are enrolled in, if you do not attend in agiven week you are eligible for at most 50% credit on that lab.
Online quiz in the first 10-15 mins of Lab (may only be taken duringin Old Chem 101).
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 9 / 27
Syllabus & Policies Syllabus Redux
Research Project
Objective: Give you independent applied research experience using realdata
Open ended semester long research project.
Choose a research question, find data, analyze it, write up yourresults.
More details in the near future.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 10 / 27
Syllabus & Policies Syllabus Redux
Exams
Midterm 1: Wednesday, October 9th
Midterm 2: Wednesday, November 20th
Final: Wednesday, December 11th 2-5pm (Cumulative)
Exam dates cannot be changed. No make-up exams will be given. Ifyou cannot take the exams on these dates you should drop this class.
You may bring a calculator to the exams (no cell phones, iPods, etc.)and you may also bring one sheet of notes (“cheat sheet”). Thissheet must be no larger than 8 1
2 ” × 11” and must be prepared by you(no photocopies). You may use both sides of the sheet.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 11 / 27
Syllabus & Policies Syllabus Redux
I will regularly send announcements via email, make sure to checkyour email daily.
While email is the quickest way to reach me outside of class, notethat it is much more efficient to answer most questions in person.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 12 / 27
Syllabus & Policies Syllabus Redux
Students with disabilities
Students with disabilities who believe they may need accommodations inthis class are encouraged to contact the Student Disability Access Officeat (919) 668-1267 as soon as possible to better ensure that suchaccommodations can be made.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 13 / 27
Syllabus & Policies Syllabus Redux
Late Work Policy
For homework and lab write ups:
late but during class: -10 ptsafter class on due date: -20 ptsnext day: -30 ptslater than next day: -100 pts
For research project: -10 pts / day late
There will be no make-up quizzes.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 14 / 27
Syllabus & Policies Syllabus Redux
Other Policies
The final exam must be taken at the stated time and you cannot passthis class if you do not take the final exam.
You must score at least 30 / 100 on the research project to pass thisclass.
Regrade requests must be made within one week of when theassignment is returned, and must be submitted in writing.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 15 / 27
Syllabus & Policies Syllabus Redux
Academic Dishonesty
Any form of academic dishonesty will result in an immediate 0 on thegiven assignment and will be reported to the Office of Student Conduct.Additional penalties may also be assessed if deemed appropriate. If youhave any questions about whether something is or is not allowed, ask mebeforehand.
Some examples:
Use of disallowed materials (including any form of communicationwith classmates or looking at a classmate’s work) during exams.
Plagiarism of any kind.
Use of outside answer keys or solution manuals for the homework.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 16 / 27
Syllabus & Policies Syllabus Redux
Tips for success
1 Complete the reading before each lecture, and review again at the endof each chapter.
2 Be an active participant during lectures and labs.3 Ask questions - during class or office hours, or by email. Ask me, the
TAs, and your classmates.4 Do the problem sets - start early and make sure you attempt and
understand all questions.5 Start your project early and allow adequate time to complete the
necessary components.6 Give yourself plenty of time time to prepare a good cheat sheet for
exams. This requires going through the material and taking the timeto review the concepts that you’re not comfortable with.
7 Do not procrastinate - don’t let a week go by with unansweredquestions as it will just make the following week’s material even moredifficult to follow.
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 17 / 27
Why (Bio)Statistics
1 Syllabus & PoliciesSyllabus Redux
2 Why (Bio)StatisticsScientific MethodHistorical ConnectionsSelected Biology Applications
3 Other ApplicationsCell Phone CommunitiesBasketballPolitics, Journalism and Sports
Statistics 102
Lecture 1 - Introduction Kenneth K. Lopiano
Why (Bio)Statistics Scientific Method
Statistics and the Scientific Method
From Universe Today - http:// www.universetoday.com/ 74036/ what-are-the-steps-of-the-scientific-method/
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 18 / 27
Why (Bio)Statistics Historical Connections
R.A. Fisher
“I occasionally meet geneticists who ask me whether it is true that thegreat geneticist R.A. Fisher was also an important statistician.” - L.J.Savage (Annals of Statistics, 1976)
Source: http:// www.swlearning.com/ quant/ kohler/ stat/ biographical sketches/ Fisher 3.jpeg
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 19 / 27
Why (Bio)Statistics Historical Connections
R.A. Fisher cont.
Biology:
Heterozygote advantage
Population genetics (Modernevolutionary synthesis)
Fisherian runaway selection
...
Statistics:
Analysis of Variance
Null hypothesis
Maximum Likelihood
F distribution
Fisher’s Exact test
Fisher Information
Randomization testing
...
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 20 / 27
Why (Bio)Statistics Selected Biology Applications
Next Gen Sequencing
44
Data workflow
Images are analysed closer to the instrument (on the instrument control PC), then base-calls are transferred to a secondary analysis server.
Real-Time Analysis
Images Intensities Base calls
Secondary Analysis
Alignments Visualisation etcserver
GA and PEM
http:// www.ebi.ac.uk/ industry/ Documents/ workshop-materials/ newsequence291009/ Basecalling-Klaus Maisinger.pdf
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 21 / 27
Why (Bio)Statistics Selected Biology Applications
Crab Abundance Estimation
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 22 / 27
Why (Bio)Statistics Selected Biology Applications
Grafted Tomato Plants
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 23 / 27
Why (Bio)Statistics Selected Biology Applications
Knee Osteoarthritis
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 24 / 27
Other Applications
1 Syllabus & PoliciesSyllabus Redux
2 Why (Bio)StatisticsScientific MethodHistorical ConnectionsSelected Biology Applications
3 Other ApplicationsCell Phone CommunitiesBasketballPolitics, Journalism and Sports
Statistics 102
Lecture 1 - Introduction Kenneth K. Lopiano
Other Applications Cell Phone Communities
Communities in Dominican Republic
Caughlin et al. - Place-Based Attributes Predict Community Membershipin a Mobile Phone Communication Networkhttp:// dx.plos.org/ 10.1371/ journal.pone.0056057Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 25 / 27
Other Applications Basketball
Basketball Analytics
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 26 / 27
Other Applications Politics, Journalism and Sports
The most famous statistician in the world ...
Nate Silver - Now at ESPN
Statistics 102 (Kenneth K. Lopiano) Lecture 1 - Introduction August 26, 2013 27 / 27