22
Why Data Science is Something You Should Care About Presented @ South Dakota Code Camp 2012 Ryan Swanstrom @swgoof

Learn Data Science

  • Upload
    swgoof

  • View
    3.990

  • Download
    0

Embed Size (px)

DESCRIPTION

Big Data and Data Science are hot buzzwords right now. The buzzwords might go away but the ideas will not. This talk will explain the buzzwords, and it will cover some of the best resources for attaining data science skills.

Citation preview

Why Data Science is Something You Should Care About

Presented @ South Dakota Code Camp 2012

Ryan Swanstrom @swgoof

About Ryan Swanstrom

Find me on the web

http://twitter.com/swgoof

http://linkedin.com/in/ryanswanstrom

http://datascience101.wordpress.com/

Data Science

"[ability to] obtain, scrub, explore, model and interpret data, blending hacking, statistics, and machine learning."

definition by Hilary Mason, Chief Scientist @ Bit.ly

Who is a data scientist?

http://onforb.es/WNLnRu

Big Data

Any dataset where the size or speed of incoming data causes difficulties in processing

● Volume● Velocity● Variety

Hadoop

"[...] a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models."

Apache Hadoop Website

● HDFS - Hadoop Distributed File System● MapReduce

Lots of Data

18 Monthsthe amount of time for digital data to double

Data Products

Why Do You Care?

McKinsey Global Big Data Report

● 140k - 190k Unfilled Jobs by 2018

● 1.5M Managers & Analysts

Now That You Care, What Skills?

1. Machine Learning2. Statistics3. Story Telling (Communication)4. Big Data5. Algorithms6. Curiosity

College and University

Pros

● Credentials● Experts● Familiar● Widely Accepted● Structured

Cons

● Expensive● Not Individualized● School● Lengthy● Inflexible● Not Real World

Corporate Training

Pros

● Short Timeframe● Experts● Certificates● Business-Savy● Real World● Structured

Cons

● Expensive● Not Individualized● Product Focused● Sales Pitch

MOOCs (Massive Open Online Courses)

Pros

● Free● Experts● Flexible

Cons

● No Credentials● Single Course● No Programs (Yet)

Blogs/Wikis/Other

Pros

● Free● Very Specific● Short● Lots of them

Cons

● Quality?● No Credentials● No Structure● Too many!

Blogs/Wikis/Other

The Problem

● What content is good?

● What order should I cover the content?

● Where do I find new content?

● Who can help me understand?

Data Science 201 - coming soon

http://www.datascience201.comHelping you find the best

data science learning content!

Thank You