11
Programming for Data Science CSC392/CSC310

Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

Programming for Data Science

CSC392/CSC310

Page 2: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

Course Details● Dr Lutz Hamel● Best way to get in touch - email:

[email protected]● Everything is online

○ Course Website○ Lecture Notes○ Gradebook (Sakai)○ Example Code

● Python!● Anaconda3● Jupyter Notebooks

Page 3: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

What is Data Science?

It relies on

● computer science ○ for AI, data structures, algorithms, visualization,

big data support, and general programming● statistics/mathematics

○ for data models and inference● domain expertise

○ for asking questions and interpreting results

☞ Data science is the discipline of the extraction of knowledge from data.

Page 4: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

What is Data Science?

How do we do that?

☞We build MODELS of data!

☞ Data science is the discipline of the extraction of knowledge from data.

Page 5: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

Models: Play Tennis

Lots of data - very little information!

Build a model - a decision tree!

Page 6: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

Models: Play Tennis

☞This model summarizes the whole table correctly!

ID3 Decision Tree

Page 7: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

What is Data Science?

Where does the data come from?

☞ The data pipeline!

☞ Data science is the discipline of the extraction of knowledge from data.

Page 8: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

The Data Pipeline

Page 9: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

What is Data Science?

How do we preprocess our data for model building?

☞ Statistics!

● Descriptive Statistics● Missing Value Processing● Normalization

☞ Data science is the discipline of the extraction of knowledge from data.

Page 10: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for
Page 11: Programming for Data Science - University of Rhode Island · for AI, data structures, algorithms, visualization, big data support, and general programming statistics/mathematics for

What is Data Science?

How do we ask the right questions?

☞ Domain Expertise!

Knowledge cannot be generated in a vacuum. You need the context of a domain in order to generate new insights. E.g. bioinformatics, climate modeling, sales forecasting, etc.

☞ Data science is the discipline of the extraction of knowledge from data.