38
T witter T rends John DeNero & Aditi Muralidharan University of California, Berkeley

John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Twitter TrendsJohn DeNero & Aditi MuralidharanUniversity of California, Berkeley

Page 2: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

A Hook Into Data Science

Page 3: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

Page 4: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

• Uses Python built-in data types for sequences and maps.

Page 5: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

• Uses Python built-in data types for sequences and maps.

• Students should: process lots of real data, create a useful and attractive visualization, & understand data abstraction.

Page 6: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

• Uses Python built-in data types for sequences and maps.

• Students should: process lots of real data, create a useful and attractive visualization, & understand data abstraction.

What do people tweet?Draw their feelings on a map

to discover trends.

Page 7: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Break each tweet into

words

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

• Uses Python built-in data types for sequences and maps.

• Students should: process lots of real data, create a useful and attractive visualization, & understand data abstraction.

What do people tweet?Draw their feelings on a map

to discover trends.

Page 8: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Break each tweet into

words

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

• Uses Python built-in data types for sequences and maps.

• Students should: process lots of real data, create a useful and attractive visualization, & understand data abstraction.

What do people tweet?Draw their feelings on a map

to discover trends.

Find all tweets that contain a

query word

Page 9: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Break each tweet into

words

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

• Uses Python built-in data types for sequences and maps.

• Students should: process lots of real data, create a useful and attractive visualization, & understand data abstraction.

What do people tweet?Draw their feelings on a map

to discover trends.

Group those tweets by US state

Find all tweets that contain a

query word

Page 10: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Break each tweet into

words

A Hook Into Data Science

• Second project (of four) in our CS 1 course, based on The Structure and Interpretation of Computer Programs.

• Uses Python built-in data types for sequences and maps.

• Students should: process lots of real data, create a useful and attractive visualization, & understand data abstraction.

What do people tweet?Draw their feelings on a map

to discover trends.

Compute the average sentiment of those tweets

Group those tweets by US state

Find all tweets that contain a

query word

Page 11: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

What Does America Think of Texas?

Page 12: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

What Does America Think of Texas?

I love the Texas summer but a high of 111 is crazy

Page 13: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

What Does America Think of Texas?

I love the Texas summer but a high of 111 is crazy

+0.625

Page 14: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

What Does America Think of Texas?

I love the Texas summer but a high of 111 is crazy

+0.625 -0.5

Page 15: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

What Does America Think of Texas?

I love the Texas summer but a high of 111 is crazy

+0.625 -0.5

Page 16: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

What Does America Think of Texas?

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

I love the Texas summer but a high of 111 is crazy

+0.625 -0.5

Page 17: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

What Does America Think of Texas?

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Page 18: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 19: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 20: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 21: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 22: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 23: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 24: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 25: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 26: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

Page 27: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

• Each state is represented by a sequence of polygons.

Page 28: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

• Each state is represented by a sequence of polygons.

• Each polygon is represented by a sequence of positions.

Page 29: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

• Each state is represented by a sequence of polygons.

• Each polygon is represented by a sequence of positions.

Page 30: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

• Each state is represented by a sequence of polygons.

• Each polygon is represented by a sequence of positions.

• Students need simple unit tests to solve this problem.

Page 31: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Finding the Centroid of a State

• Each state is represented by a sequence of polygons.

• Each polygon is represented by a sequence of positions.

• Students need simple unit tests to solve this problem.

• (!) Some students encounter floating point approximations.

Page 32: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Checking for Data Abstraction

An abstract data type is defined by its behavior, and its use should be independent of its representation.

Page 33: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Checking for Data Abstraction

An abstract data type is defined by its behavior, and its use should be independent of its representation.

def make_position(lat, lon): """Return a position...""" return (lat, lon)

def latitude(position): """Return the latitude...""" return position[0]

def longitude(position): """Return the longitude...""" return position[1]

Page 34: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Checking for Data Abstraction

An abstract data type is defined by its behavior, and its use should be independent of its representation.

def make_position(lat, lon): """Return a position...""" return (lat, lon)

def latitude(position): """Return the latitude...""" return position[0]

def longitude(position): """Return the longitude...""" return position[1]

lambda x: lat if x else lon

position(true)

position(false)

Page 35: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Survey Results

Page 36: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Survey Results

• Compared to three other projects (2 games, 1 interpreter)

Page 37: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Survey Results

• Compared to three other projects (2 games, 1 interpreter)

• Which project did you enjoy the most? (21.4% overall)

• Female (23.9%) versus male (20.8%)

• Started programming after 19th birthday (24.2%)

• Taking first computer science course (19.0%)

• Final grade of an A (14.5%), B (25.7%), or C (16.7%)

Page 38: John DeNero & Aditi Muralidharan University of California ... · A Hook Into Data Science • Second project (of four) in our CS 1 course, based on The Structure and Interpretation

Survey Results

• Compared to three other projects (2 games, 1 interpreter)

• Which project did you enjoy the most? (21.4% overall)

• Female (23.9%) versus male (20.8%)

• Started programming after 19th birthday (24.2%)

• Taking first computer science course (19.0%)

• Final grade of an A (14.5%), B (25.7%), or C (16.7%)

• Which project taught you the most? (7.8% overall)

• Female (8.2%) versus male (7.8%)

• Final grade of an A (3.2%), B (8.8%), or C (13.9%)