Twitter Data - University of California,...

Preview:

Citation preview

Twitter DataGilad Mishne - @gilad

Twitter SearchInfo 290 - Analyzing Big Data With Twitter

UC Berkeley Information SchoolAugust 2012

About Twitter

2

• The fastest, simplest way to communicate

• More than 140M active users• Majority (also) mobile; 60% out of U.S.

• More than 400M twitter.com visitors• More than 400M tweets/day (peak: 25K/sec)

• 1,000 employees (majority in San Francisco)• 50% engineers

Twitter data: text

3

Twitter data: social graph

4Credit: @isaach

Twitter data: time series

5

Twitter data: interest graph

6Credit: @psychemedia

Combined: the pulse of the world

7

What we do with large data

8

Scale

9

Search

10

Recommendations

11

Ads

12

Anti-Spam

13

The Speakers

(Plus, some Twitter features)

14

Twitter Overview

15Embedded links

Expanded Tweets

Hadoop/Pig

16

Replies/conversations

Trends/Streaming

17

Protected accounts

Support for 30+ languages

Search

18Retweets, Favorites

Graphs, Recommendations, Relevance

19Retweets (in the timeline)

Security, anomaly detection

20

Photos

Scalding

21

Geotagging,hashtags

Goals

• Work with real data, on real problems• Learn how it is to work in a place like Twitter• Build something useful• Have a good time!

22

Questions?

23

Follow me: @gilad

Recommended