Upload
justin-moore
View
654
Download
2
Tags:
Embed Size (px)
Citation preview
San Francisco
Australia
Brooklyn
repeat checkin %
The Adjustment Bureau
Explore
Engineering
Our StackA
pp
licat
ion
Sta
ck
Scala/Liftweb API Machines WWW Machines Batch Jobs
Scala Application code
Mongo/Postgres/Flat Files
Databases Logs
Dat
a St
ack Amazon S3 Database Dumps Log Files
EMR Hadoop
Hive/Ruby/Mahout Analytics Dashboard Map Reduce Jobs
mongoexport
postgres dumpFlume
Massive Intersection Queries
select * from checkins
where user in friends
and venue in nearby
select * from similarities
where venue1 in myHistory
and venue2 in nearby
1
2
3
4
5
a
b
c
d
e
< 100ms
Analytics
Goals
• Simple Interface
• Scales with our data
• Cheap (free)
• Supports 90% use cases
• Fast
Our Internal Dashboard
EMR
The Team
The Anatomy of a Data Team
• Analytics (Stats)
• Science (ML)
• Engineering (CS)
• Mix of all of these!
Building a Data Team
• It’s hard
• Good references:
– http://radar.oreilly.com/2011/09/building-data-science-teams.html
– http://mathbabe.org/2011/09/25/why-and-how-to-hire-a-data-scientist-for-your-business/
Join us!
foursquare is hiring
www.foursquare.com/jobs
Justin Moore@injust