14
Photos at Meetup processing / storage / serving Greg Whalin (@gwhalin)

Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Photos at Meetup

processing / storage / serving

Greg Whalin (@gwhalin)

Page 2: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Meetup

founded in 2002 in NYC81k+ paying groups1.2m+ RSVPs per months25k+ Meetups per week6 straight quarters of profitabilityteam of 70 people working on MEMEwe are hiring

Page 3: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Photos (processing and storage)

~500k photo uploads per monthprocessing

3 - 5 different dimensionsorientation (not dealing with for now)exif

storagecheapeasy to scaleperformance not critical because ...

servingfastbut CDN oklocal cache

Page 4: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Processing

ImageMagick (or GraphicsMagick)CPU intensive No processing in core application

for large uploads, can hold up request separate service (backed by distributed cluster)should be able to work behind dumb round-robin load balancer

Batch for background processingrapid turnaround still important

Page 5: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

StorageRequirements: cheap, scalable, redundant, easylocal RAID

easy, but not scalable, redundant or cheapSAN

scalable, easy, and redundant, but not cheapNAS

scalable, easy, and cheap, but not super redundantDistributed

can be cheap, scalable, redundant, and easy We went w/ MogileFS

Cheap and scalable: jbod on entire networkRedundant: auto-replication schemes (rack and chassis aware), no SPOFEasy: kinda, but no POSIX interface; web api (GET/PUT)

Page 6: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Photo upload life cycle

1. User POSTs photo(s) from computer to Meetup App servers

2. Record inserted into db in a pending state3. App server POSTs photo w/ some metadata to staging

cluster (behind dumb load balancer)4. lighttpd+fastCGI process running on each stager stores

photo and meta to filesystem5. Stager job monitors directory (using inotify) and wakes up

when new photo to process6. Stager code (Perl) processes photo, stores in MogileFS and

updates DB that it is complete

Page 7: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup
Page 8: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Current Setup

2 x dedicated upload app servers machinesrarely ever restarted so as to avoid interrupting long uploads java + tomcat 8 proc Opteron w/ 16 GB

2 x dedicated staging/processing boxeslighttpd + perl daemon4 proc Opteron w/ 8 GB

9 x storage nodes (Mogile)4 proc Opteron w/ 8 GB JBOD (old boxes 8 x 500GB, new 16 x 1TB)total capacity now 86TB (72 used)2 x db boxes for meta info (running mysql)

Page 9: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup
Page 10: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup
Page 11: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Serving

2 serving nodes2 proc Opterons w/ 4 GB (old cheap boxes)lighttpd + fastCGIretrieve photo from MogileFS, and return itnot high speed at serving so...

Akamai photos served w/ long TTL for cacheeven so, 20% of all requests hit our origin servers (lots of edge?) so ...

Varnish cache in front of origin servers (coming soon) Akamai midgress

Page 12: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup
Page 13: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup
Page 14: Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

Links

http://www.danga.com/mogilefs/http://www.imagemagick.org/script/index.php