84
Building a platform from open source Dustin Whittle / Yahoo! Developer Network

Building A Platform From Open Source At Yahoo

Embed Size (px)

DESCRIPTION

Building a platform from open sourceJoin us for a case study on using open source tools to build a platform for enterprise web application with symfony. The focus of this session will be on how Yahoo! built Delicious and Yahoo! Answers with symfony. Find out what worked and what didn't when building scalable web applications and how you can leverage Yahoo's Open Stack for your next project. We will examine the components that make up Yahoo!'s open stack: developer tools (YUI), data APIs (YQL), and the application platform (YAP).

Citation preview

Building a platform from open source Dustin Whittle / Yahoo! Developer Network

Overview •  Why symfony? •  symfony vs ysymfony •  Social Search: Yahoo!

Answers and Delicious •  Yahoo! Open Strategy

–  What is the Yahoo! Open Stack?

–  Application Platform + Developer Tool

•  Developer Tools –  YUI, Design Patterns,

Tutorials •  Data (YQL) & Social APIs

–  YQL, Geo, Profiles, Connections, Updates, …

–  YOS SDK for PHP •  Building an open

application with symfony and YOS –  OAuth / YQL / OpenSocial

Who am I? •  Yahoo! – Application Platform + Dev Network – Worked with Y! Answers, Delicious, Y! Bookmarks, Y!

Widgets, Yahoo! Application Platform •  Working with symfony since open source –  symfony Core Team Member – Responsible for symfony at Yahoo! –  Commercial support + training (USA)

DEVELOPER.YAHOO.COM EXAMPLES | TUTORIALS | CODE SAMPLES

YAHOO! IS POWERED BY OPEN SOURCE TECHNOLOGIES

FREEBSD | LINUX | APACHE | PHP | MYSQL | BUGZILLA | HADOOP | SYMFONY

YAHOO! EMBRACES OPEN STANDARDS W3C | MICROFORMATS | OAUTH | OPENID | OPENSOCIAL

YAHOO! HIRES OPEN SOURCE DEVELOPERS

RASMUS LERDORF | DOUG CROCKFORD | DOUG CUTTING | CHRISTIAN HEILMANN

YAHOO! GIVES BACK TO OPEN SOURCE YUI | BROWSER PLUS | DESIGN PATTERNS | R3 | YSLOW + PERFORMANCE RULES

YAHOO! SHARES ITS DATA THROUGH OPEN APIS AND WEB SERVICES

YQL | PIPES | BOSS | CONTACTS | UPDATES | MAIL | DELICIOUS | FLICKR | UPCOMING | HOTJOBS | MAPS | FIREEAGLE | GEOLOCATION | LOCAL | TRAFFIC |

WEATHER | MUSIC | ANSWERS | SHOPPING | FINANCE | TRAVEL

YAHOO! ENGAGES COMMUNITIES WITH OPEN HACK EVENTS AROUND THE WORLD

Conferences | Hack Days | HackU | Tech Talks | YDN Theater

Users 

Load Balancers 

Frontend 

PHP APC, PEAR, PECL, Custom Extensions 

FreeBSD 4.x/6.x, Linux 2.6.x 

ysymfony / YUI Apache Custom Modules 

Backend 

MySQL/Oracle  Web Services  Ad API  User API 

Y! needs from a frontend platfrom •  Fit existing environment (RHEL/PHP5/Apache) •  Development Cycle – How easy to develop, test, and deploy? •  Clean separation between data, logic, and display (MVC) •  Independent model layer to fit service oriented architecture •  Extensible and pluggable •  Internationalization and localization support •  Detailed documentation and active community of support •  Open source and ability to contribute back

Why a frontend platform? •  Rasmus says “frameworks are not well suited

for Y!” – Build applications to requirements

•  Do exactly what you need: no more, no less •  Understand that frameworks add a lot of overhead •  Choosing functional components is a better fit

•  Despite choosing open source or building your own –  Everyone uses a framework –  If you use open source, you get maintenance for free

Why a framework at all? •  Another software layer (ysymfony, yphp, yapache) •  Factors out common patterns

–  Code Layout –  Configuration –  URL Routing –  ORM / Data Access –  Authentication / Security (XSS/CSRF) –  Form Validation / Repopulation –  Internationalization / Localization –  Debugging and Testing utilities

•  Encourages good design •  Abstraction > Consistency > Maintainability

The choice to use symfony •  Philosophy

–  Full-stack framework for building complex web applications

–  Adopt best ideas from anywhere, using existing code if available (Mojavi, Prado, Rails, Django)

•  Design –  Clean separation between Model, View, and Controller –  Controller using modules and actions –  Views using templates in straight PHP with helpers –  Easy to reuse view modules to compose a page

•  Layouts, Components, Partials, Slots

The choice to use symfony •  Configurability / Flexibility •  Features we do not want are easily disabled •  Use of factories for easy customization •  Documentation / Support Community •  The Definitive Guide to symfony (free online) •  Excellent tutorials and example applications

–  Askeet & Jobeet •  Active community with wiki, mailing lists, forums,

irc channel

Why symfony for Yahoo! teams? •  Eliminate common patterns by adding a layer on

PHP –  Code layout/structure (MVC) –  Configuration –  Internationalization

•  ysymfony is just a toolkit –  Learn one set of tools

•  Shift between multiple projects •  Consistency

–  Long term maintainability through platform

A look at Yahoo! Answers •  http://answers.yahoo.com •  Yahoo! Answers is the largest collection of human

knowledge on the Web with more than 135 million users and 515 million answers worldwide (Yahoo! Internal Data, March 2008).

•  Yahoo! Answers is the 2nd ranked education & reference site on the web (comScore)

•  Available in 26 markets and 12 languages

Yahoo! Answers at the beginning •  Started as a small development team on PHP4 from

a fork of Yahoo! Taiwan Knowledge+ •  Launched December 2005 by December 2006 there

were 60 million users and 65 million answers •  The code base eventually became difficult to

maintain and iterate new features •  Large distributed development teams (US / UK)

The big picture •  A complete platform for building web

applications from frameworks – PHP Framework –  JavaScript Framework –  CSS Framework – UI Design Patterns + Best Practices – Development Tools (logger, profiler, debugger, docs) – Unit + Functional Testing Frameworks (LIME / YUI

Test) – Deployment Tools (rsync deployment system)

What does Yahoo! change? •  Minor changes to fit our environment (bsd/php/apache)

–  Most of our changes are easily implemented via factories •  Added dimensions to configurations (ysfDimensionsPlugin) •  Integrate R3 translation/template management (ysfR3Plugin)

–  R3 - http://developer.yahoo.com/r3/ •  Dropped the ORM and pushed down the stack (SOA)

–  Added a parallel API Dispatcher (ysfAPIClientPlugin) •  Created a build and deployment solution (ysfBuildPlugin) •  Integrate support for Y! User Interface libraries (ysfYUIPlugin)

Propel or Doctrine or ??? •  No ORM for large projects •  Propel or Doctrine for large projects – Doctrine for internal projects (best supported)

•  Service Oriented Architecture – Platforms as services (reusable to all) – No heavy lifting, push down the stack –  Thin Controller/Fat Model (where model == services)

•  Java/C++/Erlang + JSON/XML

What does it mean to scale? •  A system whose performance improves after adding

hardware, proportionally to the capacity added, is said to be a scalable system.

•  High Availability + Scalability + Performance •  Bigger dataset, more traffic, maintainable •  Not about performance

–  PHP is slow, but it is not your bottleneck •  Languages do not scale, architectures do. •  Planning to grow and planning to fail

–  Capacity Planning –  Business Continuity Planning

Scaling – Planning •  Planning hardware purchases and hosting options to

have as much as you need without breaking your wallet •  Partitioning and distributing databases to support large

datasets and simultaneous transactions •  Monitoring your applications to find and clear

bottlenecks •  Providing services APIs and using services from other

providers to increase your site's reach and capabilities •  Think Minimal, Plan to grow, Plan to fail.

Scaling – The basics in PHP •  PHP is rarely the bottleneck •  “Most performance comes not from the language, but from

application design” - Rasmus •  Share Nothing Architecture

–  Independent, self-sufficient, no single point of contention –  No local storage = No PHP Sessions

•  Use a database (works for distributed) •  Use a small signed cookie (ideal)

–  Important data in database –  Individual expiration on session objects –  Small data items

–  Use a distributed cache •  Memcache

•  Always use an opcode cache –  Forget about small efficiencies -- Premature optimization is the root of all

evil.

Scaling Databases – The basics •  Master/Slave Replication

–  First steps –  Helps with reads, writes are still bottleneck

•  Partitioning –  Segmenting data

•  Sharding (horizontal partitioning) –  Segmenting data onto different physical machines –  Make problems smaller, easier to grow

•  Offline/queue everything that degrades user experience

Improving latency with Caching •  Always use PHP opcode cache (APC, Xcache, etc) – Use for routing and i18n cache

•  Memcache (distributed cache) – Use for view cache

•  Distributed invalidation can be a pain •  sfViewCacheManager makes this easy! •  Be intelligent about cache_keys (uri, user, state)

•  There is a fine line to caching – At what point do you spend more time managing the

cache, than reading from it?

Tweaking Performance •  Don’t use features you do not need

–  settings.yml / factories.yml •  Use core_compile (aggregate classes) •  Remove debug statements (sfOptimizerPlugin) •  Do not use .htaccess (move to real apache config) •  Set a minimal include path •  Increase realpath_cache_size + realpath_cache_ttl •  Use apc.stat=0 •  Use @routeName •  Do not use components in loop

Do it yourself for cheap •  Open source software = Free

–  Apache –  PHP –  MySQL –  Memcache / Perlbal / MogileFS / Squid / Gearman –  symfony / Doctrine / Propel / Swift –  Nagios

•  Amazon Shared Infrastructure = Cheap –  EC2 Cloud Computing –  S3 Distributed Storage –  SimpleDB

Yahoo! Open Strategy

OPEN PLATFORMS + COLLABORATION OPENID | XRDS | OAUTH | PORTABLE CONTACTS | OPEN SOCIAL

The Open Web

Y! OS – The Open Stack

•  Yahoo! Developer Network •  Developer Tools (YUI, etc) 

•  Social APIs •  Profiles •  ConnecVons •  Updates 

•  Data APIs •  OAuth 

•  Yahoo! Query Language •  Yahoo! ApplicaVon PlaXorm •  OpenSocial 

Y! Developer Network – YUI Javascript •  JavaScript Framework – Utilities - YAHOO, Dom, Event, Animation, Browser

History Manager, Connection Manager, Cookie, DataSource, Drag and Drop, Element, Get, ImageLoader, JSON, Resize, Selector, Loader

–  Controls / Widgets - AutoComplete, Button, Calendar, Charts, Color Picker, DataTable, ImageCropper, Rich Text Editor, Slider, Uploader

–  Container (Module, Overlay, Panel, Tooltip, Dialog), Layout Manager, Menu, TabView, TreeView

– Debug – Logger, Profiler, Test

Y! Developer Network – YUI CSS •  CSS Foundation – Reset - Neutralizes browser CSS styles – Base - Applies consistent style foundation –  Fonts - Foundation for typography and font-sizing – Grids - Thousands of wireframe layouts

•  User Interface Design Patterns Library – Proven solutions to common interfaces – http://developer.yahoo.com/ypatterns/ – Grade Browser Support / Progressive Enhancement

Documentation •  More than 275 functional examples – http://developer.yahoo.com/yui/examples/

•  YSlow + Performance Rules – http://developer.yahoo.com/performance

•  YUI Blog – http://yuiblog.com/

•  Mailing List @ Yahoo! Groups – http://tech.groups.yahoo.com/group/ydn-javascript/

SELECT * FROM INTERNET

Before YQL •  Thousands of Web Services that provide

valuable data •  Require developers to read documentation and

form URLs/queries. •  Data is isolated •  Needs combining, tweaking, shaping even after

it gets to the developer.

Y! Open Stack – YQL •  SQL-Like Language –  Synonymous with Data access –  Familiar to developers –  Expressive enough to get the right data

•  Self Describing - show, desc table •  Allows you to query, filter and join data across

Web Services.

YQL – Open Tables

•  Twitter •  Weather •  Wesabe •  Whitepages •  Zillow •  ….

•  Delicious •  Dopplr •  Friendfeed •  Github •  New York Times •  Shopping

Available on github ‐ hZp://github.com/spullara/yql‐tables/ 

YQL - Examples

•  select * from social.connections •  select * from delicious.feeds.popular •  select * from flickr.photos.interestingness •  select * from friendfeed.status •  select * from github.checkins

YQL – Javascript Execute •  Allows executing of javascript on the server side

to mashup (join, filter, etc) •  Makes combining many web services very

simple •  Support for Oauth

YQL – API End Points •  OAuth Endpoint •  http://query.yahooapis.com/v1/yql?q=...

•  Public Endpoint •  http://query.yahooapis.com/v1/public/yql?q=

•  YQL Console •  http://developer.yahoo.com/yql/console

Y! Open Stack – Application Platform •  Allows developers to deploy their own web

based applications on Yahoo! •  Multiple Views: Small and Canvas •  Social Context: the new Yahoo! Social Directory •  OpenSocial 0.8 Javascript APIs

Y! Open Stack – SDKs •  PHP SDK Available – Open and OAuth Applications

•  ActionScript 3 SDK Available – Open Applications

•  ObjectiveC SDK Available – Open and OAuth Applications

YOSSDK – Methods 3-Legged OAuth

getSessionedUser (session) getOwner (session) getUser (session) query (session) getPresence (user) setPresence (user) listUpdates (user) listConnectionUpdates (user) insertUpdate (user) deleteUpdate (user) loadProfile (user) getConnections (user) getContacts (user) setSmallView (user)

2-Legged OAuth

setSmallView (application) query (application)

YOSSDK – 2-Legged OAuth

•  Used For:

-  Public user data and open APIs

YOSSDK – 3-Legged OAuth

•  Used For:

-  Private data access

Building an Open App : CommonGround •  Experience: Basic Web Dev Knowledge –

PHP/HTML/CSS/Javascript

•  What we are building: Common ground – Find out what you have in common with your social graph: music, movies, books, hobbies.

•  What we will use: YOSSDK, YQL, YAP

h"p://developer.yahoo.com/dashboard 

CommonGround available on GitHub

http://github.com/dwhittle/commonground

QUESTIONS?

WANT TO JOIN YAHOO? WE ARE HIRING AND HAVE INTERNSHIPS!

DEVELOPER.YAHOO.COM EXAMPLES | TUTORIALS | CODE SAMPLES

ENJOY THE REST OF DUTCH PHP CONFERENCE 2009