Upload
eamonn-obrien-strain
View
1.870
Download
0
Tags:
Embed Size (px)
DESCRIPTION
In which the author explains how to design large-scale cloud platforms for document processing
Citation preview
Book Widget
Embedding automated photo-document publication on the web and in mobile
devices
E. O’Brien-Strain, A. Hunter, J. Liu, Q. Lin, D. Tretter, J. Wang, X. Zhang, and P. Wu
Hewlett-Packard Labs
In which the author ...
explains how to design large-scale cloud platforms for document processing
Outline
Motivation
Architectural Principles
Platform Implementation
Example Application – Facebook Photo-Books
Motivation
We have a wealth of auto-publishing algorithms
Want to provide them for third parties to use
By building a cloud-based platform that Is flexible and programmable Is secure and private Is infinitely scalable Has high availability and responsiveness Reconciles WYSIWYG and style-driven use
models
Architectural Principles
REST
Capability Security Authorization
Authentication Agnostic
Scaling: Elastic, Sessionless, noSQL, Caching
Orthogonal: UI / Content Source / Artifacts
REST
“Representational State Transfer”
Architectural pattern for creating network APIs
All API calls are HTTP requests to some URL GET to retrieve data from a URL PUT to write data to a URL POST to perform some action on a URL DELETE to remove the data from the URL
Starting from response from an initial URL client code finds other URLs to operate on
Capability Security Authorization
An example of a URL from our API:
http://foo.com/document/qY9vZObN-slqsv_RWnJB4w/content/chunks.json
Has cryptographically-secure random string
If you do not know this URL there is no way to find it
Possess URL <=> Authorized to use URL “Moderate” level of security
Still vulnerable to network snoopers Can use SSL to increase security
Authentication Agnostic
No concept of a “User” Instead just store anonymous resources
Client code expected to keep track of users and authenticate them remember which resources belong to each user
Gives flexibility to client to use any authentication No need for complexity of “single-sign-on”
Allows us to avoid many security/privacy headaches avoid complexity and cost of using SSL have our data cached in Internet infrastructure
Scaling
Can use elastic infrastructure cloud rapid spin-up and spin-down of virtual servers
Sessionless Bank of web servers operating in parallel Sequence of HTTP requests sprays out
arbitrarily over multiple servers
NoSQL Highly-distributed no-master key-value store
Caching at every level
Any Permutation of
User interface for creating documents Web (HTML5 or Flash) Mobile device PC application
Where content comes from Social network Photo site / document storage site PC folder
What kind of artifacts Print at home, at retail, or at PSP View on e-Book reader, slate, or phone
Platform Implementation
Initially Targeting Photo-Oriented Documents
Unified Model for Document + Template + Content
Content Transformation
Transactional Data
Embeddable as a Widget
Monetizable
Initial Target
Platform architected to handle a wide variety of documents
Initially handles photo-oriented documents Such as photo-books
Can be extended to handle more text-heavy documents
Such as magazines
Unified “Document” Model
Document = content + “rendersheet”
A single “document” resource type for User documents (content + rendersheet) Collections of input content (just content) Templates (just rendersheet)
Any document can be used as template for new document
Any document can by used as source of content for new document
Look-and-feel of one document can be applied to another
Algorithms
Auto-organizing algorithms using Photo quality Near-duplicate detection using structural
similarity and time proximity
Auto-layout algorithms BRIC (blocked recursive image composition) START (structured layout for resizable background art)
Transactional Data
Resources are not stored indefinitely Have an expiration date
Two top-level types of resources Documents (composed of “Chunks”) Artifacts
Embeddable as a Widget
Can be embedded in Web or mobile application
Third-party developer can write their own document design user-
interface or they can use the Flash widget that we
provide
Monetizable
We include features that allow for a variety of different business models
Each client application must register with us
API key and “shared secret” token All client requests that create of modify
resources must be signed with the secret All resources are marked with the client
application that created All resources have a “time to live” before
they are deleted
Example Application
Facebook Application
Built by team of outside developers
Uses our UI widget for creating and viewing photobooks
Integrates nicely into Facebook site Leverages social connections of users To make application more viral
Summary
Introduction
Architectural Principles
Platform Implementation
Example Application – Facebook Photo-Books