View
434
Download
0
Category
Preview:
Citation preview
Getting Started on Google Cloud Platform
Aaron Taylor
@ataylor0123
access any file in seconds, wherever it is.
www.meta.sc
Folders are outdated
Files are scattered
Talk Roadmap
• What problems we face at Meta
• How we are solving them using GCP
• How you can get started on GCP
Building a product
• No baggage, free to choose whatever stack we want
• Take advantage of latest technologies
• but not quite bleeding edge
Engineering Goals
• This will be a complex product, it needs to be comprehensible to everyone on our team
• Keep the team as lean as possible
• Focus on product, not sysadmin and dev ops
Language Choices
• Go chosen as our primary language
• Python for NLP and data analysis
• enables easy experimentation, comfortable for data scientists and developers
• Java/Scala interacting with Dataflow, Apache Tika, etc.
Our Hard Problems
• User onboarding load
• Heterogeneous (changing) data sources
• Unpredictable traffic from web hooks
• Compute loads for file content analysis
• Processing streaming data
User Onboarding
• Crawl multiple cloud accounts at once
• Parallel computation
• In-process using Go
• Distributed using tasks• App Engine
Taskqueues
Heterogeneous Data
• Remove complexity of third-party services
• Detect changes/breakages in APIs
• Distributed by nature
• Continuous Deployment
• Datastore
• BigQuery
Unpredictable Traffic
• Changes are pushed to us through web hooks
• Dropping changes generally unacceptable
• One user should not negatively impact others
• App Engine autoscaling
• Asynchronous task queues
Compute loads• Rich file content analysis
• Parallel computation
• App Engine Flexible Runtimes
• CPU-based autoscaling
Stream Processing• Efficient handling of
high-volume changes
• Collate events in succession, from multiple users
• Google Cloud Pub/Sub
• Google Cloud Dataflow
How we started off
• App Engine is our entry point
• Service Oriented Architecture
• Currently ~37 different services
• Cloud Datastore is our persistence layer
• BigQuery as a data warehouse
Documentation
• Lots of information for getting started
• Quality resources for our growing team
• Onboarding new developers without GCP experience has been a breeze
• Google is devoting lots of resources to this area
App Engine
• Don’t worry about servers
• Cache, task queues, cron, database, logging, monitoring, and more all built in
• Powerful, configurable autoscaling
• Heavy compute on App Engine Flexible Runtimes
Development Process
• Build, run, and test services locally
• Continuous deployment to a development project
• Incremental releases go to production project
• Logging and monitoring easy to setup
Problems we faced• Mantra of “don’t worry about scalability” didn’t take us
very far
• Users have lots and lots of files
• Datastore use optimizations
• Cost issues with App Engine
• Trimming auto-scaling parameters
• Migrated heavy compute to Flexible Runtimes
Outside GCP• Algolia
• Hosts infrastructure for our search indices
• Pusher
• realtime socket connections
• Postmark/Mailchimp
• transactional and campaign-based email
Growth of the platform• Rapid changes and improvements taking place
• Flexible Runtimes
• Container Engine
• Dataflow
• Investing in a documentation overhaul soon
• Support is generally quite responsive
Recent Developments
• Introduction of Pub/Sub to our system for all event processing
• Experimenting with Kubernetes/Container Engine
• Dataflow stream processing jobs
• Splitting functionality into multiple projects
Quickstart Documentation for Go
How you can start off
Hello World in Go
https://cloud.google.com/appengine/docs/go/quickstart
Server
package hello
import ( "fmt" "net/http" )
func init() { http.HandleFunc("/", handler) }
func handler(w http.ResponseWriter, r *http.Request) { fmt.Fprint(w, "Hello, world!") }
hello.go
Configuration
runtime: go api_version: go1
handlers: - url: /.* script: _go_app
app.yaml
Deploy
appcfy.py update .
Add a Guestbook
https://cloud.google.com/appengine/docs/go/gettingstarted/creating-guestbook
Datastoretype Greeting struct { Author string Content string Date time.Time }
// guestbookKey returns the key used for all guestbook entries. func guestbookKey(c appengine.Context) *datastore.Key { // The string "default_guestbook" here could be varied to have multiple guestbooks. return datastore.NewKey(c, "Guestbook", "default_guestbook", 0, nil) }
func root(w http.ResponseWriter, r *http.Request) { c := appengine.NewContext(r)
// Ancestor queries, as shown here, are strongly consistent with the High // Replication Datastore. Queries that span entity groups are eventually // consistent. If we omitted the .Ancestor from this query there would be // a slight chance that Greeting that had just been written would not // show up in a query. q := datastore.NewQuery("Greeting").Ancestor(guestbookKey(c)).Order("-Date").Limit(10)
greetings := make([]Greeting, 0, 10) if _, err := q.GetAll(c, &greetings); err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return }
if err := guestbookTemplate.Execute(w, greetings); err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) } }
Templates
var guestbookTemplate = template.Must(template.New("book").Parse(` <html> <head> <title>Go Guestbook</title> </head> <body> {{range .}} {{with .Author}} <p><b>{{.}}</b> wrote:</p> {{else}} <p>An anonymous person wrote:</p> {{end}} <pre>{{.Content}}</pre> {{end}} <form action="/sign" method="post"> <div><textarea name="content" rows="3" cols="60"></textarea></div> <div><input type="submit" value="Sign Guestbook"></div> </form> </body> </html> `))
Formsfunc sign(w http.ResponseWriter, r *http.Request) { c := appengine.NewContext(r) g := Greeting{ Content: r.FormValue("content"), Date: time.Now(), }
if u := user.Current(c); u != nil { g.Author = u.String() } // We set the same parent key on every Greeting entity to ensure each Greeting // is in the same entity group. Queries across the single entity group // will be consistent. However, the write rate to a single entity group // should be limited to ~1/second. key := datastore.NewIncompleteKey(c, "Greeting", guestbookKey(c)) _, err := datastore.Put(c, key, &g) if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return } http.Redirect(w, r, "/", http.StatusFound) }
Conclusions
• Google Cloud Platform has allowed us to build out Meta in ways that wouldn’t otherwise be feasible
• Simplicity of App Engine allows us to focus on product
• Scalability/Availability are built in to the platform
access any file in seconds, wherever it is.
www.meta.sc/careers
careers@meta.sc
Recommended