node.js + Riak

Preview:

DESCRIPTION

Javascript is becoming a compelling platform - from browser to database. Node and Riak are two big players in this space.

Citation preview

node.js & RiakFrancisco Treacy

May 31st, 2010 @ ams.rbAmsterdam

An introduction to

`whoami`

Building a web-basedcollaborative e-readingplatform

J2EE veteran

Trumpetless wannabe trumpet player

@frank06 on the interwebs

How did we learn to program?

Let’s look back, shall we?

printf("What is your name?");gets(string);printf("Your name is: %s", string);

Does this sound familiar?

...it is how we learn how to program

<?php$query = "SELECT astronaut FROM spacecraft";$result = mysql_query($query);

while($result) { ... }?>

So then we go build our first webapp...

(with the first language we come across)

Let’s imagine a bunch of peopletrying to access your website

(run that code) at the same time

It’s like hosting a partyand provide only one toilet

© hopechurchokc / flickr

The cost of I/O

L1 cache: 3 cycles

L2 cache: 14 cycles

RAM: 250 cycles

Disk: 41,000,000 cycles

Network: 240,000,000 cycles

The cost of I/O

L1 L2 RAM Disk Network

The cost of I/O

In other words, reaching RAM is like going fromhere to the Red Light District.

Accessing the network is like going to the moon.

$result = mysql_query($query);

So... what is your program still doing while MySQL goes fetch

Neil Armstrong?

But we have threads!(to do other stuff while we wait)

Programming with threads

Race conditions

Coarse-grained locks – more blocking

Fine-grained locks – more complexity

Risk for deadlocks

Hard to reason about and therefore to get right

Context switching overhead

Programming with threads

A look at Apache and Nginx

http://blog.webfaction.com/a-little-holiday-present

Memory vs concurrent connections

A look at Apache and Nginx

1 thread per connection is limiting for massive concurrency

Apache uses 1 thread per connection

Nginx doesn’t use threads

it runs an event loop

small memory allocation per connection

Event loops

Especially relevant when dealing with I/O

All code runs in a single thread

No need for multithreading – no locks!

Event loops

window.onload = function() { alert("Apollo 11 landed!")}

Rings a bell?

Javascript!

Fortunately, Google has beendeveloping a brand new

Javascript VM for Chrome...

It’s robust, blazingly fast, and open-source:

V8

The node way

A set of bindings to Google’s V8 Javascript VM

A purely evented, non-blocking infrastructure that makes it super simple to build highly concurrent programs

Ability to handle thousands of concurrent connections with minimal overhead on a single process

The node way

I love Ruby

I can use EventMachine for async

And, after all, my Rails app doesn’t have youporn’s traffic

Why should I care?

The node way

“Libraries like eventmachine will never be truly intuitive to use, because event-

driven I/O is enough of a fundamental shift that it requires deep language

integration. Javascript, it turns out, is a fundamentally event-driven language because of its origins in the browser”

Adam Wiggins – Heroku

The node way

"Threads should be used by experts only"

Javascript is perfect for event loops with first class function objects and closures

It is arguably the most popular programming language

Full Javascript stack?

Full Javascript stack

$.ajax({ url: '/api/feedme', success: function(data) { $('.result').html(data); }});

Client-side

Full Javascript stack

var socket = new io.Socket('localhost');socket.connect();socket.send('some data');socket.addEvent('message', function(data){ $('.result').html(data);});

Client-side – even better

WebSockets through Socket.IO-node

Full Javascript stack

http.createServer(function (request, response) { response.writeHead(200, {'Content-Type': 'text/html'}); response.end('I am back!');}).listen(8000);

Server-side – node.js

Full Javascript stack

Database

?

Introducing Riak

Content-agnostic key/value store

REST API embraces HTTP

Javascript Map/Reduce

Distributed, “assume that failures will happen”

Linearly scalable

Introducing Riak

€ ⇔ throughput – cost predictability

Both up and down; less headaches for operations and development

What is scalable anyway?(other than a buzzword)

db.save('astronauts', 'neil', { name: "Neil Armstrong", retired: true, daysInSpace: 8, missions: ['Apollo 11', 'Gemini 8']})()

riak-js – http://github.com/frank06/riak-jsAvailable for node.js and jQuery

Brief Riak overview

var map = function(v, keydata, args) { var result = [], doc = Riak.mapValuesJson(v)[0] doc.missions.forEach(function(mission) { if (mission.match(new RegExp(args.mission))) result.push(doc) }) return result;}

Map/Reduce jobs can be written in Javascript and submitted via the HTTP interface

Brief Riak overview

db.mapReduce({ inputs: "astronauts", query: [{ map: { source: map, arg: { mission: "Apollo" } } }] })(function(results) { // ... });

Bring the computation to the data – map phases are executed in parallel

Aggregation happens in one node

Brief Riak overview

db.get('astronauts', 'neil', {r: 2})( function(neil) { $('.result').html(neil) })

Tunable N/R/W values to tweak CAP behaviour

Eventual consistency: Brief sacrifices of consistency in failure conditions

“Choose your own fault tolerance/performance tradeoff”

Brief Riak overview

Consistency: Reads and writes reflect a globally consistent system state

Availability: System is available for reads and writes

Partition tolerance: System can handle the failure of individual parts

CAP theorem

Brief Riak overview

CAP theorem

No real-world data store can serve completely consistent data while

being 100% available and handling disconnected networks

Brief Riak overview

Internet

NGINX

Rails Rails Rails Rails Node

Collaborative platform for studying

Tens of thousands of books alongside with user generated content

Highlighting, note-taking, sharing

Web-based, use it anywhere: laptop, phone, iPad

HTML5 + Javascript + node.js + Riak (and Rails!)

Expect a beta release in October

Riak is developer- and ops-friendly: it scales down to your laptop as easily as up to a cluster – especially during exams period!

Allows us to store multimedia assets just as JSON

Lucene-like search coming soon

Node is used for Ajax calls and WebSocket

Rails for the rest (it’s convenient and mature)

Cutting-edge technologies are not bug-free

Riak still has some rough edges (some in terms of performance)

node is approaching its first stable version

Async JS code can get “boomerang-shaped”

There are caveats though

There are caveats though

db.save(bucket, doc, content)(function(response, meta) { db.get(bucket, doc)(function(response2) { assert.equal(response2, content); db.remove(bucket, doc)(function() { db.get(bucket, doc)(null, function(r, meta3) { assert.equal(404, meta3.statusCode); db.get(bucket, other)(function() { // ... }) }); }); });});

Boomerang-shaped code

Step or flow-jsAddress boomerang-shaped code

Step( function readSelf() { fs.readFile(__filename, this); }, function capitalize(err, text) { if (err) { throw err; } return text.toUpperCase(); }, function showIt(err, newText) { sys.puts(newText); });

CoffeeScriptis a new language inspired by Javascript and Ruby

grade: (student) -> if student.excellent_work "A+" else if student.okay_stuff if student.tried_hard then "B" else "B-" else "C"

eldest: if 24 > 21 then "Liz" else "Ike"

CoffeeScriptcompiles down to Javascriptvar eldest, grade;grade = function(student) { if (student.excellent_work) { return "A+"; } else if (student.okay_stuff) { if (student.tried_hard) { return "B"; } else { return "B-"; } } else { return "C"; }};eldest = 24 > 21 ? "Liz" : "Ike";

And can be run on node!

There’s no doubt about it.

Javascript as a platformis serious stuff.

Full JS stacks will becomemore and more popular.

Maybe it’s about timewe started teaching this?

puts("Enter your spacecraft:")gets(function(s) { puts("You’re flying your " + s)})

Go fetch a beer – see you in a bit for a demo and questions!