Asynchronous I/O in NodeJS - new standard or challenges?

Embed Size (px)

Citation preview

Asynchronous I/O in NodeJS

- Standard and challenge of web programming?

Pham Cong Dinh @ SkunkWorks
@pcdinh
BarCamp Saigon 2011

Follow us on Twitter: @teamskunkworks

Notice

It is not a comprehensive study

Things can be changed

No standard at the moment

Unix only

Proven control flow models

Single-threaded process modelPrefork MPM Apache HTTPd

Multi-threaded process modelWorker MPM Apache HTTPd

JVM

Emerging control flow models

Coroutines

Coroutines are computer program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations.Fiber

A lightweight thread of execution. Like threads, fibers share address space. However, fibers use co-operative multitasking while threads use pre-emptive multitasking. Events: non-blocking I/O

Events / Non-blocking I/O

From http://bethesignal.org/

Request handling:
Process / Threads / Events

Process: A single process is forked per request. That process is blocked until response can be produced

Threads: A process contains multiple threads. A single thread is assigned to handle an incoming request. Thread-per-request model

Events: A process consists of the processing of a series of events. At any instant in time, a single event/request is being processed. The process is not blocked on I/O request.

Request handling:
Process / Threads / Events

Threads shares

Default share memory File descriptors Filesystem context Signals and Signal handling

Request handling:
Process / Threads / Events

Thread creation is expensive - From iobound.com

Request handling:
Process / Threads / Events

Context switching is expensive

Request handling:
Process / Threads / Events

Blocking model: Process and Threads

Request handling:
Process / Threads / Events

Non-blocking model: Events

Request handling:
Event-driven IO loop issue

poll vs. epoll

epoll: O(1) or O(n=active)

Request handling:
Event-driven callback issues

Non-blocking IO Loop vs. Blocking callback

Event dispatching is not blocked

Callback can be blocked

Mixing asynchronous code and synchronous code can be bad

Events in NodeJS

libev for event loops

libeio for asynchonous file I/O

c-ares for asynchronous DNS requests and name resolution.

evcom (by Ryan Dahl) is a stream socket library on top of libev.

Asynchronous programming model in NodeJS

First citizen: High order function/callback

Most objects in NodeJS are Event Emitters (http server/client, etc.)

Most low level functions take callbacks. (posix API, DNS lookups, etc.)

Blocking code

var a = db.query('SELECT A');console.log('result a:', a);

Non-blocking code using callbackdb.query('SELECT A', function(result) { console.log('result a:', result);});

Asynchronous programming model in NodeJS

Callbacks is hardDivides things into stages and handle each stage in a a callback

Do things in a specific order.

You must keep track of what is done at a point of time

Hard to handle failures

Nested callbacks can be hard to read

Asynchronous programming model in NodeJS

Nested callbacks can be hard to read

var transferFile = function (request, response) { var uri = url.parse(request.url).pathname; var filepath = path.join(process.cwd(), uri); // check whether the file is exist and get the result from callback path.exists(filepath, function (exists) { if (!exists) { response.writeHead(404, {"Content-Type": "text/plain"}); response.write("404 Not Found\n"); response.end(); } else { // read the file content and get the result from callback fs.readFile(filepath, "binary", function (error, data) { if (error) { response.writeHead(500, {"Content-Type": "text/plain"}); response.write(error + "\n"); } else { response.writeHead(200); response.write(data, "binary"); }

response.end(); }); } });}

Asynchronous programming model in NodeJS

Callback is hard to debug

function f () { throw new Error(foo);}

setTimeout(f, 10000*Math.random());setTimeout(f, 10000*Math.random());

From which line does the error arise?

Asynchronous programming model in NodeJS
Flow Control Libraries

Steps https://github.com/creationix/step

Flow-JS https://github.com/willconant/flow-js

Node-Promise https://github.com/kriszyp/node-promise

Asynchronous programming model in NodeJS
Flow Control Libraries: Steps

Step's goal is to both remove boilerplate code and to improve readability of asynchronous code. The features are easy chaining of serial actions with optional parallel groups within each step.

Step( function readSelf() { fs.readFile(__filename, this); }, function capitalize(err, text) { if (err) throw err; return text.toUpperCase(); }, function showIt(err, newText) { if (err) throw err; console.log(newText); });

Asynchronous programming model in NodeJS
Flow Control Libraries: Flow-JS

Flow-JS provides a Javascript construct that is something like a continuation or a fiber found in other languages. Practically speaking, it can be used to eliminate so-called "pyramids" from your multi-step asynchronous logic.

dbGet('userIdOf:bobvance', function(userId) { dbSet('user:' + userId + ':email', '[email protected]', function() { dbSet('user:' + userId + ':firstName', 'Bob', function() { dbSet('user:' + userId + ':lastName', 'Vance', function() { okWeAreDone(); }); }); });});

Asynchronous programming model in NodeJS
Flow Control Libraries: Flow-JS

flow.exec( function() { dbGet('userIdOf:bobvance', this);

},function(userId) { dbSet('user:' + userId + ':email', '[email protected]', this.MULTI()); dbSet('user:' + userId + ':firstName', 'Bob', this.MULTI()); dbSet('user:' + userId + ':lastName', 'Vance', this.MULTI());

},function() { okWeAreDone() });

Asynchronous programming model in NodeJS
JavaScript extension: TameJS

Tame (or "TameJs") is an extension to JavaScript, written in JavaScript, that makes event programming easier to write, read, and edit when control-flow libraries are not good enough!.

http://tamejs.org/

Asynchronous programming model in NodeJS
JavaScript extension: TameJS

Synchronous code

handleVisit : function(angel, buffy) { var match_score = getScore(angel, buffy); var next_match = getNextMatch(angel); var visit_info = recordVisitAndGetInfo(angel, buffy); if (match_score > 0.9 && ! visit_info.last_visit) { sendVisitorEmail(angel, buffy); } doSomeFinalThings(match_score, next_match, visit_info);}Asynchrnous code

handleVisit : function(angel, buffy) { getScore(angel, buffy, function(match_score) { getNextMatch(angel, function(next_match) { recordVisitAndGetInfo(angel, buffy, function(visit_info) { if (match_score > 0.9 && ! visit_info.last_visit) { sendVisitorEmail(angel, buffy); } doSomeFinalThings(match_score, next_match, visit_info); }); }); });}

Asynchronous programming model in NodeJS
JavaScript extension: TameJS

TameJS style

handleVisit : function(angel, buffy) {

// // let's fire all 3 at once //

await { getScore (angel, buffy, defer(var score)); getNextMatch (angel, buffy, defer(var next)); recordVisitAndGetInfo (angel, buffy, defer(var vinfo)); }

// // they've called back, and now we have our data //

if (score > 0.9 && ! vinfo.last_visit) { sendVisitorEmail(angel, buffy); } doSomeFinalThings(score, next, vinfo);}

Asynchronous programming model in NodeJS
JavaScript's yield

V8/NodeJS has not supported yield, so generator yet

The end

Q & A