34
RESTful APIs for Scientific Computing in Django Shreyas Cholia - [email protected] Annette Greiner - [email protected] Outreach, Software and Programming Group NERSC - LBNL SciPy 2011, Austin, TX Thursday, July 21, 2011

RESTful APIs for Scientific Computing in Djangoconference.scipy.org/scipy2011/slides/cholia_restfuldjango.pdf · RESTful APIs for Scientific Computing in Django Shreyas Cholia

Embed Size (px)

Citation preview

RESTful APIs for Scientific Computing in Django

Shreyas Cholia - [email protected] Greiner - [email protected], Software and Programming GroupNERSC - LBNLSciPy 2011, Austin, TX

Thursday, July 21, 2011

NERSC

• National Energy Research Scientific Computing Center

• DOE Office of Science HPC User Facility at Lawrence Berkeley Lab

• Provides high performance compute, data, network and information services to scientists across the world

Thursday, July 21, 2011

NERSC Resources

• Multiple HPC clusters

Hopper, Franklin, Carver, Magellan, Euclid, PDSF

• HPSS archival storage system

• Global File System

Thursday, July 21, 2011

Web Gateways

• Old way - SSH + command line + batch system

• People now expect web interfaces for everything

• Usability - scientific computing should be as easy as online-banking

• don’t want generic options/tools not applicable to your science

• don’t want to deal with backend, middleware, UNIX CLI etc.

Thursday, July 21, 2011

Motives

• Make it very easy for science teams to build web gateways to their data and computation

• We have already built several science specific gateways - want to encapsulate common patterns

Thursday, July 21, 2011

Web Stack

• Browser + AJAX

• REST

• JSON

• Web Framework

Thursday, July 21, 2011

Web Frameworks

• Python has a number of very powerful web frameworks - Django, Web2py, Pylons/Pyramid ...

• Model-View-Controller pattern

• Separation of

• data model

• routing/request processing

• templates

• DRY - Don’t Repeat Yourself

• Building blocks for common web tasks (auth, HTTP headers, themes and templates)

• Database abstraction (ORM) - masks SQL layer

Thursday, July 21, 2011

Django

• Powerful MVC framework in Python

• Probably most widely adopted and documented Python framework

Thursday, July 21, 2011

She turned me into a NEWT

• NEWT - NERSC Web Toolkit

• REST ful ... ish API

• access HPC resources over the web using HTTP + JSON

• Built using Django

Thursday, July 21, 2011

NERSC Web Toolkit

• NEWT Service Exposes NERSC Resources as HTTP URIs

• ReST API - HTTP + verbs + JSON

• newt.js Javascript Library for frontend dev

Thursday, July 21, 2011

Things you can do ...

• Authenticate using NERSC credentials

• Check machine status

• Upload and download files

• Submit a compute job

• Monitor a job

• Get user account information

• Store app data

• Issue UNIX commands

Thursday, July 21, 2011

NEWT Django

Client: Web Application - HTML 5/AJAX

System Resources (via

Globus)

Persistent Store (NoSQL DB)

Accounting Information

Files

Batch Jobs

Shell Commands

Status

CouchDB NIM

Authentication

MyProxy CA

Internal DB:session, cred, user

information

http requestJSON data

Thursday, July 21, 2011

Quick Demo

• Make request at a URL

GET !"#$%&&#'()*+,*-)!./0($1.2'3&/04)&5-0-0&!'##0(&6

• Get back JSON response

Thursday, July 21, 2011

Django makes it easy!

• Django already gives us most of the glue to build these kinds of frameworks

• Python has tons of special sauce to interface with different backend resources

Thursday, July 21, 2011

Building an API

• Decide on the resources you want to expose

• CORE: Map URI paths to custom methods

• Plugin an authentication backend if needed

• To access the full complement of HTTP verbs you will need a REST plugin for Django - Piston, Tastypie

Thursday, July 21, 2011

urls.py

urlpatterns = patterns('', (r'^/?$', 'newt.home.views.apiroot'), (r'^status', 'newt.status.views.statusAdapter')), (r'^file', include('newt.file.urls')),

)

Thursday, July 21, 2011

views.py

class StatusAdapter(PublicJsonResourceAdapter): def get(self, request): logger.debug("display status for all") try: status_dict=Status.get() except Exception, e: return HttpResponseServerError("Could not connect") output = JSONEncoder().encode(status_dict['machine_info']) return HttpResponse(output, status=status_dict ['httpstatus'], content_type='application/json')

Thursday, July 21, 2011

!"#$ #"%&'#(" )"%(#*+,*&-

789: &5-0-0&; 9-<=>)$6789:6?*)*6)'65-0-06'/6;@6(0)-(/$6A'<6>?

BC: &D+0&;&#*)!& ;0)-(/$6?>(01)'(E6+>$F/26G'(6&#*)!&6'/6;6

BC: &*11'-/)&-$0(&H ;0)-(/$6-$0(6*11'-/)6>/G'6G'(6H

ICJ &$)'(0&IK&I8L I0+0)0$6'<A01)6I8L6>/6IK

Thursday, July 21, 2011

• Very pluggable - easily drop in external apps

• Middleware layer - can intercept and tweak HTTP requests and responses (useful for handling cross site headers)

• Lots of nice decorators for handling authorization, sessions, caching etc.

• Handles users, sessions, DB (ORM) stuff automatically

Other Cool Things About Django!

Thursday, July 21, 2011

$.newt_ajax({

url: "/queue/hopper/",

type: "POST",

data: {"jobfile": filename},

success: function(data){

$("#output").append(data.jobid);

},

});

This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like

{"status": "OK", "error": "", "jobid" : "hop1234.id" }

The AJAX way!

Thursday, July 21, 2011

NERSC’s Online VASP Application

Thursday, July 21, 2011

Thursday, July 21, 2011

Thursday, July 21, 2011

• Create an API that allows science groups to build custom web applications

• A Simple RESTful API makes it very easy for science groups to build science specific interfaces to data and computing

! ! Science-As-A-Service

What it all means …

Thursday, July 21, 2011

Questions?

https://newt.nersc.gov for examples*, tutorial etc.

(*you will need a NERSC account for most examples)

Contact:

Shreyas Cholia: [email protected]

Annette Greiner: [email protected]

The End

Thursday, July 21, 2011

Code Samples

Thursday, July 21, 2011

job_list template

{% for job in all_jobs %}

. . .

<tr class="completed"><td><a href="{% url nova.main.views.view_job job.id %}">{{ job.jobname }}</a></td><td>{% if job.time_submitted %}{{ job.time_submitted|date:"M j, Y g:i A" }}{% else %}-{% endif %}</td><td>Completed</td><td class="buttons"><a href="" onclick="show_copy_dialog({{job.id}});return false;">Copy</a> <a href="" onclick="show_move_dialog({{job.id}}, '{{job.jobname}}');return false;">Move</a> <a href="" onclick="show_del_dialog({{job.id}});return false;">Delete</a> <a href="{% url nova.main.views.view_convergence job.id %}">Convergence</a></td><td class="hidden">3</td></tr>

{% endfor %}

Thursday, July 21, 2011

newt.status.models.py

class Status(object): @classmethod def get(cls, machine_name=None): base_url = _settings.STATUS_URL url = '%s?%s=%s' % (base_url, 'system', machine_name) conn = httplib2.Http() response, content = conn.request(url, 'GET') httpstatus = int(response['status']) # {"system":"carver", "status":"up"} jd = JSONDecoder().decode(content) od = {'httpstatus':httpstatus, 'machine_info': jd} return od

Thursday, July 21, 2011

view_job view

def view_job(request, jobid, *args, **kwargs): j=Job.objects.get(id=jobid) try: dir_info=j.get_dir(j.jobdir) except IOError, ex: return HttpResponseBadRequest("File Not Found: %s"%str(ex)) # sort by filename dir_info=sorted(dir_info,key=itemgetter('name')) return render_to_response('main/job_view.html', {'job_name': j.jobname, 'job_id': jobid, 'job_jobdir': j.jobdir, 'dir_info' : dir_info, 'pbs_id': j.pbsjobid, 'machine':j.machine}, context_instance=RequestContext(request))

Thursday, July 21, 2011

get_dir in the job model

def get_dir(self, *args, **kwargs): """ >>> j.get_dir() [{listing1},{listing2},]

"""

if 'dir' in kwargs: path=kwargs['dir'] else: path=self.jobdir cookie_str=self.user.cookie url = '/file/%s%s' % (self.machine, path) response, content = util.newt_request(url, 'GET', cookie_str=cookie_str) if response['status']!='200': raise IOError(content) dir_info=JSONDecoder().decode(content) return dir_info

Thursday, July 21, 2011

newt_request in util

def newt_request(url, req_method, params=None, cookie_str=None):

newt_base_url=getattr(settings, 'NEWT_BASE_URL') full_url = newt_base_url+url conn = httplib2.Http(disable_ssl_certificate_validation=True)

# Massage inputs if cookie_str: headers={'Cookie': cookie_str} else: headers=None if type(params) is dict: body=urllib.urlencode(params) elif (type(params) is str) or (type(params) is unicode): body=params else: body=None logger.debug("NEWT: %s %s"%(req_method,full_url)) response, content = conn.request(full_url, req_method, body=body, headers=headers) logger.debug("NEWT response: %s"%response.status)

return (response, content)

Thursday, July 21, 2011

get_dir in the job model

def get_dir(self, *args, **kwargs): """ >>> j.get_dir() [{listing1},{listing2},]

"""

if 'dir' in kwargs: path=kwargs['dir'] else: path=self.jobdir cookie_str=self.user.cookie url = '/file/%s%s' % (self.machine, path) response, content = util.newt_request(url, 'GET', cookie_str=cookie_str) if response['status']!='200': raise IOError(content) dir_info=JSONDecoder().decode(content) return dir_info

Thursday, July 21, 2011

view_job view

def view_job(request, jobid, *args, **kwargs): j=Job.objects.get(id=jobid) try: dir_info=j.get_dir(j.jobdir) except IOError, ex: return HttpResponseBadRequest("File Not Found: %s"%str(ex)) # sort by filename dir_info=sorted(dir_info,key=itemgetter('name')) return render_to_response('main/job_view.html', {'job_name': j.jobname, 'job_id': jobid, 'job_jobdir': j.jobdir, 'dir_info' : dir_info, 'pbs_id': j.pbsjobid, 'machine':j.machine}, context_instance=RequestContext(request))

Thursday, July 21, 2011

job_view template

<h2>Files</h2>{% for fileline in dir_info %} {% if 'd' not in fileline.perms %} <p><a href="{% url nova.main.views.get_file job_id fileline.name %}">{{ fileline.name }}</a></p> {% endif %}{% endfor %}

Thursday, July 21, 2011