36
Chathuri Wimalasena, Suresh Marru - Apache Airavata (Slide Contributions) Randy Abernethy - Apache Thrift

RESTLess Design with Apache Thrift: Experiences from Apache Airavata

  • Upload
    smarru

  • View
    575

  • Download
    1

Embed Size (px)

DESCRIPTION

Apache Airavata is software for providing services to manage scientific applications on a wide range of remote computing resources. Airavata can be used by both individual scientists to run scientific workflows as well as communities of scientists through Web browser interfaces. It is a challenge to bring all of Airavata’s capabilities together in the single API layer that is our prerequisite for a 1.0 release. To support our diverse use cases, we have developed a rich data model and messaging format that we need to expose to client developers using many programming languages. We do not believe this is a good match for REST style services. In this presentation, we present our use and evaluation of Apache Thrift as an interface and data model definition tool, its use internally in Airavata, and its use to deliver and distribute client development kits.

Citation preview

Page 1: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Chathuri Wimalasena, Suresh Marru - Apache Airavata

(Slide Contributions) Randy Abernethy - Apache Thrift

Page 2: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

40%  discount  coupon  on  any  of  Manning  Publications   Coupon  code  -­‐‑  ********  

(listen  through  the  end  for  decryption  key)  

A  free  ebook:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co.

(listen  through  the  end  and  impress  us  to  win)    

Red Alert: FREE Stuff ….FREE as in

*  Image  Source:  hHp://blog.law.cornell.edu/voxpop/files/2012/05/VOX.Chicken.Crossing_FreeBeer.jpg

Page 3: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Outline �  Introduction

�  Apache  Thrift   �  Apache  Airavata

� Motivation  for  Airavata  to  explore  Thrift � Airavata  API � Road  to  Airavata  1.0 � Airavata  Thrift  based  architecture � Experience  &  lessons  learned � Discussion,  Q  &  A

Page 4: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

� Modern  distributed  applications  are  rarely  composed  of  modules  wriHen  in  a  single  language

� Weaving  together  innovations  made  in  a  range  of  languages  is  a  core  competency  of  successful  enterprises

� Cross  language  communications  are  a  necessity,  not  a  luxury

Polyglotism

**  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.

Page 5: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

�  A  high  performance,  scalable  cross  language  serialization  and  RPC  framework �  What?

�  Full  RPC  Implementation  -­‐‑  Apache  Thrift  supplies  a  complete  RPC  solution:    clients,  servers,  everything  but  your  business  logic

�  Modularity  -­‐‑  Apache  Thrift  supports  plug-­‐‑in  serialization  protocols:    binary,  compact,  json,  or  build  your  own

�  Multiple  End  Points  –  Plug-­‐‑in  transports  for  network,  disk  and  memory  end  points,  making  Thrift  easy  to  integrate  with  other  communications  and  storage  solutions  like  AMQP  messaging  and  HDFS

�  Performance  -­‐‑  Apache  Thrift  is  fast  and  efficient,  solutions  for  minimal  parsing  overhead  and  minimal  size  

�  Reach  -­‐‑  Apache  Thrift  supports  a  wide  range  of  languages  and  platforms:  Linux,  OSX,  Windows,  Embedded  Systems,  Mobile,  Browser,  C++,  Go,  PHP,  Erlang,  Haskell,  Ruby,  Node.js,  C#,  Java,  C,  OCaml,  ObjectiveC,  D,  Perl,  Python,  SmallTalk,  …

�  Flexibility  -­‐‑  Apache  Thrift  supports  interface  evolution,  that  is  to  say,  CI  environments  can  roll  new  interface  features  incrementally  without  breaking  existing  infrastructure.

Apache Thrift

*  Slide  source:  Randy  Abernethy.

Page 6: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

1.  Define  the  service  interface  in  IDL 2.  Compile  the  IDL  to  generate  client/server  RPC  stubs  in  the  desired  

languages 3.  Call  the  remote  functions  as  if  they  were  local  using  the  client  stubs 4.  Connect  the  server  stubs  to  the  desired  implementation 5.  Choose  a  prebuilt  Apache  Thrift  server  to  host  your  service

thrift IDL

Page 7: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

�  Streaming  –  Communications  characterized  by  an  ongoing  flow  of  bytes  from  a  server  to  one  or  more  clients. �  Example:  An  internet  radio  broadcast  where  the  client  

receives  bytes  over  time  transmiHed  by  the  server  in  an  ongoing  sequence  of  small  packets.  

�  Messaging  –  Message  passing  involves  one  way  asynchronous,  often  queued,  communications,  producing  loosely  coupled  systems. �  Example:  Sending  an  email  message  where  you  may  get  a  

response  or  you  may  not,  and  if  you  do  get  a  response  you  don’t  know  exactly  when  you  will  get  it.

�  RPC  –  Remote  Procedure  Call  systems  allow  function  calls  to  be  made  between  processes  on  different  computers. �  Example:  An  iPhone  app  calling  a  service  on  the  Internet  

which  returns  the  weather  forecast.

Comm schemes Apache  Thrift  is  an  efficient  cross  platform  

serialization  solution  for  streaming  interfaces

Apache  Thrift  provides  a  complete  RPC  framework

**  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.

Page 8: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

�  User  Code �  client  code  calls  RPC  methods  and/or  [de]serializes  objects

�  service  handlers  implement  RPC  service  behavior

�  Generated  Code �  RPC  stubs  supply  client  side  proxies  and  server  side  processors

�  type  serialization  code  provides  serialization  for  IDL  defined  types

�  Library  Code �  servers  host  user  defined  services,  managing  connections  and  concurrency

�  protocols  perform  serialization �  transports  move  bytes  from  here  to  there

Architecture

**  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.

Page 9: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

� The  Thrift  framework  was  originally  developed  at  Facebook  and  released  as  open  source  in  2007.  The  project  became  an  Apache  Software  Foundation  incubator  project  in  2008,  after  which  four  early  versions  were  released.

�  0.9.1 released  2013-­‐‑07-­‐‑16 �  0.9.2 Planned  2014-­‐‑06-­‐‑01 �  1.0.0 Planned  2015-­‐‑01-­‐‑01

Thrift Roadmap

it  is  difficult  to  make  predictions,  particularly  about  the  future. -­‐‑-­‐‑  Mark  Twain

Open  Source Community  Developed Apache  License  Version  2.0

**  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.

Page 10: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

� Web �  thrift.apache.org �  github.com/apache/thrift

� Mail � Users:  user-­‐‑[email protected] � Developers:  dev-­‐‑[email protected]

� Chat �  #thrift

� Book � Abernethy  (2014),  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co.      [hHp://www.manning.com/abernethy/]

Thrift Resources

Chapter  1  is  free

*  Slide  source:  Randy  Abernethy.

Page 11: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Knowledge and Expertise

Computational Resources Scientific Instruments Algorithms and

Models Archived Data and

Metadata

Advanced Science Tools

Enabling & Democratizing Scientific Research Science Gateways:

Page 12: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Few Noted Examples:

Page 13: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Airavata  API

Security

Launch  &  Manage    Jobs

Save/Load  configurations  and  provenance  data

Notify  progress  of  job  or  workflow  execution

Real-­‐‑Time  Monitoring

Execute  &  Manage  Computations

Data  Ingest  &  Discovery Data  Movement Data  &  Provenance  Subsystem

Job  &  Workflow  Subsystem

Messaging  Subsystem

i Information  Subsystem

CIPRES

Neuro  Science

Ultrascan

BioVLAB

GAAMP

Param Chem

Academic  &  Commercial  

Clouds

Inter-­‐‑ national  Grids

Apache Airavata

Page 14: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Apache Airavata

Page 15: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Translating Science Gateway Language...

Page 16: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Interactivity through layers

Page 17: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Mathema'cal* Domain* Resource* Model**Refinement*

Uncertain)es+

Orchestra)on+level+Interac)ons+

Workflow**Steering*

Parametric*Sweeps*

Provenance*

Job+Level+Interac)ons+

Job*launch,*gliding*

Checkpoint/**Restart*

Asynchronous++refinements+

Applica)on+Steering+

Challenges the API

Page 18: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Airavata API in a nutshell � Airavata  provides  a  API  to  science  gateway  developers   �  Science  gateways  =>  diverse  requirements � Airavata  API  needs  to  be  

�  Easy  to  understand  and  simple �  Generic  enough  to  support  different  gateways �  Existing  science  gateway  integration  should  be  smooth �  Flexible  for  changes  arising  with  different  requirements  of  science  gateways

Page 19: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Earlier Airavata Architecture

(a simplified view - evolved over 10 years)

Page 20: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Road to Airavata 1.0 �  First  Approach  -­‐‑  SOAP  /  WS

�  Pros   �  ability  to  separate  out  context  and  the  payload �  Already  proven  tools  and  techniques  (XML,  WSDL,  WS-­‐‑Security  etc) �  Clients  can  easily  generate  stubs  using  WSDLs

�  Cons   �  Heavy  weight �  Data  schema  changes  cannot  be  handled  easily �  Could  not  support  broad  range  of  clients

Page 21: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Contd.. Road to Airavata 1.0 �  Second  Approach  -­‐‑  REST

�  Pros   �  Flexible  for  data  representation  (JSON  or  XML) �  Light  weight �  BeHer  performance

�  Cons   �  Multiple  object  layers       �  No  standard  way  to  describe  the  service  to  the  client

Page 22: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Key issues with earlier Airavata API � Only  java  clients  were  developed

�  users  had  to  write  their  own  client  for  other  languages � API  design  was  a  hybrid  model  

�  Makes  it  more  complicated  internally � Complex  data  objects  are  overwhelming  to  work  in  REST

�  Lot  of  marshalling  /  unmarshalling  of  data  objects � Unnecessary  need  for  a  servlet  container � Overwhelming  number  of  methods  with  lots  of  overloaded  methods

Page 23: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Contd..Road to Airavata 1.0 � Third  Approach  -­‐‑  RPC  style  tools  (Apache  Thrift,  Protocol  Buffers,  Apache  Avro)

�  Ability  to  communicate  easily  across  different  programming  languages �  Easy  way  to  generate  server  and  client  code

Page 24: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Why Apache Thrift � Apache  project �  For  Airavata  users,  a  client  library  that  can  consume  the  API  is  more  important  than  an  open  API  like  REST

� Light  framework �  No  multiple  dependencies  and  servlet  containers.

� Easy  learning  curve  to  get  started �  If  carefully  crafted,  the  IDLs  and  framework  support  backward  and  forward  compatibility

Page 25: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Clean  way  to  define  IDLs  with  richer  data  structures

Page 26: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Apache Thrift Integration with Airavata

●  Airavata  API  server  and  Orchestrator  are  thrift  services  at  the  moment

●  Other  components  will  also  become  thrift  services  in  the  future

Page 27: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Apache Thrift Integration with Airavata

Page 28: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Experience: What we gain � Able  to  support  clients  wriHen  in  different  languages � Clean  design � Having  different  versions  of  servers

�  TSimpleServer  -­‐‑  Simple  single  threaded  server �  TThreadPoolServer  -­‐‑  Uses  Java'ʹs  built  in  ThreadPool  management �  TNonblockingServer  -­‐‑  non-­‐‑blocking  TServer  implementation �  THsHaServer  -­‐‑  extension  of  the  TNonblockingServer  to  a  Half-­‐‑Sync/Half-­‐‑Async  server

Page 29: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Experience Contd.. What we gain

� No  need  of  marshalling  /  unmarshalling  -­‐‑  data  models  used  internally �  Server  is  very  robust  -­‐‑  able  to  handle  large  number  of  concurrent  requests.  

� Easy  to  do  modifications  to  the  models,  since  code  is  auto-­‐‑generated � Convenient  way  to  achieve  backward  compatibility

Page 30: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Lessons Learned � Modularize  the  data  models  in  order  to  ease  the  maintenance

� No  maven  plugin  at  the  moment,  hence  you  have  to  commit  auto-­‐‑generated  code.

Page 31: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Lessons Learned Contd..

Page 32: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Lessons Learned Contd..

� Thrift  has  limited  support  to  handle  null  values. �  Experiment  model  is  a  complex  model  with  lot  of  other  structs. �  Enums  cannot  be  passed  as  null  over  the  wire  even  they  are  specified  as  optional  in  the  struct

� Thrift  has  limited  documentation  and  samples �  airavata  can  be  used  as  a  reference  

�  hHps://github.com/apache/airavata

Page 33: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Future Directions

� March  towards  Airavata  1.0  with  a  stable  API � Exploring  next  generation  information  model  and  Messaging  Systems

�  Ka{a,  flume,  fluend,  Scribe,  Hedwig…. � Refactor  Registry  Implementation   � Airavata  architecture  to  support  multi-­‐‑tenanted  PAAS

�  Motivated  by  a  downstream  open  source  project  SciGaP  –  Science  Gateways  Platform  as  a  Service  

Page 34: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

How you can participate/follow

�  Join  the  party  -­‐‑  Architecture  mailing  list � BYOX   � X  =  opinions,  software,  ideas,  code,  yourself  ..

Page 35: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Take Home �  2  e  books  on  thrift �  awareness  of  thrift  and  airavata �  lessons  learned � how  to  get  involved � We  are  under  same  umbrella  for  a  reason,  lets  party  together

�  Join  our  Architecture  List  and  help   �  Other  means??

Page 36: RESTLess Design with Apache Thrift: Experiences from Apache Airavata

Discussion, Q & A