16
ightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1, 2) (1) Academic Computer Centre Cyfronet AGH, Krakow, Poland (2) Department of Computer Science, AGH Krakow, Poland EGI Community Forum 2015, 10-13 November, Bari

Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Embed Size (px)

Citation preview

Page 1: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Lightweight constructionof rich scientific applicationsDaniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1, 2)

(1) Academic Computer Centre Cyfronet AGH, Krakow, Poland(2) Department of Computer Science, AGH Krakow, Poland

EGI Community Forum 2015, 10-13 November, Bari

Page 2: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Agenda

• Introduction to the PLGrid infrastructure

• Motivation behind providing support for building scientific gateways

• Current web development state-of-the-art and why are we behind?

• How can we address the issues with the plgapp platform

• Architecture and features of plgapp

• Sample use case

• Conclusions and future work

Page 3: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

PL-Grid Consortium

• Consortium creation – January 2007• a response to requirements from Polish

scientists

• due to ongoing Grid activities in Europe (EGEE, EGI_DS)

• Aim: significant extension of amount of computing resources provided to the scientific community (start of the PL-Grid Programme)

• Development based on: • projects funded by the European Regional

Development Fund as part of the Innovative Economy Program

• close international collaboration (EGI, ….)

• previous projects (5FP, 6FP, 7FP, EDA…)

• National Network Infrastructure available: Pionier National Project

• computing resources: Top500 list

• Polish scientific communities: ~75% highly rated Polish publications in 5 Communities

PL-Grid Consortium members: 5 High Performance Computing Polish Centres, representing Communities, coordinated by ACC

Cyfronet AGH

Page 4: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

PLGrid Core projectCompetence Centre in the Field of

Distributed Computing Grid Infrastructures

• Budget: total 104 949 901,16 PLN, including funding from the EC : 89 207 415,99 PLN

• Duration: 01.01.2014 – 31.11.2015

• Project Coordinator: Academic Computer Centre CYFRONET AGH

The main objective of the project is to support the development of ACC Cyfronet AGH as a specialized competence centre in the field of distributed computing infrastructures, with particular emphasis on grid technologies, cloud computing and infrastructures supporting computations on big data.

Page 5: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

PLGrid Core project – services

• Basic infrastructure services

• Uniform access to distributed data

• PaaS Cloud for scientists

• Applications maintenance environment of MapReduce type

• End-user services

• Technologies and environments implementing the Open Science paradigm

• Computing environment for interactive processing of scientific data

• Platform for development and execution of large-scale applications organized in a workflow

• Automatic selection of scientific literature

• Environment supporting data farming mass computations

We are building withinthis task.

Page 6: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Why scientific gateways are made with the use of the web?

• Tailored GUIs allowing researchers to focus on the problem and not on the underlying infrastructure (jobs, GridFTP, etc.)

• Single requirement of having only a web browser on any computer is enough (no additional installable software)

• Easy updates (simplyrefresh the page)

Page 7: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Why do we need more support for building scientific gateways?

• The DRY rule (don’t repeat yourself)

• Each time a new scientific portal is built a server machine is instantiated to host the web server

• Domain certificates are requested for each of the new domains

• Authentication and authorization done separately in each case

• For over 20 domains each providing a few portal services in the PLGrid infrastructure this becomes a significant effort

• Limited selection of available technologies

• Infrastructure client libraries available in few languages limiting portal developers (Java, Python)

• Direct integration with WMS or CREAM job management services calls for using outdated standards and libraries (Axis which over10 years old)

Page 8: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Can we be more like Web 2.0?

• There are many nice frameworks to pick from to build interactive, rich web applications as scientific portals

• They use open communication standards (e.g. REST over HTTP) allowing for easy integration

• Both server and client-side programming approaches are available (a whole application can be run in the user’s browser)

• The Web 2.0 way has to be made available for computing infrastructures

Page 9: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

The plgapp approach to the problem

• Take care of authentication and authorization just once and reuse it

• Providing a modern authentication system such as OpenID by the PLGrid infrastructure is the first step

• Each portal built with plgapp gets its own login page by default without a single line of code

• Assigning application addresses works out-of-the-box with a valid wildcard certificate

• E.g. https://my-new-portal.app.plgrid.pl

• Nowadays a web server is just a plain file repository (HTML, JavaScript and other files) so setup just one instance and reuse it

• Web server is provided for you, no need to apply for a server machine

• Handle the whole application deployment process with a control panel

• For each of the portals created with plgapp two execution environments are provided (production and testing)

• Interact with the computing infrastructure through a set of JavaScript libraries

• We provide JavaScript wrappers for basic services for managing jobs, data transfers and metadata management

Page 10: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Architecture of plgapp – libraries

• No additional dependencies are (and thus no technology lock) required to communicate with the computing infrastructure

• Submitting jobs, running remote processes, managing files and metadata done via well-documented JavaScript APIs

• The set of JavaScript libraries is easily extendable to support new capabilities of basic services

• Security ensured by proxying all requests by plgapp (proxy delegation works out-of-the-box)

Page 11: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Architecture of plgapp – application implementation

• Each plgapp constitutes a isolated scientific portal with two execution environments

• Production environment used by regular users

• Testing environment used to validate new functionalities and by early adopters

• Application files developed locally synchronized via a web form or with a Dropbox client

• Ensures tight implement-save-run loop for better programmer productiveness

• Promoting testing environment to production controlled via a web panel

Page 12: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Architecture of plgapp – security

• Each portal has a customizable login page by default

• During the user login action a proxy certificate is retrieved from the OpenID server and used for delegation behind the scenes

• Granting access to individual portals is accomplished at the level of the centralized PLGrid user portal

• The only requirement on the user side is to have a modern browser installed (Firefox, Chrome, etc.)

Page 13: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Control panels of plgapp

• Application deployment pipeline controlled entirely via a web interface

• Creating of new applications including address assignment

• Promoting testing environment to production

• Viewing application activity log

• Integration with Dropbox

• Customizing the login page

• Online documentation

Page 14: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

MCB – one of the plgapp use cases

• MCB – nuclear reactor simulations

• Portal implemented by using the plgapp platform

• The work focused only the graphical user interfaces with all infrastructure integration in place

• Applications prototypes delivered and validated by domain experts on daily bases with a first working version available after a single day

Page 15: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Conclusions and future work

• Summary

• Plgapp supports a whole pipeline of scientific application deployment cycle

• Computing infrastructure work minimized by reusing of-the-shelf components

• No more technology limitations to what can be used to build modern scientific portals

• Built on top of existing basic services which even more increases the reuse factor

• Plgapp can be integrated in other computing infrastructures

• Future work

• Maintain the service

• Extend the set of JavaScript libraries if necessary

Page 16: Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,

Further contact information

Our web page

http://dice.cyfronet.pl

The PLGrid portal

https://portal.plgrid.pl

Cyfronet’s web page

http://cyfronet.pl