15
AmCAT3 Using Django for a scientific document analysis website: Tastypie, unit tests, R, open platforms and open questions Wouter van Atteveldt (VU Amsterdam)

Using Django for a scientific document analysis (web) application

Embed Size (px)

Citation preview

Page 1: Using Django for a scientific document analysis (web) application

AmCAT3

Using Django for a scientific document analysis website: Tastypie, unit tests, R, open platforms and open questions

Wouter van Atteveldt (VU Amsterdam)

Page 2: Using Django for a scientific document analysis (web) application

AmCAT

What is AmCAT?

Design considerations

Open data and the publication cycle

Tables, TastyPie, and R

Unit tests

Page 3: Using Django for a scientific document analysis (web) application

What is AmCAT?

Document management and analysis

Aimed at social sciences and humanities Input: scraping, uploading Management: projects, selections Analyses: keyword analysis, linguistic processing

(lemmatizing etc), manual annotation

Open source, open standards, open access

Page 4: Using Django for a scientific document analysis (web) application
Page 5: Using Django for a scientific document analysis (web) application
Page 6: Using Django for a scientific document analysis (web) application
Page 7: Using Django for a scientific document analysis (web) application
Page 8: Using Django for a scientific document analysis (web) application

Design Choices

Default Django: web site backed by a database AmCAT: database with a web front end

Page 9: Using Django for a scientific document analysis (web) application

Design Choices

Default Django: web site backed by a database AmCAT: database with a web front end

Data should be accessible from outside ORM should be usable without web site code DB should be final authentication/authorisation

Page 10: Using Django for a scientific document analysis (web) application

Design choices

Separate 'apps' for business, presentation Custom authentication middleware and user

management save() and update() with using= database-specific code for creating users We don't actually like this too much...

All data and methods (should be) exposed through web service API

Page 11: Using Django for a scientific document analysis (web) application

Open data and Publication Cycle

RelationalDB

ORM(django)

AmCAT Navigator(web site)

REST API(web service)

SPARQL End point

External scripts(Python, R, ...)

Page 12: Using Django for a scientific document analysis (web) application

Open access publication cycle

Source:DANS/AmCAT3

(Linked) data

Analysis:

R, matlab, ...

Publication:

PDF + hyperlinks

Structured data?

Links back to

Web service

'data link'from site

e.g. Sweave + Latex

Page 13: Using Django for a scientific document analysis (web) application

Tastypie + Datatables

Django Model-based REST api Jquery datatables with AJAX call

The good news: It works Unified point of entry for tables in website and scripts

The bad news: Tastypie code horribly redundant (Unless we're doing it wrong!)

Page 14: Using Django for a scientific document analysis (web) application

Unit tests

Web pages tough to test well Move as much code as possible from presentation to

business layer Trivial views need less testing Regular python modules easy to test

Our choices: Put all unit tests in the 'target' module Put more complicated integration tests in tests/ package 'Policy' superclass to check pylint, license of target module

Page 15: Using Django for a scientific document analysis (web) application

Bonus slide: Plugins

Django (model)forms as interface description for plugins

Plugins callable from web site, as web service, and from cli

Single point of entry for actions (relation with REST data modification?)