Upload
vanatteveldt
View
753
Download
1
Embed Size (px)
Citation preview
AmCAT3
Using Django for a scientific document analysis website: Tastypie, unit tests, R, open platforms and open questions
Wouter van Atteveldt (VU Amsterdam)
AmCAT
What is AmCAT?
Design considerations
Open data and the publication cycle
Tables, TastyPie, and R
Unit tests
What is AmCAT?
Document management and analysis
Aimed at social sciences and humanities Input: scraping, uploading Management: projects, selections Analyses: keyword analysis, linguistic processing
(lemmatizing etc), manual annotation
Open source, open standards, open access
Design Choices
Default Django: web site backed by a database AmCAT: database with a web front end
Design Choices
Default Django: web site backed by a database AmCAT: database with a web front end
Data should be accessible from outside ORM should be usable without web site code DB should be final authentication/authorisation
Design choices
Separate 'apps' for business, presentation Custom authentication middleware and user
management save() and update() with using= database-specific code for creating users We don't actually like this too much...
All data and methods (should be) exposed through web service API
Open data and Publication Cycle
RelationalDB
ORM(django)
AmCAT Navigator(web site)
REST API(web service)
SPARQL End point
External scripts(Python, R, ...)
Open access publication cycle
Source:DANS/AmCAT3
(Linked) data
Analysis:
R, matlab, ...
Publication:
PDF + hyperlinks
Structured data?
Links back to
Web service
'data link'from site
e.g. Sweave + Latex
Tastypie + Datatables
Django Model-based REST api Jquery datatables with AJAX call
The good news: It works Unified point of entry for tables in website and scripts
The bad news: Tastypie code horribly redundant (Unless we're doing it wrong!)
Unit tests
Web pages tough to test well Move as much code as possible from presentation to
business layer Trivial views need less testing Regular python modules easy to test
Our choices: Put all unit tests in the 'target' module Put more complicated integration tests in tests/ package 'Policy' superclass to check pylint, license of target module
Bonus slide: Plugins
Django (model)forms as interface description for plugins
Plugins callable from web site, as web service, and from cli
Single point of entry for actions (relation with REST data modification?)