Upload
christoph-trattner
View
7.934
Download
2
Tags:
Embed Size (px)
DESCRIPTION
This presentation will be a live exchange of ideas & arguments, between a representative of a start up working on agricultural information management and discovery, and a representative of academia that has recently completed his PhD and is now leading a young and promising research team. The two presenters will focus on the case of a recommendation service that is going to be part of a web portal for organic agriculture researchers and educators (called Organic.Edunet), which will help users find relevant educational material and bibliography. They currently develop this as part of an EU-funded initiative but would both be interested to find a way to further sustain this work: the start up by including this to the bundle of services that it offers to the users of its information discovery packages, and the research team by attracting more funding to further explore recommendation technologies. The start up representative will describe his evergoing, helpless and aimless efforts to include a research activity on recommender systems within the R&D strategy of the company, for the sakes of the good-old-PhD-times. And will explain why this failed. The academia representative will describe the great things that his research can do to boost the performance of recommendation services in such portals. And why this does-not-work-yet-operationally because he cannot find real usage data that can prove his amazing algorithm outside what can be proven in offline lab experiments using datasets from other domains (like MovieLens and CiteULike). Both will explain how they started working together in order to design, experimentally test, and deploy the Organic.Edunet recommendation service. And will describe their expectations from this academic-industry collaboration. Then, they will reflect on the challenges they see in such partnerships and how (if) they plan to overcome them.
Citation preview
Je t'aime... moi non plus
Nikos Manouselis, Agro-Know Technologies (Greece)Christoph Trattner, Know-Center (Austria)
Nikos Manouselis, Agro-Know Technologies (Greece)Christoph Trattner, Know-Center (Austria)
reporting on the opportunities, expectations and challenges of a
real academic-industrial collaboration …blah blah
the characters
Nikos• MSc, MΕng, PhD• >140 pubs• 1 post-doc• 1 hybrid research
position• 5 rejected faculty
applications• Agro-Know as a
research hybrid
“meaningful services around high-quality
agricultural data pools”
http://www.agroknow.gr
Unorganized Content in local and remote sites
Widgets
Authoring services
Data Discovery Services
Analytics services
Agricultural Data Platform
Ingestion Translate Publishing
Harvesting BlossomCultivation
Organized and structured Content in local and remote
DBs
Educational
Geographical
Bibliographic
Enrichment
Aggregate data from diverse sources
Work with different type
of data
Prepare data for
meaningful services
Educational
Bibliographic
knowledge aggregation and sharing solutions
ChristophBorn: 24.2.1980, Graz, Austria
•Research Field: Social Computing (so basically my research is centered around Social Networks Analysis, Social (Semantic) System design and Social information access)
•Education: Ph.D. („Dr. techn.“) in Computer Science and a MSc & BSc in Telematics from Graz University of Technology
•Publications (since 2009) : 5 Journals, 24 Conf. Papers, 2 Book Chapter, Publications in for example: WWW, HT, ICWE, Wikisym, SocialCom, ASONAM, etc.
•Currently, I am working as Head of the Social Semantic Research Group and Deputy division manager at the Know-Center, in Graz Austria
Contact: Email: [email protected] Web: http://christophtrattner.infoTwitter: @ctrattner
Christoph’s team
• 1 Post Doc, 5 Pre Docs (1 more will join in Sept. )
• 2 MSc student• 1 BSc student
DI. Dieter Theiler
DI. Dominik Kowald
Mag. Peter Kraker
Mag. Sebastian Dennerlein
Dr. Elisabeth Lex
Mag. MatthiasRella
Christoph’s collaborators
Organic.Edunet
• outcome of EU project “Organic.Edunet” of eContent+ programme (2007-2010)
• based on network of >10 content providers• portal maintained & updated by Agro-Know
and an academic partner (anAP)• evolved through EU project “Organic.Lingua”
(2011-2014) in collaboration with K-C and anAP
Organic.Edunet Recommendation• social navigation module exposed through API
– content-based recommendation using tags on resources
– user-based collaborative filtering using multi-criteria ratings
• recommendation of relevant resources within user’s profile– well-hidden, never used– module API developed & supported by Agro-Know– UI & features developed & supported by anAP
desired: Organic.Edunet “Suggest”
• a real content discovery service suggesting resources to users– interactions used as input to train system– personalised vs. non-personalised version
desired: explore further
• personalising suggestions of related content when users view an item
Agro-Know’s perspective
• a service that can become a plug-and-play product– working on top of recommendation API– reusable in all agDiscovery services (sites, portals, apps)
• a service that works, well– tested performance, correct parameters for algorithms
in each context– tested & adaptable UI, to be reused in several
deployments
• a service bundle that we can sell to our clients
Nikos’ perspective• experiment with multi-criteria recommendation
– continue work that started in PhD– visualisation & UI challenges– find someone to try-that-interesting-idea
• take advantage of large user base & lots of data– Organic.Edunet dataset: ratings & tags already collected – expand to federated data sources of social data
• keep publishing, but not keep on doing research experiments
Christoph’s & KC’s perspective
• Why is this cooperation valuable for us/me? – Typically it is not too easy to get access to real user data..
• Test algorithms not only “offline” but also online– Currently, we are just playing around with offline experiments
• Test interfaces not only in lab studies – Currently, we are evaluating our interfaces just with expert
interviews or with lab studies • Work towards second doctorial thesis that lies in the context of
recommending “things” (people, resources, annotations) in social semantic networks
the plan
bringing it all together
• major activities to take place in next 9 months– offline experiments using existing dataset &
exploring various algorithmic options [summer’13]
– online experiments exploring various service options [autumn’13]
– final service deployment [winter’13-’14]
evaluation experiments (1/2)
• evaluating algorithms– offline experiment running different
algorithms over offline data that have user preferences
– online experiment with single interface with back end recommendation engine interchanging between algorithm variations
evaluation experiments (2/2)
• evaluating different visualisations– simple suggested list of resources– simple tag-cloud based faceted browsing– cluster-based bubble interface for browsing bases
on themes
• evaluating data availability/coverage– one interface with selected algorithm with
backend selector that will interchange item catalogue dataset
research outcomes
• conference publications to make K-C happy– ACM RecSys’14– ACM HT’14
• journal publication to make all happy– ACM TIST Special Issue on Recommender System
Benchmarking
the challenges
Nikos’ perspective• productizing & selling
– bundle of services together with K-C or Agro-Know’s product?
– business & costing model?
• time– research mentoring is a luxury for a start-up CEO– should eventually lead to an added-value product– creates bias in product development process (what if
this idea should simply die?)
• trust: what if they are yet-another-anAP?
Christoph‘s perspective
• Time: Tight timeline – according to Giannis (our project coordinator)
services should be done by Sept. – Not much room for failure
Christoph‘s perspective
Christoph‘s perspective
• Data: Sparse data...– Although the portal attracts a lot of people every
day (a bunch of thousands), we currently do not have the data we need to do „real“ cool personalized recommender stuff...
Christoph‘s PerspectiveOrganic.Edunet CiteULike
Christoph‘s perspective
• Multilinguality:– Currently the portal provides documents in 42
different languages...how do we handle that?– Well, lucky us, most articles are in English language so
we might handle this by providing our services just to those users?
• Speed: Although our recommendations are pretty fast (almost real-time) how do we handle network delays? Maybe it is better to set up a virtual machine?
Christoph‘s perspective
• Scalability: What happens if the portal really flies off? Currently, we have almost everthing in memory – Ok we have a big server with 256GB of RAM...and we are using Apache Mahout for some
algorithms (e.g. CF), but how about the other „cool“ algorithms we have developed and that we want to test?
Christoph‘s perspective
• Sustainability & Trust: Currently, we are pretty fine with Nikos, and he likes our ideas, but what if we want to test new stuff? – Does he allow us to change our services? – Or even worse, he does not allow us to change
anything!
confession time: why do this
no script!