42
Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University

Making Mashups with Marmite

Embed Size (px)

DESCRIPTION

Making Mashups with Marmite. Jeff Wong Jason I. Hong Carnegie Mellon University. The Big Picture Problem. Lots of content out there on the web But not always in a form amenable to your needs Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center - PowerPoint PPT Presentation

Citation preview

Page 1: Making Mashups  with Marmite

Making Mashups with Marmite

Jeff WongJason I. Hong

Carnegie Mellon University

Page 2: Making Mashups  with Marmite

The Big Picture Problem

• Lots of content out there on the web– But not always in a form amenable to your needs

– Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center

• Two observations:– In many cases, all of the data and services people need

already exist, but not connected together

– Unlikely that a web site can predict all possible needs

Page 3: Making Mashups  with Marmite

A Solution: Mashups

• Rapidly growing community of users creating “mashups” combining content from multiple web sites– Ex. Housingmaps.com

Page 4: Making Mashups  with Marmite
Page 5: Making Mashups  with Marmite
Page 6: Making Mashups  with Marmite
Page 7: Making Mashups  with Marmite

A Solution: Mashups

• Rapidly growing community of users creating “mashups” combining content from multiple web sites– Ex. Housingmaps.com

– Ex. MySpace child predators

– Ex. Friendster locations

– Ex. Most popular videos on YouTube, Yahoo Video, …

Page 8: Making Mashups  with Marmite

A Solution: Mashups

• Rapidly growing community of users creating “mashups” combining content from multiple web sites– Ex. Housingmaps.com

– Ex. MySpace child predators

– Ex. Friendster locations

– Ex. Most popular videos on YouTube, Yahoo Video, …

• ProgrammableWeb.com statistics– ~1500 mashups created since April 2005

– 356 open web-based APIs available

Page 9: Making Mashups  with Marmite

But Creating Mashups is Hard

• Requires lots of skill to create a mashup– Ex. Housingmaps creator has PhD in computer science

– Ex. MySpace child predator list took months

• Requires programming expertise in many areas– Web crawling

– Text parsing

– Pattern matching

– Databases

– HTML

Page 10: Making Mashups  with Marmite

MarmiteEnd-User Programming for Mashups

• Main idea: make it easy to create web mashups

• Use a dataflow approach connecting small operators– Inspired by Unix pipes and Apple’s Automator

• Example:– Get all events from Upcoming.org

– Filter out events that are too old

– Put them all onto a map

• Runs inside of a standard web browser

Page 11: Making Mashups  with Marmite

Set of Operators

Page 12: Making Mashups  with Marmite

Data Flow View

Page 13: Making Mashups  with Marmite

Data View

Page 14: Making Mashups  with Marmite

Using Marmite (Envisioned)

• Extract content from one or more web pages – names, addresses, dates, phone #, URLs

• Process it in a data flow manner– filtering out values or adding metadata

– integrating with other data sources (similar to a database join operation)

• Direct the output to a variety of sinks– databases, map services, text files, visualizations, web

pages, or source code that can be further edited

Page 15: Making Mashups  with Marmite

Marmite

• Motivation and Examples• Features and Design Rationale• User Evaluation

Page 16: Making Mashups  with Marmite

Features and Design Rationale

• Conducted a series of quick evaluations to understand design space and potential problems– Automator

– Lo-fi prototypes

Page 17: Making Mashups  with Marmite

Automator

Page 18: Making Mashups  with Marmite

Informal Automator Evaluation

• Had three novices try three simple web-based tasks– Warm-up task

– Traverse a set of web pages

– Download a set of images

• Some findings:– Some difficulties knowing how to start and what to do next

– Little feedback about state of system between operations

– Difficult to iterate due to network speed issues

Page 19: Making Mashups  with Marmite

Lo-Fi Prototypes

• 6 paper prototypes with 20 participants

Page 20: Making Mashups  with Marmite

Design Solutions

• Problem: how to start and what to do next• Solution: Suggest next actions

– Weak data typing to find types (addresses, numbers, etc)

– Filter operators to only show relevant ones

– Suggest operators that might be applicable

Page 21: Making Mashups  with Marmite
Page 22: Making Mashups  with Marmite

Design Solutions

• Problem: little feedback about state of system between operations

• Solution: link data flow and data view together– Many systems take program-centric view (ex. Automator)

or data-centric view (ex. spreadsheets)

– Use hybrid data flow / data view, showing an operation and its effects together

– Data view usually “spreadsheet”, other views possible too (for example, maps)

Page 23: Making Mashups  with Marmite
Page 24: Making Mashups  with Marmite
Page 25: Making Mashups  with Marmite

Design Solutions

• Problem: difficult to iterate due to network speeds• Solution: cache data, let people “replay” data

– Reload, pause, play

Page 26: Making Mashups  with Marmite

Other Design Findings

• Screen real estate issues– Collapsible operators, leaving a readable label

Page 27: Making Mashups  with Marmite

Extracting Generic Content

• Can’t have pre-defined extractor operators for every possible web site– Need a more general way of extracting data from pages

• Developed a generic wizard UI for selecting links– Content from that set could be extracted via other operators

– Uses Solvent (MIT), an XPath-based algorithm for finding patterns in web pages

• Finds “groups” of related web content based on how HTML is structured

Page 28: Making Mashups  with Marmite

Marmite

Page 29: Making Mashups  with Marmite

Operators

• Operators have input types – Operator uses this to guess which columns it wants

• Operators have output types

Page 30: Making Mashups  with Marmite

Implementation

• JavaScript (for underlying code) and Extensible Binding Language (XBL for UI)

• Operators currently in JavaScript– Ideally could be scriptable in any programming language

– Currently ~15 operators

Page 31: Making Mashups  with Marmite

Marmite

• Motivation and Examples• Features and Design Rationale• User Evaluation

Page 32: Making Mashups  with Marmite

Evaluation

• Informal user study with 6 people– 2 novices

– 2 people with spreadsheet experience (formulas)

– 2 people with programming experience

• Tasks (in increasing difficulty)– Warmup task showing how to retrieve a set of addresses

and how to geocode an address

– Search for and filter out events further than a week away

– Compile a list of events from two event services and plot them on a map

– Recreate the housingmaps site

Page 33: Making Mashups  with Marmite

Results

• Three people able to complete all tasks in ~1 hour– First two users confused about suggested actions

(automatically popped up, made manual for other 4 users)

– Novice made some progress, not able to finish all tasks

• Able to re-create housingmaps in ~15 minutes

Page 34: Making Mashups  with Marmite

Marmite

Page 35: Making Mashups  with Marmite

More Results

• Biggest barrier was understanding the data flow– Did not understand input and output concept

– Applied operators as one-off, did not realize that it was a static representation of flow

– Did not understand data flow and data view were linked

Page 36: Making Mashups  with Marmite

Future Directions

• Short-term– Better screen-scraping operators

– More operators

– Better connection with web services (WSDL and REST)

– Better help for starting a data flow

• Long-term– Intelligence analysis

– Better visualizations

– Location-based services

Page 37: Making Mashups  with Marmite

Conclusions

• Marmite, a tool for creating web-based mashups– Extract content from one or more web pages

– Process it in a data flow manner

– Direct the output to a variety of sinks

• Hybrid data flow / data view• User evaluation shows some promising results

Jeff Wong, Jason Hong, Making Mashups with Marmite: Re-purposing Web Content through End-User Programming, CHI 2007

Page 38: Making Mashups  with Marmite
Page 39: Making Mashups  with Marmite
Page 40: Making Mashups  with Marmite
Page 41: Making Mashups  with Marmite

Marmite

Page 42: Making Mashups  with Marmite

Types of Operators

• Sources– Add data into Marmite by querying databases, extracting

information from web pages, and so on.

• Processors– modify, combine, or delete existing rows. Example operators

include geocoding (converting street addresses to latitude and longitude) and filtering. Processor operators might add or remove columns as well

• Sinks– redirect the flow the data out of Marmite. Examples include

showing data on a map, saving it to a file, or to a web page.