8
Automatic extraction of local events from web sites By: Anton Risberg Alaküla Karl Hedin Sånemyr

sites local events from web Automatic extraction offileadmin.cs.lth.se/cs/Education/EDAN50/LocalEvents_ARA...SATTA ETIKETTER KLICKA HAR 0M DU HAR GLOMT Drrr LOSENORD LOGGA IN Maj 2013

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Automatic extraction of local events from web

sitesBy:

Anton Risberg AlakülaKarl Hedin Sånemyr

Project Background

"Ole Römers väg 3G

223 63 Lund" Scraper Scraper Scraper

55.7,13.2

Project Goal

"Ole Römers väg 3G

223 63 Lund" Scraper Scraper Scraper

55.7,13.2

SPARQL-endpoint

RDF

RDF Converter

● Plan:a. Dump scraper datab. Write a java program to format -> RDFc. Give RDF output to Sindice

● Result○ Learned a lot about rdf○ No real issues

Scraper improvements

● Problems○ Event descriptions short/missing○ Geocoding not always successful (<30% success)

● Results○ Dygnetrunt now parses 100%, gecodes >80%

Scraper improvements

● Problems○ Lund.se complete redesign

● Results○ Learned some Python!○ New visitlund.se scraper

SPARQL Endpoint

● Goal○ Perform SPARQL queries on Sindice's servers.

● Problem○ Delays @ Sindice

● Solution○ Made our own SPARQL "endpoint" + "event search

engine"