FeedMe - a semantic RSS aggregatorbib.irb.hr/datoteka/507911.ljubesic10-feedme.pdf · Existing...

FeedMe - a semantic RSS aggregator

Nikola Ljubešić, Damir Boras, Mislav Cimperšak, Marija Tkalec

Faculty of Humanities and Social SciencesUniversity of Zagreb

08. lipnja 2010.

Overview

1. The basic idea

2. Our system

3. Statistical analysis of collected data

4. Usage examples

08. lipnja 2010.

Overview

1. The basic idea

2. Our system

4. Usage examples

08. lipnja 2010.

Aggregating news

• collecting news from different information sources as publishing them as a single source

• manual and automated

• automated - problem of repeating information - need for analysis and organization

08. lipnja 2010.

Existing aggregators

• Google News

• EMM NewsExplorer

• MondoPress

08. lipnja 2010.

• RSS (Really Simple Syndication) - family of web feed formats used to publish frequently updated works

• XML file - readable by humans and machines

• RSS structured, (X)HTML nowadays still not - easier data harvesting through RSS

08. lipnja 2010.

Google Reader

• on-line RSS aggregator

• problems

• loss of information

• repeating information

• unwanted information

08. lipnja 2010.

Our idea

• collect RSS server-side - no loss of entries

• cluster RSS entries concerning their content - complex entries, no duplicates

• enable users to filter information - “affirmate” ie. “negate” specific feeds

08. lipnja 2010.

Filtering

• publish only feed entries containing n or more original feed entries

• “affirmate” feeds - publishing only feed entries containing at least one original entry of all the “affirmative” feeds

• “negate” feeds - not publish feed entries containing any of the original entries from any negated feed

08. lipnja 2010.

Overview

1. The basic idea

2. Our system

4. Usage examples

08. lipnja 2010.

FeedMe

• back-end - collecting RSS entries on a half an hour basis and organizing them into clusters

• front-end - web application for

• creating groups of feeds (filtering - minimum elements, affirmating, negating)

• browsing the compiled groups

• publishing groups as new RSS feeds

08. lipnja 2010.

Overview

1. The basic idea

2. Our system

4. Usage examples

08. lipnja 2010.

The collected data

• 388 RSS feeds

• 38 different portals

• collected from 2010-05-10

• more than 100.000 entries

• cca. 30.000 clusters

08. lipnja 2010.

Distribution of documents regarding the cluster size

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

08. lipnja 2010.

Portals publishing on “large” events (>2)Net.hr

Monitor.hrTportal.hr

Index.hrDnevnik.hrNacional.hr

Jutarnji.hrHRT.hr

24sata.hrVecernji.hr

SlobodnaDalmacija.hrRTL.hr

0 20 40 60 80

08. lipnja 2010.

Portals publishing new stories first

Index.hrNet.hr

Monitor.hrDnevnik.hrNacional.hrTportal.hrJutarnji.hr

Vecernji.hrHRT.hr

SlobodnaDalmacija.hr24sata.hr

RTL.hr

0 50 100 150 200

08. lipnja 2010.

Portals publishing new stories first (normalized by portal size)

Tportal.hrJutarnji.hr

Net.hrHRT.hr

Vecernji.hrNacional.hrDnevnik.hrMonitor.hr

RTL.hrIndex.hr

24sata.hrSlobodnaDalmacija.hr

0 0,10 0,20 0,29 0,39

08. lipnja 2010.

Plagiates?Tportal.hr

Dnevnik.hr

Nacional.hr

Net.hr

Jutarnji.hr

Index.hr

Monitor.hr

SlobodnaDalmacija.hr

HRT.hr

0 0,08 0,15 0,23 0,30

08. lipnja 2010.

Overview

1. The basic idea

2. Our system

4. Usage examples

08. lipnja 2010.

Filtering by minimum number of elements

08. lipnja 2010.

Filtering by affirmating feeds

08. lipnja 2010.

Filtering by negating feeds

08. lipnja 2010.

Future steps

• user-defined RSS sources

• full-text news portals

• different sources - social networks

• topic tracking

• named entity identification

• sentiment analysis and mining

08. lipnja 2010.

Thank you! Questions?

08. lipnja 2010.

FeedMe - a semantic RSS aggregatorbib.irb.hr/datoteka/507911.ljubesic10-feedme.pdf · Existing...

Documents

S8 plus TTS Coupe SQ5 RS3 Sportback RSS Coupe RSS

RSS Autoresponder

GODIŠNJE IZVJEŠĆE 2016. - Europa...Pokretanje postupka pred Sudom u predmetu Uber 8. lipnja Stupanje na dužnost triju novih članova Općeg suda 26. i 29. lipnja Pokretanje postupka

prova - uio.no · 3.1 (RSS. - ) Rss, t) (RSS. - Rss, ) Rss, / (o - p . SE is {4 . Supp se 44

Manual de Edición del Recurso Lector RSS, Alkacon ...€¦ · Introducción: Recurso Lector RSS, Alkacon Syndication Feed Los llamados feed RSS (en español fuentes RSS) son canales

CCC! RSS Cue Card Personal RSS Feeds

A Plan to Save RSS : RSSahae/isw-2004/slides/sandler.pdf · RSS client applications –Join the overlay •Distribute the RSS forwarding load –Examine conventional RSS feeds for

Sensing Semantics of RSS Feeds by Fuzzy Matchmaking · RSS feeds, and hence RSS 1.0, RSS 2.0 or any other RSS-like formats (e.g. Atom) can be applied. It acts as a real-time sensor

RSS€¦ · Razvijeni su posebni računarski programi koji se nazivaju "RSS aggregators" (RSS agregatori) koji automatski pristupaju RSS punjaču veb lokacija koje korisnika interesuju

Rss Feeds & RSS Readers

RSS Really Simple Syndication. Overview RSS is a web content syndication format RSS documents are XML and must conform to the xml 1.0 recommendation RSS

Revolutionizing RSS Feeds - nworealtors.comnworealtors.com/img/pdf/revolutionizing-rss-feeds.pdf · What is RSS and What the Heck’s a “Reader”? AccordingtoWikipedia: “RSS(ReallySimpleSyndication)isafamilyofWebfeedformatsusedto

RSS Tutorial

Radio Test Report Industry Canada RSS 197 RSS 197 Model ...dl.ubnt.com/compliance/Reports/NanoM365_IC Report.pdf · RSS-197 – Base and Fixed Stations, 3650 – 3700 MHz RSS-197

Ponedjeljak, 28. lipnja 2021. Konstituirana Županijska

RSS-2000 Operator Manual - RSSI - Electric Vehicle Barriers Maintenance Manual(… · RSSI Barriers, LLC. warranties the RSS-2000 Series of Barriers (RSS-2000NC, RSS-2000, RSS-2000V,

Proposal of RSS Extension for Security Information Exchange · RSS_dir is a concept of the RSS directory for the RSS channel. RSS directory describes a RSS channel tree using the

RSS Android

VISIÓN GENERAL DEL MARCO RSS - Proforest · Guías de Campo del RSS RSS 1. Preparación y Alcance RSS 2.a Evaluación de Riesgos (Pilar 1) RSS 2.b Evaluación de Necesidades (Pilar

Rss Presentation