On line footprint @upc




Citation preview

Online FootprintsWhat the internet knows about you.


● Background

● Online Privacy

● Hyperdata

● Identity

● Investigation Points

● Possible Applications


“When you're in positions of privileged access like a systems administrator for the sort of intelligence community agencies, you're exposed to a lot more information on a broader scale then the average employee and because of that you see things that may be disturbing but over the course of a normal person's career you'd only see one or two of these instances. When you see everything you see them on a more frequent basis and you recognize that some of these things are actually abuses. And when you talk to people about them in a place like this where this is the normal state of business people tend not to take them very seriously and move on from them.”

Edward Snowden


Online Privacy

Is Privacy the right to be forgotten?

Three-quarters of the 1.8 trillion gigabytes of digital information online hasbeen created by individual users. On top of that, an increasing amount of additional data about those users is collected by public and private companies.

Library Briefing - Library of the European Parliament - 01/03/2012

Not yet..

Directorate-General for Internal Policies of the European Parliament published a study on Citizens Rights and Constitutional Affairs, stating:

“The study contends that an analysis of European surveillance programmes cannot be reduced to a question of balance between data protection versus national security, but has to be framed in terms of collective freedoms and democracy.”


In an online context, the right to privacy has commonly been interpreted as a right to “information self-determination”.

Acts typically claimed to breach online privacy concern the collection of personal information without consent, the selling of personal information and the further processing of that information.


“The web is fundamentally a distributed hypermedia application”Software Architecture: Foundations, Theory and Practice

Taylor, Medividovic, Dashofy (2010)

The age of the “metadata”

In addition to user-generated content, “meta-data” is collected and stored by public and private organisations about where, when and who created that content.

Metadata is more interesting than actual information.

Enter your name, and Personas scours the web for information attempting to characterize the person - to fit them to a predetermined set of categories that an algorithmic process created from a massive

corpus of data.

Personas http://personas.media.mit.edu/


Hyperdata indicates data objects linked to other data objects in other places, as hypertext indicates text linked to other text in other places. Hyperdata enables formation of a web of data, evolving from the "data on the Web" that is not inter-related (or at least, not linked).



It is not a buzz-word.

Hyperdata is at the core of the web nowadays.

Hyperdata means snippets of information linked between each others.

Why is it relevant to privacy?

Because links are context.

Context is semantics.

Privacy protection means knowing which are the weakest links that can reveal

something about ourselves.


● Where do I work?● Who are my friends?● What music do I like?● When do I exercise?● What music do I like when I

exercise?● Where do I spend my time?● Who do I communicate with?● …

Identity is a puzzle

Identity is a network

An example: histogram of a twitter user checkins profile

Investigation points

Investigation points

● Hyperdata and hyperdata languages

● Graph analysis

● Information theory and statistical analysis

● Privacy attacks (ex: neighbourhood attack, friend in the middle,...)



● Social Networks

● Web surfing habits (i.e. browser state, stored cookies)

● Mobile applications

● Sensor data

● Bitcoin graph mining

Thank you!

Please talk to me, I’d love to exchange some thoughts and doubts.silviap@ieee.org
