Lecture 6: Social Web & Web Science (2012)

Preview:

DESCRIPTION

This is Lecture VI: Web Science of the Social Web course at the VU University Amsterdam. Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2012/ Lora Aroyo, The Network Institute, VU University Amsterdam (some adapted from Nigel Shadbold and Noshir contractor presentations)

Citation preview

Social WebLecture VI

How can we MINE, ANALYSE and VISUALIZE the Social Web? (1I) : The Web Science

Lora AroyoThe Network Institute

VU University Amsterdam

(based on slides from Les Carr, Nigel Shadbolt)

Tuesday, March 13, 12

The Webthe most used and one of the most transformative

applications in the history of computing, e.g. how the Social Web has transformed the world's

communication

approximately 1010 people more than 1011 web documents

Tuesday, March 13, 12

Web is NOT a Thing

• it’s not a verb, or a noun

• it’s a performance, not an object

• co-constructed with society

• activity of individuals who create interlinked that both reflect and reinforce the interlinkedness of society and social interaction ... and a record of

that performanceTuesday, March 13, 12

The WebGreat success as a technology,

it’s built on significant computing infrastructure, but

as an entity surprisingly unstudied

Tuesday, March 13, 12

Science & Engineering

• physical science: analytic discipline to find laws that generate or explain observed phenomena

• CS is mainly synthetic: formalisms & algorithms are created to support specific desired behaviors

• Web Science: web needs to be studied & understood as a phenomenon but also to be engineered for future growth and capabilities

Tuesday, March 13, 12

http://webscience.ecs.soton.ac.uk/L.A. Carr, C.J. Pope, W. Hall,N.R. Shadbolt

Tuesday, March 13, 12

Simple micro rules give rise to complex macro

phenomena• at microscale an infrastructure of artificial languages and

protocols: a piece of engineering

• however, interaction of people creating, linking and consuming information generates web's behavior as emergent properties at macroscale

• properties require new analytic methods to be understood

• some properties are desirable and are to be engineered in, others are undesirable and if possible engineered out

Tuesday, March 13, 12

A new way of software development

• software applications designed based on appropriate technology (algorithm, design) and with envisioned 'social' construct

• usually tested in the small, testing microscale properties

• a macrosystem evolving from people using the microsystem and interacting in often unpredicted ways, is far more interesting and must be analyzed in different ways

• also the macrosystems exhibit challenges that do not exist at microscale

Tuesday, March 13, 12

Evolution of Search Engines

1: techniques designed to rank documents2: people were gaming to influence algorithms &

improve their search rank3: adapt search technologies to defeat this influence

Tuesday, March 13, 12

The Web Graph

• to understand the web, in good CS tradition, we look at the graph

• nodes are web pages (HTML)• edges are hypertext links

between nodes

• first analysis shows that in-degree and out-degree follow power law distribution => shown to hold for large samples

• this gave insight into the growth of the web

Tuesday, March 13, 12

Search Algorithms

• the Web graph also at basis of algorithms for search engines:

• HITS or PageRank assume that inserting a hyperlink symbolizes an endorsement of authority of the page linked to

Tuesday, March 13, 12

User State is Important

• the original Web graph is too simple, starts from quasi static HTML

• for personalization or customization different representations (of sources) may be served to different requesters, e.g. cookies

• graph based models often do not account for this sort of user-dependent state, and not fit for all the information behind the servers, in Deep Web

• it’s not a simple HTTP-GET anymore (but HTTP-POST or HTTP-GET with complex URI) that is the basis for defining nodes in the graph

• URis that carry user state are heavily used in Web applications, but are not in the model and largely unanalyzed

Tuesday, March 13, 12

According to Googleeach day 20-25% of searches have not been seen before, i.e.

generate a new identifier thus a new node in the graph

more than 20 million new links per day, 200 per second

do they follow the same power laws & growth models?

Tuesday, March 13, 12

According to Googleeach day 20-25% of searches have not been seen before, i.e.

generate a new identifier thus a new node in the graph

more than 20 million new links per day, 200 per second

do they follow the same power laws & growth models?

validating such models is hard

exponential growth of contentchanges in number & power of servers

increasing diversity in users

Tuesday, March 13, 12

Social Web Sites

• modern websites (on the social web)• have large script systems running in browser• store personal information

do these systems show a similar behavior? (macro)are they stable? are they fair?do they need to be regulated?are the access restrictions, for personal information, assured?

many Social Web sites are not part of the (open) graph model

there is a need for understanding and intervening/engineering

Tuesday, March 13, 12

Wikipedia• purely mathematical (technology-based) models do not capture the

whole story

• the Wikipedia structure (link labels) shows a Zipf-like distribution just like other tag-based systems

• Wikipedia is built on MediaWiki software

• but other MediaWiki-based applications did not generate such significant use

• the pure 'technological' explanation cannot explain it

• must be related to the 'social model' of how Wikipedia is organized

this is referred to as the dynamics of a 'social machine' (already in TBL’s original vision of WWW)

Tuesday, March 13, 12

Social Machines• today's interactive applications are very early

social machines limited by being largely isolated from one another

• more effective social machines can be expected

• social processes in society interlink, so they should also interlink on the web

• technology needed to allow user communities to construct, share & adapt social machines to get success through trial, use & refinement

Tuesday, March 13, 12

Next Generation Social Machines

• what are fundamental theoretical properties of social machines, what algorithms are needed to create them?

• what underlying architectural principles a needed to effectively engineer new web components for this social software?

• how can we extend current web infrastructure with mechanisms that make the social properties of information sharing explicit and conform to relevant social-policy expectations?

• how do cultural differences affect development and use of social mechanisms?

Tuesday, March 13, 12

Modeling the Social Machines

• trustworthiness, reliability or silent expectations about use of information

• privacy, copyright, legal rules

• we lack structures for formally representing & reasoning over such properties

• thus, without scalable models for these issues it is hard to help the web go in the best possible direction

Tuesday, March 13, 12

Tuesday, March 13, 12

http://webscience.ecs.soton.ac.uk/L.A. Carr, C.J. Pope, W. Hall,N.R. Shadbolt

Tuesday, March 13, 12

Web Science is about additionality

not the union of disciplines, but intersection

Tuesday, March 13, 12

Society is Diversedifferent parts of society have different objectives and hence incompatible Web requirements, e.g. openness, security, transparency, privacy

Tuesday, March 13, 12

Understanding the Socio-Cultural

• POWER DISTANCE: The extent to which power is distributed equally within a society and the degree that society accepts this distribution.

• UNCERTAINTY AVOIDANCE: The degree to which individuals require set boundaries and clear structures

• INDIVIDUALISM vs COLLECTIVISM: The degree to which individuals base their actions on self-interest versus the interests of the group.

• MASCULINITY vs FEMININITY: A measure of a society's goal orientation

• TIME ORIENTATION: The degree to which a society does or does not value long-term commitments and respect for tradition.

Tuesday, March 13, 12

Understanding the variation

• Ecology of the Web - structure of the environment, producers and consumers

• Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes

• Mechanisms - variation (mutation, migration, HGT, genetic drift), selection

• Outcomes - adaption, co-evolution, competition, co-operation, speciation, extinction

Tuesday, March 13, 12

Understanding the variation

• Ecology of the Web - structure of the environment, producers and consumers

• Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes

• Mechanisms - variation (mutation, migration, HGT, genetic drift), selection

• Outcomes - adaption, co-evolution, competition, co-operation, speciation, extinction

Tuesday, March 13, 12

Understanding the variation

• Ecology of the Web - structure of the environment, producers and consumers

• Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes

• Mechanisms - variation (mutation, migration, HGT, genetic drift), selection

• Outcomes - adaption, co-evolution, competition, co-operation, speciation, extinction

Tuesday, March 13, 12

Understanding the variation

• Ecology of the Web - structure of the environment, producers and consumers

• Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes

• Mechanisms - variation (mutation, migration, HGT, genetic drift), selection

• Outcomes - adaption, co-evolution, competition, co-operation, speciation, extinction

Tuesday, March 13, 12

butHow to do the Science?

Tuesday, March 13, 12

Web Science Reflections

Is the Web changing faster than our ability to observe it?How to measure or instrument the Web?How to identify behaviors and patterns?

How to analyze the changing structure of the Web?

Tuesday, March 13, 12

Big Bang: Web Information

• assumption of the open exchange of information is being imposed on the society

• is the Web, open access, open data and scientific and creative commons offer a beneficial opportunity or dangerous cul-de-sac?

Tuesday, March 13, 12

Open Questions

• How is the world changing as other parts of society impose their requirements on the Web?, e.g. current examples with SOTA/PIPA, ACTA requirements for security and policing taking over free exchange of information, unrestricted transfer of knowledge

• Are the public and open aspects of the Web a fundamental change in society’s information processes, or just a temporary glitch?, e.g. are open source, open access, open science & creative commons efficient alternatives to free-based knowledge transfer?

Tuesday, March 13, 12

Open Questions

• do we take Web for granted as provider of a free and unrestricted information exchange?

• is Web Science the response to the pressure for the Web to change - to respond to the issues of security, commerce, criminality and privacy?

• What are the challenges for Web science?

• to explain how the Web impacts society?

• to predict the outcomes of proposed changes to Web infrastructure on business & society?

Tuesday, March 13, 12

What can you do as a Computer Scientist?

specifically for the Social Web

Tuesday, March 13, 12

Hands-on Teaser

• Q&A on Assignments

• Pitch of the Social Web Apps

image source: http://www.flickr.com/photos/bionicteaching/1375254387/

Tuesday, March 13, 12

Recommended