IC05 cours 1

Embed Size (px)

Citation preview

Prsentation PowerPoint

IC 05 / semestre printemps 2008

IC 05 / semestre printemps 2008

Franck.ghitallaDpartement TSHPrsident de WebAtlas

[email protected]

IC 05 / semestre printemps 2008

Forme(s) et type(s) de rseaux (between order and randomness)

Agrgats et loi de puissance (information geography)

Dynamics (information IN and ON nets)

Corrlation structure/contenu (information distribution on power-law topology)

Cadres conceptuelsA.-L. BARABASI linked - the new science of networks, new ed. 2005.S. JONHSON -emergence: the connected lives of ants, brains, cities, and software, 2002.

Thorie des graphesD. WATTS six degrees - the science of a connected age, 2004.S. STROGATZ - sync: the emerging science of spontaneous order, 2004.M. NEWMAN - the structure and dynamics of networks, 2003.

Web-MiningS. CHAKRABARTI mining the web, 2002.J. KLEINBERG - algorithm design, 2006.

InfoVizB. SHNEIDERMAN - readings in information visualization: using vision to think, 1999.

Ouvrages de rfrence en Network SciencesIC 05 / semestre printemps 2008

IC 05 / semestre printemps 2008

The problem : in social networks, web graphs or cells organisation (and even each complex system), we have to understand the presence of two contradictory properties.

On the one hand, the network should display a large culstering coefficient, meaning that on average a persons friends are far more likely to know each other than two people chosen at random. On the other hand, it should be possible to connect two people chosen at random via a chain of only a few intermediaries. (D; Watts, p77, Six Degrees)

IC 05 / semestre printemps 2008

Most of networks (not only social networks) display what we call clustering, which is really just to say that most peoples friends are also to some extent friends of each other (D. Watts, p40, Six Degrees)1) Clusters, proximity and long distances

IC 05 / semestre printemps 2008

Most of real networks (web, biology, social organization) are highly clusteredDans un rseau social, si chaque agent a 100 amis, et eux-mmes 100, etc...A 5 degrs de distance, on arrive 9 milliards. A priori, mathmatiquement, ce type d'expansion permet de comprendre les fameux 6 degrees constants ( gauche).

Cependant, dans les rseaux sociaux ou sur le web pour les sites/pages, il existe une forte redondance en termes de structure : mes amis ont de grandes chances d'tre aussi les amis de mes amis (disons, dans le mme cluster social ou communaut).

IC 05 / semestre printemps 2008

Graphe de sitesGraphe de mots-clef

IC 05 / semestre printemps 2008

La structure trs clusterise du web se mesure aisment partir de calculs de densit pour la distribution de la connectivit hypertexte entre sites. C'est l'un des principes dvelopps par J. Kleinberg ds 1996 pour la conception de l'algorithme HITS (Hypertext Induced Topic Search).

IC 05 / semestre printemps 2008

Some of the properties of extremely complicated systems can be understood without knowing anything about their detailed structure or governing rules (D. Watts, p65, Six Degrees)2) Randomness, universality and complex systems

IC 05 / semestre printemps 2008

A random graph is, as the name might suggest, a network of nodes connected by links in a purely random fashion (D. Watts)

(gauche) Random Graphe (non-orient et orient)

(droite) Random Graph - (Rnyi)

IC 05 / semestre printemps 2008

All we needed to do was find a way to tune each network between complete order and complete disorder in a way that it traced through all the various intermediate stages (D. Watts, p86, Six Degrees)3) Small World

IC 05 / semestre printemps 2008

Clustering Degree = HighDistance Degree = HighClustering Degree = LowDistance Degree = LowClustering Degree = HighDistance Degree = Low

IC 05 / semestre printemps 2008

A space of possible worlds the parameter we can tune from 0 to 1, from randomness to order in wich, at one end of the spectrum individuals always make new friends through thier current friends and, at the other end, they never do. in the middle, there is a version of reality. (D. Watts).

The BetaModel,the order-randomness-spectrum(D. Watts, S. Strogatz)

Orderfrom 0 to 1Randomnessfrom 0 to 1En moyenne, pour un network de 1 million de nuds distribu en mode rgulier (avec pour chaque nud 100 voisins proches (50 gauche, 50 droite), la distribution de 5 liens en mode random fait chuter les distances moyennes de 50%. Il faut multiplier ce chiffre par 10 (soit 5x10 = 50) pour faire chuter la moyenne encore de 50 %. Et ainsi de suiteLes effets sont donc dcroissants.

IC 05 / semestre printemps 2008

3 proprits des web-graphs a) de forme b) de distribution de lordre c) de domaines(mots, liens, acteurs)

IC 05 / semestre printemps 2008

IC 05 / semestre printemps 2008

IC 05 / semestre printemps 2008

Franck.ghitallaDpartement TSHPrsident de WebAtlas

[email protected]