Upload
letram
View
217
Download
0
Embed Size (px)
Citation preview
Ethical Forum University Foundation
November 2009
How does Google work? Google as a search engine
Vincent BlondelLouvain School of Engineering, UCL
1996: Research project, by Larry Page and Sergey Brin at Stanford University
1998: Google Inc. company. 25 million webpages indexed.
2000: One billion webpages indexed.
2004: Google goes public.
2005: One billion images.
2006: "to google" added to the Oxford English Dictionary.
1998
2009
«Don’t be evil»
Larry Page and Sergey Brin
PageRank 6
PageRank 5
Google employs a number of techniques to improve search quality including PageRank, anchor text, and proximity information.
PageRank 10
www.fnrs.be PageRank 7
PageRank 9www.uclouvain.be
PageRank 8www.kbr.be
www.google.be
www.fondationuniversitaire.be PageRank 6
PageRank democracy: let the links vote
You webpage inherit a high PageRank if it is being pointed by pages that themselves have a high PageRank.
• Frequent updates (Googlebot)
• Sophisticated distributed computation
36 data centers (19 in the U.S., 12 in Europe). About 450.000 computers
• Storage
How is this done?
To be googeable or not to be
• Position 1 : 100%• Position 2 : 100%• Position 3 : 100%• Position 4 : 85%• Position 5 : 60%• Position 6 : 50%• Position 7 : 50%• Position 8 : 30%• Position 9 : 30%• Position 10 : 20%
Golden triangle
Google ranking robustness
• Google changes the algorithm and the ranking
• Google removes webpages from index
• Google bombing
• Buy links
Kinderstart had a traffic 10 million a month, something caused its ranking to drop. Traffic dropped 70 percent and revenue dropped 80 percent. Kinderstart sued Google.
From: ***Date: December 16, 2006 1:20:56 PM CSTTo: ***Subject: google
Dear Girish,Vincent is the person I had mentioned who has ceasedto exist now that his webpage is no longer reachablevia Google!
Can you please give him a pointer on whom to write to?
I am copying him on this message.Thanks,--Kumar
--------------P. R. KumarFranklin W. Woeltge Professor of Electrical and Computer Engineering, andResearch Professor, Coordinated Science Lab
From: ***Date: December 18, 2006 6:29To: Vincent Blondel <[email protected]>Subject: google
Hi Prof. Blondel,
I'm Prof. Kumar's ex-student, now working at Google. He told me aboutthe problems you were having with your website listings under Googlesearch. It seems your site is being searched properly now - it showsup at the top of the results while searching for your name.
Best,Girish
Google: «miserable failure»
Email from a UCL colleague (sometime in 2007).
«Un jour, j'ai besoin du CV du recteur de l'UCL pour la soumission d'un projet. J'utilise Google et je tape "Bernard Coulie". Sans faire attention, j'entre en fait "BernardCoulie", sans espace, et je tombe sur deux liens internes à l'UCL. C'étaient deux fichiers qui donnaient les salaires de tous les membres de l'université. Par accident, ces fichiers étaient mal protégés et accessibles à tous.
J'avertis les informaticiens. Alerte générale. Durant la nuit tout est réparé.
Restaient les caches pour lesquels il a fallu contacter Google ainsi que les autres moteurs de recherche afin que l’information disparaisse totalement du web.»