What is Webometrics? Mike Thelwall Statistical Cybermetrics Research Group University of...

Preview:

Citation preview

What is Webometrics?

Mike ThelwallStatistical Cybermetrics Research Group

University of Wolverhampton, UK

Virtual Knowledge Studio (VKS)

Information Studies

1. Introduction

□Webometrics is concerned with gathering data on and measuring aspects of the Web□web sites□web pages□hyperlinks□web search engine results□YouTube video commenter networks□MySpace Friend networks

□…for very varied social science purposes

New problems: Web-based phenomena

□Webometrics can be applied to understanding web-based phenomena□Why do web sites interlink?□Which web sites interlink?□What interlinking patterns exist?□What topics are frequently blogged

about?

Old problems: Offline phenomena reflected online

□Some offline phenomena have measurable online reflections□International communication□Inter-university collaboration□University-business collaboration□The impact or spread of ideas□Public opinion

2. ExamplesBlog searching - blogpulse.com

Example: Identifying and tracking public science concerns

in blogsOver 100,000 Blogs and other sources tracked

daily via RSS feedsObjective: to identify and track public

concerns about scienceE.g., “Schiavo” identified and tracked as

potential public science concern

Example: The online impact of research groups (NetReAct)

Normalised linking, smallest countries removed

Geopoliticalconnected

SwedenFinland

Norway

UK

Germany

Austria Switzerland

Poland

Italy

Belgium

Spain

France

NL

Example:Links betweenEU universities

International biofuels research network

Example: MySpace age profiles

percentage of profiles containing swearing

moderate strong very strong sample size

US males 16-19 10% 47% 2% 1,530

US females 16-19 11% 38% 2% 1,287

UK males 16-19 33% 33% 8% 171

UK females 16-19 18% 38% 3% 130

(typical sample size 20-148 for non-web swearing research)

emphatic adverb/adjective OR adverbial booster OR premodifying intensifying negative adjective

(36% of swearing)

□and we r guna go to town again n make a ryt fuckin nyt of it again lol

□see look i'm fucking commenting u back□lol and stop fucking tickleing me!! □Thanks for the party last night it was fucking

good and you are great hosts. □That 50's rock and roll weekender was fucking

mint! □Fuckin my space, my arse □1/2 d ppl cudnt even speak fuckin english! □yeah so me and sarah broke up and

everythings fucking shit

YouTube – Video poster ages

YouTubefriend network

Online impact - Keywords in web pages mentioning IWRM

Data Gathering/Processing Tools

□Blogpulse.com – blog network diagrams

□LexiURL Searcher – links, web text, YouTube, Flickr, Technorati

□Issue Crawler, Google TouchGraph - links

Discussion points for online data

□ Validity – is the underlying meaning of the text/video/picture readily apparent to the researcher?□ Possibly not to any great degree for teenagers’ MySpace

comments or very personal YouTube videos

□ Reliability –are search engines accurate/good at returning the correct results?□ Google blog search shows unreliability – very variable

over time□ Researchers can triangulate different similar search

engines or over time to test reliability

Discussion points for online data

□Coverage – to what extent is all the phenomena of interest covered by the source (e.g., search engine) used?

□Sample bias – are certain types of people over-represented? (e.g., the more literate, the more vocal, the more politically active, youth, educated, creative types…)

Summary

□The web contains a wide variety of interesting web and “web 2.0” content posted by many different people in many different formats

□Webometric methods can give insights into this data

Books

□Thelwall, M. (2009). Introduction to webometrics: Quantitative web research for the social sciences. New York: Morgan & Claypool.

□Rogers, R. (2005). Information politics on the Web. Massachusetts: MIT Press.

□ http://lexiurl.wlv.ac.uk http://webometrics.wlv.ac.uk http://www.issuecrawler.net

Important considerations

□Data accuracy□Data cleaning□Context to help interpret results□Report results carefully

Example: Analysis of the accuracy of search engine

results

Live Search results analysis

Recommended