Upload
julius-johnson
View
212
Download
0
Embed Size (px)
Citation preview
Sustainability: Web Site Statistics
Marieke Napier
UKOLN
University of Bath
Bath, BA2 7AY
UKOLN is supported by:
[email protected]://www.ukoln.ac.uk/
Sustainability Workshop, London, 16 May 2002
2
Web Site Statistics
This presentation will:• Give a (very) brief overview of what Web
statistics are• Consider why we need them• Focus on the analysis of usage data created
by your Web site• Look at what other criteria, besides Web
server statistics, can be used to provide performance indication
Sustainability Workshop, London, 16 May 2002
3
What are Web Statistics?
• Web statistics are produced by the Web server software
• Information (such as IP address, name of resource) is recorded in a log files
• It is also possible to configure your server to record more information (such as referrer details)
• The log files produced are mainly accurate• However interpretation of the statistics can be
misleading
Sustainability Workshop, London, 16 May 2002
4
Why do we Need Them?
• They indicate how popular your site is• They show how successful your marketing strategy
has been• They can be used in management reports• They can identify gaps in service provision • They predict and plan for future load patterns• They allow you to monitor performance levels• They can be used in consideration of deployment of
new technologies• They can inform and motivate contributors• They can show who your users are• NOF have asked for them
Sustainability Workshop, London, 16 May 2002
5
The HTTP Process
• A user clicks a link or enters a URL• The remote web server downloads the HTML
page• The HTML page is interpreted and any inline
objects are also downloaded:– Each image (occurrence of <IMG SCR=“image1">)– Background image or sound– External JavaScript or stylesheet files etc.
• The user follows a path through the site making new requests till they leave your site
SummaryEach individual users request for a page can produce multiple requests at the remote server and generate multiple hits.
SummaryEach individual users request for a page can produce multiple requests at the remote server and generate multiple hits.
Sustainability Workshop, London, 16 May 2002
6
Viewing Web Statistics
• Server log files are available to view…but may not make a lot of sense on first look
• The Analog program (Cambridge University) was one of the first packages to provide a graphical summary of web log file.
http://www.statslab.cam.ac.uk/~sret1/stats/stats.htmlhttp://www.statslab.cam.ac.uk/~sret1/stats/stats.html
Sustainability Workshop, London, 16 May 2002
7
Web Statistics: Terms Used
Hit • Any information requested from a site - this includes HTML
pages, pictures, forms, scripts and files downloaded• Can be affected by redesign, robots, caching etc.
Page Views (or requests/impressions)• The number of pages viewed• Extensions such as .htm, .html, .asp etc.
User Sessions• Series of requests from unique IP address within a period
of time (more accurate if registered users • Issues with firewalls, institutional caches etc.
Sustainability Workshop, London, 16 May 2002
8
Interpretation Issues
Profiling users - can we track users easily? • You can’t tell the exact identity of your users • Using IP addresses, domain names of visitors• Following paths – entering and exiting the site• Registration
Caching• Browser caching and institutional/ISP caching
Robots• Necessary enable your resources to be found • Robots generate hits
Quality??
Sustainability Workshop, London, 16 May 2002
9
Log Analysis Tools
There are many tools available:• Analog: free, easily automated. However little data-
mining capabilities and management graphs limited.• WebTrends: Popular desktop package. Several
versions. May be expensive for reporting on multiple Web sites.
• Webaliser, WebVisit, HitList, Reportmagic etc.• A list is available at
http://uk.dir.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools/
Sustainability Workshop, London, 16 May 2002
10
Externally-Hosted ServicesTwo services have been used extensively by UKOLN: SiteMeter and NedStat• Advantages:
– No software to buy, install, configure and run or powerful PC to run software on
– No log files to manage– Uses "cache-busting" images– Can monitor extra features
• Disadvantages:
http://www.sitemeter.com/http://www.sitemeter.com/
– Limited data-mining– Lloss of Ownership of data– Dependency on external service– Fails to monitor text browsers http://www.nedstats.com/http://www.nedstats.com/
Sustainability Workshop, London, 16 May 2002
11
Other Performance Indicators
Links to Your Site• Indicators that people are interested in your
service (and can deliver traffic)
Search Engines Coverage• Indicators that users can find resources on your
Web site
User Feedback• Comments, voting, etc.
Technical Indicators• Browser support, broken links, server-uptime,
etc.
Sustainability Workshop, London, 16 May 2002
12
Links To Your Site• Links are an indication of
potential use of your Web site • Search engines can be used to
report on the numbers of links to a Web site
• LinkPopularity.com provides an interface to 3 search engines
• Monthly reports can be obtained
http://www.linkpopularity.comhttp://www.linkpopularity.com
Sustainability Workshop, London, 16 May 2002
13
Coverage By Search Engines• Have you promoted your
Web site?• Can your Web site be
accessed by search engines?
• Are you near the top of the search results?
• Search engines can report on their coverage of your Web site
• Coverage is an indication of potential use of your Web site
For information on how to ensure that your web site has been indexed see the section on Promotion of your Project Web
For information on how to ensure that your web site has been indexed see the section on Promotion of your Project Web
Sustainability Workshop, London, 16 May 2002
14
Technical Indicators
Broken Links• How many links are there on your Web site
(internal and external)?• How many broken links are there?• Use services like linkalarm.com
Server Availability• Recording down time• Email alerting• Use services like InternetSeer.com
Sustainability Workshop, London, 16 May 2002
15
Conclusions
• Web statistics can be difficult to interpret• Analysis of Web statistics is needed for lots of
reasons• Think about the tools you will need (and the
resource implications in using them)• Besides analysis of log files there are other
performance indicators which may be of use• Analysis will also help with in monitoring the
performance of your Web site and planning future developments
Sustainability Workshop, London, 16 May 2002
16
Any Questions?
This presentation is loosely based on the Information Paper on Web Site Performance Monitoring available at:
http://www.ukoln.ac.uk/nof/support/help/papers/performance/http://www.ukoln.ac.uk/nof/support/help/papers/performance/