16
Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email [email protected] URL http://www.ukoln.ac.uk/

Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email [email protected] URL

Embed Size (px)

Citation preview

Page 1: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability: Web Site Statistics

Marieke Napier

UKOLN

University of Bath

Bath, BA2 7AY

UKOLN is supported by:

[email protected]://www.ukoln.ac.uk/

Page 2: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

2

Web Site Statistics

This presentation will:• Give a (very) brief overview of what Web

statistics are• Consider why we need them• Focus on the analysis of usage data created

by your Web site• Look at what other criteria, besides Web

server statistics, can be used to provide performance indication

Page 3: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

3

What are Web Statistics?

• Web statistics are produced by the Web server software

• Information (such as IP address, name of resource) is recorded in a log files

• It is also possible to configure your server to record more information (such as referrer details)

• The log files produced are mainly accurate• However interpretation of the statistics can be

misleading

Page 4: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

4

Why do we Need Them?

• They indicate how popular your site is• They show how successful your marketing strategy

has been• They can be used in management reports• They can identify gaps in service provision • They predict and plan for future load patterns• They allow you to monitor performance levels• They can be used in consideration of deployment of

new technologies• They can inform and motivate contributors• They can show who your users are• NOF have asked for them

Page 5: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

5

The HTTP Process

• A user clicks a link or enters a URL• The remote web server downloads the HTML

page• The HTML page is interpreted and any inline

objects are also downloaded:– Each image (occurrence of <IMG SCR=“image1">)– Background image or sound– External JavaScript or stylesheet files etc.

• The user follows a path through the site making new requests till they leave your site

SummaryEach individual users request for a page can produce multiple requests at the remote server and generate multiple hits.

SummaryEach individual users request for a page can produce multiple requests at the remote server and generate multiple hits.

Page 6: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

6

Viewing Web Statistics

• Server log files are available to view…but may not make a lot of sense on first look

• The Analog program (Cambridge University) was one of the first packages to provide a graphical summary of web log file.

http://www.statslab.cam.ac.uk/~sret1/stats/stats.htmlhttp://www.statslab.cam.ac.uk/~sret1/stats/stats.html

Page 7: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

7

Web Statistics: Terms Used

Hit • Any information requested from a site - this includes HTML

pages, pictures, forms, scripts and files downloaded• Can be affected by redesign, robots, caching etc.

Page Views (or requests/impressions)• The number of pages viewed• Extensions such as .htm, .html, .asp etc.

User Sessions• Series of requests from unique IP address within a period

of time (more accurate if registered users • Issues with firewalls, institutional caches etc.

Page 8: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

8

Interpretation Issues

Profiling users - can we track users easily? • You can’t tell the exact identity of your users • Using IP addresses, domain names of visitors• Following paths – entering and exiting the site• Registration

Caching• Browser caching and institutional/ISP caching

Robots• Necessary enable your resources to be found • Robots generate hits

Quality??

Page 9: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

9

Log Analysis Tools

There are many tools available:• Analog: free, easily automated. However little data-

mining capabilities and management graphs limited.• WebTrends: Popular desktop package. Several

versions. May be expensive for reporting on multiple Web sites.

• Webaliser, WebVisit, HitList, Reportmagic etc.• A list is available at

http://uk.dir.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools/

Page 10: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

10

Externally-Hosted ServicesTwo services have been used extensively by UKOLN: SiteMeter and NedStat• Advantages:

– No software to buy, install, configure and run or powerful PC to run software on

– No log files to manage– Uses "cache-busting" images– Can monitor extra features

• Disadvantages:

http://www.sitemeter.com/http://www.sitemeter.com/

– Limited data-mining– Lloss of Ownership of data– Dependency on external service– Fails to monitor text browsers http://www.nedstats.com/http://www.nedstats.com/

Page 11: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

11

Other Performance Indicators

Links to Your Site• Indicators that people are interested in your

service (and can deliver traffic)

Search Engines Coverage• Indicators that users can find resources on your

Web site

User Feedback• Comments, voting, etc.

Technical Indicators• Browser support, broken links, server-uptime,

etc.

Page 12: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

12

Links To Your Site• Links are an indication of

potential use of your Web site • Search engines can be used to

report on the numbers of links to a Web site

• LinkPopularity.com provides an interface to 3 search engines

• Monthly reports can be obtained

http://www.linkpopularity.comhttp://www.linkpopularity.com

Page 13: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

13

Coverage By Search Engines• Have you promoted your

Web site?• Can your Web site be

accessed by search engines?

• Are you near the top of the search results?

• Search engines can report on their coverage of your Web site

• Coverage is an indication of potential use of your Web site

For information on how to ensure that your web site has been indexed see the section on Promotion of your Project Web

For information on how to ensure that your web site has been indexed see the section on Promotion of your Project Web

Page 14: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

14

Technical Indicators

Broken Links• How many links are there on your Web site

(internal and external)?• How many broken links are there?• Use services like linkalarm.com

Server Availability• Recording down time• Email alerting• Use services like InternetSeer.com

Page 15: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

15

Conclusions

• Web statistics can be difficult to interpret• Analysis of Web statistics is needed for lots of

reasons• Think about the tools you will need (and the

resource implications in using them)• Besides analysis of log files there are other

performance indicators which may be of use• Analysis will also help with in monitoring the

performance of your Web site and planning future developments

Page 16: Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: Email m.napier@ukoln.ac.uk URL

Sustainability Workshop, London, 16 May 2002

16

Any Questions?

This presentation is loosely based on the Information Paper on Web Site Performance Monitoring available at:

http://www.ukoln.ac.uk/nof/support/help/papers/performance/http://www.ukoln.ac.uk/nof/support/help/papers/performance/