28
Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

Embed Size (px)

Citation preview

Page 1: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

Promoting Good Statistical Practices

Roger Stern - SSC, Reading

WMO/FAO training workshop - November 2005

Page 2: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 2

Contents• Understanding the present situation:

– The need for (basic) training in statistics– Past training in statistics– Developments in statistical computing– And in statistical analyses

• Possibilities for the future– Resources

• statistical software (freely available in Africa)• materials to promote good statistical practices• training materials

• Spatial analysis • In conclusion

– These are exciting times - let’s look forwards not backwards

Statistical Services Centre
Start by describing the beginning, BUCS, then showing a bit of CAST, then SSC-Stat, then Instat.Then into presentation.
Page 3: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 3

Training in statistics• It is difficult to practice good statistics

– unless we have had appropriate training

• For example seasonal forecasting• Uses PCA

• Spatial methods mentioned in this workshop include:• Kriging, and co-kriging• PCA and clustering

• When many staff find more basic concepts difficult• Percentiles and return periods – (show CAST as preview)• Standard errors, etc

• So they have to accept (advanced) methods in an unquestioning way

Page 4: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 4

Past training in statistics

• Training for (non-statistician) users in the past has been problematical

• consequently they fear statistics

• and hence also statisticians

• Similarly, insufficient soft training for statisticians• consequently they sometimes lack communication skills

• and marketing skills

• and are often side-lined in important development and research projects

• just like Met staff perhaps???

Page 5: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 5

Common training problems for non-statisticians

• Training is dominated by analysis• with little on data management • or on design

• A recipe-book approach is used• hence e.g. overuse of irrelevant significance tests• little understanding of principles

• Training emphasises hand computation• for understanding (which they don’t get!)• but not needed later• and little experience of computers for statistical work

• Presentation is too mathematical• not conceptual

AND often taught by someone who has little interest in the student’s main subject areas

Page 6: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 6

RESULT!• Users with near universal dislike of statistics

• and statisticians?

• strong demand for relevant in-service training in statistics

• Most of these past weaknesses in training• are the same for statisticians

• who can be too pedantic and inflexible in their advice– and are then feared and ignored, where possible, by potential clients

• We see later how this can now easily change• for both statisticians

• and for others who need to generate and use statistics

Page 7: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 7

Advances in statistical computing

• History• 1960’s SAS and SPSS started

– A long way back in computer terms

• By early 1980’s• Statistics packages well established

• Micro-computers appeared – too small for these packages

• So lots of other statistics packages– that made the same mistakes as SAS and SPSS a generation earlier

» it is easy to write statistical software, but difficult to write good software

Page 8: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 8

Statistics packages : THEN

• In the 1990’s• Standard statistics packages dominant again

– compare other types of software

– With some additions e.g. Stata

• All command-driven– So you had to learn the language (for SPSS, or SAS)

– So people and training courses used just one package

• Data transfer between packages was difficult

• Training courses often confused – learning the package with learning statistics

– c.f. data management – learning concepts or learning Access

Page 9: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 9

A big advance…..

Windows appeared

& EXCEL ruled the world

for better for worse!

Page 10: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 10

Statistics packages : NOW• All common packages are in Windows

• Very similar interface– Like other Windows software– So very easy to learn

• And to add to Excel– so you can still keep your “security blanket”

• And easy to add another package– hence not so critical what package is used for statistics training

• Data transfer has also become easy

• Hardly need a training course • for the software• so can concentrate on training in statistics again!

Page 11: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 11

Advances in statistical analysis

• The “estuary model”• ever-increasing unity to the methods

– this makes training much easier

• if we build a solid foundation– special methods are then seen as such

Page 12: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 12

Start in 1960’s

• In the mountains there were little streams• Regression and

• Analysis of variance

• These were for normally distributed data

• In another valley• parameter estimation was for other distributions, like

Poisson and binomial

• And leading to another valley• the chi square analysis for categorical data

Page 13: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 13

Then• In the late 1960’s

• Chi-square tests joined with other ways of looking at multidimensional contingency tables

• to become log-linear models

• In the early 1970’s• log-linear models• joined probit analysis

• into the general stream of generalized linear models

• that also included ANOVA and regression• for normal and non-normal data

Page 14: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 14

And finally for us here• In the 1980’s

• REML started• and is for data at multiple levels

• By the 1990’s it had joined the mainstream• and included powerful methods for spatial modelling

• So now• same modelling ideas used for a wide range of

problems

• Making both training and analysis • simpler and more coherent• as long as the trainers know.

BUT some are still up in the mountains!

Page 15: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 15

So where are we now?

• Statistical software has developed• and so has user’s computing skills

• Statistical methods have developed• and are easier to use

• And the resources to bring the two together• are now being made available• and are becoming accessible throughout

• We describe some of these resources• First generally• And then look briefly at methods for spatial modelling

Page 16: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 16

Software includes:• SSC-Stat

• add-in for Excel to encourage good use– with a tutorial guide– and guides for good tables and good graphs– for example it provides boxplots

• Instat+• first simple statistics package for ‘Excel-lers’

– supports good teaching of statistics– stepping stone to other statistics packages– tutorial guide, introductory guide – and climatic guide, now updated for Instat Version 3– for example for data summary or training

• Genstat• One of the major statistics packages (like SPSS, Systat)

– For modern statistical modelling, like GLMs and REML– And good facilities for spatial modelling

Page 17: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 22

Genstat•Specially for agricultural applications

•And now with added climatic features

•Like extremes, and circular plots

•Plus a climatic guide

Page 18: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 24

Resources for good statistical practice

• Good practice guides• Mini-guides for statistical sceptics

– designed originally to promote good statistical practice in DFID projects

– covering design, data management analysis and presentation– a book is now available

• And so much more:• Participatory (QQA) stuff, important for Met services• Now a book is available, based on Malawi’s “starter

pack”• Data management – where Met services can support

other groups

Page 19: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 27

Training resources include

• Statistical games to help teach statistics• Reading and BUCS

• For example PADDY, the rice survey game

• Materials for distance learning• Now CAST in general

• But can now be adapted for African needs

• With support from the Rockefeller foundation

Page 20: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 30

Interesting ways of learningTraining software

• Statistics concepts through CAST

Page 21: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 31

Interesting ways of learningStatistical Games

• Simulating a survey based on a real crop cutting survey in Sri Lanka

Page 22: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 33

And in climatology

• Providing the basic statistical skills• Now through a facilitated e-learning course

• Tested in 2005, and provided from 2006

• For staff in HQ and (hopefully) in outstation offices– Because decentralisation is important

• Using a specially adapted version of CAST• That can be provided to African Services

• You have seen this earlier

• Also software (Instat) plus Genstat• Each with their special climatic guide

Page 23: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 34

Spatial ideas• More to spatial analysis than just maps• Remember the data – when will you map?

• Daily – many “layers”• Annually (e.g. date of start of the season)• Averages – take care of different years at different stations

• Example where map does not give the full answer• Southern Zambia – risky for maize• Suggest strategy – say farmers overall have 20% (1 year in 5)

risk of replanting• How much seed should be stocked?• Map – very simple 20% everywhere – does it answer the

question?• Need spatial correlations – why?

Page 24: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 35

GIS and mapping

• Many problems can be mapped effectively• Then much “spatial analysis” is descriptive

statistics• Selection of subsets,• Transformations to provide new layers• Logical calculations• Etc

• This is non-controversial• Simple smoothing to provide contours is the same

• As long as the spatial “averaging” e.g. splines, inverse distance is recognised as such

• But kriging, etc is moving into inferential ideas• And statistical packages could also be used for such operations

Page 25: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 36

Spatial statistics with statistical software

• Many statistical packages, e.g. Genstat

• Provide some facilities for spatial analysis

• For example kriging

• And REML – for the future

Page 26: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 37

Demonstration

• Show two examples of Genstat• First is a simple contour plot

• Shows the value of a log file of commands

• Second is an example of kriging• Shows more facilities in fitting and plotting

• Other facilities include• Co-kriging

• REML for “proper” spatial modelling

• Within which kriging is a special case

• More “research” and case studies are needed

Page 27: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

WMO/FAO Training workshop, November 2005

PROMOTING GOOD STATISTICAL PRACTICE 38

In conclusion

• The time is right:• Statistics has changed

• Training methods can change

• The resources are here

• And in Africa:• Evidence-based decision making is (more) encouraged

• Met Services are key organisations

• Because climatic data are needed in so many applications

Challenge: How will you proceed??

Page 28: Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

Thank you