53
THE MODEL BUILDING PROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Embed Size (px)

Citation preview

Page 1: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

THE MODEL BUILDING PROCESS

Dr Nick Malleson

Dr Alison Heppenstall

GEOG3150 Semseter 2

Lecture 6

Page 2: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Projects

Aim

Improve your understanding of:

the process of building an individual-

level model

the value of modelling for

understanding the present and

predicting the future.

Develop skills in designing,

constructing and running models in

NetLogo

Scenario

The organisers of a local music

festival are concerned with the

amount of crime that takes place

there.

You are asked to:

create a model of the festival

suggest ways in which they could

intervene to reduce mugging

Method:

Use an agent-based model of the

festival (visitors and muggers) to

suggest the optimal locations for

security booths

Run some experiments with your

model

Produce a report that explains

findings and provides policy

suggestions

http://www.geog.leeds.ac.uk/courses/level3/geog3150/project/project/

Page 3: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Project Details

Option A – Programming

Largely a practical modelling

task

“Improve” the festival model

What does “improve”

mean?

To get the very best marks

you need to go beyond the

work that we did during the

practices.

Or create an entirely new

model

Deliverables:

Improved NetLogo program

500 word technical report

Justification for changes

Outline experiments

Possible improvements /

critique

Marking criteria

20% - report

60% - model functionality

20% - source code

Page 4: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Project Details

Option B – Report

No more development

Use the model created during the practical sessions to run experiments to explore optimal security locations

Deliverables:

1,500 word report

Include ideas / concepts presented in lectures

Evidence of substantial wider reading

Generic School marking criteria

Page 5: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Project Details

Option B – Report Sections

Background

A critique of agent-based modelling

Method

The model itself, outlining broadly how it works. (Use the ODD protocol?)

Pros and cons of the model

Possible improvements (other than those for part A) with justification

Experiments

Detail the experiments

Justify assumptions and explain what has been done to ensure that the experiments are fair.

Results

Results of the experiments, including tables, graphs, images as appropriate.

Conclusion and recommendations

Conclusion / summary

Recommendations for festival organisers

Evaluation

Critique the entire project

Evaluate success of meeting requirements of the festival organisers.

With hindsight, is ABM the most suitable?

What would you do differently next time? (with justification and references to relevant literature).

Page 6: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Questions about the project?

Page 7: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Recap: Last Week

InteractionsGlobal and local

Direct or mediated

Design concepts

Page 8: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Modelling BehaviourHumans are not random!

How to identify important behaviours?

KISS vs KIDS

Cognitive frameworksRule-based

PECS

BDI

Recap: Last Week

Photo attributed to Arts Electronica (CC BY-NC-ND 2.0)

Page 9: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Lecture 6 – The Model Building Process

Preparation

Design and build the model

Verification

Calibration

Validation

Prediction / Explanation

Seminar: Modelling Societal Challenges

Page 10: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Reading

Also see the reading list (at the library)

Ngo, T. A. and L. See (2012) Calibration and Validation of Agent-Based Models of Land Cover Change.In Agent-Based Models of Geographical Systems, pp 181-197

Edmonds, B. and R. Meyer (Eds) (2013) Simulating social complexity : a handbook. Springer.(Chapter 6: Checking Simulations and Chapter 8: Validating simulations) http://lib.leeds.ac.uk/record=b3432143

O'Sullivan, D. (2013) Spatial simulation: exploring pattern and process Chichester: John Wiley & Sons(Chapter 7: Modelling Uncertainty) http://lib.leeds.ac.uk/record=b3432142

Page 11: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

What is the modelling process?

Preparation

Literature review

Data review

Model design & implementation

KISS / KIDS

Interaction & Behaviour

Complexity & Emergence

Evaluation

Verification

Calibration

Validation

Prediction / Explanation

Page 12: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Preparing to model

Literature review

what do we know about fully?

what do we know about in sufficient detail?

what don't we know about? (and does this matter?)

What can be simplified?

Driver aggression in a traffic modelComplex psychological model

Model based on age and gender

Single aggression rate

Mortgages in a housing modeldetail of mortgage rates’ variation with economy;

a time-series of data;

a single rate figure.

It depends on what you want from the model.

Page 13: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Data review

Outline the key elements of the system, and compare this with the data you need.

What data do you need, what can you do without, and what can't you do without?

Page 14: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

NetLogo Example: Fire

Model Parameter

Forest density

Data Required

Read tree density

What data do you think we would need to build this simulation?

Page 15: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Data are needed for different purposes

Model initialisation

Data to get the model replicating reality as it runs.

Model calibration

Data to adjust parameters to replicate reality.

Model validation

Data to check the model matches reality.

Model prediction

More initialisation data.

Do you have sufficient data?

Page 16: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Lecture 6 – The Model Building Process

Preparation

Design and build the model

Verification

Calibration

Validation

Prediction / Explanation

Seminar: Modelling Societal Challenges

Page 17: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Model design

If the model is possible given the data, draw it out in detail.

Where do you need detail?

Where might you need detail later?

What processes are you wanting to model?

Start general and work to the specifics.

If you get the generalities flexible and right, the model will have a solid foundation for later.

Page 18: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Documenting Design – The ODD Protocol

ABMs can be very complicated

This makes them hard to design

And hard to describe (e.g. in journals)

ODD (Overview, Design concepts, Details)

A standard framework for describing ABMs.

Help with basic decisions

What should go into the model

What behaviours to include

Required model outputs

Page 19: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

The ODD Protocol Elements of the (updated) ODD protocol

Overview1. Purpose2. Entities, state variables, and scales3. Process overview and scheduling

Design concepts

4. Design conceptsBasic principlesEmergenceAdaptationObjectivesLearningPredictionSensingInteractionStochasticityCollectivesObservation

Details5. Initialization6. Input data7. Submodels

Source: Grimm et al. (2010)

Do some background reading! See the reading list for more information (at the library)

Page 20: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

KISS / KIDS

What are the really important things that need to be in a model?

How complicated should it be?

Recall from lecture 5:Keep it descriptive, stupid (KIDS)

Start descriptive, then simplify

Keep it simple, stupid (KISS)Start simple, then add complexity

Two contrasting views of KISS and KIDS approaches:

Axelrod, R. (1997). Advancing the art of simulation in the social sciences. In Conte, R., Hegselmann, R., and Terna, P. (eds) Simulating Social Phenomena , pages 21–40. Springer-Verlag, Berlin. (Note: this book is available in the library, the author has also made a draft of the chapter available online: http://www-personal.umich.edu/~axe/research/AdvancingArtSim2005.pdf ).

Edmonds, B. and Moss, S. (2005). From KISS to KIDS: an ‘anti-simplistic’ modelling approach. In Davidsson, P., Logan, B., and Takadama, K., editors, Multi Agent Based Simulation 2004, Lecture Notes in Artificial Intelligence , pages 130–144. Springer. Available online: http://cfpm.org/cpmrep132.html (also here).

Page 21: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Lecture 6 – The Model Building Process

Preparation

Design and build the model

Verification

Calibration

Validation

Prediction / Explanation

Seminar: Modelling Societal Challenges

Page 22: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Definitions: Verification, Calibration, Validation

1. Verification

Has the model been constructed correctly?

2. Calibration

Changing model parameters so that the results match expected data

3. Verification

Run the model again using new data (not used for calibration).

Does it still perform well?

Page 23: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Definitions: Verification and Validation

“Verification is the process of making sure that an implemented model matches its design. Validation is the process of making sure that an implemented model matches the real-world.”

- North & Macal, 2007, pp. 30–31

“Model verification is substantiating that the model is transformed from one into another, as intended, with sufficient accuracy. Model verification deals with building the model right”.

“Model validation is substantiating that the model, within its domain of applicability, behaves with satisfactory accuracy consistent with the study.” “Model validation deals with building the right model.”

- Balci (1997)

Other definitions are available!

Page 24: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

1. Verification

Is your model working as you would expect it to?

Is replicating the processes (mechanisms) that you are interested in simulating?

Sometimes called testing the ‘inner validity’ of the model

Things to Look for:

Logical errors in translation of verbal model into computer model

Programming errors – “bugs”

One approach: Docking

Implement in another language

Axtell et al. (1996)

Page 25: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Verification Examples

Basic diffusion model Heppenstall et al. (2005)

Agent behaviour without geographyMalleson et al. (2010)

Page 26: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Is the fire model working correctly?

What things might we check to verify that the fire model has been programmed correctly?

Page 27: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Is the fire model working correctly?

Are the trees burning?

Is the water burning?

Does the fire spread or stay in one place?

Does increasing the forest density have the desired impact?

What happens at 0% density? (and 100%?).

What things might we check to verify that the fire model has been programmed correctly?

Page 28: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

2. Calibration

If your model doesn't replicate real situations with

simple rules you'll need to calibrate your model.

Calibration is the process of fine-tuning the parameter

values in your model so they fit your data

Derivation of best-fit model parameters from real-world data

How to compare the model to the data??

(More on this in lecture 9)

Page 29: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

DEMO: Calibration: Fire

A recent fire burned 82.4% of a forest.

Find the density value that simulates the correct percentage

Page 30: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Behaviour/Parameter Space in Netlogo

Even a small model can have an enormous parameter space

Combinatorial explosion of possible input settings

Extremely large number of possible inputs for integer value

parameters

Even larger number of possible inputs for real-valued parameters

Method 1:

Wander aimlessly through NetLogo input parameter space,making

lots of runs, and looking for interesting relationships between inputs

and outputs

Method 2:

Plan the parameter space voyage, measure results, use stats to

determine if there is intelligent life out there.

Page 31: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Hint: Automatic parameter sweeps in NetLogo

Can use NetLogo Behaviour Space to systematically change parameters

Page 32: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Finding the optimal parameter combination

Calibration is basically a search through a multi-dimensional parameter space

Previous approach is OK, but what about:

Numerous parameters

Non-linear outcomesMight require millions of

model runs

More efficient methods for

doing computer-aided

calibration

More on this in lecture 9

Page 33: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6
Page 34: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

3. Validation

Recap:

1. Verification

Does the model implementation match our

intended design?

2. Calibration

Dan we make the model fit real world

data?

3. Validation

Does the model still work if we give it

different data (not used for calibration)

Page 35: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Validation

Can a model replicate processes and patterns in

unknown data.

Not a binary event

A model cannot simply be classified as valid or invalid

A model can have a certain degree of validity which of

course is encapsulated by various measures of fit

Different parts of the model to validate

Different methods to use

Page 36: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Validation

Need to decide on what you are interested in looking at.

Visual confirmatione.g. Comparing two city forms.

One-number statistice.g. Can you replicate average price?

Spatial, temporal, or interaction matche.g. Can you model city growth block-by-block?

More on this in lecture 9 ..

Page 37: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Validation Example

Single statistical

comparison

Spatial

comparison of

aggregate results

Validating agent

behaviours

Statistic ValueDifference in total number of crimes 547 (13%)

R2 crimes per output area 0.873

Page 38: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Goals of Validation

Manson’s question: “How well does a model characterise the target system?”

Verburg et al:

“It is not particularly useful to attempt to crown a model as valid, or to condemn a model as invalid based on the validation results. It is more useful to state carefully the degree to which a model is valid. Validation should measure the performance of a model in a manner that enables the scientist to know the level of trust that one should put in the model. Useful validation should also give the modeler information necessary to improve the model.”

Page 39: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Testing

Verification and calibration are demonstrated via testing

More formal modelling efforts require more formal testing

Testing need not be statisticalOften testing is more qualitative:

E.g. When law and order goes down, violence (civil unrest) should go up!!

Experts not involved with the modelling effort can be an extremely valuable part of testing.

Page 40: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Validation: Fire

How would you validate the fire model?

Page 41: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Validation: Fire

Different data set?

Temporal?

Spatial?

Different topography?

How would you validate the fire model?

Page 42: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Balci’s “Principles” of Verification, Validation and Testing

1 - Iterative process Check each part as it comes together Establish points in development to be sure you are on the right track

Build from modest goals to more ambitious ones

Levels of Testing: Private testing (testing as you go) Sub-model testing (different team -- or different hat, testing different inputs and output variables)

Integration testing (do the parts (sub-models) go together?) Model testing (does the whole thing work right? - validity of the model)

Acceptance testing (done by, or on behalf of, sponsor)

Balci, O., "Principles of Simulation Model Validation, Verification, and Testing", Transactions of the Society for Computer Simulation International, vol. 14, no. 1, pp. 3-12, 1997.

Page 43: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

2 Validity is not binary

No model is completely valid as all are abstractions of a system.

Modelling is about producing something with a certain, well understood, degree of credibility.

As the degree of model creditability increases, so will the development cost… but so will its utility (but at a decreasing rate).

Page 44: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

3 Validity for a Purpose

A model is “valid enough” for a particular purpose -no model is “valid enough” in general.

Accuracy of the model depends on the importance of the decision being made.

“The model is sufficiently valid” with respect to study objective.

Corollary: modelling should start with a research question. VV & T evaluates how well it answers that question.

“If you don’t know where you are going, you might end up somewhere else” Yogi Berra

Page 45: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

4 VV&T is hard

No cook book way to do it

Requires creativity and insight (& time!)Must understand the model and identify adequate test cases

Generally beyond the scope of peer review

Third party efforts often limited to external validation using supplied data

Possibly also verification of computer code

Examination of internal validity and testing against novel data are both difficult.

Page 46: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

5 Validity only applies to tested conditions

Input conditions are important for outputs E.g. assume model for rush hour but does not work for other times of the day

A really robust model will perform better outside of test conditions

But how well?

Performance is better if the model is more internally valid (cause and effect are analogous to target).

Domain of Applicability Range of input conditions over which the model validity is claimed (Schlesinger et al., 1979)

Page 47: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

6 Testing should be planned

Ideally, you want to design the test before you get the model running

Guards against developer bias• Consider: activity & sequence diagrams might be of assistance.

Document testingTesting should be continuous (e.g. unit testing).

Page 48: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

7 Valid parts may not make a valid whole

In a complex model: Each sub model must be validated separately

Develop test criteria -- how should it respond in various situations?

Valid submodels can make an invalid model in various ways:

Connections between the sub models may not be right (internal invalidity producing external invalidity)

Non-linear responses may amplify small errors in sub models to produce unacceptable errors in overall model.

Page 49: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

8 Double Validation Problem

For a solid result, you need to test a valid model against valid data.

Not all data are valid– Lots of sources of error

Particularly a problem when the input data are stochastic and relational in nature

Take home message: Be as skeptical of the data as you are of the model -- test them both.

Page 50: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Lecture 6 – The Model Building Process

Preparation

Design and build the model

Verification

Calibration

Validation

Prediction / Explanation

Seminar: Modelling Societal Challenges

Page 51: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Prediction

Now you are confident that the model is reliable, it can be used to make predictions.

Feed new data, and see what it does.

If the model is deterministic, one run will be much like another.

If the model is stochastic (i.e. includes some randomisation), you’ll need to run in multiple times as you will always get a different answer.

Page 52: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Explanation

Models aren’t just about prediction.

They can be about experimenting with ideas.

They can be about testing ideas/logic of theories.

They can be to hold ideas.

Page 53: T HE M ODEL B UILDING P ROCESS Dr Nick Malleson Dr Alison Heppenstall GEOG3150 Semseter 2 Lecture 6

Summary

Preparation

Design and build the model

Verification

Calibration

Validation

Prediction / Explanation

Seminar: Modelling Societal Challenges