Experiences Using Windows Azure to Calibrate Watershed Models · 2018-01-29 · to Calibrate Watershed Models Marty Humphrey, Norm Beekwilder University of Virginia Jon Goodall, Mehmet

Experiences Using Windows Azure

to Calibrate Watershed Models

Marty Humphrey, Norm Beekwilder University of Virginia

Jon Goodall, Mehmet Ercan

University of South Carolina

Mississippi River Watershed Chesapeake Bay Watershed

Example Large Watersheds

Typical Scale of Watershed Models

Eno River near Durham, NC

Scale

• Mississippi: 1,245,000 sq mi (3,220,000 km2)

• Chesapeake: 64,000 sq mi (166,000 km2)

• Eno (our case study watershed): 66 sq mi (171 km2)

Eno to Chesapeake (~ 1,000 times)

Eno to Mississippi (~ 20,000 times)

Watershed Hydrology

SWAT

Model

Source: SWAT Theoretical Document

Challenges in Watershed Modeling

• Data Preparation

– Data exists, but files are large and require preprocessing

• Model Calibration

– Requires running the model multiple times with varying parameters

• Scale up Model to Large System (Chesapeake, Mississippi)

– Impractical using current approaches

HPC Cluster

Azure Compute Instances

Azure Compute Proxies

Early Results

(Cloud Futures, June 2011)

Stage-in Compute Total

Scientist laptop 0 55 sec 55 sec

Win2 5 sec 60 sec 65 sec

Azure (ex-large) 53 sec 32 sec 85 sec

Issues


• Windows Azure only or cloudbursting?

• Data storage – where, how?

• Data sharing/reuse policy?

• Task granularity / coding?

• Task synchronization (e.g., MPI)?

# Parameters

# USGS gage ID

# Other settings

SWAT

TxtInOut

E and parameter values

SWAT

TxtInOut

Parameters

& Settings

Get Streamflow

USGS gage ID

E values

Analyze E Values

and Create New

Parameter Series

The Calibration

Method

Is the calibration

criteria satisfied?

Stop

Yes

No

Parameter

Series

CUASHI

Streamflow

Edit Input Files

Run SWAT

Calculate E

SWAT

TxtInOut

Results

P-DDS

Choose parameters/values

Execute

model

Execute

model

Execute

model

Execute

model

Change candidate parameters/distributions

Choose Best Value

P-DDS

0

2

4

6

8

10

12

14

16

18

1

15

9

31

7

47

5

63

3

79

1

94

9

11

07

12

65

14

23

15

81

17

39

18

97

20

55

22

13

23

71

25

29

26

87

28

45

30

03

31

61

33

19

34

77

36

35

37

93

39

51

41

09

42

67

44

25

45

83

47

41

48

99

50

57

52

15

53

73

55

31

56

89

58

47

60

05

61

63

63

21

64

79

66

37

67

95

69

53

71

11

72

69

74

27

75

85

77

43

79

01

80

59

82

17

83

75

85

33

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1

24

47

70

93

11

6

13

9

16

2

18

5

20

8

23

1

25

4

27

7

30

0

32

3

34

6

36

9

39

2

41

5

43

8

46

1

48

4

50

7

53

0

55

3

57

6

59

9

62

2

64

5

66

8

69

1

71

4

73

7

76

0

78

3

80

6

82

9

85

2

87

5

89

8

92

1

94

4

96

7

99

0

Verification

Stage Out

Compute

Stage In

NW-DDS

Choose parameters/values

Execute

model

Execute

model

Execute

model

Execute

model

Change candidate parameters/distributions

NW-DDS

0

2

4

6

8

10

12

14

16

18

1

48

95

14

2

18

9

23

6

28

3

33

0

37

7

42

4

47

1

51

8

56

5

61

2

65

9

70

6

75

3

80

0

84

7

89

4

94

1

98

8

10

35

10

82

11

29

11

76

12

23

12

70

13

17

13

64

14

11

14

58

15

05

15

52

15

99

16

46

16

93

17

40

17

87

18

34

18

81

19

28

19

75

20

22

20

69

21

16

21

63

22

10

22

57

23

04

23

51

23

98

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1

24

47

70

93

11

6

13

9

16

2

18

5

20

8

23

1

25

4

27

7

30

0

32

3

34

6

36

9

39

2

41

5

43

8

46

1

48

4

50

7

53

0

55

3

57

6

59

9

62

2

64

5

66

8

69

1

71

4

73

7

76

0

78

3

80

6

82

9

85

2

87

5

89

8

92

1

94

4

96

7

99

0

Verification

Stage Out

Compute

Stage In

0

2

4

6

8

10

12

14

161

51

10

1

15

1

20

1

25

1

30

1

35

1

40

1

45

1

50

1

55

1

60

1

65

1

70

1

75

1

80

1

85

1

90

1

95

1

10

01

10

51

11

01

11

51

12

01

12

51

13

01

13

51

14

01

14

51

15

01

15

51

16

01

16

51

17

01

17

51

18

01

18

51

19

01

19

51

20

01

20

51

21

01

21

51

22

01

22

51

23

01

23

51

24

01

24

51

25

01

25

51

0

10

20

30

40

50

60

1

15

29

43

57

71

85

99

11

3

12

7

14

1

15

5

16

9

18

3

19

7

21

1

22

5

23

9

25

3

26

7

28

1

29

5

30

9

32

3

33

7

35

1

36

5

37

9

39

3

40

7

42

1

43

5

44

9

46

3

47

7

49

1

50

5

51

9

53

3

54

7

56

1

57

5

58

9

60

3

61

7

63

1

64

5

65

9

67

3

68

7

70

1

0

50

100

150

200

2501 7

13

19

25

31

37

43

49

55

61

67

73

79

85

91

97

10

3

10

9

11

5

12

1

12

7

13

3

13

9

14

5

15

1

15

7

16

3

16

9

17

5

18

1

18

7

19

3

19

9

20

5

21

1

21

7

22

3

22

9

23

5

24

1

24

7

25

3

25

9

26

5

27

1

27

7

28

3

28

9

29

5

30

1

Efficiency

Duration Speedup

Desktop, SWAT internal

calibration (single

threaded)

11.4 hours (41040 sec) --

Windows Azure, Ex-large

VM, 16 cores, NW-DDS

43.32 min (2599 sec) 15.78x



11.76 min (706 sec) 58.13x



5.03 min (302 sec) 135.89x

As number of cores increase, dominated by [a] stragglers, and

[b] re-running best param set to get output files

Re-visiting the Issues


• Windows Azure only or cloudbursting?

– Only Windows Azure (with head node inside enterprise)

• Data storage – where, how?

– Only Windows Azure, so [a] “where” is largely non-issue, and [b] “how” is as little as possible

• Data sharing/reuse policy?

– Not as much as an issue as we would like

• Task granularity / coding?

– Hmm… not exactly an issue (see next slide)

• Task synchronization (e.g., MPI)?

– Removed in an uninteresting way

Next steps: Calibration

• Additional watershed models (e.g., HSPF)

• Additional calibration algorithms

• Better control of search (e.g., stopping criteria)

– Visual results presentation and visual steering

– Trading off fast-vs.-more-exhaustive

• Better sharing / non-sharing

– Insta-virtual-cluster a la HadoopOnAzure

• How does insta-calibration change the watershed model design process? (continual calibration?)

Next steps: Data preparation

Observations on Windows Azure

• Does Windows Azure (more broadly: cloud computing) substantially change the challenges of collaboration between domains? No.

• What do we like?

– Cloudbursting mechanism (head node, VPN).

– Predictability is sufficient (and continues to improve).

• What would we like to see?

– Faster boot time for VMs.

– Non-round-up cost-model.

– Faster speed of light.

Summary

• Goals: Watershed calibration, data prep, and

large-scale modeling

• Good progress on watershed calibration, data

prep next

• Basic issues of collaboration across domains

will continue to dominate

Documents

Experiences Using Windows Azure to Calibrate Watershed Models · 2018-01-29 · to Calibrate Watershed Models Marty Humphrey, Norm Beekwilder University of Virginia Jon Goodall, Mehmet