23
Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang Purdue University Calumet

Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Predicting Water Quality in Northwest Indiana

Team members:Carl Summers, Zhe Wei Wang,Brian Hunter, Joseph Robertson

Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet

Page 2: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

AchievementsAchievements

Research extended to the IEEE CHCResearch extended to the IEEE CHC6161 Web Web

Programming CompetitionProgramming Competition Received funding through Purdue University Received funding through Purdue University

Research Department to pursue See5.0 Web Research Department to pursue See5.0 Web implementationimplementation

Collaborating with Indiana’s Department of Collaborating with Indiana’s Department of Environmental ManagementEnvironmental Management

Page 3: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Outline of Presentation:

Water Quality Prediction• Motivation• Preparing Data• Output of See5 decision tree

WebsiteWebsite• Data Graphical RepresentationData Graphical Representation• Web TechnologiesWeb Technologies

• Flash Professional 8Flash Professional 8• Cascading Style SheetsCascading Style Sheets• ASP.NET Framework 2.0ASP.NET Framework 2.0

Page 4: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

I. Water Quality PredictionI. Water Quality Prediction

Page 5: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Current mechanistic models require significant expert input to provide accurate forecasts.

These systems are typically used to predict trends in water quality over a vast region and long timelines.

Improving the detail of a mechanistic model may be too difficult, costly, or time consuming.

Traditional Mechanistic Models

Page 6: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Modeling Methods

ArtificialIntelligence

Data Mining

BayesianStatistics

Decision Tree

See5

TraditionalMechanistic

Models

Implement and compare Decision Trees, Bayesian

Networks, and the traditional Mechanistic modeling techniques.

Page 7: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

See5 – A Decision Tree Tool See5 generates a text file containing a rule-set,

used for classifying (predicting) each record in a data-set, into a discrete set of pre-determined classifications ({Good, Bad}, {Above, Normal, Below}, etc.).

Utilizes information gain, from information theory, to determine which attributes to “split” the data on.

Page 8: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Data Set

Raw data was sparse

Many attributes were useless

Required extensive work to glean useful information.

Not classified

Page 9: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Clustering

Unclassified data from USGS

ClusteringProcess

Classified Data

See5 requires classified input data.

Clustering is composed of two parts:

1) A function to group together similar points, and ultimately similar clusters. We refer to these functions as a whole as Joining Methods.

2) A function to quantify the similarity between points or clusters. These are referred to as Similarity Metrics.

Page 10: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Attribute 1

Attribute 2

Clustering

Page 11: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Date Precipitation Suspended Sediment Dissolved Oxygen Flow Rate Temperature Classification

12/15/2006 0.34 28 6.8 30 14.9 Good

12/22/2006 0 9 7 35 11.9 Bad

12/29/2006 1.6 10 6.4 46 9.5 Good

1/5/2007 3 10 6.4 52 8.5 Bad

1/12/2007 0.56 11 5.9 31 9.3 Bad

1/19/2007 0 12 8.4 43 10.8 Good

1/26/2007 0.12 20 9.2 25 11.9 Bad

2/2/2007 0 21 9.3 54 9.2 Bad

2/9/2007 0 20 8.4 35 7.9 Good

2/16/2007 0.4 20 6.4 47 8.9 Good

2/23/2007 0 17 6.1 38 9.1 Good

3/2/2007 0.13 17 6.2 29 11.4 Bad

3/9/2007 2.2 17 6.7 50 11.7 Bad

3/16/2007 1.7 15 5.5 50.1 11.9 Good

3/23/2007 0.09 18 5.7 41 12.2 Good

Clustered Data Set

Page 12: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Offset Classification

Date PrecipitationSuspendedSediment

FlowRate

Temperature

12/15/2006 0.34 28 30 14.9

12/22/2006 0 9 35 11.9

12/29/2006 1.6 10 46 9.5

1/5/2007 3 10 52 8.5

1/12/2007 0.56 11 31 9.3

1/19/2007 0 12 43 10.8

Classification

Good

Bad

Good

Bad

Bad

Good

Page 13: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Decision Tree

Date PrecipitationSuspended Sediment

Dissolved Oxygen

Flow Rate Temperature Classification

12/15/2006 0.34 28 6.8 30 14.9 Good

12/22/2006 0 20 7 35 16 Bad

05/23/2007 1.6 10 6.4 46 9.5 ???

Page 14: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

II. See5.0 Web SolutionII. See5.0 Web Solution

Page 15: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

ObjectiveObjective

Share a visualization of the predictions Share a visualization of the predictions generated by See5 with the public.generated by See5 with the public.

To provide viewers with a user interface to To provide viewers with a user interface to easily display descriptive and complex data in easily display descriptive and complex data in a comprehensive environment.a comprehensive environment.

Page 16: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

MethodsMethods

To provide a cross-platform interface by conforming To provide a cross-platform interface by conforming to W3C Standardsto W3C Standards Web languages will function through various Web Web languages will function through various Web

browsersbrowsers Provides consistency to define the appearance of an entire Provides consistency to define the appearance of an entire

Web siteWeb site Take advantage of Web technologiesTake advantage of Web technologies

No package installation required from the userNo package installation required from the user Always available (per server uptime)Always available (per server uptime) User interactionUser interaction

Easy to deploy and manage Easy to deploy and manage

Page 17: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Website

Page 18: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Interactive Content Page

Page 19: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

Data Graphical RepresentationData Graphical Representation

Applying various languages to supply a fully Applying various languages to supply a fully scalable application to the userscalable application to the user Flash 8 Professional will provide rich animation Flash 8 Professional will provide rich animation

and an elegant user interfaceand an elegant user interface CSS will allow consistency of format throughout CSS will allow consistency of format throughout

the sitethe site ASP.NET 2.0 allows embedded Flash objectsASP.NET 2.0 allows embedded Flash objects Returns server-side code and code-behind files Returns server-side code and code-behind files

into plain HTMLinto plain HTML

Page 20: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

Flash Professional 8Flash Professional 8

Many users won’t be able to install arbitrary ActiveX Many users won’t be able to install arbitrary ActiveX controls or use a Java plug-in, whereas Flash is controls or use a Java plug-in, whereas Flash is preinstalled with Windows on corporate machines, preinstalled with Windows on corporate machines, even most Linux distributions come pre-packaged even most Linux distributions come pre-packaged with Flashwith Flash

Flash can consume raw XML data to draw real-time Flash can consume raw XML data to draw real-time graphs to easily determine water qualitygraphs to easily determine water quality

Advantages of ActionScript 2.0Advantages of ActionScript 2.0 Object Oriented Programming LanguageObject Oriented Programming Language Permits vector based objects to be manipulated quickly and Permits vector based objects to be manipulated quickly and

easily, on-the-fly!easily, on-the-fly!

Page 21: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

CCascading ascading SStyle tyle SSheetsheets

Allows the provision of a standardized layout Allows the provision of a standardized layout throughout the sitethroughout the site ModulationModulation End result with CSS means cleaner codeEnd result with CSS means cleaner code

Provides the user with a consistent interfaceProvides the user with a consistent interface Conventional throughout the entire pageConventional throughout the entire page

CSS allows updating to become an easy taskCSS allows updating to become an easy task Modifications on one style sheet can affect some Modifications on one style sheet can affect some

or all pages, which are linked to that styleor all pages, which are linked to that style

Page 22: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

ASP.NET Framework 2.0ASP.NET Framework 2.0 Have accessibility to the .NET Have accessibility to the .NET

Framework 2.0 Class LibraryFramework 2.0 Class Library Easy deployment, configuration, Easy deployment, configuration,

and management with IIS 6 and management with IIS 6 (Windows Server 2003)(Windows Server 2003)

XML Metabase Schema provides XML Metabase Schema provides quick deploymentquick deployment

Easy to use GUI management utility Easy to use GUI management utility (inetmgr)(inetmgr)

Quick to update latest security Quick to update latest security patchespatches

Security Authentication to lock Security Authentication to lock out users without proper out users without proper credentials to administrate or view credentials to administrate or view the content of the pagethe content of the page

Page 23: Predicting Water Quality in Northwest Indiana Team members: Carl Summers, Zhe Wei Wang, Brian Hunter, Joseph Robertson Project Mentor: Dr. Ruijian Zhang

Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research

SummarySummary

Using clustering tools to classify data in Using clustering tools to classify data in preparation for See5preparation for See5

Using See5 to generate a rule setUsing See5 to generate a rule set Use the rule set to obtain predictionsUse the rule set to obtain predictions Ultimately implement and compare other Ultimately implement and compare other

prediction methodsprediction methods Provide a public website for the visualization Provide a public website for the visualization

of the predictionof the prediction