Upload
malcolm-mckenzie
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
ITSC/University of Alabama in Huntsville
ADaM System Architecture
Rahul Ramachandran, Sara Graves and Rahul Ramachandran, Sara Graves and
Ken KeiserKen Keiser
Mathematical Challenges in Scientific Data MiningMathematical Challenges in Scientific Data Mining
IPAM January 14-18, 2002IPAM January 14-18, 2002
Information Technology and Systems CenterInformation Technology and Systems Center
University of Alabama in HuntsvilleUniversity of Alabama in Huntsville
ITSC/University of Alabama in Huntsville
Talk Overview
Mining System Requirements
ADaM System Architecture
ADaM Plan Builder
Research directions
ITSC/University of Alabama in Huntsville
Mining System Requirements: Mining System Requirements: When,Where and WhoWhen,Where and Who
WHEN•Real Time•On-Ingest•On-Demand•Repeatedly
WHERE•User Workstation•Data Archive Center•Data Mining Center
WHO•Casual Users•Domain Experts•Mining Experts
Data Mining
ITSC/University of Alabama in Huntsville
Algorithm Development and Mining (ADaM) System
ADaM system developed under NASA research grant
The system provides knowledge discovery, feature
detection and content-based searching for data values, as
well as for metadata. It contains over 120 different operations to be performed on
the input data stream.
Operations vary from specialized atmospheric science data-
set specific algorithms to different digital image processing
techniques, processing modules for automatic pattern
recognition, machine perception, neural networks and genetic
algorithms.
ITSC/University of Alabama in Huntsville
ADaM Features
Handles science data set variability Multiple resolution/multiple scales Variability of formats Granularity of data Includes spatial/temporal dimensions
Allows addition of new algorithms
Allow scientists to select and sequence
different operations
ITSC/University of Alabama in Huntsville
ADaM Engine ADaM Engine ArchitectureArchitecture
PreprocessedData
PreprocessedData
DataDataTranslated
Data
Patterns/ModelsPatterns/Models
ResultsResults
OutputGIF ImagesHDF-EOSHDF Raster ImagesHDF SDSPolygons (ASCII, DXF)SSM/I MSFC
Brightness TempTIFF ImagesOthers...
Preprocessing AnalysisClustering K Means Isodata MaximumPattern Recognition Bayes Classifier Min. Dist. ClassifierImage Analysis Boundary Detection Cooccurrence Matrix Dilation and Erosion Histogram Operations Polygon Circumscript Spatial Filtering Texture OperationsGenetic AlgorithmsNeural NetworksOthers...
Selection and Sampling Subsetting Subsampling Select by Value Coincidence SearchGrid Manipulation Grid Creation Bin Aggregate Bin Select Grid Aggregate Grid Select Find HolesImage Processing Cropping Inversion ThresholdingOthers...
Processing
InputHDFHDF-EOSGIF PIP-2SSM/I PathfinderSSM/I TDRSSM/I NESDIS Lvl 1BSSM/I MSFC
Brightness TempUS RainLandsatASCII GrassVectors (ASCII Text)
Intergraph RasterOthers...
ITSC/University of Alabama in Huntsville
ADaM Mining ADaM Mining EnvironmentEnvironment
MiningResults
Mining Engine (ADaM)AnalysisModules
InputModules
OutputModules
Analysis/Vis Tools
Knowledge Base
Distributed Clients
Web-basedWorkstation
basedOther Systems
Common Client API
Data Stores
Data Mining Server
Event/Relationship SearchSystem
ITSC/University of Alabama in Huntsville
ADaM Architecture
ITSC/University of Alabama in Huntsville
ADaM Miner Engine
Manages the processing of data through a series of specified operations Loads input, processing and output modules dynamically as needed at execution timeAllows for the addition of newly developed modules without the need to rebuild the engine Interprets a mining plan script that provides the details about specified operations and the order that they should be executed
ITSC/University of Alabama in Huntsville
ADaM Miner Database
Used to store information that includes the names, locations and related metadata for input data sets available on the serverIncludes information about users, jobs, mining results, and other related information Simple relational database
ITSC/University of Alabama in Huntsville
ADaM Daemon and Scheduler
Scheduler Examines the list of jobs to be executed on the
server and determines which job or jobs to execute at any given time
Queues the requests and executes them sequentially.
Daemon Handles all network communications with the
mining system Is configured to listen on a specific port for any
socket communications
ITSC/University of Alabama in Huntsville
ADaM Input/Operation Filters
Input/Output Filters are data readers and writersOperations are the algorithmsEach of the operations and (input/output) filters is implemented as a shared library New modules may be added to the system without recompiling or relinking. All operations/filters either produce or operate on a data collection, which provides a common format for representing scientific data.
ITSC/University of Alabama in Huntsville
General Mining Steps
Select data files to be mined
“Check-In” the data files into the Miner Database
Write a “Mining Plan” consisting of sequence of input filter and operations
Execute the Mining Plan using the engine
Check and save results
Iterate
ITSC/University of Alabama in Huntsville
What is Check-In?Process of encoding information such as the names, locations and related metadata for input data sets available on the serverCreate complex data hierarchy in the database
ITSC/University of Alabama in Huntsville
ADaM Plan Builder: Check-InTwo Modes of Operation-General: which only requiresminimal information-Advanced: requires moredetailed information and Allows user to set up structured database
Path to the data file
Data file name
Input Filter associated with theData file
Load an XML file containingexisting Check-In specifications
ITSC/University of Alabama in Huntsville
ADaM Plan Builder – Layout
Plan Menu allows one to:•Select a new plan•Load existing plan•Check-In data
Input Menu contains the listof Input Filters one can select
Operation Menu contains the listof operations one can select
ITSC/University of Alabama in Huntsville
ADaM Plan Builder – Layout
Panel where Mining Plan can be viewed either as text or a tree
ITSC/University of Alabama in Huntsville
ADaM Plan Builder – Layout
Description about the Operation/Input Filter can be viewed in this panel
ITSC/University of Alabama in Huntsville
ADaM Plan Builder – Layout
All the parameters needed forthe Operation are described here
ITSC/University of Alabama in Huntsville
ADaM Plan Builder – Layout
Sample values for Operation’sparameters are shown in this panel
ITSC/University of Alabama in Huntsville
ADaM Plan Builder – Layout
Go Mine the data using the Mining Plan
Allows user to select the operationand add it to the Mining Plan
ITSC/University of Alabama in Huntsville
Research Directions
Generic Data Reader for ADaM ESML – Earth Science Markup Language
Programmers Guide for ADaM
Distributed Mining
Grid Mining Successful implementation and testing of the ADaM
system on the NASA Information Power Grid
Mining Onboard the Space Craft The EnVironmEnt for On-Board Processing (EVE) system
ITSC/University of Alabama in Huntsville
ADaM Information
Web site: datamining.itsc.uah.edu
ADaM Lite beta version download Contact: [email protected]