Yan Huang huangyan@unt.edu 6350 Spatio-temporal Data Processing Course Overview
Preview:
Citation preview
- Slide 1
- Yan Huang huangyan@unt.edu 6350 Spatio-temporal Data Processing
Course Overview
- Slide 2
- Basic Information Instructor: Yan Huang (huangyan at unt.edu)
Meeting place and time: M 2:30-:520pm B157 Office hours: M
12:30-2:30pm
- Slide 3
- Basic Information TA: Sasi Koneru (SasiKoneru@my.unt.edu)
Office hours: Monday 10:00 AM to 2:00 PM, F208
- Slide 4
- Evaluation The evaluation scheme will be class participation
10% paper analysis and presentation - 25% project - 40%. Term paper
30%
- Slide 5
- Classroom policy No computers or laptops unless told so.
- Slide 6
- Paper Analysis I Collect 5 or more papers in one sub-area Write
short summaries for 3 (100-200 words) Make a 15 minutes
presentation on what you learn on this topic The presentation will
take an integrated approach where you introduce the motivation of
the three papers, give a precise problem definition, compare and
contrast the ways the 3 papers approach the problem and how they
validate their results, present conclusions, and point to some
future directions if you can identify
- Slide 7
- Paper Analysis II Choose and present one paper from the reading
list Collect two questions from each group Ask two questions
yourself Lead group discussion Detail instructions are available
from: http://www.cse.unt.edu/~huangyan/6350/paperAnalysis.txt One
paper every week
- Slide 8
- Find Related Work Need to know the key words May need to
explore and refine during your search Often you can find electronic
version of the papers, especially for publications related to
computer science Authors website ACM digital library IEEE xplore
Springer Online Google scholar You school typically subscribes to
these publishers Search from a computer with IP address belonging
to your school
- Slide 9
- Computer Science Bibliography Collections CiteSeer
http://citeseer.ist.psu.edu/ DBLP
http://www.informatik.uni-trier.de/~ley/db/ Google Scholar
http://scholar.google.com/ ACM Digital Library
http://portal.acm.org/dl.cfm IEEE Xplore
http://portal.acm.org/dl.cfm
- Slide 10
- One Way to Find Related Papers access the rattan through a
melon access to a melon along its rattan
- Slide 11
- Term Project ACMGIS CUP 2014 Team of up-to 2 person March 03,
10 minutes presentation on algorithm design and cost analysis Score
is based on normalized grade you get from submission.
- Slide 12
- Term Paper Two choices Term paper Survey paper
- Slide 13
- Term paper Research oriented Key components: Problem Statement,
Significance of the problem Related Work and Our Contributions
Proposed Approach Validation of listed contributions (experimental,
analytical) Conclusions and Future Work
- Slide 14
- Survey paper Key components Problem Statement, Significance of
the problem Our Contributions (usually it is the
categorization/classification of the research literature) A
classification of the papers related to the problem. Use a concept
hierarchy, figures, and diagrams if necessary. Summarize, classify,
contrast, and compare the research literature according to your
classification scheme A summary of the trend and future work of
this line of research. Conclusion.
- Slide 15
- Spatial Databases (SDBMS) Traditional (non-spatial) database
management systems provide: Persistence across failures Allows
concurrent access to data Scalability to search queries on very
large datasets which do not fit inside main memories of computers
Efficient for non-spatial queries, but not for spatial queries
Non-spatial queries: List the names of all bookstore with more than
ten thousand titles. List the names of ten customers, in terms of
sales, in the year 2001 Use an index to narrow down the search
Spatial Queries: List the names of all bookstores with ten miles of
Minneapolis List all customers who live in Tennessee and its
adjoining states List all the customers who reside within fifty
miles of the company headquarter
- Slide 16
- Value of SDBMS Examples of non-spatial data Names, phone
numbers, email addresses of people Examples of Spatial data Census
Data NASA satellites imagery - terabytes of data per day Weather
and Climate Data Rivers, Farms, ecological impact Medical Imaging
Exercise: Identify spatial and non-spatial data items in A phone
book A Product catalog
- Slide 17
- User, Application domains Many important application domains
have spatial data and queries. Some Examples follow: Army Field
Commander: Has there been any significant enemy troop movement
since last night? Insurance Risk Manager: Which homes are most
likely to be affected in the next great flood on the Mississippi?
Medical Doctor: Based on this patient's MRI, have we treated
somebody with a similar condition ? Molecular Biologist:Is the
topology of the amino acid biosynthesis gene in the genome found in
any other sequence feature map in the database ? Astronomer:Find
all blue galaxies within 2 arcmin of quasars. Exercise: List two
ways you have used spatial data. Which software did you use to
manipulate spatial data?
- Slide 18
- SDBMS A SDBMS is a software module that can work with an
underlying DBMS supports spatial data models, spatial abstract data
types (ADTs) and a query language from which these ADTs are
callable supports spatial indexing, efficient algorithms for
processing spatial operations, and domain specific rules for query
optimization Example: Oracle Spatial data cartridge, ESRI SDE can
work with Oracle DBMS Has spatial data types (e.g. polygon),
operations (e.g. overlap) callable from SQL3 query language Has
spatial indices, e.g. R-trees IBM: Spatial Option Informix: Spatial
Datablade
- Slide 19
- SDDMB vs. GIS GIS is a software to visualize and analyze
spatial data using spatial analysis functions such as Search
Thematic search, search by region, (re-)classification Location
analysis Buffer, corridor, overlay Terrain analysis Slope/aspect,
catchment, drainage network Flow analysis Connectivity, shortest
path Distribution Change detection, proximity, nearest neighbor
Spatial analysis/Statistics Pattern, centrality, autocorrelation,
indices of similarity, topology: hole description Measurements
Distance, perimeter, shape, adjacency, direction GIS uses SDBMS to
store, search, query, share large spatial data sets
- Slide 20
- SDBMS vs. GIS SDBMS focuses on Efficient storage, querying,
sharing of large spatial datasets Provides simpler set based query
operations Example operations: search by region, overlay, nearest
neighbor, distance, adjacency, perimeter etc. Uses spatial indices
and query optimization to speedup queries over large spatial
datasets. SDBMS may be used by applications other than GIS
Astronomy, Genomics, Multimedia information systems,...
- Slide 21
- Issues in SDBMS Spatial data model Query language Query
processing File organization and indices Query optimization,
etc.
- Slide 22
- Spatio-temporal Databases Add temporal dimension Examples:
Trajectories Evolving region Moving points
- Slide 23
- Geo-stream databases Many data are generated continuously
Transaction data Network monitoring Financial application Most
recent data are commonly queried in a one-pass fashion Monitoring
Aggregation Database system provides abstractions and declarative
languages that stream processing can benefit from
- Slide 24
- Stream Application Environmental monitoring Patient monitoring
Finance Network monitoring Click-streams Transaction monitoring
Traffic analysis Moving object queries Sensor network RFID
- Slide 25
- Sample Applications Environmental monitoring Notify me when UV
is high, temperature is low Traffic monitoring Traffic jam:
aggregated speed much below speed limit on a road segment for
extended time Accident: vehicle on unintended space, e.g. high way
for longer than expected time Click-streams Find the school
districts of the houses that the user browses the most.
- Slide 26
- Geo-streams Current streams systems lack native spatial support
Spatial stream queries are common in traffic monitoring environment
monitoring moving object databases
- Slide 27
- Location Privacy
- Slide 28
- Route prediction Next position Next stop The entire route
Application: Mobile commerce Save energy Traffic notification
- Slide 29
- Location-based social networking Social networking with
location Loopts Google latitude Geocache Social dynamics Iphone
applications
- Slide 30
- Volunteer Geographic Information System OpenStreetMap,
Wikimapia Foursquare Trapster
- Slide 31
- Spatio-temporal Analytics "Everything is related to everything
else, but near things are more related than distant things." The
analysis of data with both spatial and temporal information The
data are spatially and/or temporally correlated
- Slide 32
- Why do we need spatio-temporal analytics Analytics help us to
describe what happened in the past, understand what is happening
now, predict what will happen in the future, and make decisions.
The proliferation of sensor devices makes spatio-temporal
information a fundamental component for almost every analytical
applications
- Slide 33
- Types of Spatio-Temporal Analytics Methods Visualization and
exploratory analysis Segmentation (classification and clustering)
Outlier analysis Colocation mining Dependency analysis Trend
discovery
- Slide 34
- Data Visualization and Exploratory Analysis Map querying task
Static query (one-time query using map tools available on the
interface) Dynamic query [36] (setup of event alert conditions)
Spatial constraints are expressed using the map, while temporal
constraints are expressed as linear time moments [37] Map animation
[38] Focusing, linking and arranging views [39] Map iteration [40]
Existential changes [25] Location changes Attribute Changes
- Slide 35
- Data Visualization and Exploratory Analysis: Example
- Slide 36
- Segmentation methods Classification [41] Spatial
classification: decision tree, Bayesian, ANN Temporal
classification: decision tree, Bayesian, ANN Temporal extensions to
spatial classification/ Spatial extension to temporal
classification Clustering [42] Spatial clustering: partitioning
method, hierarchical method, density based method, and grid-based
method. Temporal clustering Interactive spatio-temporal clustering:
perform clustering spatially or temporally and then test whether
the cluster exist in both dimensions (EMM Test [43] ) Simultaneous
spatio-temporal clustering: space-time scan [44]
- Slide 37
- More on Spatio-Temporal Clustering
- Slide 38
- Model-based clustering [46] define a multivariate density
distribution and look for a set of fitting parameters for the
model. Distance-based method Moving object similarity search
Density-based method DBSCAN extensions, OPTICS [47] Flocks and
convoy Moving clusters [47] Applications: movement data, cellular
networks, environment data
- Slide 39
- Spatio-Temporal Clustering: Example
- Slide 40
- Spatio-Temporal Outlier Analysis Definition of outliers
spatial-temporal object whose thematic attribute values are signi
cantly di erent from those of other spatially and temporally
referenced objects in its spatial or/and temporal neighborhoods.
Methods [48] Clustering-based approach Distance based approach
Computational geometry based approach Spatial scan based
approach
- Slide 41
- Spatio-Temporal Outlier Detection: Example
- Slide 42
- Co-Location Mining Colocation mining finds subset of Boolean
features located in spatial proximity Methods [50] Data
mining-based approach Spatial statistical approach Buffer-based
model Temporal extension: mixed-drove approach, weighted
window-based model [51]
- Slide 43
- Co-Location Mining: Example
- Slide 44
- Other methods Association rule mining Spatial preprocessing is
required to discretize spatial measurements Methods [49] Bayesian
networks Hieratical approach Trend discovery Regression Sequence
mining
- Slide 45
- List of Current Spatio-Temporal Analytics Tools Commercial ESRI
ArcGIS series Microsoft SQL Spatial +StreamInsight Other commercial
tools Open source/free software Descartes and CommonGIS MapServer
Other free tools
- Slide 46
- ESRI ArcGIS Series ArcGIS desktop and server provide most
advanced and complete toolkit Has many extensions for different
domains Can use APIs to develop extensions, web or desktop
applications for customized needs. Many other commercial tools such
as CUBE [9] are built on top of ArcGIS.
- Slide 47
- ESRI ArcGIS Desktop and Server Extensions [1] 3D Extension
(Desktop and Server) Analyze terrain data, model subsurface
features, view and analyze impact zones, determine optimum facility
placement, share 3D views, create a 3D virtual city. Geostatistical
Extension (Desktop and Server) Visualize, model, and predict
spatial relationships. Link data, graphs, and maps dynamically.
Perform deterministic and geostatistical interpolation. Evaluate
models and predictions probabilistically
- Slide 48
- ESRI ArcGIS Desktop and Server Extensions Network Extension
(Desktop and Server) Dynamically model realistic network conditions
and solve vehicle routing problems Multipoint optimized routing,
time-sensitive, turn-by-turn driving directions, allocation of
service areas, determining the fastest fixed route to the closest
facility Schematics Extension (Desktop and Server) Rapid checking
of network connectivity Automatically generate schematics
- Slide 49
- ESRI ArcGIS Desktop and Server Extensions Spatial extension
(Desktop and Server) Comprehensive, raster-based spatial modeling
and analysis. Survey Extension (Desktop) Capture, edit, and
leverage land records using proven survey methodologies Tracking
Extension (Desktop) Create time series visualizations so you can
analyze information relative to time and location
- Slide 50
- ESRI Domain-Specific Solutions ESRI Business Analyst Online
Web-based solution that combines GIS technology with extensive
demographic, consumer spending, and business data for the entire
United States to deliver on- demand, boardroom-ready reports and
maps Perform drive-time analysis Analyze trade areas Evaluate sites
Identify most profitable customers and reach customers
- Slide 51
- ESRI Domain-Specific Solutions ArcGIS Community Analyst
Web-based solution that provides GIS capabilities to analyze data
in a geographic context as granular as congressional district,
block groups, census tracks, or ZIP Codes. ArcLogistics Create
optimized routes and schedules based on multiple factors such as
customer needs, business rules, vehicle traits, and street
restrictions. Esri Situational Awareness Provides a geospatial
framework for immediate and long-term situational awareness needs.
Includes a powerful data fusion and analysis engine; a set of fully
customizable clients for data visualization and analysis; and
locally hosted, prerendered data.
- Slide 52
- Microsoft SQL Library + StreamInsight [2] Combines SQL Server
spatial library with stream processing engine Integrating SQL
library within StreamInsight engine Focuses on data stream event
processing workflow GIS Support relies on SQL Server (limited), and
therefore need extensive customization for applications
- Slide 53
- Other Commercial tools by category Complete GIS Suite (similar
to ArcGIS) Cardcorp SIS [8],Geomedia [17], IDRISI [18], Mapinfo
[19] Spatio-temporal analysis STIS [23] Network (traffic) analysis
tools ACCESSION GIS [3], AltaMap Suite [4], CUBE[9], DYNAMEQ[15],
EMME[14] Terrain analysis ANUDEM [5] CAD applications AutoCAD
Civil3D [6], Emergency and hazard modeling and analysis CadnaA
[10], Calpuff View [11],Caris [12],CATS [13],Floodworks [16]
Specialized analysis ClusterSeer and BoundarySeer [7] (cluster and
boundary analysis), Mathematica [20] Mathematics toolkit Matlab
Plus toolbox [21], SPSS [22]
- Slide 54
- Descartes and CommonGIS [24] An interactive java based GIS tool
for visualization and exploratory analysis. Functionalities Map and
graph visualization (Choropleth maps, scatter plot) Basic queries
(distance, difference) Dynamic queries Open source and
customizable, lack advanced GIS analytics functionalities
- Slide 55
- MapServer [31] Open source GIS data rendering engine
Functionalities Advanced cartographic output Cross platform and
APIs for all popular scripting languages Support many formats OGC
standard compliant Not a full GIS suite
- Slide 56
- Other Free/Open Source tools by category Complete GIS Suite
(similar to ArcGIS) GRASS [28] Spatio-temporal analysis Map
comparison kit [30], STAR [34] Terrain analysis Landserf [29]
Exploratory data analysis GeoDA [26] Database extension PostGIS
[32] Specialized analysis GAM/K [25] (Clustering), GRASP [27]
(Regression) Mathematics toolkit R Spatial [33]
- Slide 57
- Spatio-temporal analytics is becoming an fundamental component
of business analytics The future Big data (bigger due to
spatio-temporal dimension) Real time (not only historical
spatio-temporal data, but also streaming data that requires
optimization at all levels)
- Slide 58
- References (I) [1]
http://www.esri.com/software/arcgis/index.html
http://www.esri.com/software/arcgis/index.html [2]
http://msdn.microsoft.com/en-us/library/ee362541.aspx
http://msdn.microsoft.com/en-us/library/ee362541.aspx [3]
http://www.citilabs.com/accession.htmlhttp://www.citilabs.com/accession.html
[4] http://www.geomicro.com/http://www.geomicro.com/ [5]
http://fennerschool.anu.edu.au/publications/software/anudem.php
http://fennerschool.anu.edu.au/publications/software/anudem.php [6]
http://usa.autodesk.com/adsk/servlet/pc/index?siteID=123112&id=8777380
http://usa.autodesk.com/adsk/servlet/pc/index?siteID=123112&id=8777380
[7] http://www.terraseer.com/products_boundaryseer.php
http://www.terraseer.com/products_boundaryseer.php [8]
http://www.cadcorp.com/products_geographical_information_systems/index.htm
http://www.cadcorp.com/products_geographical_information_systems/index.htm
[9] http://www.citilabs.com/cube_base.html
http://www.citilabs.com/cube_base.html [10]
http://www.datakustik.com/en/products/cadnaa/ [11]
http://www.weblakes.com/calpuff/calpuff_overview.htmlhttp://www.weblakes.com/calpuff/calpuff_overview.html
[12] http://www.caris.comhttp://www.caris.com [13]
http://www.saic.com/products/security/cats/http://www.saic.com/products/security/cats/
[14] http://www.inro.ca/en/products/emme/ [15]
http://www.inro.ca/en/products/dynameq/http://www.inro.ca/en/products/dynameq/
[16]
http://www.wallingfordsoftware.com/uk/products/floodworks/http://www.wallingfordsoftware.com/uk/products/floodworks/
[17] http://www.intergraph.com/sgi/default.aspx
http://www.intergraph.com/sgi/default.aspx [18]
http://www.clarklabs.org/http://www.clarklabs.org/ [19]
http://www.pbinsight.com/welcome/mapinfo/
http://www.pbinsight.com/welcome/mapinfo/ [20]
http://www.wolfram.com/products/mathematica/newin7/content/IntegratedGeodesyAndGIShttp://www.wolfram.com/products/mathematica/newin7/content/IntegratedGeodesyAndGIS
[21] http://www.mathworks.com/ [22] http://www.spss.com/
http://www.spss.com/ [23]
http://www.terraseer.com/products_stis.php
- Slide 59
- References (II) [24]
http://www.esds.ac.uk/international/support/user_guides/gisoverview.asp
http://www.esds.ac.uk/international/support/user_guides/gisoverview.asp
[25] http://www.ccg.leeds.ac.uk/software/gam/
http://www.ccg.leeds.ac.uk/software/gam/ [26]
http://geodacenter.asu.edu/software
http://geodacenter.asu.edu/software [27]
http://www.unine.ch/CSCF/grasp/ [28] http://grass.fbk.eu/
http://grass.fbk.eu/ [29] http://www.landserf.org [30]
http://www.riks.nl/products/Map_Comparison_Kit
http://www.riks.nl/products/Map_Comparison_Kit [31]
http://mapserver.gis.umn.edu/ [32] http://postgis.refractions.net/
http://postgis.refractions.net/ [33]
http://cran.r-project.org/web/views/Spatial.html [34]
http://regionalanalysislab.org/index.php/Main/STARS [35] P.
Compieta, S. Di Martino, M. Bertolotto, F. Ferrucci, and T.
Kechadi. 2007. Exploratory spatio-temporal data mining and
visualization. J. Vis. Lang. Comput. 18, 3 (June 2007), 255-279.
[36] C. Ahlberg, C. Williamson, B. Shneiderman, Dynamic queries for
information exploration: an implementation and evaluation, in:
Proceedings ACM CHI92, ACM Press, New York, 1992, pp. 619626. [37]
M. Harrower, A.M. MacEachren, A.L. Griffin, Developing a geographic
visualization tool to support earth science learning, Cartography
and Geographic Information Science 27 (4) (2000) 279293. [38] W.L.
Hibbard, B.E. Paul, D.A. Santek, C.R. Dyer, A.L. Battaiola, M.-F.
Voidrot-Martinez, Interactive visualization of earth and space
science computations, Computer. 27 (7) (1994) 6572. [39] A. Buja,
J.A. McDonald, J. Michalak, W. Stuetzle, Interactive data
visualization using focusing and linking, in: Proceedings IEEE
Visualization91, IEEE Computer Society Press, Washington, 1991, pp.
156163. [40] D. Stojanovic, S. Djordjevic-Kajan, A. Mitrovic, Z.
Stojanovic, Cartographic visualization and animation of the dynamic
geographic processes and phenomena, in: Proceedings of 19 th
International Cartographic Conference, Ottawa, Canada, Vol. 1,
1999, pp. 739746.
- Slide 60
- References (III) [41] Kumar, M.; Bhatt, G.; Beeson, P.; Duffy,
C. Automated Detection and Spatio-Temporal Classification of
Channel Reaches in Semi-arid Southwestern US Using ASTER. American
Geophysical Union, 2006 Joint Assembly. [42] Tim E. Carpenter,
Methods to investigate spatial and temporal clustering in
veterinary epidemiology, Preventive Veterinary Medicine, Volume 48,
Issue 4, 29 March 2001, Pages 303-320. [43] Fosgate, G.T.,
Carpenter, T.E., Case, J.T., Chomel, B.B., 2000. Timespatial
clustering of human cases of brucellosis: California, 19731992. In:
Proceedings of the Ninth International Society on Veterinary
Epidemiology and Economics, Breckenridge, CO [44] McKenzie, J.S.,
Pfeiffer, D.U., Morris, R.S., 2000. Spatial and temporal patterns
of vector-borne tuberculosis infection in beef breeding cattle in
New Zealand. In: Proceedings of the Ninth International Society on
Veterinary Epidemiology and Economics, Breckenridge, CO [45]
Chudova D, Gaffney S, Mjolsness E, Smyth P (2003)
Translation-invariant mixture models for curve clustering. In: KDD
03: Proceedings of the ninth ACM SIGKDD international conference on
Knowledge discovery and data mining, ACM, New York, NY, USA, pp
7988 [46] Ankerst M, Breunig MM, Kriegel HP, Sander J (1999)
Optics: ordering points to identify the clustering structure.
SIGMOD Rec 28(2):4960 [47] Kalnis P, Mamoulis N, Bakiras S (2005)
On discovering moving clusters in spatio-temporal data. Advances in
Spatial and Temporal Databases pp 364381 [48] Birant, D.; Kut, A..
Spatio-temporal outlier detection in large databases. 28th
International Conference on Information Technology Interfaces,
2006. [49] Jeremy Mennis, Jun Wei Liu. Mining Association Rules in
Spatio-Temporal Data: An Analysis of Urban Socioeconomic and Land
Cover Change. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-
9671.2005.00202.x/abstract.http://onlinelibrary.wiley.com/doi/10.1111/j.1467-
9671.2005.00202.x/abstract [50] Y. Huang, S. Shekhar, and H. Xiong,
Discovering colocation patterns from spatial datasets: A general
approach., IEEE Transactions on Knowledge and Data Engineering,
vol. 16, no. 12, pp. 14721485, 2004 [51] Feng Qian ; Liang Yin ;
Qinming He ; Jiangfeng He ;. Mining spatio-temporal co-location
patterns with weighted sliding window. IEEE International
Conference on Intelligent Computing and Intelligent Systems, 2009.
ICIS 2009.