12
Towards the new open source GIS platform AEGIS Roberto GIACHETTA a , István LASZLÓ b , and Csaba Levente BÁLINT a a Eötvös Loránd University (ELTE), Faculty of Informatics, Budapest, Hungary b Institute of Cartography, Geodesy and Remote Sensing (FÖMI), Budapest, Hungary Abstract. In past years, geographical information systems have undergone spectac- ular development. Beside traditional applications some new areas have been opened by the spread of navigation systems and the publication of geoinformation via In- ternet. These areas are in need of efficient data handling due to the changing spatial and descriptive data of objects. This article presents the AEGIS framework, which is a currently developed spatio-temporal data management system at the Eötvös Loránd University, Fac- ulty of Informatics. The framework introduces a data model that aims to uniformly represent raster and vector data, and therefore introduces a new indexing structure based on MV3R-tree and B-tree to monitor changes of spatial and descriptive data in time. To demonstrate the usage of this model, a simple agent-based traffic simu- lation has been development, which is also presented in the article. Keywords. Geospatial information systems, spatio-temporal data, indexing structures, agent-based traffic simulation Introduction Geographical information systems (GIS) have undergone a spectacular development in the past years. Beside traditional areas of GIS applications a rapid development took place in the world of navigation systems as well. Google Maps and Google Earth, to- gether with their Application Programming Interface, are common tools in the global handling of spatial data. The world of open source software has also evolved a lot. There is a rising need for professionals whose practice covers both information technology and geography. This paradigm shift has to be taken into account both by professionals and by academic people. At the Eötvös Loránd University, Faculty of Informatics (ELTE IK) the informal association called Creative University GIS Workshop (TEAM) deals with several related research topics, e.g. Intelligent Raster image Interpretation System (IRIS), University Digital Map Library (EDIT), Virtual Globes Museum (VGM) and segment-based analy- sis of remote sensing images. In education and research an important collaboration takes place with the Institute of Geodesy, Cartography and Remote Sensing (FÖMI). This governmental institution is responsible for the research, development and applications of remote sensing in Hungary, mainly in the areas of agriculture and environmental protection.

Towards the new open source GIS platform AEGIS...software products like DotSpatial [3] and SharpMap [4]. The main features of AEGIS can be summarized as follows. Multiple client environments

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Towards the new open source GISplatform AEGIS

Roberto GIACHETTA a, István LASZLÓ b, and Csaba Levente BÁLINT a

a Eötvös Loránd University (ELTE), Faculty of Informatics, Budapest, Hungaryb Institute of Cartography, Geodesy and Remote Sensing (FÖMI), Budapest, Hungary

Abstract. In past years, geographical information systems have undergone spectac-ular development. Beside traditional applications some new areas have been openedby the spread of navigation systems and the publication of geoinformation via In-ternet. These areas are in need of efficient data handling due to the changing spatialand descriptive data of objects.

This article presents the AEGIS framework, which is a currently developedspatio-temporal data management system at the Eötvös Loránd University, Fac-ulty of Informatics. The framework introduces a data model that aims to uniformlyrepresent raster and vector data, and therefore introduces a new indexing structurebased on MV3R-tree and B-tree to monitor changes of spatial and descriptive datain time. To demonstrate the usage of this model, a simple agent-based traffic simu-lation has been development, which is also presented in the article.

Keywords. Geospatial information systems, spatio-temporal data, indexing structures,agent-based traffic simulation

Introduction

Geographical information systems (GIS) have undergone a spectacular development inthe past years. Beside traditional areas of GIS applications a rapid development tookplace in the world of navigation systems as well. Google Maps and Google Earth, to-gether with their Application Programming Interface, are common tools in the globalhandling of spatial data. The world of open source software has also evolved a lot. Thereis a rising need for professionals whose practice covers both information technology andgeography. This paradigm shift has to be taken into account both by professionals andby academic people.

At the Eötvös Loránd University, Faculty of Informatics (ELTE IK) the informalassociation called Creative University GIS Workshop (TEAM) deals with several relatedresearch topics, e.g. Intelligent Raster image Interpretation System (IRIS), UniversityDigital Map Library (EDIT), Virtual Globes Museum (VGM) and segment-based analy-sis of remote sensing images.

In education and research an important collaboration takes place with the Instituteof Geodesy, Cartography and Remote Sensing (FÖMI). This governmental institution isresponsible for the research, development and applications of remote sensing in Hungary,mainly in the areas of agriculture and environmental protection.

Continuous improvement of image analysis methods has always been an importantpart of operational projects, in which FÖMI works together with ELTE IK. The other sideof this connection is education. The course Remote Sensing Image Analysis is taught bythe scientists of FÖMI. The main stream of common research is segment-based imageanalysis. As a result of many years’ research, a fully segment-based classification methodhas been developed, including clustering and final classification working on segments[1]. An alternative way using segment-based classification has been introduced in threeoperational applications of FÖMI [2].

Based upon the experience gained with research and education – especially lab sem-inars –, the plan of a standalone multi-platform geographic framework, called AEGIS,has been outlined1. This framework will serve as the future platform of GIS educationand research at ELTE IK. It is currently in development with the data model being nearlycomplete, but most features are only outlined to be implemented in the near future. Thissystem is intended to become open source under the GPL license once the first func-tional version is released. The following sections concentrate on the current status ofdevelopment and functions that have already been implemented.

The rest of the paper is arranged as follows. In Section 1 we will outline the conceptof our system. Section 2 explains details of data modeling in AEGIS. Section 3 describesthe first application using this architecture, which is an agent-based traffic simulation.Section 4 concludes the paper.

1. The concept of the AEGIS system

The purpose of AEGIS is developing a multi-platform, open source, client-server archi-tecture geographic information system for spatio-temporal data management. It is de-signed for broad functionality, efficiency, and allowing students to skip the learning curveand concentrate on their main tasks in lab projects and theses works without the needof building up auxiliary functionality from scratch. The system is not intended to becompetitor to current GIS editor solutions, but rather to be a prototype and sandbox forexperimental research.

The implementation is carried out using the Microsoft .NET Framework. This is dueto the wide possibilities and the simple usage of this development platform, the strong.NET education carried out at ELTE and also the previous success of .NET based GISsoftware products like DotSpatial [3] and SharpMap [4].

The main features of AEGIS can be summarized as follows.

• Multiple client environments on several platforms. The main client functions ondesktop computers, but users have limited access available through browser, mo-bile and tablet interfaces.

• Project based storage of both raster and vector data. The data model is basedupon ISO and OGC standards and incorporates revision control to follow changesduring editing. Data are stored in a central database using pyramid architecturewith vector data simplification and raster resolution reduction.

• Services based encrypted online communication of subsystems. Changes made todatabase are automatically visible to all clients.

1See the project page http://mapw.elte.hu/aegis.

• Import and export of data in any standard GIS format. Supported file formatsinclude ESRI ShapeFile, Erdas Machine Independent Format and raw satelliteimage data (“generic binary”). Online import enables access to OGC web servicesand map providers (OpenStreetMap, Google Maps etc.). Data can also be sharedonline by providing OGC web service channels.

• Extensible processing library. Operations for raster and vector data can be addedon-the-fly to provide new possibilities for data analysis, process modeling andsimulation and to produce of spatial and temporal statistics. These plugins canbe implemented in .NET using the framework’s API. Resource demanding op-erations can be carried out in a computational cloud. A simple to use scriptinglanguage is provided to enable batch processing of data.

• Comprehensive user rights control with task scheduling and activity management.All actions performed on server data are logged and can be reverted using revisioncontrol. Logging also enables the automated revision and auditing of performedtasks.

Most of these features are currently in planning, so many aspects can change duringdevelopment, but these main goals have been designated. The system architecture, asdescribed in the following subsection, is also in planning phase.

1.1. System architecture

The AEGIS system is made up of four main components as seen in Figure 1. Thesecomponents are the following.

• Thick Client: A fully functional desktop GIS browser and editor application withsupport for local file system and web access of spatial data. It features three-dimensional graphics view (implemented in XNA) with editing, analysis and sim-ulation possibilities. Operations can be performed using the local machine or theprocessing services.

• Thin Client: A simplified application for browser, mobile and tablet platformsbuilt in Silverlight with reduced operating possibilities. This client has multipleinterfaces for each platform supported with the same functionality. Operations areperformed using the processing services. This client can access only data storedon the server.

• Processing Services: Realizes the computational cloud used for distributed oper-ation execution both on server and client side. The server side aims to distributeoperations and data among running nodes. They execute them and return resultsto the server. Nodes can be installed separately from the clients.

• Server Services: The server provides connection between client machines usingencrypted channels through WCF (Windows Communication Foundation) andalso provides OGC web services.

1.2. Data management

The system features centralized storage of spatio-temporal data both in raster and vectorformat with interaction possibilities for multiple databases. The primary database back-end is the MongoDB document-oriented database management system, which enables

Figure 1. AEGIS system components

the schema-free storage of hierarchical data [5], and provides faster editing speed thanSQL based databases [6]. Support is planned for PostGIS and other databases as well.

Spatial references are stored in one preferred coordinate system. Raster and vectordata may be imported from several supported formats to the system, and data are repro-jected upon import, if necessary. However due to the possible data loss caused by rasterreprojection, the original images are also stored in the database and are used in case ofreporjections.

Spatial data is built up in a multi-resolution pyramid structure. Raster images aregradually reduced to several lower resolutions whilst vector objects are generalized usingproven methods [7,8]. This enables more dynamic access to maps (also in low-bandwidthconditions) without the need of on-the-fly image resampling. Using this feature multipleaccess levels can be assigned to different levels of the pyramid. Data read operationsmay be performed at any level of the pyramid while writing ones need permission at thelowest level (full resolution data).

The data model features both two- and three-dimensional spatial objects (objectrefers to both vector features and remotely sensed images) with time intervals. The pri-mary goal of time variable is the storage of data validity and the tracking of spatialchanges of objects. Versioning is also applied to data, to enable rollback of any modifi-cation. Objects are grouped into layers, which define dimensional and reference parame-ters. Since this complex data structuring is not supported at database level, the data accesslayer of the system is responsible to properly transfer stored data to revision controlledentities and index data for fast retrieval. The data modeling environment is presented inSection 2.

The database is separated into two main parts, as seen in Figure 2.

Figure 2. Data management

• Published data contains finalized objects that can be accessed via external chan-nels, including web services. These objects do not contain all editing informationand changes cannot be revoked, but it is enabled to switch between any publishedversions of a spatial object.

• Project data contains spatial objects that have not yet been finalized and are stillunder editing. This data are gathered under spatial projects that maintain all edit-ing and version changes to enable rollback of modifications. Project data alsoholds editing tasks (known as assignments), which can be grouped into workflowsand users or user groups can be assigned to these tasks.

2. Unified modeling and indexing of spatio-temporal data

Modeling and indexing of spatio-temporal objects has been a frequent topic among re-searchers since the 90s, and many data models have been introduced so far [9,10]. How-ever neither common solutions have arisen, nor standards have been developed so far.

In our approach the base of data model is the Open Geospatial Consortium (OGC)Simple Features Specification (SFS), which defines an ISO based modeling environmentfor spatial vector features [11]. In this solution, the central item is the Geometry class,which has several specialized versions (including collections) in object inheritance tax-onomy. Geometry defines the interface for spatial properties and operations among anyspatial objects. The model focuses on two and three dimensional vector data, without anytemporal references. In our implementation of this standard, several interfaces have beenintroduced beside the class structure to enable flexible and effective implementation ofvector features (see Figure 3). Furthermore, collections contain indexing structures for

fast retrieval of objects, as described in Section 2.2. All Geometry objects may containmultiple resolutions of spatial data and any number of descriptive data.

Vector and raster data are required to be handled in the same manner so that op-erations work uniformly on them. In current geospatial systems the representation andoperations of raster and vector data are usually independent of each other. In our system,we include raster data as an extension to the SFS. Therefore all spatial operations pre-viously defined on vector objects (e.g. intersection, translation, projection etc.) can alsobe performed on raster data (for example the ability to intersect an orthophoto with theland parcel polygon). Also, restrictions can be applied to any operations to prevent rasterfunctions (e.g. intensity transformations, image filtering) to work on vector data.

The data model is enhanced in two steps. First, temporal properties are added toall geometries, and extensions were made to model raster datasets and special vectorobjects. This extended data model can be seen in Figure 3. In the second step, indexingstructures are implemented to access data faster, and also to provide temporally variableinformation not stored at data level.

2.1. Extending simple features with complex objects

Our current model concentrates only on two dimensional spatial objects, but this will beshortly extended to 3D.

All geometries have been extended by the possibility of storing the time intervalin which they are considered to be valid. Generally, items in a collection do not needto exist in the same time interval, so the interval of a collection is the closure of theintervals of all items, but restrictions can be made to force temporal equality among allitems of the collection. GeometryCollection has also been extended with location andtime based querying abilities made possible by the inner indexing structure. Besidesstoring reference system properties, geometries store descriptive data (metadata) as well.Metadata stores properties not related to space and time, including user access, copyrightinformation, sensor data (in case of remotely sensed images) etc.

Several new vector formats are introduced as geometry descendants. The Rectangleis introduced mainly to support simple operations, like bounding box queries, but italso provides an intermediate geometry for the representation of raster images. TheGeometryNetwork and descendant collections are introduced to store topology related in-formation on data contained in one collection. For example, the LineNetwork class storespoints connected with Line objects in a graph structure based upon [12]. It is mainlyaimed to store the architecture of road networks. IGeometryNetwork interface providesoperations to query neighboring geometries. Also, a SimpleGeometryCollection and de-scendant classes were added. These classes do not contain any indexing options; there-fore they only serve as simple collections of elements.

Concerning raster data, the Image, ImageCollection, ImageBand andImageBandCollection classes are introduced. ImageBand is a descendant of rectangleand contains one band of a raster image, while Image is a collection of all bands of animage. These classes are significantly extended with operations and properties relatedto raster imagery and image metadata. Images can be stored in several radiometricresolutions (from 8 bit to 64 bit for every band), and a mask is applied to all imagesto indicate actual image pixels. Collection classes serve as accumulators of data withcommon attributes, e.g. several images from a path of a satellite. Spatial operations can

Figure 3. Data model based on the Simple Features Specification (new classes are displayed in red)

Figure 4. The MV3R-tree extended with metadata variability

be executed between raster and vector data with the result becoming raster data. Forexample intersecting a Polygon with ImageBand generates an ImageBand, where pixelsoutside the polygon are left blank.

2.2. Indexing data with temporal variability

To ensure fast queries on data, all (not simple) collections store indexing structures basedon Multi Version 3D R-trees (MV3R-trees). This structure was chosen because of its goodperformance of interval queries [13], but future research plans include testing severalavailable indexing structures or developing new ones for our purposes.

The MV3R-tree is a combination of MVR-tree and 3D R-tree. It stores multiple R-trees with different time stamps, each having a spatial bounding box, and uses multipleheuristics to enhance the performance of tree updates. To enhance the usage of indexing,an auxiliary tree has been added at leaf level that contains metadata variables. Thesevariables contain descriptive information that changes in time, and is not contained atdata level; therefore it is only reachable through collections. With this extension, not onlyspatial changes of objects can be monitored, but also altering of non-spatial information.This indexing structure can be seen in Figure 4.

The metadata variables are built up using B-tree data structure based on time inter-vals. Leafs can contain any amount of metadata variables beside the geometry object, andall leaf pointers refer to the same object. Variables consist of (key, value and modificationtype) triplets. These triplets define metadata variability for any descriptive property ofgeometry. Temporal or geometry properties cannot be altered this way; this is handled atthe MV3R-tree level. Key refers to property name, modification type defines applicationof the value, e.g. override, add, multiply, etc. During the query of geometry, the variableproperties are gathered from the structure, and are considered during the evaluation ofthe object. In Section 3 we demonstrate an example for such metadata variable usage.

The data access layer enables the storage of entire collections within the database,so the metadata variables can also be stored in to be retrieved later, when the collectionis accessed.

3. An application of spatio-temporal modeling: agent-based traffic simulation

The first application based on AEGIS core architecture is a simplified agent-based trafficsimulation model of Budapest city. The goal of this simulation is to calculate and displaytraffic information, mainly congestion levels for every hour of the day. Due to the earlystatus of development, only the data model of AEGIS was used in this project; separateoperations and display environment is built on top of that.

Figure 5 illustrates the user interface of this application, showing the map of Bu-dapest. In Figure 5a agents are shown in yellow, special buildings in different colors.Figure 5b displays the estimated congestion levels (yellow for small congestion, red forlarge congestion) for 7 o’clock in the morning.

In this simulation independent agents have several target addresses to drive to duringthe day. Two main targets are working place and home. Multiple random targets (likeshopping centers or restaurants) can also occur. Agents travel to all locations by theirown car. Locations are built up using metadata of building objects in the Budapest map.Agents present in this simulation are primitive; they use simple random-based algorithmsto make decisions based on statistics. For example, an agent may have a working timefrom 8 o’clock in the morning to 6 in the afternoon, and can go to multiple stores orshopping centers after work.

At the start of simulation, agents plan their routes, and drive according to the plan.If the agent starts work at 8, and the driving to working place takes 30 minutes, the agentstarts at 7:30. However, road traffic is constantly monitored, and road travel times canvary depending on traffic density. In case an agent rates its arrival time at destinationinacceptable (e.g. it is 30 minutes late), it replans the routes with the updated traveltimes the next day. All agents are in possession of entire map and accurate travel times.Simulations show that with constant agent count, about 92% of these routes stabilize in80 days.

Agents plan their routes using A*-algorithm working on the LineNetwork represen-tation of map. Routing algorithm calculates the journey based on road travel time, whichis available as geometry metadata, and is multiplied by a metadata variable. This mul-tiplied speed value is aggregated during routing calculation, so for every hour, differentmeasures are taken into account. The algorithm follows current time of routing position;results are accurate to the hour. Travel times are constantly monitored during the simula-tion, and metadata variables are updated for every hour. This can be seen in Figure 6.

Using this approach there was no need to copy Line objects to store different travelspeeds for every hour. This method has also been proven to be more effective than usingsome other storage for this variability, since every line segment has different values forevery hour. However, it was not necessary to split the temporal variability for all linesegments and for every hour, since several consecutive hours have the same speed values(therefore intervals can be merged).

In this application, all Line objects have the same time interval, so the temporalproperties of MV3R-tree were not used. Later in development, we allowed the blockingof any road section during any hour of the day. This was accomplished in two ways. Inthe first way, the metadata variable of travel time was increased to infinite for the givenhours. This practically did not alter the indexing of MV3R-tree, and did not cause anychanges in the performance of simulation. In the second one, this Line has been split intotwo different objects with limited time intervals. This caused an update of the MV3R-

(a) Running agents

(b) Traffic congestion measure

Figure 5. Visualization of the agent based traffic simulation

tree, but without the need to change metadata variables. During simulation no measurableperformance alteration was seen. However, further testing is needed for the measurementof performance change during massive change of road sections during simulation.

To test the efficiency of model, a separate representation has also been implemented,where no temporal variability is used, but multiple instances of Line objects were cre-ated and temporal changes were resolved by MV3R-tree. This solution resulted in over-

Figure 6. Routing with metadata variables in the LineNetwork indexing structure

whelming memory usage as a cost of slightly improved routing times, so this solutionwas abandoned.

4. Conclusion and future work

In the previous sections the concept of the AEGIS geospatial framework and the goals ofthe authors’ research have been introduced. Although development is still in early stageand most aims are still in planning, some results have already been reached.

The system is based on the spatio-temporal data model as described in Section 2,which uses complex data structures and MV3R-tree based indexing with temporal vari-ability to enable more flexible management and maintenance of data. The authors’ firstapplication, the agent-based traffic simulation has shown the justification of this model.However, more research is needed to measure performance and competitiveness to othersolutions.

Further research includes the testing of other indexing structures both in combina-tion with temporal variability and without it, and their performance measurement usingthe agent-based simulation. Further applications of the AEGIS framework are planned tobe implemented. Also, in the long term, enhancement possibilities of MongoDB spatialsupport with the low-level implementation of revision control, the OpenGIS SFS andindexing of spatial-temporal data are planned to be examined.

Acknowledgements

Research projects presented in this article are supported by the European Union andco-financed by the European Social Fund (grant agreement no. TÁMOP 4.2.1./B-09/1/KMR-2010-0003).

References

[1] László, I., Dezso, B., Fekete, I., Pröhle, T. (2009). A Fully Segment-based Method for the Classificationof Satellite Images. In: Kátai I. (ed.): Annales Univ. Sci. Budapest, Sectio Computatorica, Vol. 30: 157-174.

[2] László, I., Ócsai, K., Gera, D., Giachetta, R., Fekete, I. (2011). Object-based Image Analysis of Pas-ture with Trees and Red Mud Spill. Lecture at: 31th EARSeL Symposium, Prague, Chech Republic.http://www.conferences.earsel.org/abstract/show/2482.

[3] DotSpatial - Open Source Geospatial Framework in .NET. http://dotspatial.codeplex.com.[4] SharpMap - Geospatial Applicaton Framework for the CLR. http://sharpmap.codeplex.com.[5] Padhy, R. P., Patra, M. R., Satapathy, S. C. (2011). RDBMS to NoSQL: Reviewing Some Next-Generation

Non-Relational Database’s. In: Padhy, R. P. (ed.): International Journal of Advanced Engineering Sci-ences and Technology, Vol. 11 (1): 15-30.

[6] Giachetta, R., Máriás, Zs. (2010). Performance Evaluation of Storing Inhomogeneous Descriptive Dataof Digital Maps. Lecture at: Conference of PhD students in Computer Science, Szeged, Hungary.http://www.inf.u-szeged.hu/ cscs/pdf/CSCS2010-proceedings.pdf.

[7] Weibel, R. (1997). Generalization os spatial data. In: Goos, G., Hartmanis, J., van Leeuwen, J. (eds.):Lecture Notes in Computer Science, Vol. 1340: 99-152.

[8] Paul Hardy, D. L. (2005). GIS-Based Generalization and Multiple Representation of Spatial Data. In:Kremers, H. (ed.): Proceedings of the International CODATA Symposium on Generalization of Informa-tion, 175-190.

[9] Mokbel, M. F., Ghanem, T. M., Aref, W. G. (2003). Spatio-Temporal Access Methods. In: David, B.,Lomet, D. B. (eds.): IEEE Data Engineering Bulletin, Vol. 26: 40-49.

[10] Abraham, T., Roddick, J. F. (1999). Survey of Spatio-Temporal Databases. In: Bergougnoux, P., Shekhar,S., Frank, A. U. (eds.): GeoInformatica, Vol. 3: 61-99.

[11] Herring, J. R. (ed.): OpenGIS Implementation Standard for Geographic Information: Simple FeatureAccess - Common Architecture. http://www.opengeospatial.org/standards/sfa.

[12] George, B., Shekhar, S. (2008). Time-Aggregated Graphs for Modeling Spatio-temporal Networks. In:Spaccapietra, S., Delcambre, L. (eds.): Journal on Data Semantics, Vol. 11: 191-212.

[13] Tao, Y., Papadias, D. (2001). The MV3R-Tree, A spatio-Temporal Access Method for Timestamp andInterval Queries. In: Apers, P. M. G., Atzeni, P., Ceri, S., Paraboschi, S., Ramamohanarao, K., Snodgrass,R. T. (eds.): Proceedings of 27th International Conference on Very Large Data Bases, 431-440.