Geographic Information Systems:Models, Toolboxes and Systems for Geographic Information

Models, Toolboxes and Systems for Geographic Information

GISs differ substantially with respect to their specific functionality, which makes a comparison of the different systems quite difficult. We restrict our evaluation to the core functionality of a GIS related to manipulating vector data. Moreover, we stress the following three criteria in our comparison:

Data Model: The spatial data model offered by the system is very important to a user since it provides the geometric data types and the operations.

Spatial indexing: Spatial index structures are important for efficiently supporting the most important spatial queries. It is therefore important what kind of index structures are actually available, particularly for applications that deal with very large databases.

Spatial Join: Since spatial joins are the most important operation for combining different maps, system performance depends on the efficiency of the underlying algorithm.

These criteria are among the most important ones for spatial query processing in a GIS [24].

Though we already have restricted our considerations to specific aspects, we limit our comparison to a few important systems and libraries. We actually start our discussion with introducing two common standardized data models that are implemented by many commercial systems. Thereafter, we will discuss a few commercial systems that are used in the context of GIS in industry. Next, we present a few prototype systems that mainly serve as research platforms.

Standardized Data Models

The most important standard for GIS [51] is published by the Open GIS Consortium. It provides an object-oriented vector model and basic geometric data types. The actual implementations of commercial vendors like Oracle are closely related to the Open GIS standard. All of the geometric data types are subclasses of the class Geometry that provides an at- tribute that specifies the spatial reference system. One of the methods of Geometry delivers the so-called envelope of an object that is called MBR in our terminology. The data model distinguishes between atomic geometric types like points, curves and surfaces and the corresponding collection types. The most complex atomic type is a polygonal surface that consists of an exterior polygonal ring and a set of internal polygonal rings where each of them represents a hole in the surface. Certain assertions have to be obeyed for such a polygonal surface to be in a consistent state.

The topological relationships of two spatial objects are expressed by using the nine- intersection model [17]. This model distinguishes between the exterior, interior and bound- ary of an object. Spatial predicates like overlaps are then defined by specifying which of the assertions has to be satisfied. In addition to predicates, the Open GIS specification defines different constructive methods like the one for computing the convex hull of a spatial object. Another important function allows to compute the buffer object which contains those points that are within distance ε of a given object. Moreover, there are methods for computing the intersection (and other set operations) on two objects.

The OpenGIS standard has also largely influenced other standards for geographic data like the standard for storing, retrieving and processing spatial data using SQL [33] and the standard for the Geography Markup Language (GML) that is based on XML. The recently published version of the GML standard [52] additionally provides functionality to support three-dimensional objects and spatio-temporal applications.

Commercial Systems

In this subsection, we give a brief overview on the geographic query processing features of database systems, geographic information systems and data structures libraries.

Oracle

Among the three big database vendors, Oracle offers the richest support for the management of spatial data. The data model of Oracle [53] is similar to the simple feature model of the OpenGIS Consortium. Oracle additionally offers curves where the arcs might be circular. The spatial functionality is implemented on top of Oracle’s DBMS and therefore, it is not fully integrated. This is most notably when SQL is used for specifying spatial queries where the declarative flavor of SQL is in contrast to the imperative procedural calls of the spatial functionality.

The processing of spatial queries is performed by using the filter-refinement approach. Moreover, intermediate filters might be employed where a kernel approximation of the object is used. This kind of processing is applied to spatial selection queries and to spatial joins.

There are R-trees and quadtrees in Oracle for indexing spatial data. In contrast to R- trees, (linear) quadtrees are based on a grid decomposition of the data space into tiles, each of them keep the list of intersecting objects. The linear quadtree is implemented within Oracle’s B+-tree. In case of fixed indexing, the tiles are all of the same size. Oracle provides a function to enable users to determine a good setting of the tile size. In the case of hybrid indexing, tiles may vary in size. This is accomplished by locally increasing the grid resolution if the number of tiles is still below a given threshold. A comparison of Oracle’s spatial index-structures [35] shows that the query performance of the R-tree is superior compared to the quadtree.

SpatialWare

SpatialWare [44] provides a set of functions that allow to manage spatial data within Mi- crosoft SQL Server. The implementation is based on the extensibility features of SQL Server. Again, the spatial data types are similar to the simple features of OpenGIS.

The query processing functionality consists of spatial selection queries as well as spatial joins. Most notable is the fact that SpatialWare provides R-trees for spatial indexing.

LEDA and CGAL

LEDA [45] and CGAL [11] are C++-libraries (see Chapter 41 for more on LEDA) that offer a rich collection of data structures and algorithms. Among the more advanced structures are spatial data types suitable for being used for the implementation of GIS. Most interesting to GIS is LEDA’s and CGAL’s ability to compute the geometry exactly by using a so-called rational kernel, i.e., spatial data types whose coordinates are rational numbers. LEDA provides the most important two-dimensional data types like points, iso-oriented rectangles, polygons and planar subdivisions. Moreover, LEDA provides efficient implementations of important geometric algorithms like convex hull, triangulations and line intersection. AlgoComs, a companion product of LEDA, also provides a richer functionality for polygons that is closely related to a map overlay. In contrast to LEDA, the focus of CGAL is limited to computational geometry algorithms where CGAL’s functionality is generally richer in comparison to LEDA. CGAL contains kd-trees for indexing multidimensional point data and supports incremental nearest neighbor queries.

Both, LEDA and CGAL, do not support external algorithms and index-structures and therefore, they only partly cover the functionality required for a GIS. There has been an extension of LEDA, called LEDA-SM [12], that supports the most important external data structures.

JTS Topology Suite

The JTS Topology Suite (JTS) [76] is a Java class library providing fundamental geometric functions according to the geometry model defined by the OpenGIS Consortium [51]. Hence, it provides the basic spatial data types like polygonal surfaces and spatial predicates and operations like buffer and convex hull. The library also supports a user-definable precision model and contains code for robust geometric computation. There are also a few classes available for indexing MBRs (envelopes). The one structure is the MX-CIF quadtree [67] that is a specialized quadtree for organizing a dynamic set of rectangles. The other structure is a static R-tree that is created by using a bulk-loading technique [39]. Currently, there is no support for managing data on disk efficiently. JTS is published under an open source licensing agreement (the GNU LGPL).

Research Prototypes

SAND

The SAND System [18] gives the full query processing power of a spatial data base system and additionally, contains a browser for displaying the results of a spatial query. It provides the common folklore of spatial data types like point, rectangle, polygon and polygonal surface (termed polyregion). SAND offers three kinds of query predicates that refer to topological, metric and distance predicates, respectively. A user might ask for the sequence of objects within a given region ranked according to their distance to a given query point. SAND is very powerful with respect to its indexing. SAND offers the PMR-quadtree [67] as well as the R-tree [26]. Both of these spatial index-structures support ranking queries by controlling the traversal of the index through a priority queue. Note that SAND delivers the answers of a query as an iterator where the next answer is produced on demand. SAND offers a rich source of spatial joins that are based on the principle of synchronized traversal of spatial indexes. A special feature of SAND is its query optimizer as well as its script

language that serves as a glue between the different system components.

XXL

XXL (eXtensible and fleXible Library) [10] is not a system, but a pure Java library that does not support spatial data types, but points and rectangles. XXL provides powerful support for various kinds of (spatial) indexes. There are different kinds of implementations of R-trees as well as B-trees that might be combined with space-filling curves (e.g. z-order and Hilbert order). The concept of containers is introduced in XXL to provide an abstract view on external memory. Implementations of containers exist that are based on main memory, files and raw disks. XXL offers a rich source of different kinds of spatial joins like [14, 56] that are based on using space-filling curves and the sort-merge paradigm. XXL is equipped with an object-relational algebra of query operators and a query optimizer that is able to rewrite Java programs. Query operators are iterators that deliver the answers of a query on demand, one by one. XXL is available under an open source licensing agreement (GNU LGPL).

Dedale

The Dedale System [63] is unique in the sense that its underlying data model is based on constraints [64]. Rather than using a boundary representation, a set of constraints describes spatial objects of arbitrary dimensions. This also allows the usage of constrained-based languages for expressing spatial queries.

The latest version of Dedale is implemented on BASIS [21] that consists of the R*-tree as its spatial index structure and different spatial join algorithms that are able to exploit spatial indexes [22].

Related posts:

Leave a comment

Your email address will not be published. Required fields are marked *