Spatial databases

Spatial RDBMS is an RDBMS that can process spatial data. Popular RDBMSs, such as Oracle, offer their own Spatial RDBMS features or add-ons so that spatial data can be processed.

Since each DBMS has a different architecture, it is difficult to show how it operates through a simple diagram. But we can explain at least the concept of a spatial DBMS through the following diagram.

 

Spatial RDBMS allows to use SQL data types, such as int and varchar, as well as spatial data types, such as Point,Linestring and Polygon for geometric calculations like distance or relationships between shapes.

RDBMS uses the B-Tree series or Hash Function to process indexes (see CUBRID Query Tuning Techniques for more explanation), basically to determine the size or redundancies of column values. In other words, only one-dimensional data can be processed. Since spatial data types are two or three dimensional:

  1. R-Tree series or Quad Trees can be used in Spatial RDBMS to process such data;
  2. Or it is necessary to transform two or three dimensional data to one dimensional data, then B-Tree can be used.

Many benefits follow if the existing RDBMS is extended to process spatial data. First, even when conducting geo-spatial tasks, there will be many occasions when basic data types, such as numbers or characters, are used. Another benefit is that there will not be a burden of additional training, since SQL is already a verified solution which can successfully store the data.

RDBMS is not the only database management system available. Likewise, spatial RDBMS is not the only spatial database management system available. Many databases, such as MongoDB, the document-oriented database, search engines such as Lucene or Solr, provide spatial data processing features. However, these solutions offer less features and do not provide high precision calculations. To understand what high precision calculations mean, we will take a closer look at the features a spatial DBMS provides.

OpenGIS

OpenGIS is a standard solution to process spatial data. The OGC (Open Geospatial Consortium), a consortium made up of 416 governmental organizations, research centers and companies from all over the world (as of 2011), legislates this standard. OpenGIS (Open Geodata Interpretability Specification) is a registered trademark of the OGC, and is a standard for geospatial data processing (this document does not differentiate spatial and geospatial).

Out of the many standards in the OpenGIS, the one standard needed to understand the spatial DBMSS is Simple Feature.

Simple Feature

As mentioned above, Simple Feature is a standard needed to process the geospatial data. Geometry Object Model (spatial data type), Spatial Operation and Coordinate System (two and three dimension) are subject to the Simple Feature standard. Geometry Object Model are figures such as Point, LineString and Polygon.

The Geometry Data Model (spatial data type) can exist in not only two dimensions but three dimensions as well. The area dealt by Simple Feature is Euclid. Therefore, spatial operation on intersects and touches are all Euclid geometry areas. In the Simple Feature Statement document, the Geometry Object Model is dealt like an Object-Oriented languages class. An actual UML is used to describe. The following is a Geometry Object Model Class Diagram, which provides a summary of the contents in a Simple Feature Specification. Point, LineString, and Polygon, all inherit this geometry. ‘Query’ and ‘analysis’ blocks are Spatial Operations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
+ dimension() : Integer
+ coordinateDimension() : Integer
+ spatialDimension() : Integer
+ geometry Type() : String
+ SRID() : Integer
+ envelope() : Geometry
#query
+ equals(another :Geometry) : Boolean
+ disjoint(another :Geometry) : Boolean
+ intersects(another :Geometry) : Boolean
+ touches(another :Geometry) : Boolean
+ crosses(another :Geometry) : Boolean
+ within(another :Geometry) : Boolean
+ contains(another :Geometry) : Boolean
...
#analysis
+ distance(another : Geometry) : Distance
+ buffer(another : Distance) : Geometry
+ convexHull() : Geometry
...

The standards document uses a formula to describe the operation.

The Within calculation shown in the above Figure can be explained in the following formula where I() function is interiorE() function is exterior.

1
(a ? b = a) ? (I(a) ? E(b)) = Ø )

Operations such as buffer() are used frequently. When a point unit geometry is given as an argument, then this is processed by buffer() and returned as a geometry that is in a form of a line that is surrounded for a certain distance. Buffer() for point would be a circle. If the center of a road is shown with a LineString, the buffer() can be used to identify the road type that can put the road width into consideration. Building near a certain road can be identified using touches().

The Simple Feature described so far is the “Simple Feature Common Architecture“. The other Simple Feature standards are Simple Feature CORBA, Simple Feature OLE/COM and Simple Feature SQL.

Naturally, Spatial RDBMS is closely related to Simple Feature SQL. Simple Feature SQL includes Simple Feature Common Architecture, and deals with the standards Spatial RDBMS must have based on ANSI SQL 92. It deals with how to show the Geometry Object Model in DBMS data type and how to show the SQL function in spatial operation, etc. It also specifies the basic DBMS Table the Spatial DBMS must have. SPATIAL_REF_SYS, a table that contains the Spatial Reference ID is a good example.

A famous open source library that implements the Simple Feature specifications is JTS Topology Suite, which is written in a Java, and GEOS, a C++ port of JTS. GEOS is used in PostGIS (package that is added on to the PostgreSQL to process Spatial data).