HTML5 Geolocation

Chapter 5. Saving Geographic Information

Many applications do more than display a geolocation on a map once it has been acquired. In many cases, the location is saved for later use—possibly displaying a history of where a user has been, or showing where many users are at any given time. In these cases, the browser application will need to collect the geolocation of the device and then send that information to a server for further processing. Most of this backend processing is beyond the scope of this book, but most likely a web server will be used with a server-side language like PHP, Python, C# or VB.NET, Java, etc. The language used does not really matter, but how the information is saved does matter.

Note

For more information on server-side scripting languages, check out some of these titles to get you started: PHP and MySQL Development, 4th Edition by Luke Welling and Laura Thomson (Addison-Wesley Professional), Programming Python, Fourth Edition by Mark Lutz (O’Reilly Media), Head First Java, Second Edition by Kathy Sierra and Bert Bates (O’Reilly Media), and Beginning ASP.NET 4: in C# and VB by Imar Spaanjaars (Wrox).

Since I want to concentrate more on what to do with the geographic information once it has been collected by the browser than how to manipulate it on the server, I will talk specifications more than implementation in the sections to come. There are many ways that geolocation information can be saved for later use: text files, CSV files, XML files, JSON files, KML files, Shapefiles, geodatabases, relational databases, etc. How you decide to save your geometry is going to depend on several factors, including GIS environment, operating systems, and budget.

For example, if you have a limited budget, then going a more open-source route with your GIS needs might be in order. In this case, using KML and Google Maps might be the right direction. If you are in an enterprise environment, however, then you are more likely to be using ArcGIS Desktop and other Esri products. In this type of environment, a more robust Oracle database might be in order. Knowing that there are many solutions to a problem is important knowledge that I hope you take advantage of in your own projects.

I will focus on only three of the ways data can be saved—KML, Shapefiles, and relational databases—because they are all popular ways of saving geographic information. If none of these proves to be a good method for your needs, hopefully it will at least aid you in learning how you can store your data using a different format.

KML

Keyhole Markup Language (KML) is an XML file format designed to hold geographic information that is to be visualized on Internet-based maps and browsers, such as Google Maps and Google Earth. It was originally created by Keyhole, Inc., which was purchased by Google in 2004. Google submitted the KML 2.2 specification to the Open Geospatial Consortium (OGC) to ensure that KML remained an open standard. It became an official OGC standard on April 14, 2008.

Note

Often times you will see a KMZ file extension—this is a zipped file that contains a compressed version of one or more KML files and their associated icon and image files.

KML has many uses for geospatial information, one of which is holding point data—which I am sure you have figured out by now is the focus of geolocation. In KML, a point is held within a <Placemark> container. This container holds a name, description, and Point geometry, at a minimum. A Simple Point Placemark is shown here:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Placemark>
    <name>Simple placemark</name>
    <description>This is an example of a simple placemark.</description>
    <Point>
      <coordinates>-90.185278,38.624722</coordinates>
    </Point>
  </Placemark>
</kml>

There are basically three types of Point Placemark that can be created:

Simple
Floating
Extruded

A Simple Point Placemark will always be attached to the ground, meaning it will always be displayed at the height of the underlying terrain. A Floating Point Placemark has a specific height at which it is defined to be above the ground height. An Extruded Point Placemark is similar to the Floating Point Placemark in that it is at a specific height above the ground, but it is tethered to the ground by a customizable tail. All three of these types are controlled by the data inside the <Point> element of the <Placemark>.

The following illustrates the syntax of a Point Placemark, showing the child elements that would be associated with geolocation. For a full list of the elements that can be added as children of a Placemark, see the KML Reference, Placemark at http://code.google.com/apis/kml/documentation/kmlreference.html#placemark:

<Placemark id="ID">
  <name>...</name>                     <!-- string -->
  <description>...</description>       <!-- string -->
  <Timestamp>
    <when>...</when>                   <!-- kml:dateTime -->
  </Timestamp>
  <ExtendedData>...</ExtendedData>     <!-- custom -->
  <Point id="ID">
    <extrude>...</extrude>             <!-- boolean -->
    <altitudeMode>...</altitudeMode>   
            <!-- clampToGround, relativeToGround, or absolute -->
    <coordinates>...</coordinates>     <!-- long,lat[,alt] -->
  </Point>
</Placemark>

Looking at the <Point> element, you will see that it has three children, <extrude>, <altitudeMode>, and <coordinates>. The <coordinates> element is required by any of the three types of Point Placemark and contains a latitude and longitude measured in decimal degrees referenced with WGS 84, and an optional altitude measured in meters above sea level. When an <altitudeMode> element is added to the <Point> element, the Point Placemark becomes Floating or Extruded. Determining which of these types it is falls to whether or not the <extrude> element is set to true with a value of 1.

Example 5-1 shows a KML file with several points in it, along with all of the information that can be gathered by the W3C Geolocation API.

Example 5-1. Sample KML file with geolocation information

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Document>
    <Placemark id="pt_000000">
      <name>Point 000000</name>
      <description>This is the first point collected.</description>
      <Timestamp><when>2011-04-06T23:24:12+06:00</when></Timestamp>
      <ExtendedData>
        <Data name="accuracy"><value>20</value></Data>
        <Data name="altitudeAccuracy"><value>100</value></Data>
        <Data name="heading"><value>NaN</value></Data>
        <Data name="speed"><value>0</value></Data>
      </ExtendedData>
      <Point>
        <extrude>0</extrude>
        <altitudeMode>relativeToGround</altitudeMode>
        <coordinates>-90.185278,38.624722,212</coordinates>
      </Point>
    </Placemark>
    <Placemark id="pt_000001">
      <name>Point 000001</name>
      <description>This is the second point collected.</description>
      <Timestamp><when>2011-04-07T00:15:37+06:00</when></Timestamp>
      <ExtendedData>
        <Data name="accuracy"><value>10</value></Data>
        <Data name="altitudeAccuracy"><value>10</value></Data>
        <Data name="heading"><value>37</value></Data>
        <Data name="speed"><value>15.6464</value></Data>
      </ExtendedData>
      <Point>
        <extrude>0</extrude>
        <altitudeMode>relativeToGround</altitudeMode>
        <coordinates>-89.788221,38.4233,18</coordinates>
      </Point>
    </Placemark>
    <Placemark id="pt_000002">
      <name>Point 000002</name>
      <description>This is the third point collected.</description>
      <Timestamp><when>2011-04-07T11:49:03+06:00</when></Timestamp>
      <ExtendedData>
        <Data name="accuracy"><value>60</value></Data>
        <Data name="altitudeAccuracy"><value>80</value></Data>
        <Data name="heading"><value>147</value></Data>
        <Data name="speed"><value>31.2928</value></Data>
      </ExtendedData>
      <Point>
        <extrude>0</extrude>
        <altitudeMode>relativeToGround</altitudeMode>
        <coordinates>-90.123129,37.992331,25</coordinates>
      </Point>
    </Placemark>
  </Document>
</kml>

Although the latitude, longitude, altitude, and timestamp can be included natively, the rest of the geolocation information—accuracy, altitudeAccuracy, heading, and speed—needs to be added in the <ExtendedData> element and defined there for use. There are three ways this data can be added; see the KML Reference, ExtendedData at http://code.google.com/apis/kml/documentation/kmlreference.html#extendeddata for more information on these methods. I chose the data pair method so that the values would be shown in Google Earth, but one of the other methods might better suit your application needs.

Because KML is basically text in a file, it is a fairly straightforward bit of programming on the server-side of an application to create this file, read from it, or write to it regardless of the technology being used. Also, because of its XML nature, converting the data in the KML file to a different format is also not that difficult. Working with KML is easy and makes it a good choice for storing geolocation data.

Shapefiles

A shapefile is a data format designed for holding geographical vector data like points and polygons along with associated attribute data. It was developed by Esri and is maintained by it. It was specifically designed as a spatial data format for use with Esri’s ArcGIS Desktop product, though it works with other software as well. Some other software that can utilize the shapefile format include AutoCAD Map, MapInfo, GeoMedia, and GRASS.

There are tools available to convert shapefiles to other formats and vice versa, making this a flexible format for holding geolocation information. By holding the point data in a shapefile, it can easily be converted to another format when needed. Some conversion programs are SHP2KML, shp2CAD, and SHP2MIF. The reverse of these programs can also easily be located with a quick web search.

Though it is called a shapefile, the format is actually a set of files that work together to produce the necessary working data. There are three or more files needed for a shapefile, as shown in Table 5-1.

Table 5-1. Files associated with a shapefile^[a]

Extension	Description	Required
.shp	Stores the feature geometry.	yes
.shx	Stores the index of the feature geometry.	yes
.dbf	Stores the attribute information of the feature in a dBASE table.	yes
.sbn/.sbx	Stores the spatial index of the features.	no
.fbn/.fbx	Stores the spatial index of the features that are read-only.	no
.ain/.aih	Stores the attribute index of the active fields in a table or a theme’s attribute table.	no
.atx	Stores the attribute index for the dBASE table.	no
.ixs	Stores the geocoding index for read/write shapefiles.	no
.mxs	Stores the geocoding index for read/write shapefiles (ODB format).	no
.prj	Stores the coordinate system information.	no
.xml	Stores metadata for the feature.	no
.cpg	Stores the codepage for identifying the character set to be used by the shapefile.	no
^[a]ArcGIS Resource Center, Desktop 10, Shapefile file extensions. http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Shapefile_file_extensions/005600000003000000/.

To be useful to a web application, there needs to be a way to programmatically manipulate a shapefile and perform all necessary operations on it (create, read/write, etc.) For instance, the Shapefile C Library serves as a way to write C programs that give reading, writing, and some updating capabilities to a developer. A more useful scripting library for web applications is the Python Shapefile Library.

Python Shapefile Library

The Python Shapefile Library (PSL) was written by Joel Lawhead. It provides read and write capabilities for shapefiles using the Python scripting language. It is designed to be as extensible as it can be when creating a shapefile while still having some validation to ensure a proper file is produced. Take a look at Example 5-2.

Example 5-2. Creating a Shapefile with the Python Shapefile Library

# Include the Python Shapefile Library
import shapefile as sf

# Name of the shapefile to create
filename = 'shapefiles/geolocation'

# Create a /point/ shapefile, and turn on autoBalance
sf_w = sf.Writer(sf.POINT)
sf_w.autoBalance = 1

# Add the points
sf_w.point(-90.185278, 38.624722, 212)
sf_w.point(-89.788221, 38.4233, 18)
sf_w.point(-90.123129, 37.992331, 25)

# Create attribute information
sf_w.field('Name', 'C', 20)
sf_w.field('Description', 'C', 80)
sf_w.field('Timestamp', 'D')
sf_w.field('Accuracy', 'N', 4, 0)
sf_w.field('AltitudeAccuracy', 'N', 4, 0)
sf_w.field('Heading', 'N', 9, 6)
sf_w.field('Speed', 'N', 7, 4)

# Add attribute information
sf_w.record('Point 000000', 'This is the first point collected.', \
    '2011-04-06T23:24:12+06:00', 20, 100, None, 0)
sf_w.record('Point 000001', 'This is the second point collected.', \
    '2011-04-07T00:15:37+06:00', 10, 10, 37, 15.6464)
sf_w.record('Point 000002', 'This is the third point collected.', \
    '2011-04-07T11:49:03+06:00', 60, 80, 147, 31.2928)

# Save the file
sf_w.save(filename)

# Create a projection file
prj = open("%s.prj" % filename, 'w')
epsg = 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137, \
    298.257223563]],PRIMEM["Greenwich",0],UNIT["degree", \
    0.0174532925199433]]'
prj.write(epsg)
prj.close()

The first line of code imports PSL into the working script. After specifying that the shapefile will be a POINT type using the Writer object, the property autoBalance is set to true. This verifies that when a point or a record is added with the script, the opposite is also added (every point has a record and every record has a point). Next, the points are added with the point() method. The point() method takes a latitude, longitude, and optional altitude and measure. In Example 5-2, the latitude, longitude, and altitude of each point is recorded.

Before attribute records can be added to the shapefile, the attributes must be defined using the field() method. The field() method takes a field name, field type, field length, and (for numbers) decimal length. Once defined, the records, one for each point, are created using the record() method. After the records have been added, the shapefile is saved and the three required files (.shp, .shx, and .dbf) are created. Additionally, Example 5-2 creates a .prj file for a more complete shapefile instance.

In most cases, the shapefile holding the geolocation information will already be created when the application needs to add another record. The following code shows a small script that can edit an existing shapefile and add another point to it:

import shapefile as sf
filename = 'shapefiles/geolocation'
sf_e = sf.Editor(shapefile = filename + '.shp')
sf_e.point(-102.125532, 34.223411, 40)
sf_e.record('Point 000004', 'This is an appended point. ', \
'2011-04-10T01:52:22+06:00', 20, 30, 118, 17.21)
sf_e.save(filename)

The code for editing an existing shapefile and adding a point is simple. Updating existing points is a little more complicated, however. The Editor object takes care of inserting and deleting records in the shapefile. The specific record number needs to be obtained by first reading the shapefile and locating the record (also something PSL can do). Then the record should be deleted using the delete() method, and a new record with the corrected information should be added to the shapefile.

The Python Shapefile Library is fairly easy to use, even if you do not know a lot of Python to start with. The only downside to this library is that it does not have the best documentation available. Otherwise, it is a great way to manipulate shapefiles within a web application.

Databases

A database is an organized collection of data that is built so that the data can be stored, manipulated, and retrieved in an easy manner. The typical databases used for geographic information are relational database management systems (RDBMS), though it is also possible to store the data in object database management systems (ODBMS). Note that for the rest of this chapter, when I refer to a database, I am referring to an RDBMS. Some examples of common RDBMS systems are dBASE, Microsoft SQL Server, MySQL, Oracle, PostgreSQL, and Sybase.

Spatial databases are built so that the spatial data and attributes coexist in the same database. MySQL, DB2, Oracle, and Microsoft SQL Server (starting with 2008) all can store spatial information natively in their tables. In some cases, however, additional software is placed on top of the RDBMS in order to facilitate geographic functionality (especially querying) within the database. ArcSDE, OracleSpatial, and PostGIS are examples of software that is used on top of the databases themselves to handle geographic data. OracleSpatial is built specifically for Oracle and PostGIS is built specifically for PostgreSQL, while ArcSDE works with four commercial databases. MySQL has geographic functionality built directly into it and does not require additional software.

SDE

ArcSDE, or simply SDE (Spatial Database Engine), is an Esri product for storing and managing geographic data with other business data within a relational database. It is designed to run with the commercial databases IBM DB2, Informix, Microsoft SQL Server, and Oracle, as well as the open source database PostgreSQL. Starting with ArcGIS 9.2, Esri stopped selling ArcSDE as a stand-alone product and began bundling it with their ArcGIS Desktop and ArcGIS Server products. The latest release of the software at the time of this writing is 10.0. ArcSDE supports various standards, including OGC simple features, the International Organization for Standardization (ISO) spatial types, the OracleSpatial format, the PostGIS format, and the Microsoft spatial format.

PostGIS

PostGIS adds spatial functionality to the PostgreSQL relational database. It was developed by Refractions Research as an open source project and is released under the GNU General Public License. The first stable version (1.0) of the software was released in 2005. The current version of the software (as of this writing) is 1.5.2. PostGIS, which acts like ArcSDE or OracleSpatial, follows the OGC Simple Feature specification, though it has not been certified compliant by the OGC.

The following should give you some idea of PostGIS functionality using SQL:

SELECT loc.the_geom
FROM
geolocations loc INNER JOIN 
(SELECT the_geom
FROM
 (SELECT the_geom, ST_Area(the_geom) AS area
  FROM parks) p
WHERE
 area > 10000) park ON ST_Intersects(loc.the_geom, park.the_geom)

This query finds all geolocations that are located within city parks with an area greater than 10,000 feet. To do this, it first pulls the parks polygons and calculates their areas using the PostGIS function ST_Area(). It then finds the parks with an area larger than 10,000 feet. Finally, it finds the geolocations located within the parks using the ST_Intersects() function. The results of this query is the geometry associated with each geolocation found within a city park with an area greater than 10,000 feet.

MySQL

MySQL is the world’s most popular open source database, in use by some of the most heavily visited websites like Google, Wikipedia, YouTube, and Facebook. Instead of requiring additional software on top of itself, MySQL implements a subset of the OGC SQL with Geometry Types specification directly into its database. It has taken MySQL several releases since it first introduced spatial capabilities to get to the place where other spatial databases like PostGIS and OracleSpatial currently are.

The OGC naming conventions were not implemented in MySQL until version 5.6. Unfortunately, as of the time of this writing, the current generally available community release of MySQL is 5.5.11, which has differences in naming. For example, MySQL 5.6 would utilize the exact same SQL statement as the example in PostGIS. The MySQL 5.5 version of this code would look like the following:

SELECT loc.the_geom
FROM
geolocations loc INNER JOIN 
(SELECT the_geom
FROM
 (SELECT the_geom, Area(the_geom) AS area
  FROM parks) p
WHERE
 area > 10000) park ON Intersects(loc.the_geom, park.the_geom)

As you can see, they are very similar in nature to one another, and anyone with some SQL and spatial database experience could figure out MySQL’s version of things. Once MySQL 5.6 becomes generally available, MySQL will have caught up with its competitors, making it very attractive solution for spatial data management considering its popularity as a relational database.

To conclude the discussion on spatial data management with relational databases, Example 5-3 creates the structure our geolocations would need to match the examples in KML or Python Shapefile Library.

Example 5-3. Creating a geolocation database in MySQL

CREATE DATABASE geolocations;

USE geolocations;

CREATE TABLE positions (
pos_id           INT            NOT NULL AUTO_INCREMENT PRIMARY KEY,
the_geom         POINT          NOT NULL,
altitude         DECIMAL(8, 2)  NOT NULL,
accuracy         DECIMAL(4, 0)  NOT NULL,
altitudeAccuracy DECIMAL(4, 0)  NULL,
heading          DECIMAL(9, 6)  NULL,
speed            DECIMAL(7, 4)  NULL,
timestamp        DATETIME       NOT NULL,
name             VARCHAR(20)    NOT NULL,
description      VARCHAR(80)    NULL
);

This example creates a new database called geolocations, and then creates a table called positions that holds all of the attribute data that can be collected from the W3C Geolocation API. The SQL script to insert a new position record into our database would look like this:

INSERT INTO positions (
the_geom, 
altitude, 
accuracy, 
altitudeAccuracy, 
heading, 
speed, 
timestamp, 
name,
description
) VALUES (
GeomFromText('POINT(-89.788221 38.4233)'),
18,
10,
10,
37,
15.6464,
'2011-04-07 00:15:37',
'Point 000001', 
'This is the second point collected.'
);

This SQL statement adds a point with the OGC Well-Known Text (WKT) format using the GeomFromText() function. This SQL would be executed from a server-side script with values passed to it from the client after a location had been retrieved. The table creation and insertion would be almost the same in any relational database.

Previous Chapter

4. Geolocation and Mapping APIs

Next Chapter

6. What You Can Do with Geolocation

Table of Contents for HTML5 Geolocation

Chapter 5. Saving Geographic Information

Note

KML

Note

Shapefiles

Python Shapefile Library

Databases

SDE

PostGIS

MySQL

Table of Contents for
HTML5 Geolocation