Quite often data isn’t in a format that makes it readily available for both mapping software and people to read. Many data formats store the geographic data in a binary format. This format is normally readable only by computers and is designed for software to use. Some spatial data formats are in a simple text format, which is easier to explore.
One example of a text-based format is Geography Markup Language (GML). It uses a text syntax to encode coordinates into a text file. GML can then be read or manually edited without needing a special piece of software or at least nothing more than a common text editor. Creating GML from scratch isn’t very pleasant. Fortunately, another OGR utility exists that can convert OGR-supported data formats into and out of other formats, including GML.
GML has three different versions: GML1, GML2, and GML3. There
are many sub-versions as well. The differences between versions can be
a problem because certain features may not be directly transferable to
another. The tools introduced in this chapter use ogr2ogr, which outputs to GML2
format.
GML was designed to be suitable for data interoperability, allowing the exchange of spatial data using a common format. This has opened up the possibilities for various web services that operate using different software yet can communicate using this standard format. See Chapter 12 to learn more about this example.
GML’s downside is its size. Text files can’t hold as much raw
computer data as a binary file, so GML files tend to be much larger
than other data formats. It isn’t ideal to store all your data as GML,
but it’s perfect for sending data to someone who may not be able to
support other formats. Because GML is text, it compresses very well
using tools such as gzip and
WinZip.
Extensible Markup Language (XML) is a popular standard for general data exchange, especially through the Internet. XML-based datafiles (such as GML) typically have an associated schema document. Schema documents aren’t discussed, but you should know that they do exist. They describe how the data is structured in the XML data file—what attributes they have, what version of XML is being used, what type of information is in an attribute, etc. Schema documents have a filename suffix of .xsd. Therefore, a GML file called airports.gml also has an associated airports.xsd file describing the contents of airports.gml.
Example
7-1 shows how to convert the airports data from shapefile
format to GML using the ogr2ogr
conversion utility.
>ogr2ogr -f "GML" airports.gml data/airports.shp airports>more airports.gml<?xml version="1.0" encoding="utf-8" ?> <ogr:FeatureCollection xmlns:xsi="http://www.w3c.org/2001/XMLSchema-instance" xsi:schemaLocation=". airports.xsd" xmlns:ogr="http://ogr.maptools.org/" xmlns:gml="http://www.opengis.net/gml"> <gml:boundedBy> <gml:Box> <gml:coord><gml:X>434634</gml:X><gml:Y>5228719</gml:Y></gml:coord> <gml:coord><gml:X>496393</gml:X><gml:Y>5291930</gml:Y></gml:coord> </gml:Box> </gml:boundedBy> <gml:featureMember> <airports fid="0"> <NAME>Bigfork Municipal Airport</NAME> <LAT> 47.7789</LAT> <LON> -93.6500</LON> <ELEVATION> 1343.0000</ELEVATION> <QUADNAME>Effie</QUADNAME> <ogr:geometryProperty><gml:Point><gml:coordinates>451306,5291930</gml:coordinates> </gml:Point></ogr:geometryProperty> </airports> </gml:featureMember> <gml:featureMember> <airports fid="1"> <NAME>Bolduc Seaplane Base</NAME> <LAT> 47.5975</LAT> <LON> -93.4106</LON> <ELEVATION> 1325.0000</ELEVATION> <QUADNAME>Balsam Lake</QUADNAME> <ogr:geometryProperty><gml:Point><gml:coordinates>469137,5271647</gml:coordinates> </gml:Point></ogr:geometryProperty>
This converts the airports shapefile into GML format and can be read (by either humans or programs) with little effort. GML data can be changed or appended to and then reconverted or used as is if your application supports the GML format. For example if you want to change the location of an airport, simply edit the text file using a text editor such as Notepad (on Windows) or vim (on Unix). Programs such as Word or Wordpad can also be used as text editors, but you must specify to save the files as plain text, or the data is saved as a binary format made for a word processor.
In Example 7-1, I
demonstrate the more command,
which lists the contents of the file to your screen, one page at a
time. This prevents the output from scrolling off the top of the
screen so quickly that it can’t be read. You should be aware that
more can be found on various
versions of Windows, but on other operating systems, it may have
been replaced by the less
program. Both programs can be used interchangeably here. You request
another page of output by pressing the space bar. To view only the
next line, press the Enter key.
When editing the airports.gml file, you can see there are
hierarchical sections in the file. Each airport feature has a section
starting with <gml:featureMember> and ending with
</gml:featureMember>. The
last attribute listed in each section is <ogr:geometryProperty>. This is the
section containing all the geometry information. To edit an airport
file, simply change the coordinates that are entered between the
<gml:coordinates> tags. For
example:
<gml:coordinates>444049,5277360</gml:coordinates>
can be changed to:
<gml:coordinates>450000,5260000</gml:coordinates>
ogrinfo can be used again to
view information about the GML file, or ogr2ogr can be used to convert the data back
into the shapefile format someone else may be expecting. This is shown
in the following example:
> ogr2ogr -f "ESRI Shapefile" airport_gml.shp airports.gml airportsThis takes the GML file and converts it back to shapefile
format. Now ogrinfo can be run on
it to compare the results with the earlier tests.
You will find that sometimes you get a set of data for a
project but are interested in only a small portion of it. For example,
if I’m making a map of my local municipality, I might want only the
location of the nearest airport and not all 12 that are available from
my government data source. Reducing the set of features is possible
using ogr2ogr, and the syntax that
selects the feature(s) of interest is almost identical to ogrinfo.
It is possible to use ogr2ogr
as a feature extraction tool. For example, if you want to make a map
of the Bigfork airport, you can use ogr2ogr and the -where option to create a new shapefile that
only includes the Bigfork Municipal Airport, as shown in the following
example:
> ogr2ogr -f "ESRI Shapefile" bigfork data/airports.shp -where "name='Bigfork Municipal Airport'"The first parameter tells ogr2ogr to create a shapefile. The command
then creates a folder called bigfork and creates the airports shapefile within it. As seen in
Example 7-2, when you run
ogrinfo against this dataset, you
can see there is now only one feature, Bigfork Municipal Airport.
> ogrinfo bigfork airports
INFO: Open of 'bigfork'
using driver 'ESRI Shapefile' successful.
Layer name: airports
Geometry: Point
Feature Count: 1
Extent: (451306.000000, 5291930.000000) - (451306.000000, 5291930.000000)
Layer SRS WKT:
(unknown)
NAME: String (64.0)
LAT: Real (12.4)
LON: Real (12.4)
ELEVATION: Real (12.4)
QUADNAME: String (32.0)
OGRFeature(airports):0
NAME (String) = Bigfork Municipal Airport
LAT (Real) = 47.7789
LON (Real) = -93.6500
ELEVATION (Real) = 1343.0000
QUADNAME (String) = Effie
POINT (451306 5291930)The power of ogr2ogr
is its ability to handle several vector data formats. To get a list of
possible formats, run ogr2ogr
without any parameters, as shown in Example 7-3. You will see the
help information displayed as well as the -f format name options.
-f format_name: output file format name, possible values are:
-f "ESRI Shapefile"
-f "TIGER"
-f "S57"
-f "MapInfo File"
-f "DGN"
-f "Memory"
-f "GML"
-f "PostgreSQL"Creating other formats is as easy as changing the -f option to the needed format, then
entering in an appropriate output dataset name. The -where clause can remain as is.
The formats supported by ogr2ogr depend on which were built into
ogr2ogr when it was compiled. If
a format you need isn’t shown in the list, you may need to find
another version of ogr2ogr from
somewhere else: try the mailing list, for example. There are also
some formats ogr2ogr can’t write
to because the OGR library allows reading only of some formats. See
this page for formats including those which OGR can read and
write/create—http://www.gdal.org/ogr/ogr_formats.html.
Converting to other file-based formats is particularly easy. The following example shows how to create a GML file:
> ogr2ogr -f "GML" bigfork_airport.gml data/airports.shp -where "name='Bigfork Municipal Airport'"This example shows how to convert to DGN format:
>ogr2ogr -f "DGN" -select ""bigfork_airport.dgn data/airports.shp -where "name='Bigfork Municipal Airport'"
DGN format can’t support attribute fields. When converting to DGN, you’ll see an error message:
ERROR 6: CreateField() not supported by this layer.
The file is created, but it doesn’t include any attributes. One
workaround to this problem is to use the -select option. With other formats, the
-select option allows you to
specify what attributes you want to convert into the destination
layer. Because ogr2ogr can’t
convert any attributes to the DGN file, you select no fields by
providing an empty value, such as two double quotes as shown in the
DGN example.
ogr2ogr can also
convert data to database formats, such as PostgreSQL (using the
PostGIS spatial extension if it is available). The syntax for the
command is more complicated than simple file-based formats because
there are certain parameters that must be used to connect to the
destination database. These options aren’t discussed in detail here
but are covered in Chapter
13.
This example shows how to convert from a shapefile to a PostgreSQL database:
> ogr2ogr -f "PostgreSQL" "PG:dbname=myairports host=myhost.com user=pgusername"
data/airports.shp -where "name='Bigfork Municipal Airport'"The command in this example connects to the myairports database on a server called
myhost.com using the PostgreSQL
database user pgusername. It then
creates a table called airports.
This is the same name as the input shapefile, which is the default.
Querying the database using the PostgreSQL query tool psql shows that the conversion was
successful, as in Example
7-4.
psql> select * from airports;
-[ RECORD 1 ]+--------------------------------------------------
ogc_fid | 1
wkb_geometry | SRID=-1;POINT(451306 5291930)
name | Bigfork Municipal Airport
lat | 47.7789
lon | -93.6500
elevation | 1343.0000
quadname | EffieOther tools can convert shapefiles to
PostgreSQL/PostGIS databases. shp2pgsql is a command-line program that exports a shapefile
into SQL commands. These commands can be saved into a text file or
sent directly to the database for execution. This utility comes with
PostGIS.
The Quantum GIS (QGIS) desktop GIS also has a shapefile to the PostgreSQL/PostGIS conversion tool, and it has a plug-in for QGIS called SPIT. See http://www.qgis.org for more information about QGIS.
More detailed PostgreSQL usage is covered in Chapter 13, where an example problem shows how to load data into a spatial database and also extracts data from the database.