There are many ways to put data into a database. One
method is to manually type SQL commands with psql to insert data into a table. You can also
use programs that convert data into an SQL script that can then be
loaded, or you can use a tool that exports data directly to the
database.
PostGIS comes with command-line tools called pgsql2shp and shp2pgsql. These can be used to convert from
a PostGIS table to a shapefile and back.
The tool shp2pgsql converts
the shapes to a text stream of SQL commands. Therefore, you need to
pipe it to a text file for later loading or to the psql command to use it immediately. For
example, use one of these methods:
>shp2pgsql countyp020.shp countyp020>mycounties.sql>psql -d project1 -f mycounties.sql
or use this one to load the data immediately to the database:
> shp2pgsql mycounties.shp mycounties | psql -d project1By default, the shp2pgsql
command puts the geometry data into a field called the_geom, whereas the default for the
ogr2ogr command (shown next) puts the geometry data into a
field called wkb_geometry. These
defaults can be overridden and changed to something more meaningful.
To acquire the countyp020 data, see
the next section.
ogr2ogr is an
excellent program for putting spatial data into a database. It’s part
of the GDAL/OGR toolkit included in FWTools and introduced in Chapters 3 and 7. This command-line tool takes any
OGR supported data layer and exports it into the database.
This example uses some data taken from the U.S. National Atlas site at http://nationalatlas.gov/atlasftp.html. Specifically, it uses the County Boundaries data at http://edcftp.cr.usgs.gov/pub/data/nationalatlas/countyp020.tar.gz.
The file is a gzip‘d tar file. On Linux, you can expand this file
using the command-line program tar:
> tar -xzvf countyp020.tar.gzThis creates a shapefile named countyp020. On Windows, most Zip programs (e.g., WinZip) can decompress this file for you.
The ogrinfo command then
provides a summary of how many features are in the file:
> ogrinfo countyp020.shp -al -so
...
Feature Count: 6138
...It is a good idea to have a feel for how much data will be converted before running the next step. The more features there are, the more time it will take to load the data to the database.
To translate this shapefile to the project1 database, use the ogr2ogr command:
> ogr2ogr -f "PostgreSQL" PG:dbname=project1 countyp020.shpThis is one of the simplest examples of using ogr2ogr with PostgreSQL/PostGIS. The first
parameter is the target data format. In this case, it is a PostgreSQL database. The prefix PG: provides more detail about the target
data source. Here, only the database name is supplied: dbname=project1.
If the database is on a different computer or requires more user access privileges, further details will be required. Multiple database parameters can be included, and you can quote them like this:
.."PG:dbname=project1 host=mypc user=tyler"..The source data filename is countyp020.shp, a shapefile data layer. The conversion process may take a minute or two to run.
As shown in Example
13-1, check to see that it was loaded successfully by going
into the psql interpreter again.
You can list the tables that are available and do a simple query to
see if all the features came across.
>psql project1Welcome to psql 7.4.2, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit project1=#\dtList of relations Schema | Name | Type | Owner --------+------------------+-------+------- public | countyp020 | table | tyler public | geometry_columns | table | tyler public | spatial_ref_sys | table | tyler (3 rows) project1=#SELECT count(*) FROM countyp020;count ------- 6138 (1 row)
This reports the number of features that are in the table. Each
geographic feature, in this case, counties, has a record in the
countyp020 table.
You can specify the name you want the table to have when it is
created using the -nln parameter
followed by the name. In the previous example, it used the name of
the original data layer as the name of the output table. If you want
the output table to have a different name (counties) use the ogr2ogr command like this (all one
line):
> ogr2ogr -f "PostgreSQL" PG:dbname=project1 countyp020.shp -nln countiesThis table has more than geographic data. In the psql interpreter, the columns of data in the
table can be listed using the \d
parameter followed by the name of the table, as shown in Example 13-2.
# \d countyp020
Table "public.countyp020"
Column | Type
--------------+---------------
ogc_fid | integer
wkb_geometry | geometry
area | numeric(9,3)
perimeter | numeric(9,3)
countyp020 | numeric(9,0)
state | character(2)
county | character(50)
fips | character(5)
state_fips | character(2)
square_mil | numeric(19,3)
Check constraints:
"$1" CHECK (srid(wkb_geometry) = -1)
"$2" CHECK (geometrytype(wkb_geometry) = 'POLYGON'::text OR wkb_geometry IS NULL)Each column is listed and shows the datatype that each column
can hold. PostgreSQL databases can handle all these types without
PostGIS, except for the geometry data. Only one column in the table
contains geometry data: the wkb_geometry column has geometry listed as its type:
wkb_geometry | geometry
You can have multiple columns holding geometry data in your
table. You can even store different types of geometries (points,
lines, or polygons) in the same table by adding more fields of the
type geometry.
The ogr2ogr utility
automatically names the geometry
column wkb_geometry (which stands
for well-known binary geometry, another OGC specification), but it can
be called anything you like. The other columns hold standard database
information— numeric or character/text data.
Both the shp2pgsql and
ogr2ogr methods do some metadata
tracking by inserting records into the table called geometry_columns. This table tracks which
columns in a table include geometric features. Some applications
depend on finding entries in this table, and others don’t. For
example, MapServer doesn’t require this, but the ArcMap-PostGIS
connector (discussed at the end of this chapter) depends on it. It is
good practice to keep this table up to date. Here’s a sample entry in
the table:
-[ RECORD 1 ]-----+------------- f_table_catalog | f_table_schema | public f_table_name | countyp020 f_geometry_column | wkb_geometry coord_dimension | 3 srid | -1 type | POLYGON attrelid | 949399 varattnum | 2 stats |
This is the entry for the countyp020 table, as shown by f_table_name = countyp020. The name of the column holding
the geometric data is stored in the f_geometry_columnfield. PostGIS can hold 3D
data. coord_dimension can be set to
2 (2D) or 3 (3D). The SRID field refers to the spatial reference
system. In this example, it is set to srid =
-1, meaning none is specific. The
field type = POLYGON says that the
geometry column holds polygon features. Any application that accesses
PostGIS data can inspect the geometry_columns table instead of looking
for information from each and every table.