Now that we have a spatial database, let's see how to access it from Python. Using psycopg2 to access a spatial database from Python is quite straightforward. For example, the following code shows how to connect to the database and issue a simple query:
import psycopg2
connection = psycopg2.connect(database="...", user="...",
password="...")
cursor = connection.cursor()
cursor.execute("SELECT id,name FROM cities WHERE pop>100000")
for row in cursor:
print(row[0],row[1])The psycopg2.connect() statement opens up a connection to the database using the database name, user, and password you set up when you created and configured the database. Once you have a database connection, you then create a Cursor object against which you can execute queries. You can then retrieve the matching data, as shown in this example.
Let's use psycopg2 to store the World Borders Dataset into a spatial database table and then perform some simple queries against that data. Place a copy of the World Borders Dataset into a suitable directory, and create a new Python program called postgis_test.py inside the same directory. Enter the following into your program:
import psycopg2
from osgeo import ogr
connection = psycopg2.connect(database="<dbname>", user="<user>",
password="<password>")
cursor = connection.cursor()Don't forget to replace the <dbname>, <user>, and <password> values with the name of the database, the user account, and the password you set up earlier.
So far, we have simply opened a connection to the database. Let's create a table to hold the contents of the World Borders Dataset. To do this, add the following to the end of your program:
cursor.execute("DROP TABLE IF EXISTS borders")
cursor.execute("CREATE TABLE borders (" +
"id SERIAL PRIMARY KEY," +
"name VARCHAR NOT NULL," +
"iso_code VARCHAR NOT NULL," +
"outline GEOGRAPHY)")
cursor.execute("CREATE INDEX border_index ON borders " +
"USING GIST(outline)")
connection.commit()As you can see, we delete the database table if it exists already so that we can rerun our program without it failing. We then create a new table named borders with four fields: an id, a name, and an iso_code, all of which are standard database fields, and a spatial geography field named outline. Because we're using a geography field, we can use this field to store spatial data that uses unprojected lat/long coordinates.
The third statement creates a spatial index on the outline. In PostGIS, we use the GIST index type to define a spatial index.
Finally, because Postgres is a transactional database, we have to commit the changes we have made using the connection.commit() statement.
Now that we've defined our database table, let's add some data into it. Using the techniques we learned earlier, we'll read through the contents of the World Borders Dataset shapefile. Here is the relevant code:
shapefile = ogr.Open("TM_WORLD_BORDERS-0.3/TM_WORLD_BORDERS-0.3.shp")
layer = shapefile.GetLayer(0)
for i in range(layer.GetFeatureCount()):
feature = layer.GetFeature(i)
name = feature.GetField("NAME")
iso_code = feature.GetField("ISO3")
geometry = feature.GetGeometryRef()
wkt = geometry.ExportToWkt()All of this should be quite straightforward. Our next task is to store this information into the database. To do this, we use the INSERT command. Add the following code to your program, inside the for loop:
cursor.execute("INSERT INTO borders (name, iso_code, outline) " +
"VALUES (%s, %s, ST_GeogFromText(%s))",
(name, iso_code, wkt))Notice that psycopg2 automatically converts standard Python data types such as numbers, strings, and date/time values to the appropriate format for inserting into the database. Following the Python DB-API standard, %s is used as a placeholder to represent a value, and that value is taken from the list supplied as the second parameter to the execute() function. In other words, the first %s is replaced with the value of the name variable, the second with the value of the iso_code variable, and so on.
Because psycopg2 doesn't know about geometry data values, we have to convert the geometry into a WKT-format string and then use the ST_GeogFromText() function to convert that string back into a PostGIS geography object.
Now that we have imported all the data, we need to commit the changes we have made to the database. To do this, add the following statement to the end of your program (outside the for loop):
connection.commit()
If you run this program, it will take about 30 seconds to import all the data into the database, but nothing else will happen. To prove that it worked, let's perform a simple spatial query against the imported data—in this case, we want to find all countries that are within 500 kilometers of Zurich, in Switzerland. Let's start by defining the latitude and longitude for Zurich and the desired search radius in meters. Add the following to the end of your program:
start_long = 8.542 start_lat = 47.377 radius = 500000
We can now perform our spatial query using the ST_DWithin() query function, like this:
cursor.execute("SELECT name FROM borders WHERE ST_DWithin(" +
"ST_MakePoint(%s, %s), outline, %s)",
(start_long, start_lat, radius))
for row in cursor:
print(row[0])The ST_DWithin() function finds all records within the borders table that have an outline within radius meters of the given lat/long value. Notice that we use the ST_MakePoint() function to convert the latitude and longitude value to a Point geometry, allowing us to compare the outline against the given point.
Running this program will import all the data and show us the list of countries that are within 500 kilometers of Zurich:
Luxembourg Monaco San Marino Austria Czech Republic France Germany Croatia Italy Liechtenstein Belgium Netherlands Slovenia Switzerland
While there is a lot more we could do, this program should show you how to use PostGIS to create a spatial database, insert data into it, and query against that data, all done using Python code.
Because PostGIS is an extension to PostgreSQL and you use psycopg2 to access it, there are three separate sets of documentation you will need to refer to:
psycopg2 documentation: http://initd.org/psycopg/docsOf these, the PostGIS manual is probably going to be the most useful, and you will also need to refer to the psycopg2 documentation to find out the details of using PostGIS from Python. You will probably also need to refer to the PostgreSQL manual to learn the non-spatial aspects of using PostGIS, though be aware that this manual is huge and extremely complex, reflecting the complexity of PostgreSQL itself.
PostGIS supports a number of advanced features which you may find useful:
A overlaps B, A contains B, and A is to the left of B. These comparison operators make use of spatial indexes to identify matching features extremely quickly.PostGIS has a well-deserved reputation for being a geospatial powerhouse. While it is not the only freely available spatial database, it is easily the most powerful and useful, and we will be using it extensively throughout this book.