Table of Contents for
Python Geospatial Development - Third Edition

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Python Geospatial Development - Third Edition by Erik Westra Published by Packt Publishing, 2016
  1. Cover
  2. Table of Contents
  3. Python Geospatial Development Third Edition
  4. Python Geospatial Development Third Edition
  5. Credits
  6. About the Author
  7. About the Reviewer
  8. www.PacktPub.com
  9. Preface
  10. What you need for this book
  11. Who this book is for
  12. Conventions
  13. Reader feedback
  14. Customer support
  15. 1. Geospatial Development Using Python
  16. Geospatial development
  17. Applications of geospatial development
  18. Recent developments
  19. Summary
  20. 2. GIS
  21. GIS data formats
  22. Working with GIS data manually
  23. Summary
  24. 3. Python Libraries for Geospatial Development
  25. Dealing with projections
  26. Analyzing and manipulating Geospatial data
  27. Visualizing geospatial data
  28. Summary
  29. 4. Sources of Geospatial Data
  30. Sources of geospatial data in raster format
  31. Sources of other types of geospatial data
  32. Choosing your geospatial data source
  33. Summary
  34. 5. Working with Geospatial Data in Python
  35. Working with geospatial data
  36. Changing datums and projections
  37. Performing geospatial calculations
  38. Converting and standardizing units of geometry and distance
  39. Exercises
  40. Summary
  41. 6. Spatial Databases
  42. Spatial indexes
  43. Introducing PostGIS
  44. Setting up a database
  45. Using PostGIS
  46. Recommended best practices
  47. Summary
  48. 7. Using Python and Mapnik to Generate Maps
  49. Creating an example map
  50. Mapnik concepts
  51. Summary
  52. 8. Working with Spatial Data
  53. Designing and building the database
  54. Downloading and importing the data
  55. Implementing the DISTAL application
  56. Using DISTAL
  57. Summary
  58. 9. Improving the DISTAL Application
  59. Dealing with the scale problem
  60. Performance
  61. Summary
  62. 10. Tools for Web-based Geospatial Development
  63. A closer look at three specific tools and techniques
  64. Summary
  65. 11. Putting It All Together – a Complete Mapping System
  66. Designing the ShapeEditor
  67. Prerequisites
  68. Setting up the database
  69. Setting up the ShapeEditor project
  70. Defining the ShapeEditor's applications
  71. Creating the shared application
  72. Defining the data models
  73. Playing with the admin system
  74. Summary
  75. 12. ShapeEditor – Importing and Exporting Shapefiles
  76. Importing shapefiles
  77. Exporting shapefiles
  78. Summary
  79. 13. ShapeEditor – Selecting and Editing Features
  80. Editing features
  81. Adding features
  82. Deleting features
  83. Deleting shapefiles
  84. Using the ShapeEditor
  85. Further improvements and enhancements
  86. Summary
  87. Index

Working with geospatial data

In this section, we will look at some examples of tasks you might want to perform that involve using geospatial data in both vector and raster format.

Task – calculate the bounding box for each country in the world

In this slightly contrived example, we will make use of a shapefile to calculate the minimum and maximum latitude/longitude values for each country in the world. This "bounding box" can be used, among other things, to generate a map centered on a particular country. For example, the bounding box for Turkey would look like this:

Task – calculate the bounding box for each country in the world

Start by downloading the World Borders Dataset from http://thematicmapping.org/downloads/world_borders.php. Make sure you download the file named TM_WORLD_BORDERS-0.3.zip rather than the simplified TM_WORLD_BORDERS_SIMPL-0.3.zip file. Decompress the .zip archive and place the various files that make up the shapefile (the .dbf, .prj, .shp, and .shx files) together in a suitable directory.

Next, we need to create a Python program that can read the borders for each country. Fortunately, using OGR to read through the contents of a shapefile is simple:

from osgeo import ogr

shapefile = ogr.Open("TM_WORLD_BORDERS-0.3.shp")
layer = shapefile.GetLayer(0)

for i in range(layer.GetFeatureCount()):
    feature = layer.GetFeature(i)

Note

This complete program, along with all the other examples in this chapter, can be downloaded as part of the sample code for this chapter.

The feature will consist of a geometry defining the outline of the country, along with a set of attributes providing information about that country. According to the Readme.txt file, the attributes in this shapefile include the ISO-3166 three-letter code for the country (in a field called ISO3) as well as the name for the country (in a field called NAME). This allows us to obtain the country code and name for the feature:

countryCode = feature.GetField("ISO3")
countryName = feature.GetField("NAME")

We can also obtain the country's border polygon using the GetGeometryRef() method:

geometry = feature.GetGeometryRef()

To solve this task, we will need to calculate the bounding box for this geometry. OGR uses the term envelope to refer to the geometry's bounding box, so we can get the bounding box in the following way:

minLong,maxLong,minLat,maxLat = geometry.GetEnvelope()

Let's put all this together into a complete working program:

# calcBoundingBoxes.py

from osgeo import ogr

shapefile = ogr.Open("TM_WORLD_BORDERS-0.3.shp")
layer = shapefile.GetLayer(0)

countries = [] # List of (code, name, minLat, maxLat,
               # minLong, maxLong) tuples.

for i in range(layer.GetFeatureCount()):
    feature = layer.GetFeature(i)
    countryCode = feature.GetField("ISO3")
    countryName = feature.GetField("NAME")
    geometry = feature.GetGeometryRef()
    minLong,maxLong,minLat,maxLat = geometry.GetEnvelope()

    countries.append((countryName, countryCode,
                      minLat, maxLat, minLong, maxLong))

countries.sort()

for name,code,minLat,maxLat,minLong,maxLong in countries:
    print("{} ({}) lat={:.4f}..{:.4f},long={:.4f}..{:.4f}"
        .format(name,code,minLat,maxLat,minLong,maxLong))

Note

If you aren't storing the TM_WORLD_BORDERS-0.3.shp shapefile in the same directory as the script itself, you will need to add the directory where the shapefile is stored to your ogr.Open() call.

Running this program produces the following output:

% python calcBoundingBoxes.py
Afghanistan (AFG) lat=29.4061..38.4721, long=60.5042..74.9157
Albania (ALB) lat=39.6447..42.6619, long=19.2825..21.0542
Algeria (DZA) lat=18.9764..37.0914, long=-8.6672..11.9865
...

Task – calculate the border between Thailand and Myanmar

In this recipe, we will make use of the World Borders Dataset to obtain polygons defining the borders of Thailand and Myanmar. We will then transfer these polygons into Shapely and use Shapely's capabilities to calculate the common border between these two countries, saving the result into another shapefile.

If you haven't already done so, download the World Borders Dataset from the Thematic Mapping web site, http://thematicmapping.org/downloads/world_borders.php.

The World Borders Dataset conveniently includes the ISO-3166 two-character country code for each feature, so we can identify the features corresponding to Thailand and Myanmar as we read through the shapefile. We can then extract the outline of each country, converting the outlines into Shapely geometry objects. Here is the relevant code, which forms the start of our calcCommonBorders.py program:

# calcCommonBorders.py

import os, os.path, shutil
from osgeo import ogr, osr
import shapely.wkt

shapefile = ogr.Open("TM_WORLD_BORDERS-0.3.shp")
layer = shapefile.GetLayer(0)

thailand = None
myanmar = None

for i in range(layer.GetFeatureCount()):
    feature = layer.GetFeature(i)
    if feature.GetField("ISO2") == "TH":
        geometry = feature.GetGeometryRef()
        thailand = shapely.wkt.loads(geometry.ExportToWkt())
    elif feature.GetField("ISO2") == "MM":
        geometry = feature.GetGeometryRef()
        myanmar = shapely.wkt.loads(geometry.ExportToWkt())

Tip

Once again, this code assumes that you have placed the TM_WORLD_BORDERS-0.3.shp shapefile in the same directory as the Python script. If you've placed it into a different directory, you'll need to adjust the ogr.Open() statement to match it.

Now that we have the country outlines in Shapely, we can use Shapely's computational geometry capabilities to calculate the common border between these two countries:

commonBorder = thailand.intersection(myanmar)

The result will be a LineString (or a MultiLineString if the border is broken up into more than one part). Let's now save this common border into a new shapefile:

if os.path.exists("common-border"):
    shutil.rmtree("common-border")
os.mkdir("common-border")

spatialReference = osr.SpatialReference()
spatialReference.SetWellKnownGeogCS('WGS84')

driver = ogr.GetDriverByName("ESRI Shapefile")
dstPath = os.path.join("common-border", "border.shp")
dstFile = driver.CreateDataSource(dstPath)
dstLayer = dstFile.CreateLayer("layer", spatialReference)

wkt = shapely.wkt.dumps(commonBorder)

feature = ogr.Feature(dstLayer.GetLayerDefn())
feature.SetGeometry(ogr.CreateGeometryFromWkt(wkt))
dstLayer.CreateFeature(feature)

If we were to display the results on a map, we could clearly see the common border between these two countries:

Task – calculate the border between Thailand and Myanmar

Note

We will learn how to draw maps like this in Chapter 7, Using Python and Mapnik to Generate Maps.

The contents of the common-border/border.shp shapefile is represented by the heavy line along the countries' common border.

Note that we will use this shapefile later in this chapter to calculate the length of the Thailand-Myanmar border, so make sure you generate and keep a copy of this shapefile.

Task – analyze elevations using a digital elevation map

A digital elevation map (DEM) is a raster geospatial data format where each pixel value (or cell) represents the height of a point on the earth's surface. We encountered DEM files in the previous chapter, where we saw two examples of data sources that supply this type of information: the National Elevation Dataset covering the United States, and GLOBE which provides DEM files covering the entire Earth.

Because a DEM file contains elevation data, it can be interesting to analyze the elevation values for a given area. For example, we could draw a histogram showing how much of a country's area is at a certain elevation. Let's take some DEM data from the GLOBE dataset and calculate an elevation histogram using that data.

To keep things simple, we will choose a small country surrounded by ocean: New Zealand.

Note

We're using a small country so that we don't have too much data to work with. We're also using a country surrounded by ocean so that we can check all the points within a bounding box rather than having to use a polygon to exclude points outside of the country's boundaries. This makes our task much easier.

To download the DEM data, go to the GLOBE web site (http://www.ngdc.noaa.gov/mgg/topo/globe.html) and click on the Get Data Online hyperlink. We're going to use the data already calculated for this area of the world, so click on the Any or all 16 "tiles" hyperlink. New Zealand is in tile L, so click on the hyperlink for this tile to download it.

The file you download will be called l10g.zip (or l10g.gz if you chose to download the tile in GZIP format). If you decompress it, you will end up with a single file called l10g, containing the raw elevation data.

By itself, this file isn't very useful—it needs to be georeferenced onto the earth's surface so that you can match up each elevation value with its position on the earth. To do this, you need to download the associated header file. Suitable header files for the premade tiles can be found at http://www.ngdc.noaa.gov/mgg/topo/elev/ esri/hdr. Download the file named l10g.hdr and place it into the same directory as the l10g file you downloaded earlier. You can then read the DEM file using GDAL, just like we did earlier:

from osgeo import gdal
dataset = gdal.Open("l10g")

You probably noticed when you downloaded the l10g tile that it covers much more than just New Zealand—all of Australia is included, as well as Malaysia, Papua New Guinea, and several other east-Asian countries. To work with the elevation data for just New Zealand, we will need to identify the relevant portion of the raster DEM, that is, the range of x and y coordinates that cover New Zealand. We will start by looking at a map and identifying the minimum and maximum latitude/longitude values that enclose all of New Zealand, but no other country:

Task – analyze elevations using a digital elevation map

Rounded to the nearest degree, we get a longitude/latitude bounding box of (165, -48)...(179, -33). This is the area we want to scan to cover all of New Zealand.

There is, however, a problem: the raster data consists of pixels or "cells" identified by x and y coordinates, not longitude and latitude values. We have to convert from longitudes and latitudes to x and y coordinates. To do this, we need to make use of GDAL's coordinate transformation features.

We'll start by obtaining our dataset's coordinate transformation:

t = dataset.GetGeoTransform()

Using this transformation, we can convert an (x, y) coordinate to its corresponding latitude and longitude value. In this case, however, we want to do the opposite—we want to take a latitude and longitude and calculate the associated x and y coordinates. To do this, we have to invert the transformation. Once again, GDAL will do this for us:

success,tInverse = gdal.InvGeoTransform(t)
if not success:
    print("Failed!")
    sys.exit(1)

Tip

There are some cases where a transformation can't be inverted. This is why gdal.InvGeoTransform() returns a success flag as well as the inverted transformation. With this particular set of DEM data, however, the transformation should always be invertible, so this is unlikely to happen.

Now that we have the inverse transformation, it is possible to convert from a latitude and longitude to an x and y coordinate, like this:

x,y = gdal.ApplyGeoTransform(tInverse, longitude, latitude)

Using this, we can finally identify the minimum and maximum (x, y) coordinates that cover the area we are interested in:

x1,y1 = gdal.ApplyGeoTransform(tInverse, minLong, minLat)
x2,y2 = gdal.ApplyGeoTransform(tInverse, maxLong, maxLat)

minX = int(min(x1, x2))
maxX = int(max(x1, x2))
minY = int(min(y1, y2))
maxY = int(max(y1, y2))

Now that we know the x and y coordinates for the portion of the DEM that we're interested in, we can use GDAL to read in the individual height values. We start by obtaining the raster band that contains the DEM data:

band = dataset.GetRasterBand(1)

Note

GDAL band numbers start from 1. There is only one raster band in the DEM data we're using.

We next want to read the elevation values from our raster band. If you remember, there are two ways you can use GDAL to read raster data: as raw binary data, which you then extract using the struct module, or using the NumPy library to automatically read data into an array.

Let's assume that you don't have NumPy installed; we'll use the struct method instead. Using the technique we covered in Chapter 3, Python Libraries for Geospatial Development, we can use the ReadRaster() method to read the raw binary data and then the struct.unpack() function to convert the binary data to a list of elevation values. Here is the relevant code:

fmt = "<" + ("h" * width)
scanline = band.ReadRaster(X, y, width, height,
                           width, height, gdalconst.GDT_Int16)
values = struct.unpack(fmt, scanline)

Putting all this together, we can use GDAL to open the raster data file and read all the elevation values within the bounding box surrounding New Zealand. We'll then count how often each elevation value occurs and print out a simple histogram using the elevation values. Here is the resulting code:

# histogram.py

import sys, struct
from osgeo import gdal
from osgeo import gdalconst

minLat  = -48
maxLat  = -33
minLong = 165
maxLong = 179

dataset = gdal.Open("l10g")
band = dataset.GetRasterBand(1)

t = dataset.GetGeoTransform()
success,tInverse = gdal.InvGeoTransform(t)
if not success:
    print("Failed!")
    sys.exit(1)

x1,y1 = gdal.ApplyGeoTransform(tInverse, minLong, minLat)
x2,y2 = gdal.ApplyGeoTransform(tInverse, maxLong, maxLat)

minX = int(min(x1, x2))
maxX = int(max(x1, x2))
minY = int(min(y1, y2))
maxY = int(max(y1, y2))

width = (maxX - minX) + 1
fmt = "<" + ("h" * width)

histogram = {} # Maps elevation to number of occurrences.

for y in range(minY, maxY+1):
    scanline = band.ReadRaster(minX, y, width, 1,
                               width, 1,
                               gdalconst.GDT_Int16)
    values = struct.unpack(fmt, scanline)

    for value in values:
        try:
            histogram[value] += 1
        except KeyError:
            histogram[value] = 1

for height in sorted(histogram.keys()):
    print(height, histogram[height])

Tip

Don't forget to add a directory path to the gdal.Open() statement if you placed the l10g file in a different directory.

If you run this, you will see a list of heights (in meters) and how many cells have that height value:

-500 2607581
1 6641
2 909
3 1628
...
3097 1
3119 2
3173 1

This reveals one final problem: there is a large number of cells with an elevation value of -500. What is going on here? Clearly, -500 is not a valid height value. The GLOBE documentation provides the answer:

"Every tile contains values of -500 for oceans, with no values between -500 and the minimum value for land noted here."

So all those points with a value of -500 represent cells that are over the ocean. If you remember from our description of the GDAL library in Chapter 3, Python Libraries for Geospatial Development, GDAL includes a no data value for exactly this purpose. Indeed, the gdal.RasterBand object has a GetNoDataValue() method that will tell us what this value is. This allows us to add the following highlighted line to our program to exclude the cells without any data:

for value in values:
    if value != band.GetNoDataValue():
        try:
            histogram[value] += 1
        except KeyError:
            histogram[value] = 1

This finally gives us a histogram of the heights across New Zealand. You could create a graph using this data if you wished. For example, the following chart shows the total number of pixels at or below a given height:

Task – analyze elevations using a digital elevation map