In this chapter, the examples we will look at will use spatial data from the OpenStreetMap project, a Wikipedia for maps. It's a convenient data source because it's free for everyone to use and download, and it has global coverage (although the quality may vary). The OSM project itself only serves a full dump (called planet.osm), which is huge and hard to process. Luckily, third-party services offering smaller extracts exist.
At the time of writing, there were three notable services:
- Mapzen (https://mapzen.com/data/metro-extracts/) offers ready-made extracts of metropolitan areas around the world for anyone, and custom extracts for registered users.
- Geofabrik (http://download.geofabrik.de/) offers continental and country extracts. No sign-in is required.
- BBBike (http://extract.bbbike.org/) offers custom extracts of medium-sized areas (up to 24 million square kilometers or 768 MB of data). No sign-in is required, but a valid e-mail address is. As the extracts are generated on-demand, it takes a couple of minutes to generate them and give a unique URL.
For the purpose of this chapter's examples, let's pick a city or county-sized extract (so BBBike and Mapzen services are a best fit) _in _PBF _format_. This is an OSM-specific exchange format.
After downloading, the file can be imported using the ogr2ogr command-line tool:
ogr2ogr -t_srs EPSG:32633 -f PostgreSQL "PG:dbname=mastering_postgis host=localhost user=osm password=osm" planet_17.894_49.888_ef55391f.osm.pbf
Replace the database credentials and PBF file name with yours, and the EPSG code to the appropriate projection for your area of interest.
This is convenient, as ogr2ogr is widely used for spatial data conversion, but is not particularly efficient, both in terms of processing power and the disk space required. For larger, country, or continental extracts, or even a full dump, another OSM import tool such as osm2pgsql or Imposm is required. This is, however, outside of the scope of this book.
The import tool creates one table per geometry type: points, lines, MultiPolygons, and MultilLineStrings, and the columns refer to the most commonly used OSM tags.