Table of Contents for
QGIS: Becoming a GIS Power User

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition QGIS: Becoming a GIS Power User by Alexander Bruy Published by Packt Publishing, 2017
  1. Cover
  2. Table of Contents
  3. QGIS: Becoming a GIS Power User
  4. QGIS: Becoming a GIS Power User
  5. QGIS: Becoming a GIS Power User
  6. Credits
  7. Preface
  8. What you need for this learning path
  9. Who this learning path is for
  10. Reader feedback
  11. Customer support
  12. 1. Module 1
  13. 1. Getting Started with QGIS
  14. Running QGIS for the first time
  15. Introducing the QGIS user interface
  16. Finding help and reporting issues
  17. Summary
  18. 2. Viewing Spatial Data
  19. Dealing with coordinate reference systems
  20. Loading raster files
  21. Loading data from databases
  22. Loading data from OGC web services
  23. Styling raster layers
  24. Styling vector layers
  25. Loading background maps
  26. Dealing with project files
  27. Summary
  28. 3. Data Creation and Editing
  29. Working with feature selection tools
  30. Editing vector geometries
  31. Using measuring tools
  32. Editing attributes
  33. Reprojecting and converting vector and raster data
  34. Joining tabular data
  35. Using temporary scratch layers
  36. Checking for topological errors and fixing them
  37. Adding data to spatial databases
  38. Summary
  39. 4. Spatial Analysis
  40. Combining raster and vector data
  41. Vector and raster analysis with Processing
  42. Leveraging the power of spatial databases
  43. Summary
  44. 5. Creating Great Maps
  45. Labeling
  46. Designing print maps
  47. Presenting your maps online
  48. Summary
  49. 6. Extending QGIS with Python
  50. Getting to know the Python Console
  51. Creating custom geoprocessing scripts using Python
  52. Developing your first plugin
  53. Summary
  54. 2. Module 2
  55. 1. Exploring Places – from Concept to Interface
  56. Acquiring data for geospatial applications
  57. Visualizing GIS data
  58. The basemap
  59. Summary
  60. 2. Identifying the Best Places
  61. Raster analysis
  62. Publishing the results as a web application
  63. Summary
  64. 3. Discovering Physical Relationships
  65. Spatial join for a performant operational layer interaction
  66. The CartoDB platform
  67. Leaflet and an external API: CartoDB SQL
  68. Summary
  69. 4. Finding the Best Way to Get There
  70. OpenStreetMap data for topology
  71. Database importing and topological relationships
  72. Creating the travel time isochron polygons
  73. Generating the shortest paths for all students
  74. Web applications – creating safe corridors
  75. Summary
  76. 5. Demonstrating Change
  77. TopoJSON
  78. The D3 data visualization library
  79. Summary
  80. 6. Estimating Unknown Values
  81. Interpolated model values
  82. A dynamic web application – OpenLayers AJAX with Python and SpatiaLite
  83. Summary
  84. 7. Mapping for Enterprises and Communities
  85. The cartographic rendering of geospatial data – MBTiles and UTFGrid
  86. Interacting with Mapbox services
  87. Putting it all together
  88. Going further – local MBTiles hosting with TileStream
  89. Summary
  90. 3. Module 3
  91. 1. Data Input and Output
  92. Finding geospatial data on your computer
  93. Describing data sources
  94. Importing data from text files
  95. Importing KML/KMZ files
  96. Importing DXF/DWG files
  97. Opening a NetCDF file
  98. Saving a vector layer
  99. Saving a raster layer
  100. Reprojecting a layer
  101. Batch format conversion
  102. Batch reprojection
  103. Loading vector layers into SpatiaLite
  104. Loading vector layers into PostGIS
  105. 2. Data Management
  106. Joining layer data
  107. Cleaning up the attribute table
  108. Configuring relations
  109. Joining tables in databases
  110. Creating views in SpatiaLite
  111. Creating views in PostGIS
  112. Creating spatial indexes
  113. Georeferencing rasters
  114. Georeferencing vector layers
  115. Creating raster overviews (pyramids)
  116. Building virtual rasters (catalogs)
  117. 3. Common Data Preprocessing Steps
  118. Converting points to lines to polygons and back – QGIS
  119. Converting points to lines to polygons and back – SpatiaLite
  120. Converting points to lines to polygons and back – PostGIS
  121. Cropping rasters
  122. Clipping vectors
  123. Extracting vectors
  124. Converting rasters to vectors
  125. Converting vectors to rasters
  126. Building DateTime strings
  127. Geotagging photos
  128. 4. Data Exploration
  129. Listing unique values in a column
  130. Exploring numeric value distribution in a column
  131. Exploring spatiotemporal vector data using Time Manager
  132. Creating animations using Time Manager
  133. Designing time-dependent styles
  134. Loading BaseMaps with the QuickMapServices plugin
  135. Loading BaseMaps with the OpenLayers plugin
  136. Viewing geotagged photos
  137. 5. Classic Vector Analysis
  138. Selecting optimum sites
  139. Dasymetric mapping
  140. Calculating regional statistics
  141. Estimating density heatmaps
  142. Estimating values based on samples
  143. 6. Network Analysis
  144. Creating a simple routing network
  145. Calculating the shortest paths using the Road graph plugin
  146. Routing with one-way streets in the Road graph plugin
  147. Calculating the shortest paths with the QGIS network analysis library
  148. Routing point sequences
  149. Automating multiple route computation using batch processing
  150. Matching points to the nearest line
  151. Creating a routing network for pgRouting
  152. Visualizing the pgRouting results in QGIS
  153. Using the pgRoutingLayer plugin for convenience
  154. Getting network data from the OSM
  155. 7. Raster Analysis I
  156. Using the raster calculator
  157. Preparing elevation data
  158. Calculating a slope
  159. Calculating a hillshade layer
  160. Analyzing hydrology
  161. Calculating a topographic index
  162. Automating analysis tasks using the graphical modeler
  163. 8. Raster Analysis II
  164. Calculating NDVI
  165. Handling null values
  166. Setting extents with masks
  167. Sampling a raster layer
  168. Visualizing multispectral layers
  169. Modifying and reclassifying values in raster layers
  170. Performing supervised classification of raster layers
  171. 9. QGIS and the Web
  172. Using web services
  173. Using WFS and WFS-T
  174. Searching CSW
  175. Using WMS and WMS Tiles
  176. Using WCS
  177. Using GDAL
  178. Serving web maps with the QGIS server
  179. Scale-dependent rendering
  180. Hooking up web clients
  181. Managing GeoServer from QGIS
  182. 10. Cartography Tips
  183. Using Rule Based Rendering
  184. Handling transparencies
  185. Understanding the feature and layer blending modes
  186. Saving and loading styles
  187. Configuring data-defined labels
  188. Creating custom SVG graphics
  189. Making pretty graticules in any projection
  190. Making useful graticules in printed maps
  191. Creating a map series using Atlas
  192. 11. Extending QGIS
  193. Defining custom projections
  194. Working near the dateline
  195. Working offline
  196. Using the QspatiaLite plugin
  197. Adding plugins with Python dependencies
  198. Using the Python console
  199. Writing Processing algorithms
  200. Writing QGIS plugins
  201. Using external tools
  202. 12. Up and Coming
  203. Preparing LiDAR data
  204. Opening File Geodatabases with the OpenFileGDB driver
  205. Using Geopackages
  206. The PostGIS Topology Editor plugin
  207. The Topology Checker plugin
  208. GRASS Topology tools
  209. Hunting for bugs
  210. Reporting bugs
  211. Bibliography
  212. Index

Chapter 5. Demonstrating Change

In this chapter, we will encounter the visualization and analytical techniques of exploring the relationships between place and time and between the places themselves.

The data derived from temporal and spatial relationships is useful in learning more about the geographic objects that we are studying—from hydrological features to population units. This is particularly true if the data is not directly available for the geographic object of interest: either for a particular variable, for a particular time, or at all.

In this example, we will look at the demographic data from the US Census applied to the State House Districts, for election purposes. Elected officials often want to understand how the neighborhoods in their jurisdictions are changing demographically. Are their constituents becoming younger or more affluent? Is unemployment rising? Demographic factors can be used to predict the issues that will be of interest to potential voters and thus may be used for promotional purposes by the campaigns.

In this chapter, we will cover the following topics:

  • Using spatial relationships to leverage data
  • Preparing data relationships for static production
  • Vector simplification
  • Using TopoJSON for vector data size reduction and performance
  • D3 data visualization for API
  • Animated time series maps

Leveraging spatial relationships

So far, we've looked at the methods of analysis that take advantage of the continuity of the gridded raster data or of the geometric formality of the topological network data.

For ordinary vector data, we need a more abstract method of analysis, which is establishing the formal relationships based on the conditions in the spatial arrangement of geometric objects.

For most of this section, we will gather and prepare the data in ways that will be familiar. When we get to preparing the boundary data, which is leveraging the State House Districts data from the census tracts, we will be covering new territory—using the spatial relationships to construct the data for a given geographic unit.

Gathering the data

First, we will gather data from the sections of the US Census website. Though this workflow will be particularly useful for those working with the US demographic data, it will also be instructive for those dealing with any kind of data linked to geographic boundaries.

To begin with, obtain the boundary data with a unique identifier. After doing this, obtain the tabular data with the same unique identifier and then join on the identifier.

Boundaries

Download 2014 TIGER/Line Census Tracts and State Congressional Districts from the US Census at https://www.census.gov/geo/maps-data/data/tiger-line.html.

  1. Select 2014 from the tabs displayed; this should be the default year.
  2. Click on the Download accordion heading and click on Web interface.
  3. Under Select a layer type, select Census Tracts and click on submit; under Census Tract, select Pennsylvania and click on Download.
  4. Use the back arrow if necessary to select State Legislative Districts, and click on submit; select Pennsylvania for State Legislative Districts - Lower Chamber (current) and click on Download.
  5. Move both the directories to c5/data/original and extract them.

Tip

We've only downloaded a single boundary dataset for this exercise. Since the boundaries are not consistent every year, you would want to download and work further with each separate annual boundary file in an actual project.

Tabular data from American FactFinder

Many different demographic datasets are available on the American FactFinder site. These complement the TIGER/Line data mentioned before with the attribute data for the TIGER/Line geographic boundaries. The main trick is to select the matching geographic boundary level and extent between the attribute and the geographic boundary data. Perform the following steps:

  1. Go to the US Census American FactFinder site at http://factfinder.census.gov.
  2. Click on the ADVANCED SEARCH tab.
  3. In the topic or table name input, enter White and select B02008: WHITE ALONE OR IN COMBINATION WITH ONE OR MORE RACES in the suggested options. Then, click on GO.
  4. From the sidebar, in the Select a geographic type: dropdown in the Geographies section, select Census Tract - 140.
  5. Under select a state, select Pennsylvania; under Select a county, select Philadelphia; and under Select one or more geographic areas and click Add to Your Selections:, select All Census Tracts within Philadelphia County, Pennsylvania. Then, click on ADD TO YOUR SELECTIONS.
  6. From the sidebar, go to the Topics section. Here, in the Select Topics to add to 'Your Selections' under Year, click on each year available from 2009 to 2013, adding each to Your Selections to be then downloaded.
  7. Check each of the five datasets offered under the Search Results tab. All checked datasets are added to the selection to be downloaded, as shown in the following screenshot:
    Tabular data from American FactFinder
  8. Now, remove B02008: WHITE ALONE OR IN COMBINATION WITH ONE OR MORE RACES from the search filter showing selections in the upper-left corner of the page.
  9. Enter total into the topic or table name field, selecting B01003: TOTAL POPULATION from the suggested datasets, and then click on GO.
  10. Select the five 2009 to 2013 total population 5-year estimates and then click on GO.
    Tabular data from American FactFinder
  11. Click on Download to download these 10 datasets, as shown in the preceding screenshot.
  12. Once you see the Your file is complete message, click on DOWNLOAD again to download the files. These will download as a aff_download.zip directory.
  13. Move this directory to c5/data/original and then extract it.

Preparing and exporting the data

First, we will cover the steps for tabular data preparation and exporting, which are fairly similar to those we've done before. Next, we will cover the steps for preparing the boundary data, which will be more novel. We need to prepare this data based on the spatial relationships between layers, requiring the use of SQLite, since this cannot easily be done with the out-of-the-box or plugin functionality in QGIS.

The tabular data

Our tabular data is of the census tract white population. We only need to have the parseable latitude and longitude fields in this data for plotting later and, therefore, can leave it in this generic tabular format.

Combining it yearly

To combine this yearly data, we can join each table on a common GEOID field in QGIS. Perform the following steps:

  1. Open QGIS and import all the boundary shapefiles (the tracts and state house boundaries) and data tables (all the census tract years downloaded). The single boundary shapefile will be in its extracted directory with the .shp extension. Data tables will be named something similar to x_with_ann.csv. You need to do this the same way you did earlier, which was through Add Vector Layer under the Layer menu. Here is a list of all the files to add:
    • tl_2014_42_tract.shp
    • ACS_09_5YR B01003_with_ann.csv
    • ACS_10_5YR B01003_with_ann.csv
    • ACS_11_5YR B01003_with_ann.csv
    • ACS_12_5YR B01003_with_ann.csv
    • ACS_13_5YR B01003_with_ann.csv
  2. Select the tract boundaries shapefile, tl_2014_42_tract, from the Layers panel.
  3. Navigate to Layers | Properties.
  4. For each white population data table (ending in x_B02008_with_ann), perform the following steps:
    1. On the Joins tab, click on the green plus sign (+) to add a join.
    2. Select a data table as the Join layer.
    3. Select GEO.id2 in the Join field tab.
    4. Target field: GEOID
    Combining it yearly

After joining all the tables, you will find many rows in the attribute table containing null values. If you sort them a few years later, you will find that we have the same number of rows populated for more recent years as we have in the Philadelphia tracts layer. However, the year 2009 (ACS_09_5YR B01003_with_ann.csv) has many rows that could not be populated due to the changes in the unique identifier used in the 2014 boundary data. For this reason, we will exclude the year 2009 from our analysis. You can remove the 2009 data table from the joined tables so that we don't have any issue with this later.

Now, export the joined layer as a new DBF database file, which we need to do to be able to make some final changes:

  1. Ensure that only the rows with the populated data columns are selected in the tracts layer. Attribute the table (you can do this by sorting the attribute table on that field, for example).
  2. Select the tracts layer from the Layers panel.
  3. Navigate to Layer | Save as, fulfilling the following parameters:
    • Format: DBF File
    • Save only the selected features
    • Add the saved file to the map
    • Save as: c5/data/output/whites.dbf
    • Leave the other options as they are by default

Updating and removing fields

QGIS allows us to calculate the coordinates for the geographic features and populate an attribute field with them. On the layer for the new DBF, calculate the latitude and longitude fields in the expected format and eliminate the unnecessary fields by performing the following steps:

  1. Open the Attribute table for the whites DBF layer and click on the Open Field Calculator button.
  2. Calculate a new lon field and fulfill the following parameters:
    • Output field name: lon.
    • Output field type: Decimal number (real).
    • Output field width: 10.
    • Precision: 7.
    • Expression: "INTPLON". You can choose this from the Fields and Values sections in the tree under the Functions panel.
    Updating and removing fields
  3. Repeat these steps with latitude, making a lat field from INTPLAT.
  4. Create the following fields using the field calculator with the expression on the right:
    • Output field name: name; Output field type: Text; Output field width: 50; Expression: NAMESLAD
    • Output field name: Jan-11; Output field type: Whole number (integer); Expression: "ACS_11_5_2" - "ACS_10_5_2"
    • Output field name: Jan-12; Output field type: Whole number (integer); Expression: "ACS_12_5_2" - "ACS_11_5_2"
    • Output field name: Jan-13; Output field type: Whole number (integer); Expression: "ACS_13_5_2" - "ACS_12_5_2"
  5. Remove all the old fields (except name, Jan-11, Jan-12, Jan-13, lat, and lon). This will remove all the unnecessary identification fields and those with a margin of error from the table.
  6. Toggle the editing mode and save when prompted.
    Updating and removing fields

Finally, export the modified table as a new CSV data table, from which we will create our map visualization. Perform the following steps:

  1. Select the whites DBF layer from the Layers panel.
  2. Navigate to Layer | Save as while fulfilling the following parameters:
    • Format: Comma Separated Value [CSV]
    • Save as: c5/data/output/whites.csv
    • Leave the other options as they were by default

The boundary data

Although we have the boundary data for the census tracts, we are only interested in visualizing the State House Districts in our application. Our stakeholders are interested in visualizing change for these districts. However, as we do not have the population data by race for these boundary units, let alone by the yearly population, we need to leverage the spatial relationship between the State House Districts and the tracts to derive this information. This is a useful workflow whenever you have the data at a different level than the geographic unit you wish to visualize or query.

Calculating the average white population change in each census tract

Now, we will construct a field that contains the average yearly change in the white population between 2010 and 2013. Perform the following steps:

  1. As mentioned previously, join the total population tables (ending in B01003_with_ann) to the joined tract layer, tl_2014_42_tract, on the same GEO.id2, GEO fields from the new total population tables, and the tract layer respectively. Do not join the 2009 table, because we discovered that there were many null values in the join fields for the white-only version of this.
  2. As before, select the 384 rows in the attribute table having the populated join columns from this table. Save only the selected rows, calling the saved shapefile dataset tract_change and adding this to the map.
  3. Open the Attribute table and then open Field Calculator.
    • Create a new field.
    • Output field name: avg_change.
    • Output field type: Decimal number (real).
    • Output field width: 4, Precision: 2.
    • The following expression is the difference of each year from the previous year divided by the previous year to find the fractional change. This is then divided by three to find the average over three years and finally multiplied by 100 to find the percentage, as follows:
      ((("ACS_11_5_2" - "ACS_10_5_2")/ "ACS_10_5_2" )+
        (("ACS_12_5_2" - "ACS_11_5_2")/ "ACS_11_5_2" )+
        (("ACS_13_5_2" - "ACS_12_5_2")/ "ACS_12_5_2" ))/3 * 100
  4. After this, click on OK.

The spatial join in SpatiaLite

Now that we have a value for the average change in white population by tract, let's attach this to the unit of interest, which are the State House Districts. We will do this by doing a spatial join, specifically by joining all the records that intersect our House District bounds to that House District. As more than one tract will intersect each State House District, we'll need to aggregate the attribute data from the intersected tracts to match with the single district that the tracts will be joined to.

We will use SpatiaLite for doing this. Similar to PostGIS for Postgres, SpatiaLite is the spatial extension for SQLite. It is file-based; rather than requiring a continuous server listening for connections, a database is stored on a file, and client programs directly connect to it. Also, SpatiaLite comes with QGIS out of the box, making it very easy to begin to use. As with PostGIS, SpatiaLite comes with a rich set of spatial relationship functions, making it a good choice when the existing plugins do not support the relationship we are trying to model.

Tip

SpatiaLite is usually not chosen as a database for live websites because of some limitations related to multiuser transactions—which is why CartoDB uses Postgres as its backend database.

Creating a SpatiaLite database

To do this, perform the following steps:

  1. Create a new SpatiaLite database.
  2. Navigate to Layer | Create Layer | New Spatialite Layer.
  3. Using the ellipses button (), browse to and create a database at c5/data/output/district_join.sqlite.
  4. After clicking on Save, you will be notified that a new database has been registered. You have now created a new SpatiaLite database. You can now close this dialog.
    Creating a SpatiaLite database
Importing layers to SpatiaLite

To import layers to SpatiaLite, you can perform the following steps:

  1. Navigate to Database | DB Manager | DB Manage.
  2. Click on the refresh button. The new database should now be visible under the SpatiaLite section of the tree.
  3. Navigate to Table | Import layer/file (tract_change and tl_2014_42_sldl).
  4. Click on Update options.
  5. Select Create single-part geometries instead of multi-part.
  6. Select Create spatial index.
  7. Click on OK to finish importing the table to the database (you may need to hit the refresh button again for table to be indicated as imported).
    Importing layers to SpatiaLite

Now, repeat these steps with the House Districts layer (tl_2014_42_sldl), and deselect Create single-part geometries instead of multi-part as this seems to cause an error with this file, perhaps due to some part of a multi-part feature that would not be able to remain on its own under the SpatiaLite data requirements.

Querying and loading the SpatiaLite layer from the DB Manager

Next, we use the DB Manager to query the SpatiaLite database, adding the results to the QGIS layers panel.

We will use the MBRIntersects criteria here, which provides a performance advantage over a regular Intersects function as it only checks for the intersection of the extent (bounding box). In this example, we are dealing with a few features of limited complexity that are not done dynamically during a web request, so this shortcut does not provide a major advantage—we do this here so as to demonstrate its use for more complicated datasets.

  1. If it isn't already open, open DB Manager.
  2. Navigate to Database | SQL window.
    Querying and loading the SpatiaLite layer from the DB Manager
    • Fill the respective input fields in the SQL query dialog:
    • The following SQL query selects the fields from the tract_change and tl_2014_42_sldl (State Legislative District) tables, where they overlap. It also performs an aggregate (average) of the change by the State Legislative Districts overlying the census tract boundaries:
      SELECT t1.pk, t1.namelsad, t1.geom, avg(t2.avg_change)*1.0 as avg_change
      FROM   tl_2014_42_sldl AS t1, tract_change AS t2
      WHERE MbrIntersects(t1.geom, t2.geom) = 1
      GROUP BY t1.pk;
    Querying and loading the SpatiaLite layer from the DB Manager
  3. Then, click on Load now!.
  4. You will be prompted to select a value for the Column with unique integer values field. For this, select pk.
  5. You will also be prompted to select a value for the Geometry column field; for this, select geom.

The symbolized result of the spatial relationship join showing the average white population change over a 4-year period for the State House Districts' census tracts intersection will look something similar to the following image:

Querying and loading the SpatiaLite layer from the DB Manager