Table of Contents for
Mastering PostGIS

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Mastering PostGIS by Tomasz Nycz Published by Packt Publishing, 2017
  1. Mastering PostGIS
  2. Title Page
  3. Copyright
  4. Credits
  5. About the Authors
  6. About the Reviewers
  7. www.PacktPub.com
  8. Customer Feedback
  9. Table of Contents
  10. Preface
  11. What this book covers
  12. What you need for this book
  13. Who this book is for
  14. Conventions
  15. Reader feedback
  16. Customer support
  17. Downloading the example code
  18. Downloading the color images of this book
  19. Errata
  20. Piracy
  21. Questions
  22. Importing Spatial Data
  23. Obtaining test data
  24. Setting up the database
  25. Importing flat data
  26. Importing data using psql
  27. Importing data interactively
  28. Importing data non-interactively
  29. Importing data using pgAdmin
  30. Extracting spatial information from flat data
  31. Importing shape files using shp2pgsql
  32. shp2pgsql in cmd
  33. The shp2pgsql GUI version
  34. Importing vector data using ogr2ogr
  35. Importing GML
  36. Importing MIF and TAB
  37. Importing KML
  38. ogr2ogr GUI (Windows only)
  39. Importing data using GIS clients
  40. Exporting a shapefile to PostGIS using QGIS and SPIT
  41. Exporting shapefile to PostGIS using QGIS and DbManager
  42. Exporting spatial data to PostGIS from Manifold GIS
  43. Importing OpenStreetMap data
  44. Connecting to external data sources with foreign data wrappers
  45. Connecting to SQL Server Spatial
  46. Connecting to WFS service
  47. Loading rasters using raster2pgsql
  48. Importing a single raster
  49. Importing multiple rasters
  50. Importing data with pgrestore
  51. Summary
  52. Spatial Data Analysis
  53. Composing and decomposing geometries
  54. Creating points
  55. Extracting coordinates from points
  56. Composing and decomposing Multi-geometries
  57. Multi-geometry decomposition
  58. Composing and decomposing LineStrings
  59. LineString composition
  60. LineString decomposition
  61. Composing and decomposing polygons
  62. Polygon composition
  63. Polygon decomposition
  64. Spatial measurement
  65. General warning - mind the SRID!
  66. Measuring distances between two geometries
  67. Measuring the length, area, and perimeter of geometries
  68. Line length
  69. Polygon perimeter
  70. Polygon area
  71. Geometry bounding boxes
  72. Accessing bounding boxes
  73. Creating bounding boxes
  74. Using bounding boxes in spatial queries
  75. Geometry simplification
  76. Geometry validation
  77. Simplicity and validity
  78. Testing for simplicity and validity
  79. Checking for validity
  80. Repairing geometry errors
  81. Validity constraint
  82. Intersecting geometries
  83. Nearest feature queries
  84. Summary
  85. Data Processing - Vector Ops
  86. Primer - obtaining and importing OpenStreetMap data
  87. Merging geometries
  88. Merging polygons
  89. Merging MultiLineStrings
  90. Slicing geometries
  91. Splitting a polygon by LineString
  92. Splitting a LineString with another LineString
  93. Extracting a section of LineString
  94. Buffering and offsetting geometries
  95. Offsetting features
  96. Creating convex and concave hulls
  97. Computing centroids, points-on-surface, and points-on-line
  98. Reprojecting geometries
  99. Spatial relationships
  100. Touching
  101. Crossing
  102. Overlapping
  103. Containing
  104. Radius queries
  105. Summary
  106. Data Processing - Raster Ops
  107. Preparing data
  108. Processing and analysis
  109. Analytic and statistical functions
  110. Vector to raster conversion
  111. Raster to vector conversion
  112. Spatial relationship
  113. Metadata
  114. Summary
  115. Exporting Spatial Data
  116. Exporting data using \COPY in psql
  117. Exporting data in psql interactively
  118. Exporting data in psql non-interactively
  119. Exporting data in PgAdmin
  120. Exporting vector data using pgsql2shp
  121. pgsql2sph command line
  122. pgsql2shp gui
  123. Exporting vector data using ogr2ogr
  124. Exporting KML revisited
  125. Exporting SHP
  126. Exporting MapInfo TAB and MIF
  127. Exporting to SQL Server
  128. ogr2ogr GUI
  129. Exporting data using GIS clients
  130. Exporting data using QGIS
  131. Exporting data using Manifold.
  132. Outputting rasters using GDAL
  133. Outputting raster using psql
  134. Exporting data using the PostgreSQL backup functionality
  135. Summary
  136. ETL Using Node.js
  137. Setting up Node.js
  138. Making a simple Node.js hello world in the command line
  139. Making a simple HTTP server
  140. Handshaking with a database using Node.js PgSQL client
  141. Retrieving and processing JSON data
  142. Importing shapefiles revisited
  143. Consuming JSON data
  144. Geocoding address data
  145. Consuming WFS data
  146. Summary
  147. PostGIS – Creating Simple WebGIS Applications
  148. ExtJS says Hello World
  149. Configuring GeoServer web services
  150. Importing test data
  151. Outputting vector data as WMS services in GeoServer
  152. Outputting raster data as WMS services in GeoServer
  153. Outputting vector data as WFS services
  154. Making use of PgRaster in a simple WMS GetMap handler
  155. Consuming WMS
  156. Consuming WMS in ol3
  157. Consuming WMS in Leaflet
  158. Enabling CORS in Jetty
  159. Consuming WFS in ol3
  160. Outputting and consuming GeoJSON
  161. Consuming GeoJSON in ol3
  162. Consuming GeoJSON in Leaflet
  163. Outputting and consuming TopoJSON
  164. Consuming TopoJSON in ol3
  165. Consuming TopoJSON in Leaflet
  166. Implementing a simple CRUD application that demonstrates vector editing via web interfaces
  167. WebGIS CRUD server in Node.js
  168. WebGIS CRUD client
  169. Layer manager
  170. Drawing tools
  171. Analysis tools - buffering
  172. Summary
  173. PostGIS Topology
  174. The conceptual model
  175. The data
  176. Installation
  177. Creating an empty topology
  178. Importing Simple Feature data into topology
  179. Checking the validity of input geometries
  180. Creating a TopoGeometry column and a topology layer
  181. Populating a TopoGeometry column from an existing geometry
  182. Inspecting and validating a topology
  183. Topology validation
  184. Accessing the topology data
  185. Querying topological elements by a point
  186. Locating nodes
  187. Locating edges
  188. Locating faces
  189. Topology editing
  190. Adding new elements
  191. Creating TopoGeometries
  192. Splitting and merging features
  193. Splitting features
  194. Merging features
  195. Updating edge geometry
  196. Topology-aware simplification
  197. Importing sample data
  198. Topology output
  199. GML output
  200. TopoJSON output
  201. Summary
  202. pgRouting
  203. Installing the pgRouting extension
  204. Importing routing data
  205. Importing shapefiles
  206. Importing OSM data using osm2pgrouting
  207. pgRouting algorithms
  208. All pairs shortest path
  209. Shortest path
  210. Shortest path Dijkstra
  211. A-Star (A*)
  212. K-Dijkstra
  213. K-Shortest path
  214. Turn restrictions shortest path (TRSP)
  215. Driving distance
  216. Traveling sales person
  217. Handling one-way edges
  218. Consuming pgRouting functionality in a web app
  219. Summary

Consuming JSON data

Since we have our corporate database ready, we can focus on the defined problem. Let's try to define the steps we need to take in order to complete our task:

  1. Obtain the weather forecast.
  2. Process the weather data and put it into the database.
  1. Assign the weather forecasts to the administrative boundaries.
  2. List the administrative units that meet a hypothetical alert watch.

Our company has decided to use a weather data provider called OpenWeatherMap. One can access the data via an API, and quite a lot of information is accessible with free accounts. One can also obtain data in bulk, though this requires a paid service subscription. We are not forcing you to use a commercial account, of course; we will use some data examples that are provided free of charge so potential users can familiarize themselves with the output that is provided by the service.

In this example, we will play with a weather forecast with a 3 hour interval and a 4 day timespan. The service provides fresh data every 3 hours, so it is easy to imagine how absurd it would be to process this data manually.

An example dataset can be obtained from http://bulk.openweathermap.org/sample/hourly_14.json.gz.

The source code for this example is available in the chapter's resources in the code/06_processing_json directory.

Let's install some node modules used by this example:

npm install pg --save
npm install line-by-line --save

We saw how to download and unzip an archive in a previous example, so these steps are skipped as there is no point in repeating them here (the source code does include the download and unzip code, though).

Our data comes as gzip, so the unzipping logic is a bit different from what we saw already; the node's zlib module is used instead.
If you happen to experience problems downloading the data, you should find it under the data/06_processing_json directory.

Once we have downloaded and unzipped the data, we should have a look at what's inside. Basically, each line is a JSON string with the following data (based on the actual content of the JSON):

{ 
city: {
coord: {
lat: 27.716667,
lon: 85.316666
},
country: "NP",
id: 1283240
name: "Kathmandu",
data: [
...
],
time: 1411447617
}
}

The data property contains the array of weather forecasts we are after. The weather forecast object looks like this:

{ 
clouds: {...},
dt: 1411441200,
dt_txt: '2014-09-23 03:00:00'
main: {...},
rain: {...},
sys: {...},
weather: {...},
wind: {
deg: 84.0077,
speed: 0.71
}
}

In our scenario, we will focus on the wind speed, which is why the other properties in the preceding code are not detailed. The wind speed is expressed in m/s. We want to alert our customers whenever the forecasted wind speed exceeds level 6 on the Beaufort scale (10.8 m/s).

We already mentioned that each line of the file is a valid JSON string. This means that we can read the data line by line, without having to load all the file content to memory.

Let's read the data for Poland first:

/** 
* reads weather forecast json line by line
*/
const readJson = function(jsonFile){
return new Promise((resolve, reject) => {
console.log(`Reading JSON data from ${jsonFile}...`);

let recCount = 0;
let data = [];

//use the line reader to read the data
let lr = new lineReader(jsonFile);

lr.on('error', function (err) {
reject(err.message);
});

lr.on('line', function (line) {

recCount ++;

//we're spinning through over 10k recs, so updating
progress every 100 seems a good choice
if(recCount % 100 === 0){
progressIndicator.next();
}

//parse string to json
var json = JSON.parse(line);

//and extract only records for Poland
if(json.country === 'PL'){
data.push(json);
}
});

lr.on('end', function () {
console.warn(`Extracted ${data.length} records out of
${recCount}.`)
progressIndicator.reset();
resolve(data);
});
});
}

At this stage, we have the records prefiltered, so we're ready to load them to a database. We will load the data into two tables: one will hold the forecast pinpoint - basically this is the city we obtained the forecast for, and the other table will hold the actual forecasts per city:

/** 
* loads weather forecast data to database
*/
const loadData = function(data){
return new Promise((resolve, reject) => {
console.log('Loading data to database...');

let client = new pg.Client(dbCredentials);

client.connect((err) => {
if(err){
reject(err.message);
return;
}

//prepare querries - this will hold all of the sql so we
can execute it in one go; the content will not be strings
though but functions to execute
let querries = [];

//Table setup SQL - drop (so we're clean) and (re)create
let tableSetup = executeNonQuery(client, `DROP TABLE IF
EXISTS ${schemaName}.${tblWeatherPoints};
DROP TABLE IF EXISTS ${schemaName}.${tblWeatherForecasts};
CREATE TABLE ${schemaName}.${tblWeatherPoints} (id serial
NOT NULL, station_id numeric, name character varying, geom
geometry);
CREATE TABLE ${schemaName}.${tblWeatherForecasts} (id
serial NOT NULL, station_id numeric, dt numeric, dt_txt
character varying(19), wind_speed numeric);
`);

querries.push(tableSetup);

//data preparation - query functions with params to be applied
to the executed sql commands
for(let d of data){
//weather forecast point
querries.push(
executeNonQuery(
client,
`INSERT INTO ${schemaName}.${tblWeatherPoints}
(station_id, name, geom) VALUES($1,$2,
ST_Transform(ST_SetSRID(ST_Point($3, $4), 4326),2180))`,
[d.city.id, d.city.name, d.city.coord.lon,
d.city.coord.lat]
)
);

//weather forecasts
let forecasts = [];
let params = [];
let pCnt = 0;
for(let f of d.data){
forecasts.push(`SELECT $${++pCnt}::numeric,
$${++pCnt}::numeric, $${++pCnt}, $${++pCnt}::numeric`);
params.push(d.city.id, f.dt, f.dt_txt, (f.wind || {})
.speed || null);
}

querries.push(
executeNonQuery(
client,
`INSERT INTO ${schemaName}.${tblWeatherForecasts}
(station_id, dt, dt_txt, wind_speed)
${forecasts.join(' UNION ALL ')}`,
params
)
);
}

//finally execute all the prepared query functions and wait for
all to finish
Promise.all(querries)
.then(()=> {
client.end();
resolve();
})
.catch(err=>{
try{
client.end();
}
catch(e){}
reject(typeof err === 'string' ? err : err.message);
});
});

});
}

If you happened to load the data in QGIS, then at this stage, our imported datasets should look like the following:

Our final step is getting the actual wind alerts. We'll do a bit more PostGIS stuff and use Node.js to execute our query. Basically, the wind speed forecasts we downloaded are not bad at all. However, there are some records with a wind speed greater than 10.8 m/s, and this is will be our cut off wind speed (wind over 10.8 falls into level 6 of the Beaufort scale and means strong breeze; this is when handling an umbrella becomes a challenge).

So let's think for a moment about what we have to do:

  • For each administrative unit, we need to assign the nearest weather station
  • We have to filter out stations with wind speed forecasts that fall into the Beaufort 6 category
  • We need to select the affected administrative units

We'll initially code the query in pure SQL, as it will be much easier to digest than the same code expressed as a string in Node.js.

First, let's get a list of weather station IDs where the wind speed is forecasted to exceed our cut off point:

select
distinct on (station_id)
station_id,
dt,
dt_txt,
wind_speed
from
weather_alerts.weather_forecasts
where
wind_speed > 10.8
order by
station_id, dt;

The preceding query selects the weather forecasts with wind speeds greater than the mentioned 10.8 m/s and orders them by timestamp. Thanks to that, we can use distinct on distinct on to pick the single station IDs with the more recent forecast.

Now, let's find out the nearest weather station for each administrative unit:

select
distinct on (adm_id)
g.jpt_kod_je as adm_id, p.station_id, ST_Distance(g.geom, p.geom) as distance
from
weather_alerts.gminy g, weather_alerts.weather_points p
where
ST_DWithin(g.geom, p.geom, 200000)
order by
adm_id, distance;

We use ST_Distance to calculate the distance between administrative units and weather stations, and then order the dataset by distance. This query gets very slow the more data is processed, so a limiting clause is used to discard weather stations that are farther than 200 km from an administrative unit (it is obvious that 200 km is way too large a range to generate sensible weather alerts, but the idea remains similar and so we will use the test data).

Finally, we need to join both queries in order to get a list of the affected administrative units:

select
f.*,
adm.*
from
(select
distinct on (station_id)
station_id,
dt,
dt_txt,
wind_speed
from
weather_alerts.weather_forecasts
where
wind_speed > 10.8
order by
station_id, dt
) as f
left join (select
distinct on (adm_id)
g.jpt_kod_je as adm_id, g.jpt_nazwa_ as adm_name,
p.station_id, p.name as station_name, ST_Distance(g.geom,
p.geom) as distance
from
weather_alerts.gminy g, weather_alerts.weather_points p
where
ST_DWithin(g.geom, p.geom, 200000)
order by
adm_id, distance
) as adm
on adm.station_id in (select distinct f.station_id);

Once our SQL is operational, we need the final piece of code, and then we should be good to go:

/** 
* generates wind alerts
*/
const generateAlerts = function(){
return new Promise((resolve, reject) => {
console.log('Generating alerts...');

let client = new pg.Client(dbCredentials);

client.connect((err) => {
if(err){
reject(err.message);
return;
}

let query = `
select
f.*,
adm.*
from
(select
distinct on (station_id)
station_id,
dt,
dt_txt,
wind_speed
from
${schemaName}.${tblWeatherForecasts}
where
wind_speed > 10.8
order by
station_id, dt
) as f

left join (select
distinct on (adm_id)
g.jpt_kod_je as adm_id, g.jpt_nazwa_ as adm_name,
p.station_id, p.name as station_name, ST_Distance(g.geom,
p.geom) as distance
from
${schemaName}.${tblAdm} g, ${schemaName}.${tblWeatherPoints}
p
where
ST_DWithin(g.geom, p.geom, 200000)
order by
adm_id, distance
) as adm
on adm.station_id in (select distinct f.station_id);`

client.query(query, (err, result)=>{
if(err){
reject(err.message);
}
else {
client.end();
console.log(`Wind alerts generated for
${result.rows.length} administrative units!`);
if(result.rows.length > 0){
let r = result.rows[0];
console.log(`The first one is:
${JSON.stringify(r)}`);
}
resolve();
}
});

});

});
}

Let's assemble the calls to the methods we have written and execute them:

//chain all the stuff together 
download(downloadUrl, path.join(downloadDir, fileName))
.then(gunzipFile)
.then(readJson)
.then(loadData)
.then(generateAlerts)
.catch(err => console.log(`uups, an error has occured: ${err}`));

The output should be similar to the following:

Downloading http://bulk.openweathermap.org/sample/hourly_14.json.gz to F:\mastering_postgis\chapter07\hourly_14.json.gz...
File downloaded!
Unzipping 'F:\mastering_postgis\chapter07\hourly_14.json.gz'...
Unzipped!
Reading JSON data from F:\mastering_postgis\chapter07\hourly_14.json...
Extracted 50 records out of 12176.
Loading data to database...
Generating alerts...
Wind alerts generated for 92 administrative units!
The first one is: {"station_id":"3081368","dt":"1411441200","dt_txt":"2014-09-23 03:00:00","wind_speed":"10.87","adm_id":"0204022","adm_name":"Jemielno","station_name":"Wroclaw","distance": 53714.3452816274}

We have managed to transform a JSON weather forecast into a dataset with alerts for administrative units. The next steps could be exposing weather alerts via a web service, or perhaps sending out e-mails, or even SMSs.