Mastering PostGIS

Let's imagine that we have a potential customer database built upon yellow pages, with the customer locations expressed as addresses. I guess that using yellow pages may not be the best idea these days, but it makes a good starting point for this example.

We need to send our sales representatives to these addresses in order to establish relationships with the new customers, but first we should assign the customers to proper sales regions. Having seen St_Intersects in action already, searching for points in polygons seems like a trivial task. We need to have point geoms for this though, and we'll soon see how to go from addresses to geometry using some simple Node.js code.

Let's prepare our hypothetical new customers database first. We will reuse some data we have seen already, namely the Ordnance Survey GB address points. We imported this data in Chapter 1, Importing Spatial Data, and I assume you have not deleted the dataset already - it should be in data_import.osgb_addresses.

The records seem to be spread quite nicely, so we will simply select 100 records where we have a meaningful name and also where the building number is known:

--schema prepare / cleanup 
create schema if not exists etl_geocoding; 
drop table if exists etl_geocoding.customers; 

--customers table 
create table etl_geocoding.customers (
   id serial NOT NULL,
   name varchar,
   street varchar,
   street_no varchar,
   postcode varchar,
   town varchar,
   lon numeric,
   lat numeric,
   geom geometry,
   geocoded boolean
);

--get some hypothethical customers
insert into etl_geocoding.customers (
   name,
   street,
   street_no,
   postcode,
   town,
   geocoded
)

select organisation_name, thoroughfare, building_number, postcode, post_town, false 
from data_import.osgb_addresses 
where organisation_name != '' and building_number != '' 
limit 100;

Having prepared our customer database, we can now define the steps we need to take to end up with geocoded addresses:

Extract the non-geocoded records from the database
Use an external geocoding API in order to obtain the locations
Pump the data back to the database

Our geocoding API will be Google Maps Geocoding API. It has its own node module, so we will be able to focus on the task without having to bother with assembling a valid URL to call the API using http GET. You will find more information on the Google Maps' node module at https://github.com/googlemaps/google-maps-services-js.

In order to use Google services, one has to generate an API key. API keys are freely available and can be created via a Google Account at https://developers.google.com/console.

Once our geocoding node module has been created, we will need to install some external packages:

npm install pg --save
npm install @google/maps --save

Our first step is to extract the customer records that have not yet been geocoded:

/** 
 * reads non-geocoded customer records 
 */ 
const readCustomers = function(){ 
    return new Promise((resolve, reject) => { 
        console.log('Extracting customer record...'); 

        let client = new pg.Client(dbCredentials); 

        client.connect((err) => { 
            if(err){ 
                reject(err.message); 
                return; 
            } 

            client.query(`SELECT * FROM 
            ${customersSchema}.${customersTable} WHERE geocoded = false 
            LIMIT 10;`, function(err, result){ 
                if(err){ 
                    try { 
                        client.end(); 
                    } catch(e){} 
                    reject(err.message); 
                    return; 
                } 

                client.end(); 

                console.log('Done!'); 
                resolve(result.rows); 
            }); 
        }); 
    }); 
}

You may have noticed that I am reading only ten records at a time. This is because it lets me debug the code without having to reset the geocoding status in the database. Also, there are some usage limits with the free Google Maps account, so I am avoiding too many API calls.

Once we have the customer records at hand, we can geocode them. However, we should probably stop for a second and familiarize ourselves with the geocoder API output so that we can properly extract the information later. The following is an output example for our first address in the database: 30, GUILDHALL SHOPPING CENTRE, EX4 3HJ, EXETER:

{ 
    "results" : [ 
    { 
        "address_components":[
            {"long_name":"30-32","short_name":"30-32",
            "types":["street_number"]}, 
            {"long_name":"Guildhall Shopping Centre",
            "short_name":"Guildhall Shopping Centre",
            "types":["route"]}, 
            {"long_name":"Exeter","short_name":"Exeter","types":
            ["locality","political"]}, 
            {"long_name":"Exeter","short_name":"Exeter","types":
            ["postal_town"]}, 
            {"long_name":"Devon","short_name":"Devon","types": 
            ["administrative_area_level_2","political"]}, 
            {"long_name":"England","short_name":"England","types":
            ["administrative_area_level_1","political"]}, 
            {"long_name":"United Kingdom","short_name":"GB","types":
            ["country","political"]}, 
            {"long_name":"EX4 3HH","short_name":"EX4 3HH","types":
            ["postal_code"]} 
        ], 
        "formatted_address":"30-32 Guildhall Shopping Centre, Exeter 
        EX4 3HH, UK", 
        "geometry":{ 
            "location":{ 
                "lat":50.7235944, 
                "lng":-3.5333662 
            }, 
            "location_type":"ROOFTOP", 
            "viewport":{ 
            "northeast": 
            {"lat":50.7249433802915,"lng":-3.532017219708498}, 
            "southwest": 
            {"lat":50.7222454197085,"lng":-3.534715180291502} 
            } 
        }, 
        "partial_match":true, 
        "place_id":"ChIJy3ZkNDqkbUgR1WXtac_0ClE", 
        "types":["street_address"] 
    }, 
    "status" : "OK" 
}

Once we know what the geocoder data looks like, we can easily code the geocoding procedure:

/** 
 * generates a geocoding call 
 */ 
const generateGeocodingCall = function(gMapsClient, customer){ 
    return new Promise((resolve, reject) => { 
         let address = `${customer.street_no} ${customer.street},  
         ${customer.postcode}, ${customer.town}`; 

         gMapsClient.geocode({ 
            address: address 
        }, (err, response) => { 
            if(err){ 
                reject(err.message); 
                return; 
            } 

            if(response.json.error_message){ 
                console.log(response.json.status,   
                response.json.error_message); 
                reject(err); 
                return; 
            } 

            //update customer 
            let geocoded = response.json.results[0]; 
            if(geocoded){ 
                customer.geocoded = true; 
                customer.lon = geocoded.geometry.location.lng; 
                customer.lat = geocoded.geometry.location.lat; 
            } 

            resolve(); 
        }); 
    }); 
}

In order to make our geocoding call work for us, we need to call it for the retrieved records. Let's do it this way:

/** 
 * geocodes specified customer addresses 
 */ 
const geocodeAddresses = function(customers){ 
    return new Promise((resolve, reject) => { 
        console.log('Geocoding addresses...'); 

        let gMapsClient = require('@google/maps').createClient({ 
            key: gMapsApiKey 
        }); 

        //prepare geocoding calls 
        let geocodingCalls = [];  
        for(let c of customers){ 
            geocodingCalls.push( 
                generateGeocodingCall(gMapsClient, c) 
            ); 
        } 

        //and execute them 
        Promise.all(geocodingCalls) 
            .then(()=>resolve(customers)) 
            .catch((err) => reject(err)); 
    }); 
}

At this stage, we should have our customer records geocoded so we can save them back to the database. As you may expect, this is rather straightforward:

/** 
 * saves geocoded customers back to the database 
 */ 
const saveCustomers = function(customers){ 
    return new Promise((resolve, reject) => { 

        console.log('Saving geocoded customer records...'); 

        let client = new pg.Client(dbCredentials); 

        client.connect((err) => { 
            if(err){ 
                reject(err.message); 
                return; 
            } 

            const updateSQLs = []; 
            var pCounter = 0; 

            for(let c of customers){ 
                updateSQLs.push(executeNonQuery(client, `UPDATE 
                ${customersSchema}.${customersTable} SET 
                lon=$1,lat=$2,geocoded=true WHERE id=$3;`, [c.lon, 
                c.lat, c.id])); 
            } 

            Promise.all(updateSQLs) 
                .then(() => { 
                    client.end(); 
                    resolve(); 
                }) 
                .catch((err)=>{ 
                    try{ 
                        client.end(); 
                    } 
                    catch(e){} 
                    reject(err); 
                }); 
        }); 
    }); 
}

Finally, let's call our methods in a sequence and watch the magic happen:

//chain all the stuff together 
readCustomers() 
    .then(geocodeAddresses) 
    .then(saveCustomers) 
    .catch(err => console.log(`uups, an error has occured: ${err}`));

Table of Contents for
Mastering PostGIS

Geocoding address data

Table of Contents for Mastering PostGIS

Table of Contents for
Mastering PostGIS