We'll use some geocoding services to get answers to the following questions:
There are, of course, many more questions we could ask. We might want to know a route to navigate between two addresses. We might want to know what public transportation choices we have to get from one place to another. For now, we'll limit ourselves to these two essential geocoding questions.
There are many geocoding services available on the World wide web (WWW). There are a number of terms related to geocoding, including geomarketing, geo targeting, geolocation, and geotagging. They're all essentially similar; they depict location-based information. It can take a fair amount of espionage to track down a service with the features we want.
The following link gives a list of services:
http://geoservices.tamu.edu/Services/Geocode/OtherGeocoders/
This list is far from definitive. Some of the services listed here don't work very well. Some large companies aren't listed; for example, MapQuest appears to be missing. See http://mapquest.com for more information.
Most geocoding services want to track usage. For large batches of requests, they want to be paid for the services they offer. Consequently, they issue credentials (a key) that must be part of every request. The procedure to get a key varies from service to service.
We'll look closely at the services offered by Google. They offer a limited service without the overhead of requesting credentials. Instead of asking us to get a key, they'll throttle our requests if we make too much use of their service.
The forward geocoding service from address to latitude and longitude can be accessed via Python's urllib.request module. For a quick review, see the Using a REST API in Python section of Chapter 2, Acquiring Intelligence Data. This is usually a three-step process.
Define the parts of the URL. It helps to separate the static portions from the dynamic query portion. We need to use the urllib.parse.urlencode() function to encode the query string.
Open the URL using a with statement context. This will send the request and get the response. The JSON document must be parsed in this with context.
Process the object that was received. This is done outside the with context. Here's what it looks like:
import urllib.request
import urllib.parse
import json
# 1. Build the URL.
form = {
"address": "333 waterside drive, norfolk, va, 23510",
"sensor": "false",
#"key": Provide the API Key here if you're registered,
}
query = urllib.parse.urlencode(form, safe=",")
scheme_netloc_path = "https://maps.googleapis.com/maps/api/geocode/json"
print(scheme_netloc_path+"?"+query)
# 2. Send the request; get the response.
with urllib.request.urlopen(scheme_netloc_path+"?"+query) as geocode:
print(geocode.info())
response= json.loads( geocode.read().decode("UTF-8") )
# 3. Process the response object.
print(response)We have created a dictionary with the two required fields: address and sensor. If you want to sign up with Google for additional support and higher-volume requests, you can get an API key. It will become a third field in the request dictionary. We used a # comment to include a reminder about the use of the key item.
An HTML web page form is essentially this kind of dictionary with names and values. When the browser makes a request, the form is encoded before it is transmitted to the web server. Our Python program does this using urllib.parse.urlencode() to encode the form data into something that a web server can use.
A complete URL has a scheme, location, path, and an optional query. The scheme, location, and path tend to remain fixed. We assembled a complete URL from the fixed portions and the dynamic query content, printed it, and also used it as an argument to the urllib.request.urlopen() function.
In the with statement, we created a processing context. This will send the request and read the response. Inside the with context, we printed the headers to confirm that the request worked. More importantly, we loaded the JSON response, which will create a Python object. We saved that object in the response variable.
After creating the Python object, we can release the resources tied up in making the geocoding request. Leaving the indented block of the with statement assures that all the resources are released and the file-like response is closed.
After the with context, we can work with the response. In this case, we merely print the object. Later, we'll do more with the response.
We'll see three things, as shown in the following snippet—the URL that we built, headers from the HTTP response, and finally the geocoding output as a JSON-formatted document:
https://maps.googleapis.com/maps/api/geocode/json?sensor=false&address=333+waterside+drive,+norfolk,+va,+23510
Content-Type: application/json; charset=UTF-8
Date: Sun, 13 Jul 2014 11:49:48 GMT
Expires: Mon, 14 Jul 2014 11:49:48 GMT
Cache-Control: public, max-age=86400
Vary: Accept-Language
Access-Control-Allow-Origin: *
Server: mafe
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 443:quic
Connection: close
{'results': [{'address_components': [{'long_name': '333',
'short_name': '333',
'types': ['street_number']},
{'long_name': 'Waterside Festival Marketplace',
'short_name': 'Waterside Festival Marketplace',
'types': ['establishment']},
{'long_name': 'Waterside Drive',
'short_name': 'Waterside Dr',
'types': ['route']},
{'long_name': 'Norfolk',
'short_name': 'Norfolk',
'types': ['locality', 'political']},
{'long_name': 'Virginia',
'short_name': 'VA',
'types': ['administrative_area_level_1',
'political']},
{'long_name': 'United States',
'short_name': 'US',
'types': ['country', 'political']},
{'long_name': '23510',
'short_name': '23510',
'types': ['postal_code']}],
'formatted_address': '333 Waterside Drive, Waterside Festival Marketplace, Norfolk, VA 23510, USA',
'geometry': {'location': {'lat': 36.844305,
'lng': -76.29111999999999},
'location_type': 'ROOFTOP',
'viewport': {'northeast': {'lat': 36.84565398029149,
'lng': -76.28977101970848},
'southwest': {'lat': 36.8429560197085,
'lng': -76.29246898029149}}},
'types': ['street_address']}],
'status': 'OK'}
{'lng': -76.29111999999999, 'lat': 36.844305}The JSON document can be loaded using the json module. This will create a dictionary with two keys: results and status. In our example, we loaded the dictionary into a variable named response. The value of response['results'] is a list of dictionaries. Since we only requested one address, we only expect one element in this list. Most of what we want, then, is in response['results'][0].
When we examine that structure, we find a subdictionary with four keys. Of those, the 'geometry' key has the geocoding latitude and longitude information.
We can extend this script to access the location details using the following code:
print( response['results'][0]['geometry']['location'])
This provides us with a small dictionary that looks like this:
{'lat': 36.844305, 'lng': -76.29111999999999}This is what we wanted to know about the street address.
Also, as a purely technical note on the Python language, we included # comments to show the important steps in our algorithm. A comment starts with # and goes to the end of the line. In this example, the comments are on the lines by themselves. In general, they can be placed at the end of any line of code.
Specifically, we called this out with a comment:
form = {
"address": "333 waterside drive, norfolk, va, 23510",
"sensor": "false",
#"key": Provide the API Key here if you're registered,
}The form dictionary has two keys. A third key can be added by removing the # comment indicator and filling in the API key that Google has supplied.
The reverse geocoding service locates nearby addresses from a latitude and longitude position. This kind of query involves a certain amount of inherent ambiguity. A point that's midway between two large buildings, for example, could be associated with either or both buildings. Also, we might be interested in different levels of details: rather than a street address, we may only wish to know the state or country for a particular position.
Here's what this web service request looks like:
import urllib.request
import urllib.parse
import json
# 1. Build the URL.
form = {
"latlng": "36.844305,-76.29112",
"sensor": "false",
#"key": Provide the API Key here if you're registered ,
}
query = urllib.parse.urlencode(form, safe=",")
scheme_netloc_path = "https://maps.googleapis.com/maps/api/geocode/json"
print(scheme_netloc_path+"?"+query)
# 2. Send the request; get the response
with urllib.request.urlopen(scheme_netloc_path+"?"+query) as geocode:
print(geocode.info())
response= json.loads( geocode.read().decode("UTF-8") )
# 3. Process the response object.
for alt in response['results']:
print(alt['types'], alt['formatted_address'])The form has two required fields: latlng and sensor.
Signing up with Google for additional support and higher-volume requests requires an API key. It would become a third field in the request form; we have left a # comment in the code as a reminder.
We encoded the form data and assigned it to the query variable. The safe="," parameter assures us that the "," characters in the latitude-longitude pair will be preserved instead of being rewritten into a %2C escape code.
We assembled a complete address from the fixed portions of the URL (the scheme, net location, and path) plus the dynamic query content. The scheme, location, and path are generally fixed. The query is encoded from the form data.
In the with statement, we created a processing context to send the request and read the response. Inside the with context, we displayed the headers and loaded the resulting JSON document, creating a Python object. Once we have the Python object, we can exit the processing context and release the resources.
The response is a Python dictionary. There are two keys: 'results' and 'status'. The value of response['results'] is a list of dictionaries. There are a number of alternative addresses in the results list. Each result is a dictionary with two interesting keys: the 'types' key, which shows the type of address and the 'formatted_address' key, which is a well-formatted address close to the given location.
The output looks like this:
['street_address'] 333 Waterside Drive, Waterside Festival Marketplace, Norfolk, VA 23510, USA ['postal_code'] Norfolk, VA 23510, USA ['locality', 'political'] Norfolk, VA, USA ['administrative_area_level_1', 'political'] Virginia, USA ['country', 'political'] United States
Each of the alternatives shows a hierarchy of nested political containers for the address: postal code, locality, state (called administrative_area_level_1), and country.