In this chapter, we will use interpolation methods to estimate the unknown values at one location based on the known values at other locations.
Interpolation is a technique to estimate unknown values entirely on their geographic relationship with known location values. As space can be measured with infinite precision, data measurement is always limited by the data collector's finite resources. Interpolation and other more sophisticated spatial estimation techniques are useful to estimate the values at the locations that have not been measured. In this chapter, you will learn how to interpolate the values in weather station data, which will be scored and used in a model of vulnerability to a particular agricultural condition: mildew. We've made the weather data a subset to provide a month in the year during which vulnerability is usually historically high. An end user could use this application to do a ground truthing of the model, which is, matching high or low predicted vulnerability with the presence or absence of mildew. If the model were to be extended historically or to near real time, the application could be used to see the trends in vulnerability over time or to indicate that a grower needs to take action to prevent mildew. The parameters, including precipitation, relative humidity, and temperature, have been selected for use in the real models that predict the vulnerability of fields and crops to mildew.
In this chapter, we will cover the following topics:
Often, the data to be used in a highly interactive, dynamic web application is stored in an existing enterprise database. Although these are not the usual spatial databases, they contain coordinate locations, which can be easily leveraged in a spatial application.
The following section is provided as an illustration only—database installation and setup are needlessly time consuming for a short demonstration of their use.
If you do wish to install and set up MySQL, you can download it from http://dev.mysql.com/downloads/. MySQL Community Server is freely available under the open source GPL license. You will want to install MySQL Workbench and MySQL Utilities, which are also available at this location, for interaction with your new MySQL Community Server instance. You can then restore the database used in this demonstration using the Data Import/Restore command with the provided backup file (c6/original/packt.sql) from MySQL Workbench.
To connect to and add data from your MySQL database to your QGIS project, you need to do the following (again, as this is for demonstration only, it does not require database installation and setup):

.sql backup of the packt schema:packtlocalhostpackt3306packtpackt, as shown in the following screenshot:
fieldsprecipitationrelative_humiditytemperatureThe layers (actually just the data tables) from the MySQL Database will now appear in the QGIS Layers panel of your project.
The fields layer (table) is only one of the four tables we added to our project with latitude and longitude fields. We want this table to be recognized by QGIS as geospatial data and these coordinate pairs to be plotted in QGIS. Perform the following steps:
.csv file. This file is included in the data under c6/data/output/fields.csv.
Now, to import the CSV with the coordinate fields that are recognized as geospatial data and to plot the locations, perform the following steps:

You will receive a notification that as no coordinate system was detected in this file, WGS 1984 was assigned. This is the correct coordinate system in our case, so no further intervention is necessary. After you dismiss this message, you will see the fields locations plotted on your map. If you don't, right–click on the new layer and select Zoom to Layer.
Note that this new layer is not reflected in a new file on the filesystem but is only stored with this QGIS project. This would be a good time to save your project.
Finally, join the other the other tables (precipitation, relative_humidity, and temperature) to the new plotted layer (fields) using the field_id field from each table one at a time. For a refresher on how to do this, refer to the Table join section of Chapter 1, Exploring Places – from Concept to Interface. To export each layer as separate shapefiles, right-click on each (precipitation, relative_humidity, and temperature), click on Save as, populate the path on which you want to save, and then save them.
The newer versions of QGIS support layer/table relations, which would allow us to model the one-to-many relationship between our locations, and an abstract measurement class that would include all the parameters. However, the use of table relationships is limited to a preliminary exploration of the relationships between layer objects and tables. The layer/table relationships are not recognized by any processing functions. Perform the following steps to explore the many-to-many layer/table relationships:

field_id), which references the layer, to relate the tables. The name field can be filled arbitrarily, as shown in the following screenshot:

Network Common Data Form (NetCDF) is a standard—and powerful—format for environmental data, such as meteorological data. NetCDF's strong suit is holding multidimensional data. With its abstract concept of dimension, NetCDF can handle the dimensions of latitude, longitude, and time in the same way that it handles other often physical, continuous, and ordinal data scales, such as air pressure levels.
For this project, we used the monthly global gridded high-resolution station (land) data for air temperature and precipitation from 1901-2010, which the NetCDF University of Delaware maintains as part of a collaboration with NOAA. You can download further data from this source at http://www.esrl.noaa.gov/psd/data/gridded/data.UDel_AirT_Precip.html.
While there is a plugin available, NetCDF can be viewed directly in QGIS, in GDAL via the command line, and in the QGIS Python Console. Perform the following steps:
c6/data/original/air.mon.mean.v301.nc and add this layer.air_valid_range. You can see this information highlighted in the following image. Although QGIS's classifier will calculate the range for you, it is often thrown off by a numeric nodata value, which will typically skew the range to the lower end.


To render the gridded NetCDF data accessible to certain models, databases, and to web interaction, you could write a workflow program similar to the following after sampling the gridded values and attaching them to the points for each time period.