In this recipe, we will use the earthquake dataset included in the source from Chapter 3, Working with Vector Data – The Basics, as our input geometries for the function. We also need to define the number of clusters that the function will output; the value of k for this example will be 10. You could play with this value and see the different cluster arrangements the function outputs; the greater the value for k, the smaller the number of geometries each cluster will contain.
If you have not previously imported the earthquake data into the Chapter 3, Working with Vector Data – The Basics, schema, refer to the Getting ready section of the GIS analysis with spatial joins recipe.
Once we have created the chp03.earthquake table, we will need two tables. The first one will contain the centroid geometries of the clusters and their respective IDs, which the ST_ClusterKMeans function retrieves. The second table will have the geometries for the minimum bounding circle for each cluster. To do so, run the following SQL commands:
CREATE TABLE chp04.earthq_cent (
cid integer PRIMARY KEY, the_geom geometry('POINT',4326)
);
CREATE TABLE chp04.earthq_circ (
cid integer PRIMARY KEY, the_geom geometry('POLYGON',4326)
);