Cloud storage of geospatial data has become a common part of many GIS architectures. Whether it is used as a backup to an on-premises solution, replaces an on-premises solution, or is combined with a local solution to provide internet support for an intranet-based system, the cloud is a big part of the future of GIS.
With ArcGIS Online, CARTO, MapBox, and now MapD, the options for a cloud data store that support geospatial data are more numerous than ever. Each offers a visualization component and a different type of data storage and each will integrate with your data and software in different ways.
ArcGIS Online, while also offering stand-alone options (that is, direct data upload), integrates with ArcGIS Enterprise (formerly ArcGIS Server) to consume enterprise REpresentational State Transfer (REST) web services that are stored on a local geodatabase. ArcGIS Online is built on top of Amazon Web Services (AWS) and all of the server architecture is hidden from users. Enterprise integration requires a high-level of licensing (cost), which includes a number of cloud tokens (that is credits), and storage and analysis within the cloud account itself can use lots of those tokens.
CARTO offers cloud PostGIS storage, allowing for geospatial data files to be uploaded. With the release of the Python package CARTOframes (covered in Chapter 14, Cloud Geodatabase Analysis and Visualization), the cloud datasets can be uploaded and updated using scripting. Using Python, a CARTO account can become a part of an enterprise solution that maintains up-to-date datasets while allowing them to be quickly deployed as custom web maps using the builder application. CARTO offers two tiers of paid accounts which have different levels of storage.
MapBox is focused on map tools for creating custom basemaps for mobile apps, but it also offers cloud data storage of datasets and map creation tools such as MapBox GL, the JavaScript library for maps built on the Web Graphics Library (WebGL). With the new MapBox GL—Jupyter module, the data can be accessed using Python.
MapD, while offering similar solutions to those mentioned, is different in a number of respects. It has an open source version of the database (MapD Core Community Edition) which can be used locally or on the cloud, and has an enterprise version for large customers. While MapD Core has a relational database schema and uses SQL for queries like a traditional RDBMS, it uses GPUs to accelerate queries. MapD Core can be cloud-deployed on AWS, Google Cloud Platform, and Microsoft Azure. MapD can be installed on servers without GPUs as well, though this reduces its effective speed gains over other geodatabases.
All of the geodatabases support Jupyter Notebook environments for data queries, but MapD has them integrated into the SQL EDITOR within the Immerse visualization platform. MapD uses Apache Arrow to upload data when using pymapd and also supports INSERT statements while allowing for data to be loaded using the Immerse data importer (including SHPs, GeoJSONs, and CSVs).