(Geo)spatial data sets
In which I complain about paying a nominal fee for giant rocket robots that scan the earth from space
March 1, 2021 — December 29, 2024
Satellite images, geological tomography, climate and data records, miscellaneous useful data points about our globe
1 Maps and satellite photos
Here is a review of satellite image sources. I have only checked out a handful of these. If you just want eye candy, NASA Visible Earth is a good one. I’m fond of LANDSAT maps. Various can be found through Earth Explorer. All these resources blur into one after a while, with similarly confusing interfaces, unexpected UI glitches, and apparently random surprise pricing structures revealed belatedly.
openEO develops an open API to connect R, Python, JavaScript and other clients to big Earth observation cloud back-ends in a simple and unified way.
Earth Observation data are becoming too large to be downloaded locally for analysis. Also, the way they are organised (as tiles, or granules: files containing the imagery for a small part of the Earth and a single observation date) makes it unnecessarily complicated to analyse them. The solution to this is to store these data in the cloud, on compute back-ends, process them there, and browse the results or download resulting figures or numbers. But how do we do that?
openEO develops an open application programming interface (API) that connects clients like R, Python and JavaScript to big Earth observation cloud back-ends in a simple and unified way.
earthengine.google.com/ provides lots of imagery with an eye to discoverability and UX.
The public data archive includes more than thirty years of historical imagery and scientific datasets, updated and expanded daily. It contains over twenty petabytes of geospatial data instantly available for analysis.
See also Australia-specific stuff.
1.1 Eye candy
The special subcategory of geospatial data that looks pretty.
Hurricane as seen from space photo – Free Space Image on Unsplash
Aerial photograph of islands photo – Free Nature Image on Unsplash
Aerial photo of coastline photo – Free Nature Image on Unsplash
Top view photography of brown islands photo – Free Nature Image on Unsplash
Aerial Glacier Photographs — see also GitHub - jonkeegan/nagap-aerial-glacier-photographs at beautifulpublicdata.com
unsplash
Masterclasses in turning geographic datasets into eye candy:
2 Weather/climate
CHIRPS: Rainfall Estimates from Rain Gauge and Satellite Observations
pangeo is an umbrella organisation providing many geospatial data tools including a catalogue of hydrological, oceanographic and suchlike.
from intake import open_catalog
cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml")
list(cat)
Open Data Cube is a whole python library for working with satellite images and other large-scale raster data.
Extreme Weather Dataset Racah et al. (2017) includes for each year a (1460,16,768,1152) array, containing
- 1460 example images (4 per day, 365 days in the year)
- 16 channels in each image corresponding to various weather-related quantities
- each channel is 768 x 1152 corresponding to one measurement per 25 square km on earth
3 Biota
Esp remote sensing biodiversity. (Guo et al. 2023; Harwood et al. 2021; Mokany et al. 2022, 2022; Williams et al. 2021):
4 Data assimilation
- BG - Assimilation of multiple datasets results in large differences in regional- to global-scale NEE and GPP budgets simulated by a terrestrial biosphere model
- La Thuile Synthesis Dataset - FLUXNET
- FLUXNET2015 Dataset - FLUXNET
- SODA: Simple Ocean Data Assimilation | Climate Data Guide
- Land Data Assimilation System | LDAS
- GES DISC Search: Showing 1 - 12 of 12 datasets associated with Hydrology
- Dataset - Catalog
- Mapping Extreme Events from Space | GIS for Science
5 Incoming
“This webpage provides an interactive and searchable catalogue of public benchmark datasets for earth observation with the aim to support researchers in the fields of geoscience, remote sensing, and ML.“
Foursquare Open Source Places: A new foundational dataset for the geospatial community | Foursquare
Unfortunately, in geospatial, location, and mapping software the data layer remains largely the provenance of large scale proprietary systems. The walled garden nature of the data layer greatly hampers the industry’s ability to go from strict specialization to generalized adoption, and it is in the general adoption layer that the real value to customers exists.
In an effort to change that dynamic, we are announcing today the general availability of a foundational open data set, Foursquare Open Source Places (“FSQ OS Places”). This base layer of 100mm+ global places of interest (“POI”) includes 22 core attributes (see schema here) that will be updated monthly and available for commercial use under the Apache 2.0 license framework.