EarthSciData.CEDSFileSet β€” Type

CEDS (Community Emissions Data System) gridded emissions data.

Data is from the CEDS-CMIP release, providing global monthly anthropogenic emissions at 0.5Β° Γ— 0.5Β° resolution. Each file contains one species and has dimensions (time, sector, lat, lon) with units kg m⁻² s⁻¹.

Reference: Hoesly et al., 2018, https://doi.org/10.5194/gmd-11-369-2018

species specifies which emission species to load (e.g., "SO2"). sectors is an optional vector of sector indices (0-7) to include; nothing means sum all sectors (default).

source
EarthSciData.DataFrequencyInfo β€” Type

Information about the temporal frequency of archived data.

  • start: Beginning of time of the time series.

  • frequency: Interval between each record.

  • centerpoints: Time representing the temporal center of each record.

source
EarthSciData.DataSetInterpolator β€” Type

DataSetInterpolators are used to interpolate data from a FileSet to represent a given time and location. Data is loaded (and downloaded) lazily, so the first time you use it on a for a given dataset and time period it may take a while to load. Each time step is downloaded and loaded as it is needed during the simulation and cached on the hard drive at the path specified by the \$EARTHSCIDATADIR environment variable, or in a scratch directory if that environment variable has not been specified. The interpolator will also cache data in memory representing the data records for the times immediately before and after the current time step.

varname is the name of the variable to interpolate. default_time is the time to use when initializing the interpolator. spatial_ref is the spatial reference system that the simulation will be using. stream specifies whether the data should be streamed in as needed or loaded all at once.

source
EarthSciData.FileSet β€” Type

An interface for types describing a dataset, potentially comprised of multiple files.

To satisfy this interface, a type must implement the following methods:

  • mirror(::FileSet) (Return the base URL or path for the dataset)
  • relpath(::FileSet, ::DateTime, [varname]) (varname is optional, for per-variable file datasets)
  • url(::FileSet, ::DateTime, [varname])
  • localpath(::FileSet, t::DateTime, [varname])
  • DataFrequencyInfo(::FileSet)::DataFrequencyInfo
  • loadmetadata(::FileSet, varname)::MetaData
  • loadslice!(cache::AbstractArray, ::FileSet, ::DateTime, varname)
  • varnames(::FileSet)
  • get_geometry(::FileSet, ::MetaData) (Returns the geometry of the data)
source
EarthSciData.GEOSFPFileSet β€” Type

GEOS-FP data as archived for use with GEOS-Chem classic.

Domain options (as of 2022-01-30):

  • 4x5
  • 0.125x0.15625_AS
  • 0.125x0.15625_EU
  • 0.125x0.15625_NA
  • 0.25x0.3125
  • 0.25x0.3125_AF
  • 0.25x0.3125_AS
  • 0.25x0.3125_CH
  • 0.25x0.3125_EU
  • 0.25x0.3125_ME
  • 0.25x0.3125_NA
  • 0.25x0.3125_OC
  • 0.25x0.3125_RU
  • 0.25x0.3125_SA
  • 0.5x0.625
  • 0.5x0.625_AS
  • 0.5x0.625_CH
  • 0.5x0.625_EU
  • 0.5x0.625_NA
  • 2x2.5
  • 4x5
  • C180
  • C720
  • NATIVE
  • c720

Possible filetypes are:

  • :A1
  • :A3cld
  • :A3dyn
  • :A3mstC
  • :A3mstE
  • :I3

See http://geoschemdata.wustl.edu/ExtData/ for current options.

source
EarthSciData.GridSpec β€” Type

A lightweight specification of a model grid, providing an alternative to EarthSciMLBase.DomainInfo for use cases that don't require ModelingToolkit integration.

  • coords: Coordinate ranges for each spatial dimension, ordered (x, y, [z]).

  • spatial_ref: The spatial reference system, e.g. "+proj=longlat +datum=WGS84 +no_defs".

source
EarthSciData.MetaData β€” Type

Information about a data array.

  • coords: The locations associated with each data point in the array.

  • unit_str: Physical units of the data, e.g. m s⁻¹.

  • description: Description of the data.

  • dimnames: Dimensions of the data, e.g. (lat, lon, layer).

  • varsize: Dimension sizes of the data, e.g. (180, 360, 30).

  • native_sr: The spatial reference system of the data, e.g. "+proj=longlat +datum=WGS84 +no_defs" for lat-lon data.

  • xdim: The index number of the x-dimension (e.g. longitude)

  • ydim: The index number of the y-dimension (e.g. latitude)

  • zdim: The index number of the z-dimension (e.g. vertical level)

  • staggering: Grid staggering for each dimension. (true=edge-aligned, false=center-aligned)

source
EarthSciData.NetCDFOutputter β€” Type

Create an EarthSciMLBase.Operator to write simulation output to a NetCDF file.

  • filepath::String: The path of the NetCDF file to write to

  • file::Any: The netcdf dataset

  • vars::Any: The netcdf variables corresponding to the state variables

  • tvar::Any: The netcdf variable for time

  • h::Int64: Current time index for writing

  • time_interval::AbstractFloat: Simulation time interval (in seconds) at which to write to disk

  • extra_vars::AbstractVector: Extra observed variables to write to disk

  • extra_var_fs::AbstractVector: Functions to get the extra vars

  • grid::Any: Spatial grid specification

  • dtype::Any: Data type of the output

  • tref::Any: Reference time for the simulation, used to convert to Unix time

source
EarthSciData.OpenAQFileSet β€” Type

A FileSet for OpenAQ air quality monitoring data.

Data is sourced from the OpenAQ AWS S3 archive (gzip-compressed CSV files). Station locations within the model domain are discovered via the OpenAQ API and cached locally. Measurement data is downloaded from S3 (no API key needed for data download).

Point observations from monitoring stations are mapped to model grid cells by averaging all stations that fall within each cell. Grid cells with no stations receive a configurable fill value (default NaN).

The station_filter function can be used to filter stations, e.g. by name or ID. It should accept an OpenAQStation and return true to include the station.

source
EarthSciData.OpenAQFileSet β€” Method
OpenAQFileSet(
    parameter,
    starttime,
    endtime,
    bbox;
    grid_lon_edges,
    grid_lat_edges,
    api_key,
    station_filter,
    fill_value
)

Construct an OpenAQFileSet for the given parameter and time range.

parameter is the OpenAQ parameter name (e.g. "pm25", "o3", "no2").

bbox is a named tuple (lon_min, lat_min, lon_max, lat_max) in degrees specifying the bounding box for station discovery.

api_key is the OpenAQ API key. Defaults to ENV["OPENAQ_API_KEY"].

station_filter is an optional function f(::OpenAQStation) -> Bool to filter stations. Note: station coordinates (lon, lat) are in radians when the filter is applied.

fill_value is used for grid cells with no stations (default NaN).

source
EarthSciData.OpenAQStation β€” Type

Information about an OpenAQ monitoring station.

  • id: OpenAQ location ID

  • name: Station name

  • lon: Longitude in radians

  • lat: Latitude in radians

source
EarthSciData.TemporalCache β€” Type

Cache for time-varying interpolation state within a DataSetInterpolator.

  • data: The actual data array, with the last dimension being time.

  • interp_cache: Buffer used for interpolation.

  • load_cache: Buffer that data is read into from file (separate from data for async loading).

  • itp: The interpolation object.

  • times: Timestamps corresponding to each time index in data.

  • currenttime: The current time that the interpolator has been loaded for.

  • loadtask: Async task for loading the next time step.

  • initialized: Whether the cache has been initialized with real data.

source
Base.close β€” Method

Close resources associated with a DataSetInterpolator, including the underlying FileSet.

source
Base.close β€” Method

Close resources associated with a FileSet. Default is a no-op. Concrete subtypes with open file handles should override this.

source
EarthSciData.CEDS β€” Method
CEDS(
    domaininfo;
    species,
    sectors,
    mirror,
    version,
    data_version,
    name,
    stream
)

A data loader for CEDS (Community Emissions Data System) global gridded anthropogenic emissions.

CEDS provides monthly emissions at 0.5Β° Γ— 0.5Β° resolution from 1750 to 2023 in units of kg m⁻² s⁻¹, with 8 anthropogenic sectors.

Reference: Hoesly et al., 2018, https://doi.org/10.5194/gmd-11-369-2018

Available species: BC, CH4, CO, CO2, N2O, NH3, NMVOC, NOx, OC, SO2.

Sectors (0-7): 0: Agriculture; 1: Energy; 2: Industrial; 3: Transportation; 4: Residential, Commercial, Other; 5: Solvents production and application; 6: Waste; 7: International Shipping.

Keyword Arguments

  • species: Vector of species to load. Default is all: ["BC", "CH4", "CO", "CO2", "N2O", "NH3", "NMVOC", "NOx", "OC", "SO2"].
  • sectors: Vector of sector indices (0-7) to include, or nothing for all (default).
  • mirror: Base URL for data download. Default is the ORNL ESGF THREDDS server.
  • version: CEDS source version. Default is "CEDS-CMIP-2025-04-18".
  • data_version: Data version string. Default is "v20250421".
  • name: System name. Default is :CEDS.
  • stream: Whether to stream data on demand. Default is true.
source
EarthSciData.GEOSFP β€” Method
GEOSFP(domain, domaininfo; name, stream)

A data loader for GEOS-FP data as archived for use with GEOS-Chem classic.

Domain options (as of 2022-01-30):

  • 4x5
  • 0.125x0.15625_AS
  • 0.125x0.15625_EU
  • 0.125x0.15625_NA
  • 0.25x0.3125
  • 0.25x0.3125_AF
  • 0.25x0.3125_AS
  • 0.25x0.3125_CH
  • 0.25x0.3125_EU
  • 0.25x0.3125_ME
  • 0.25x0.3125_NA
  • 0.25x0.3125_OC
  • 0.25x0.3125_RU
  • 0.25x0.3125_SA
  • 0.5x0.625
  • 0.5x0.625_AS
  • 0.5x0.625_CH
  • 0.5x0.625_EU
  • 0.5x0.625_NA
  • 2x2.5
  • 4x5
  • C180
  • C720
  • NATIVE
  • c720

The native data type for this dataset is Float32.

stream specifies whether the data should be streamed in as needed or loaded all at once.

See http://geoschemdata.wustl.edu/ExtData/ for current data domain options.

source
EarthSciData.NEI2016MonthlyEmis β€” Method
NEI2016MonthlyEmis(sector, domaininfo; scale, name, stream)

A data loader for CMAQ-formatted monthly US National Emissions Inventory data for year 2016, available from: https://gaftp.epa.gov/Air/emismod/2016/v1/gridded/monthly_netCDF/. The emissions here are monthly averages, so there is no information about diurnal variation etc.

The emissions are returned as mixing ratios in units of kg/kg/s by converting from the native flux density (kg/mΒ²/s) using:

mixing_ratio = flux / (g0_100 * delp_dry_surface)

where g0100 β‰ˆ 10.197 kg/mΒ² and delpdry_surface is the dry pressure thickness (physically unit in hPa, but here is unitless) that varies spatially across the domain.

scale is a scaling factor to apply to the emissions data. The default value is 1.0.

stream specifies whether the data should be streamed in as needed or loaded all at once.

Conservative regridding (via ConservativeRegridding.jl) is used by default to map emissions from the native NEI Lambert Conformal Conic grid to the simulation domain grid, preserving total emissions mass.

source
EarthSciData.OpenAQ β€” Method
OpenAQ(
    parameter,
    domaininfo;
    api_key,
    station_filter,
    fill_value,
    name,
    stream
)

Create a ModelingToolkit System that provides interpolated OpenAQ air quality observations for the given parameter.

parameter is the OpenAQ parameter name (e.g. "pm25", "o3", "no2").

domaininfo is a DomainInfo specifying the model domain and time span.

api_key defaults to ENV["OPENAQ_API_KEY"].

station_filter is a function f(::OpenAQStation) -> Bool to filter stations.

fill_value is used for grid cells with no stations (default NaN).

stream specifies whether data should be streamed or loaded all at once.

source
EarthSciData._load_station_day β€” Method

Load and cache parsed rows for a station's daily CSV file, filtered to the FileSet's parameter. Returns a vector of (datetime_utc, value) tuples. The cache is scoped to the OpenAQFileSet instance and protected by a lock.

source
EarthSciData._parse_openaq_datetime β€” Method

Parse an OpenAQ datetime string, converting timezone offsets to UTC. Supports formats: "2024-01-15T12:00:00+00:00", "2024-01-15T12:00:00Z", "2024-01-15T12:00:00".

source
EarthSciData._read_station_hour β€” Method

Read measurements for a single station for a single hour. Returns (total, count) for computing an average. Uses cached daily data to avoid repeated decompression.

source
EarthSciData.create_interp_equation β€” Method
create_interp_equation(
    itp,
    filename,
    t,
    t_ref,
    coords;
    wrapper_f
)

Create an equation that interpolates the given dataset at the given time and location. filename is an identifier for the dataset, and t is the time variable. wrapper_f can specify a function to wrap the interpolated value, for example eq -> eq / 2 to divide the interpolated value by 2.

source
EarthSciData.data2vecormat β€” Method

Convert an N-D array to a 2-D matrix with where the horizontal dimensions are rows and the vertical dimension is the columns. If the input is 2-D, it is converted to a vector.

source
EarthSciData.dayofweek_itp_CO β€” Method
dayofweek_itp_CO(t, lon)

Day of week interpolation function that returns the scale factor for a given time. Returns different emission scaling factors based on the day of week.

source
EarthSciData.delp_dry_surface_itp β€” Method
delp_dry_surface_itp(lon, lat)

Interpolate the delpdrysurface field at a given longitude and latitude. Returns the dry pressure thickness value in Pa.

source
EarthSciData.discover_stations β€” Method
discover_stations(parameter, bbox, api_key, station_filter)

Discover OpenAQ stations within the given bounding box for the specified parameter. Uses the OpenAQ v3 API with pagination. Results are cached locally. bbox values are in degrees.

source
EarthSciData.diurnal_itp β€” Method
diurnal_itp(t, lon)

Diurnal interpolation function that returns the scale factor for a given time. Returns different emission scaling factors based on the hour of day.

source
EarthSciData.interp β€” Method
interp(itp, t, locs)

Return the value of the given variable from the given dataset at the given time and location.

source
EarthSciData.knots2range β€” Function

Convert a vector of evenly spaced grid points to a range. The reltol parameter specifies the relative tolerance for the grid spacing, which is necessary to account for different numbers of days in each month and things like that.

source
EarthSciData.loadslice! β€” Method
loadslice!(data, fs, t, varname)

Load a time slice from the CEDS dataset, summing across sectors (or filtering by selected sectors). The result is a 2D (lat, lon) array in kg m⁻² s⁻¹.

source
EarthSciData.loadslice! β€” Method
loadslice!(data, fs, t, varname)

Load the data in place for the given variable name at the given time.

source
EarthSciData.loadslice! β€” Method
loadslice!(data, fs, t, varname)

Load the NEI data for the given variable name at the given time. This loads data in kg/s/m^2 units on the NEI source grid for regridding.

source
EarthSciData.loadslice! β€” Method
loadslice!(data, fs, t, varname)

Load OpenAQ measurement data for the given time into data. Data is binned onto the grid by averaging all stations within each cell.

source
EarthSciData.localpath β€” Function
localpath(fs, t)
localpath(fs, t, varname)

Return the local path for the file for the given DateTime. An optional varname can be provided for datasets with per-variable files.

source
EarthSciData.maybedownload β€” Function
maybedownload(fs, t)
maybedownload(fs, t, varname)

Check if the specified file exists locally. If not, download it. An optional varname can be provided for datasets with per-variable files.

source
EarthSciData.partialderivatives_Ξ΄PΞ΄lev_geosfp β€” Method
partialderivatives_Ξ΄PΞ΄lev_geosfp(geosfp; default_lev)

Return a function to calculate coefficients to multiply the Ξ΄(u)/Ξ΄(lev) partial derivative operator by to convert a variable named u from Ξ΄(u)/Ξ΄(lev)toΞ΄(u)/Ξ΄(P), i.e. from vertical level number to pressure in hPa. The return format iscoordinateindex => conversionfactor`.

source
EarthSciData.regridder β€” Method

Create a regridding function for the given file set, metadata, and domain. If any dimensions are staggered, use interpolation; otherwise, use conservative regridding. extrapolate_type specifies the extrapolation method for interpolation; it is only used when interpolation is selected.

source
EarthSciData.relpath β€” Method
relpath(fs, t)

File path on the server relative to the mirror root; also used for local caching. The t parameter determines which 50-year chunk file to select.

source
EarthSciData.relpath β€” Method

Default 3-argument relpath falls back to the 2-argument version, ignoring varname. Subtypes with per-variable files should override this.

source
EarthSciData.relpath β€” Method
relpath(fs, t)

File path on the server relative to the host root; also path on local disk relative to ENV["EARTHSCIDATADIR"] (or a scratch directory if that environment variable is not set).

source
EarthSciData.relpath β€” Method
relpath(fs, t)

File path on the server relative to the host root; also path on local disk relative to ENV["EARTHSCIDATADIR"].

source
EarthSciData.url β€” Function
url(fs, t)
url(fs, t, varname)

Return the URL for the file for the given DateTime. An optional varname can be provided for datasets with per-variable files.

source
EarthSciData.verify_fileset_interface β€” Method
verify_fileset_interface(_)

Check that type T implements all required FileSet interface methods. Throws an error listing any missing methods. This is an opt-in check intended for use when implementing a new FileSet subtype.

Example

struct MyFileSet <: EarthSciData.FileSet ... end
# After defining all methods:
EarthSciData.verify_fileset_interface(MyFileSet)
source