Skip to main content

Data Quality

AtlasCore applies quality controls at every stage of the data pipeline — from source extraction through to API response.

Quality at ingestion

Every data extraction creates a snapshot with quality metrics:

MetricWhat it measures
record_count_rawTotal records extracted from the source
feature_count_upsertedRecords successfully normalised and stored
added_count / changed_count / removed_countDelta from the previous snapshot
quality_scoreAggregate quality score for the snapshot
profile_compatibility_statusWhether the source schema matches the expected normalisation profile

Data quality grades

Emission factor values carry a data quality grade from the source metadata:

GradeMeaning
ADirectly measured or sourced from authoritative primary data
BCalculated from primary data with standard methodology
CEstimated or derived from secondary sources

Grades are preserved from the NGA workbook and surfaced in API responses.

Pedigree scoring

For emission factors, AtlasCore derives a pedigree score based on five dimensions:

DimensionWhat it assesses
ReliabilityHow the data was collected
CompletenessCoverage of the reported population
TemporalHow current the data is
GeographicalRelevance to the Australian context
TechnologicalRelevance to the specific technology or process

Each dimension is scored 1–5 (1 = best). The mean score determines an overall indicator (high, medium, low).

Freshness monitoring

  • NGA factors: Updated annually (July). AtlasCore tracks the active edition and alerts on new publications
  • AEMO CDEII: Updated daily. Grid intensity time series are refreshed on ingest
  • CER NGER: Updated annually. Corporate emissions are versioned by reporting year
  • BOM SILO: Updated daily on S3. Climate grid data is refreshed on ingest

Evidence integrity

Every API response includes:

  • evidence_hash — SHA-256 hash of canonical inputs for deterministic verification
  • Factor set metadata — edition slug, publication date, GWP methodology
  • Amendment history — corrections and restatements tracked with diff counts

Completeness guarantees

AtlasCore does not fabricate or interpolate missing data:

  • Missing grid regions return 404, not estimated values
  • Low-confidence entity resolution matches are flagged, not suppressed
  • Scope 3 coverage gaps are explicitly documented, not padded