Data Provenance
Every AtlasCore output is traceable to its government source. This is not optional — it is a core design principle driven by the regulatory requirements of AASB S2 reporting and ISSA 5000 assurance.
Why provenance matters
Sustainability consultants and auditors need to answer: Where did this number come from?
AtlasCore answers this at multiple levels:
- Per-value — every emission factor includes source table, sheet, row number, and data quality grade
- Per-response — every API response includes an
evidence_hashfor integrity verification - Per-report — every disclosure bundle includes a full five-file evidence trail
Five-file evidence bundles
Every report generates five files regardless of the requested output format:
| File | Purpose |
|---|---|
report.json | Structured disclosure data |
report.md | Human-readable Markdown report |
provenance.json | Data lineage — which sources, snapshots, and versions were used |
checksums.json | SHA-256 hashes of all bundle files |
report.pdf / report.xlsx | Formatted output for distribution |
This means even if you only requested a PDF, the full provenance chain is preserved alongside it.
Evidence hashing
Every API response that returns data includes:
evidence_hashin the response body — SHA-256 hash of the canonical inputsX-AtlasCore-Evidence-Hashresponse header — same hash for programmatic access
If you call the same endpoint with the same inputs and the underlying data hasn't changed, you get the same evidence_hash. This provides deterministic verification: auditors can confirm that the data they reviewed matches the data in the disclosure.
Version stability
AtlasCore maintains version stability through:
- Snapshot versioning — every data extraction creates a numbered snapshot. The
latestpolicy always uses the most recent successful snapshot. - Factor set editions — NGA factors are versioned by edition (e.g.
au_nga_2024for the 2023-24 workbook). Superseded editions are preserved, not deleted. - Amendment tracking — when a factor set is re-ingested (correction or restatement), the amendment is recorded with type and diff counts.
- Deterministic outputs — same persisted inputs always produce the same outputs. No live API calls at query time.
Provenance in API responses
Emission factor responses include structured provenance:
{
"source_document_title": "Australian National Greenhouse Accounts Factors",
"source_document_url": "https://www.dcceew.gov.au/...",
"source_table": "Table 1",
"source_sheet": "Table 1",
"source_row_number": 4,
"data_quality_grade": "A",
"evidence_hash": "sha256:..."
}
Company climate profiles include data_as_of timestamps and evidence hashes that trace back to the specific emission, grid intensity, and climate data snapshots used.