FieldSpace, Deterministic Autonomy
SAFETY CASE EVIDENCE · UL 4600 + ISO 21448 ALIGNED · BIT-REPRODUCIBLE

The deterministic evidence layer for AV safety cases.

FieldSpace populates the behavioral acceptance criteria in the industry-standard safety case framework with bit-reproducible per-event evidence. Built for the safety teams writing the documents Waymo, Zoox, Nuro, and Wayve publish today, and for the insurers underwriting their fleets.

64 / 0
nuPlan scenarios completed / runner failures
0.9660
route progress ratio (leads IDMPlanner 0.9471)
100%
precision on curated Waymo close-interaction
10,000
consecutive replays, 0 diffs (bit-reproducible)
PUBLIC BENCHMARK EVIDENCE

Reproducible evidence the safety case can cite.

The current benchmark package shows FieldSpace running the official 64-scenario nuPlan closed-loop simulation, observing real openpilot drive logs and Waymo Open Motion scenarios, and producing per-event evidence traces that an operator's safety team, an auditor, or an insurer can re-run identically.

0.30-0.56 ms
openpilot replay CPU latency
98.9%
clear rate on comma replay
0
nuPlan 64-scenario runner failures
0.0303 s
FieldSpace first-smoke trajectory compute
WHAT OPERATORS ARE WRITING BY HAND

Safety cases are authored. Every claim needs evidence.

AV operators publish formal safety cases for regulators, insurers, and the public. Waymo's Safety Case Approach, Zoox's VSSA, Nuro's safety reports, and Wayve's safety framework all share the same problem. Evidence has to be reproducible by auditors, per-event, and across an evolving operating domain.

AUTHORING COST

The evidence layer is written by hand.

Internal safety teams spend significant headcount producing per-claim evidence packs to back the GSN structure of the safety case.

REPRODUCIBILITY GAP

Evidence must replay identically for outside reviewers.

An auditor, an insurer, and a regulator should all be able to re-run the same evidence and get the same output. Sampled neural runs cannot.

COVERAGE PRESSURE

Every ODD extension reopens the case.

New conditions, new geographies, and new maneuvers all force fresh evidence. Hand-authoring scales linearly with that surface area.

THE FIELDSPACE ANSWER

A deterministic methodology that maps into the framework you already use.

FieldSpace is an independent evaluative methodology. It populates the behavioral acceptance criteria with bit-reproducible per-event evidence. Same input, same output, every time, for the safety team, the auditor, the regulator, and the insurer.

WHERE FIELDSPACE PLUGS IN

Mapped to the industry-standard acceptance criteria framework.

The Favaro et al. 2023 paper "Building a Credible Case for Safety" defines a five-dimensional acceptance criteria framework for behavioral hazards, and explicitly invites third-party evaluative methodologies to map into it. FieldSpace populates all five dimensions.

AC DIMENSIONFIELDSPACE EVIDENCEREPRODUCIBILITY ARTIFACT
Severity PotentialTTC-bound, risk-field magnitude, predicted delta-VnuPlan TTC 0.9219
Conflict Role (Initiator / Responder)Identified per-event from active repulsive fieldPer-frame JSON trace
Behavioral Capability — RegulatorySpeed-limit + drivable-area compliancenuPlan 1.000 / 1.000
Behavioral Capability — Conflict AvoidanceRoute progress while avoiding conflict initiationnuPlan 0.9660
Behavioral Capability — Collision AvoidanceNo-at-fault collision, early warning lead-timenuPlan 0.9766 / 13-15s lead
Functionality StatusCPU-only engine survives degraded-compute statesnfs-modulus crate
Level of Aggregation — EventPer-frame trace across openpilot + Waymo logs60,019 + 4,550 frames
Level of Aggregation — AggregateRate-based scores across nuPlan + Waymo + openpilot64 scenarios, 50 scenarios, 10 clips
CCA PILLAR · BOTTOM-UP

Credibility of Evidence

Bit-reproducible via exact number theory. Same input, same output, for the operator, the auditor, NHTSA, and the insurer. The strongest possible answer to the bottom-up credibility pillar in Section 4 of the Favaro 2023 paper.

CCA PILLAR · TOP-DOWN

Credibility of Arguments

A different epistemic basis than the ADS being assessed. FieldSpace evidence does not share training data, weights, or sampling assumptions with the system under review, which is what an independent methodology is supposed to provide.

CCA PILLAR · IMPLEMENTATION

Implementation Credibility

Zero runner failures across 192 official nuPlan simulations. CPU-only and training-free, which reduces implementation attack surface and simplifies auditor review.

Source documents: Favaro et al. 2023 "Building a Credible Case for Safety" (arXiv:2306.01917); Webb et al. 2020 "Waymo's Safety Methodologies and Safety Readiness Determinations" (arXiv:2011.00054); Waymo Safety Case Approach White Paper (2020); UL 4600:2022; ISO 21448:2022 (SOTIF); ISO/AWI TS 5083.

HOW IT WORKS

One deterministic observer path. Scene state to evidence.

Five stages, all auditable, all deterministic. Every observer output is a function of its inputs, with no hidden training state or inference variance.

STAGE 01

Perception

Camera + YOLO-class detector + Kalman tracker → object tracks with velocity.

STAGE 02

HD Map

Lanelet2 / OSM map, Frenet projection, route planning with lane-change cost.

STAGE 03

Prediction

1.5 s motion horizon. Map-aware lane-following, CV/CTRV kinematic fallback.

STAGE 04

PDE Field

Continuity + velocity + potential PDEs on 256×64 grid. 0.2 ms solve.

STAGE 05

Evidence

Go / slow / stop output, risk trace, and optional benchmark trajectory candidate.

SAFETY OBSERVER, ALWAYS ON

A repeatable fallback trace, not a black-box alert.

When perception drops, route context breaks, or collision risk rises, FieldSpace records the active trigger, risk state, and recommended fallback phase for engineering review.

PerceptionLossOddViolationOffMapImminentCollisionOperatorRequest
// MRM state machine
Idle
↓ engage(trigger)
Decelerate
↓ speed ≤ target
HoldLane
↓ shoulder available
DriftToShoulder
↓ on shoulder | stopped
Parked ✓
// timeout → Failed → hard brake
PRODUCT ROADMAP

Safety Case Generator today. Counterfactual replay next. Insurer bridge after.

The first deliverable is a per-claim evidence pack for the operator's safety case. The same engine extends to counterfactual replay on incidents and actuarial-grade input for fleet underwriting.

V1 · AVAILABLE TODAY

Safety Case Generator

Per-claim deterministic evidence pack for the operator's published safety case.

  • Populates five behavioral AC dimensions
  • GSN-structured per-claim evidence pack
  • Reproducibility-pinned benchmark traces
  • UL 4600 + ISO 21448 aligned output
  • Auditor and insurer re-run on same input
  • CPU-only, no GPU, no training step
V2 · NEXT GATE

Counterfactual Replay

Drop in an incident log, get a deterministic should-have-happened trace tied to the affected safety claim.

  • Sensor or sim trace ingestion
  • Per-claim impact analysis from one incident
  • Causal-chain mapping into Section 2.2 of the framework
  • Safety-case revision delta on output
  • Defensible answer for post-incident review
V3 · CHANNEL PATH

Insurer Bridge

Quantitative ODD coverage and expected-loss inputs for the underwriters pricing AV fleets.

  • Quantitative ODD coverage metrics
  • Per-mile expected-loss distributions
  • Premium curve sensitivity to ODD restrictions
  • Reinsurance-ready evidence package
  • Same engine, different output shape

v1 lands as a paid validation engagement. v2 follows once an operator's incident-review process is connected. v3 is the insurer-channel multiplier that turns one underwriter win into every fleet they price.

STANDARDS ALIGNMENT

Built for the standards conversation customers already have.

FieldSpace is not claiming vehicle-level certification. We are organizing the observer, replay, and evidence package around the frameworks OEM and Tier-1 safety teams use to review ADAS and autonomy systems.

ISO 26262

Functional safety readiness

Supplier safety plan, SEooC assumptions, traceability, verification evidence, and tool-confidence path for validation use cases.

ISO 21448 / SOTIF

Triggering-condition evidence

Replayable edge cases, false-positive / false-negative review, ODD assumptions, and residual-risk documentation.

ISO 3450x

Scenario-based validation

Scenario taxonomy, ODD tags, source dataset, trigger type, review status, and pass/fail metrics for replay studies.

UL 4600 / ISO 5083

Safety-case structure

GSN-style argument skeletons and evidence registers that can become inputs to an OEM-owned safety case.

ISO/SAE 21434 + ISO 24089

Cybersecurity and updates

Threat analysis, SBOM, vulnerability handling, release integrity, and update-impact planning for software delivery.

SOC 2 / ISO 27001 / TISAX

Customer data readiness

Security-control mapping for hosted replay, partner log handling, access review, retention, and supplier-quality review.

Current status: alignment and gap-assessment preparation. FieldSpace does not claim ISO certification, SOC 2, TISAX, UNECE approval, or vehicle-level compliance. Formal scope depends on assessor review and OEM integration context.

MEASURED, NOT MARKETED

Safety Suite v1 results.

Five safety-critical scenarios run in CARLA with synthetic ground truth. Every scenario: earlier hazard detection, zero false positives, all braking margins met. Real-world replay against 182 k frames of comma openpilot drive logs below.

SCENARIOBASELINE LEADFIELDSPACE LEADGAIN
Falling Debris-0.50s+0.20s+0.70s
Sudden Cut-In-0.10s+0.20s+0.30s
Occluded Pedestrian-0.30s+0.20s+0.50s
Stopped Vehicle-0.40s+0.20s+0.60s
Sliding Cargo-0.80s+0.20s+1.00s
+0.62s
Mean lead-time improvement
0
False positives across all scenarios
1.7 ms
Mean processing latency (p99: 2.6 ms)
4,777
PDE solver FPS (256×64, Numba)
REAL-WORLD · COMMA OPENPILOT REPLAY

182,505 frames of public drive data.

Frame-for-frame replay against comma.ai's openpilot CI route bucket — real cars, real roads, real radar and vision. FieldSpace emitted one warning event and zero false criticals across 31 segments of real driving. 85% fewer spurious alerts than the prior observer.

182 k
frames replayed
−85%
false-positive rate
993×
faster than real-time
reproducibility/tier3_replay_hazardous_provided_vy.jsonFull methodology →
WHO BUYS THIS

Built for the entities carrying the safety liability.

Operators of autonomous fleets carry end-state liability and publish safety cases. Insurers underwrite those fleets and need third-party validation. Both are budget owners with a real problem the safety case generator solves.

PRIMARY ICP

Autonomous fleet operators

Robotaxi and autonomous delivery operators are already publishing formal safety cases for regulators and the public. FieldSpace is the evidence layer underneath.

Waymo · Zoox · Nuro · Wayve · Mobileye Drive · Aurora · Kodiak · Gatik · Serve · Nuro · Pony.ai
CHANNEL ICP

Insurance and reinsurance underwriters

AV insurance teams need actuarial-grade safety evidence to price fleet risk. Winning one underwriter creates a downstream requirement at every fleet they price.

Munich Re Mobility · Swiss Re Corporate Solutions · Liberty Mutual AV · Travelers AV · Tokio Marine
SECONDARY PATHS · ON REQUEST
Defense autonomy programs
Deterministic evidence for unmanned-system safety boards. Ground robotics, loyal-wingman, autonomous logistics. Engaged on a per-program basis.
OEM and Tier-1 validation partners
Available as an independent methodology inside an existing OEM or Tier-1 safety case program. Engaged on request, not the primary GTM.
Closed-site operators
Yards, ports, campuses, and other geofenced operations. Useful for early-stage methodology proof-out where the ODD is narrow.
INTEGRATION

Three engagement patterns.

Start with a paid evidence pack against the operator's existing safety case. Extend into counterfactual replay on incidents. Connect into the insurer channel when the case is defensible enough to underwrite.

PATTERN A · EVIDENCE PACK

Safety Case Generator

Operator ships an ODD specification and stack summary. FieldSpace returns a per-claim evidence pack mapped to the behavioral acceptance criteria, with reproducibility-pinned artifacts for every claim.

Per-claim evidenceGSN-structuredAuditor-replayable
PATTERN B · INCIDENT REPLAY

Counterfactual Replay

Operator submits an incident log. FieldSpace returns the deterministic should-have-happened trace, the affected safety claim, and a structured delta for the next safety case revision.

Sensor or sim tracePer-incident pricingCausal-chain map
PATTERN C · INSURER CHANNEL

Underwriting Bridge

Same engine, different output shape. Quantitative ODD coverage and expected-loss distributions for the underwriter pricing the fleet. Currently in scoping with mobility-focused reinsurance teams.

Actuarial inputsPer-mile loss curvesReinsurance-ready
NVIDIA Inception Program Member
NVIDIA Inception Member
CARLA + Isaac Sim evaluation work

Your safety case needs
a deterministic evidence layer.

Bit-reproducible. Mapped to the acceptance criteria framework you already use. Replayable by your auditors, regulators, and insurers. Bring the safety case. We populate the evidence.