FieldSpace, Deterministic Autonomy
PUBLIC BENCHMARK EVIDENCE · ISO-ALIGNED ROADMAP · AUDITABLE

Safety evidence for neural ADAS that engineers can audit.

FieldSpace is a deterministic safety observer and validation layer for autonomy. It runs beside existing ADAS stacks, produces repeatable go / slow / stop evidence, and does not require a fleet-scale neural training loop to start generating useful safety signals.

60,019
white-paper openpilot frames
50
Waymo scenarios observed
64
nuPlan scenarios completed
17x
faster than PlanCNN on first smoke
NEW WHITE PAPER

Deterministic safety evidence for neural ADAS systems.

The current benchmark package shows FieldSpace running on public driving logs, Waymo observer scenarios, official nuPlan closed-loop simulations, and the first shared nuPlan neural-baseline smoke scenario against UrbanDriver and PlanCNN.

0.30-0.56 ms
openpilot replay CPU latency
98.9%
clear rate on comma replay
0
nuPlan 64-scenario runner failures
0.0303 s
FieldSpace first-smoke trajectory compute
THE ASSURANCE GAP

Neural-only validation is hard to audit.

Learned driving models can be powerful, but they leave safety teams with a hard question: why did this scene pass, fail, or change behavior after a model update? FieldSpace adds a deterministic layer around that question.

REPLAY BURDEN

Every edge case needs a repeatable record.

Safety teams need to replay the same scene and inspect the same intermediate signals, not only trust a model score.

EXPLANATION GAP

A good action still needs a reason.

A neural stack may choose the right maneuver, but reviewers still need traceable risk, constraints, and timing behind the decision.

COVERAGE GAP

Edge-case coverage is expensive to argue.

Fleet data, labeling, retraining, and scenario curation are costly. A deterministic observer gives teams another measurement path.

THE FIELDSPACE ANSWER

Add deterministic evidence where neural systems are hardest to inspect.

FieldSpace converts scene state into an auditable risk field, then emits repeatable go / slow / stop evidence. It can run beside an existing ADAS stack before anyone asks it to control a vehicle.

HOW IT WORKS

One deterministic observer path. Scene state to evidence.

Five stages, all auditable, all deterministic. Every observer output is a function of its inputs, with no hidden training state or inference variance.

STAGE 01

Perception

Camera + YOLO-class detector + Kalman tracker → object tracks with velocity.

STAGE 02

HD Map

Lanelet2 / OSM map, Frenet projection, route planning with lane-change cost.

STAGE 03

Prediction

1.5 s motion horizon. Map-aware lane-following, CV/CTRV kinematic fallback.

STAGE 04

PDE Field

Continuity + velocity + potential PDEs on 256×64 grid. 0.2 ms solve.

STAGE 05

Evidence

Go / slow / stop output, risk trace, and optional benchmark trajectory candidate.

SAFETY OBSERVER, ALWAYS ON

A repeatable fallback trace, not a black-box alert.

When perception drops, route context breaks, or collision risk rises, FieldSpace records the active trigger, risk state, and recommended fallback phase for engineering review.

PerceptionLossOddViolationOffMapImminentCollisionOperatorRequest
// MRM state machine
Idle
↓ engage(trigger)
Decelerate
↓ speed ≤ target
HoldLane
↓ shoulder available
DriftToShoulder
↓ on shoulder | stopped
Parked ✓
// timeout → Failed → hard brake
CAPABILITY MAP

What we can evaluate today. What comes next.

The first wedge is not replacing an OEM stack. It is running a deterministic safety layer against public or partner-selected scenarios and turning edge cases into reviewable evidence.

AVAILABLE TODAY

Shadow-mode safety observer

Pilot-ready for log replay, public benchmarks, and narrow edge-case validation.

  • Deterministic risk-field evaluation
  • Replayable go / slow / stop outputs
  • Public comma.ai, Waymo, and nuPlan evidence
  • Audit traces for warning and disagreement cases
  • No neural policy training required
  • CPU-friendly benchmark runtime
NEXT EVIDENCE GATE

Partner-selected edge cases

Focused replay on scenarios that matter to the customer.

  • Cut-in and hard-braking scenes
  • Pedestrian and intersection conflicts
  • False-positive-sensitive highway replay
  • Scenario-by-scenario audit packets
  • Scaled nuPlan neural-baseline comparison
PRODUCTION PATH

OEM safety-layer integration

Integrator-led path from shadow evidence to production review.

  • ASIL decomposition review
  • SOTIF evidence mapping
  • Interface hardening with partner stack
  • Source review under NDA
  • Pilot-to-production commercial model

The near-term commercial value is practical: run FieldSpace beside the current stack, measure disagreements, and inspect the safety evidence before changing production behavior.

STANDARDS ALIGNMENT

Built for the standards conversation customers already have.

FieldSpace is not claiming vehicle-level certification. We are organizing the observer, replay, and evidence package around the frameworks OEM and Tier-1 safety teams use to review ADAS and autonomy systems.

ISO 26262

Functional safety readiness

Supplier safety plan, SEooC assumptions, traceability, verification evidence, and tool-confidence path for validation use cases.

ISO 21448 / SOTIF

Triggering-condition evidence

Replayable edge cases, false-positive / false-negative review, ODD assumptions, and residual-risk documentation.

ISO 3450x

Scenario-based validation

Scenario taxonomy, ODD tags, source dataset, trigger type, review status, and pass/fail metrics for replay studies.

UL 4600 / ISO 5083

Safety-case structure

GSN-style argument skeletons and evidence registers that can become inputs to an OEM-owned safety case.

ISO/SAE 21434 + ISO 24089

Cybersecurity and updates

Threat analysis, SBOM, vulnerability handling, release integrity, and update-impact planning for software delivery.

SOC 2 / ISO 27001 / TISAX

Customer data readiness

Security-control mapping for hosted replay, partner log handling, access review, retention, and supplier-quality review.

Current status: alignment and gap-assessment preparation. FieldSpace does not claim ISO certification, SOC 2, TISAX, UNECE approval, or vehicle-level compliance. Formal scope depends on assessor review and OEM integration context.

MEASURED, NOT MARKETED

Safety Suite v1 results.

Five safety-critical scenarios run in CARLA with synthetic ground truth. Every scenario: earlier hazard detection, zero false positives, all braking margins met. Real-world replay against 182 k frames of comma openpilot drive logs below.

SCENARIOBASELINE LEADFIELDSPACE LEADGAIN
Falling Debris-0.50s+0.20s+0.70s
Sudden Cut-In-0.10s+0.20s+0.30s
Occluded Pedestrian-0.30s+0.20s+0.50s
Stopped Vehicle-0.40s+0.20s+0.60s
Sliding Cargo-0.80s+0.20s+1.00s
+0.62s
Mean lead-time improvement
0
False positives across all scenarios
1.7 ms
Mean processing latency (p99: 2.6 ms)
4,777
PDE solver FPS (256×64, Numba)
REAL-WORLD · COMMA OPENPILOT REPLAY

182,505 frames of public drive data.

Frame-for-frame replay against comma.ai's openpilot CI route bucket — real cars, real roads, real radar and vision. FieldSpace emitted one warning event and zero false criticals across 31 segments of real driving. 85% fewer spurious alerts than the prior observer.

182 k
frames replayed
−85%
false-positive rate
993×
faster than real-time
reproducibility/tier3_replay_hazardous_provided_vy.jsonFull methodology →
GO-TO-MARKET

Three deployment paths where the economics already work.

WEDGE

Closed-campus validation

Warehouse yards, ports, airport ground ops. Start with replay and shadow-mode review before any control authority.

DHL · FedEx · port authorities · airport ground services
DUAL-USE

Defense autonomy review

Deterministic, auditable safety behavior is useful for unmanned systems where review boards need traceable evidence.

DoD · primes · tactical autonomy programs
CAPITAL-LIGHT

Tier-1 / OEM validation

Run as an independent observer against customer logs and public benchmarks, then expand only where the evidence justifies it.

Passenger OEMs · commercial truck OEMs
INTEGRATION

Two ways to deploy.

Start in parallel with zero control authority. For bounded research programs, the same core can also produce trajectory candidates for closed-loop benchmark evaluation.

PATTERN A · SHADOW

Safety Observer

Runs alongside your existing stack. Zero control authority. Ingests fused objects, outputs hazard alerts with full explainability traces.

Parallel evaluationShadow modeLowest risk
PATTERN B · BENCHMARK PLANNER

Bounded closed-loop evaluation

Planner wrapper for simulation and controlled evaluation. Useful for nuPlan-style comparison, failure analysis, and deciding which semantics to add next.

Closed-loop simulationRoute progressFailure analysis
NVIDIA Inception Program Member
NVIDIA Inception Member
CARLA + Isaac Sim evaluation work

Neural ADAS needs
an audit layer.

Deterministic, replayable, and built for technical review. Bring the scenes that matter; we will turn them into evidence your safety team can inspect.