Decision IntelDecision Intel
Pricing
Sign InBook a callTry the Demo
Voluntary standard · v1.0

The Recognition-Rigor Framework

A first-party, voluntary standard for human-AI strategic reasoning. Kahneman’s debiasing tradition and Klein’s Recognition-Primed Decision tradition, run in one pipeline and arbitrated into a single artifact. Any product can self-assess against the rubric below.

See it running end-to-endOpt in to the registryInvestor diligence answer

Why a voluntary standard

Regulators are already asking

The EU AI Act (Article 14 · human oversight; Article 15 · accuracy and record-keeping) and Basel III ICAAP both require documented reasoning on consequential decisions. R²F is one operational answer to "what does good look like."

Both Nobel traditions, not one

Kahneman (2002) and Klein converged in their 2009 paper "Conditions for Intuitive Expertise." The rigor tradition says expert intuition is biased; the recognition tradition says it is load-bearing. Real strategic reasoning needs both — arbitrated, not picked.

Usage beats certification

No third-party body certifies strategy products today. The path to a stable category is consistent usage of a shared rubric. R²F is published so others can self-assess and opt in to the registry.

The six tenets

Three rigor tenets on Kahneman’s side, three recognition tenets on Klein’s. Arbitrated by an explicit meta-judge stage — the glue that makes R²F different from running two pipelines in parallel.

143 cases in the reference corpus
Kahneman · rigorTENET 1

Bias scan

Every memo is scanned against a published, stable taxonomy of cognitive biases.

An R²F-aligned system publishes its bias taxonomy with stable IDs. Scans are reproducible — the same memo produces the same flags on re-run with the same model version.

Decision Intel pipeline stage: Bias Detective
Kahneman · rigorTENET 2

Noise audit

The system measures internal inconsistency, not only bias.

Kahneman (Noise, 2021) separates noise from bias. An R²F-aligned system produces a noise score for every memo — within-memo contradictions, tone variance, and reasoning-chain divergence.

Decision Intel pipeline stage: Noise Judge
Kahneman · rigorTENET 3

Base-rate pull

Every claim is anchored to a reference class.

The system surfaces the base-rate for the category of decision (e.g. cross-border acquisitions, market entry, enterprise software pivots) so the memo is read against prior precedent, not in isolation.

Decision Intel pipeline stage: Ensemble Sampling
Klein · recognitionTENET 4

Pattern match to playbooks

Expert intuition is surfaced, not overruled.

Klein (Sources of Power, 1998) — experts recognise patterns. An R²F-aligned system produces a list of the historical playbooks the memo most resembles, with confidence and analogues.

Decision Intel pipeline stage: Recognition-Primed Decision
Klein · recognitionTENET 5

Forgotten questions

The system names what the memo does not ask.

Strategic failures are disproportionately caused by un-asked questions. An R²F-aligned system flags the high-value questions the memo sidestepped and the ones the CEO or board will most likely raise.

Decision Intel pipeline stage: Forgotten Questions
Klein · recognitionTENET 6

Pre-mortem

Every memo is told from a future in which it failed.

Gary Klein’s pre-mortem technique: assume the decision went wrong, then reason backward to why. An R²F-aligned system produces at least one pre-mortem per memo, anchored in the specific bias signature flagged.

Decision Intel pipeline stage: Pre-mortem

Self-assessment rubric

Ten yes/no questions. Score your own product, not ours. A score of 4+ earns Bronze; 7+ earns Silver; 10 earns Gold. This is a self-assessment — no external audit body certifies the result, and the standard is published so it can be disputed in public.

  1. 1Does the product publish its bias taxonomy with stable IDs?
  2. 2Are bias flags reproducible across runs on the same memo?
  3. 3Is a separate noise score produced for each memo?
  4. 4Is every strategic claim anchored to an explicit reference class?
  5. 5Does the system surface pattern analogues from a published corpus?
  6. 6Are "forgotten questions" flagged as a first-class output?
  7. 7Is a pre-mortem produced for every memo, not just on request?
  8. 8Is the pipeline’s reasoning arbitrated by an explicit meta-judge stage?
  9. 9Does every output trace back to a hashed, tamper-evident record?
  10. 10Does the system treat expert intuition as amplified input, not noise to suppress?
Bronze · R²F-aligned
4 of 10 tenets in place. Minimum bar to claim R²F alignment — bias + noise + one recognition signal.
Silver · R²F-integrated
7 of 10 tenets in place. Both rigor and recognition pipelines are running, including a named meta-judge.
Gold · R²F-arbitrated
All 10 tenets in place. Full Kahneman × Klein synthesis with a signed, reproducible audit record.
R²F DETECTOR ATLAS · 10 OF 10 SHIPPED

Every detector. Every paper. Every line of code.

Each entry below is a procurement-grade signal anchored in a specific academic paper. The 10-paper sprint completed 2026-05-07 with the Calibrated Rejection lock; click any detector to inspect its mechanism, implementation file, and live product surfaces.

Click any detector above to see its mechanism, citation, implementation file, live product surfaces, and a per-detector mini-visualisation of the signal it produces.
Calibration baseline · investor-diligence answer

Brier 0.258 ± 0.012 across 143 historical corporate decisions.

Procurement-stage diligence asks show me your outcome calibration. The published R²F methodology is run retrospectively over 143 historical corporate decisions where the outcome is known — Brier-fair (evidence dimension neutralised, no peek at ground truth). The proper-scoring rule does the rest. The 95% CI comes from a 10,000-iteration deterministic bootstrap with a pinned seed; the reproducibility recipe sits below.

Methodology version progression
v2.0.0DEPRECATED
Legacy

Pre-validity-weighted DQI. Surfaced on audits run before 2026-04-30.

v2.0.0-seedSEED
Platform seed baseline

The number above. Computed retrospectively over the 143-case library; evidence dimension neutralised so predictions don't peek at ground truth.

v2.1.0LIVE
Live audits

Validity-aware weight shift per Kahneman & Klein 2009 first condition. Active on every new audit since 2026-04-30.

vper-orgFUTURE
Customer outcomes

When a customer org accumulates closed outcomes via Outcome Gate enforcement, per-org Brier replaces the seed baseline on every DPR they generate.

Where the seed baseline lands · Tetlock-anchored scale
0.130
Tetlock superforecasters
0.230
CIA analysts (calibrated)
0.258
Decision Intel · platform seed
0.250
Coin-flip benchmark
0.350
Motivated amateur
Seed, not forecast

These cases were audited retrospectively. The platform did NOT predict them ahead of time. The Brier number reads: “if the methodology had been applied at decision time using only pre-decision signal, this is how it would have calibrated against ground truth.”

Per-org Brier supersedes

When a customer org accumulates closed outcomes via Outcome Gate enforcement, per-org calibration replaces the seed baseline on every DPR they generate. Until then, the seed is the contractual answer.

Reproducibility

n = 143 historical corporate decisions · mean Brier 0.258 ± 0.012 (95% CI, 10,000-iteration bootstrap, seed 17039507) · methodology v2.0.0-seed · computed 2026-04-30.

Reproducibility recipe
GET /api/intelligence/calibration-baseline
# Reproduce the Brier baseline locally
import { computePlatformCalibrationBaseline } from '@/lib/learning/platform-baseline';
const baseline = computePlatformCalibrationBaseline();
// → {
//     n: 143,
//     meanBrier: 0.258,
//     brierCi95: { lower: 0.245, upper: 0.270 },
//     bootstrapIterations: 10,000,
//     bootstrapSeed: 17039507,
//     methodologyVersion: '2.0.0-seed',
//   }
The function reads `ALL_CASES` from the case-study library, runs `computeBrierFairPredictedDqi` (the no-evidence-peeking variant of the DQI formula) over each case, and bootstraps the mean with a seeded mulberry32 PRNG. Same seed → same number across machines and dates.
Canonical citation

Kahneman, D. & Klein, G. (2009). Conditions for Intuitive Expertise: A failure to disagree. American Psychologist, 64(6), 515–526. The framing paper where the two traditions converged. R²F is the operationalisation.

Opt in to the registry
R²F Standard · v1.0 · Published 2026-04-23
How it worksAI Verify alignmentBias taxonomySecurity