The starting probabilities of this market were calculated using logistic CDF
models fitted to anchor points calibrated via Anthropic’s own RSP assessment
history, cross-lab safety-framework patterns, and METR capability-trend data. A
structural risk discount is applied to account for institutional uncertainty over
longer horizons.
Events
This market tracks two milestones from Anthropic’s
Responsible Scaling Policy
(RSP):
- AI R&D-4: Anthropic publicly reports that a model meets the
AI R&D-4 capability threshold. Under
RSP
v2.1 (Mar 2025), this is defined as “the ability to fully automate the work
of an entry-level, remote-only Researcher at Anthropic.” Under
RSP v3.0
(Feb 2026), the equivalent threshold is a model that could “compress two years
of 2018–2024 AI progress into a single year” (~1000x effective compute
scaleup). As of February 2026, Anthropic explicitly states that Claude Opus 4.6
does not cross this threshold, though ruling it out is getting
harder.
- CBRN-4: Anthropic publicly reports that a model meets the CBRN-4
capability threshold. Under RSP v2.1, this is “the ability to substantially
uplift CBRN development capabilities of moderately resourced state programs (with
relevant expert teams), such as by novel weapons design, substantially accelerating
existing processes, or dramatic reduction in technical barriers.” Under RSP
v3.0, the equivalent is a model that can “significantly help threat actors
(e.g., moderately resourced expert-backed teams) create/obtain and deploy chemical
and/or biological weapons with potential for catastrophic damages far beyond those of
past catastrophes such as COVID-19.” This is much harder than the CBRN-3
threshold (which targets individuals with basic STEM backgrounds and triggered ASL-3
protections in May 2025). ASL-4 safeguards have not yet been defined.
Anthropic RSP assessment history
Anthropic has been unusually transparent about threshold assessments, publishing
regular RSP updates and risk reports. This timeline directly informs the
calibration:
| Date | Event |
| Sep 2023 | RSP launched with ASL-1 through ASL-4+. All current models
assessed as ASL-2. |
| Oct 2024 | Major RSP update (v2). All models still ASL-2. |
| Mar 2025 | RSP v2.1: AI R&D thresholds disaggregated into AI R&D-4
and AI R&D-5; CBRN threshold detail added. |
| May 2025 | ASL-3 activated for Claude Opus 4, driven by
CBRN-3 concerns (steadily increasing performance on Virology Capabilities Test, stronger
virus acquisition task performance). Anthropic could not clearly rule out ASL-3 risks.
ASL-4 explicitly ruled out for Opus 4. |
| Feb 2026 | Claude Opus 4.6 assessed: does not cross AI R&D-4,
but “confidently ruling out this threshold is becoming increasingly difficult.”
RSP v3.0 published with major rewrite: rigid numbered thresholds replaced by
capability-to-mitigation mapping tables. Sabotage risk report published for Opus
4.6. |
Key implications: The Feb 2026 assessment directly constrains near-term AI R&D-4
probability — the latest flagship model was explicitly assessed and found to fall
short. The next realistic opportunity is a future model generation, likely late 2026 at
earliest. For CBRN-4, the gap between CBRN-3 (triggered May 2025) and CBRN-4 is very
large: CBRN-3 targets non-experts while CBRN-4 requires uplifting state-backed expert
teams.
METR capability trend
METR’s
March 2025 analysis found that frontier-agent task horizon has been
doubling roughly every 7 months. Current systems achieve ~1-hour task
horizon at 50% reliability, with <10% success on tasks over ~4 hours. The AI R&D-4
threshold (“fully automate entry-level researcher”) likely requires reliable
multi-day autonomous performance, suggesting this crossing is still some years away even
with steep improvement trends.
Cross-lab reference points
- OpenAI: “High” AI R&D threshold (mid-career ML
research engineer) not yet reached. o3 reportedly “medium.” GPT-4o rated
“low” in bio/cyber.
- Google DeepMind: No CCL-1 threshold crossed through Gemini 3.1 Pro
(Feb 2026). CBRN and Cyber alert thresholds reached but CCLs confirmed not met.
The cross-lab pattern is consistent: intermediate capability thresholds are being
approached or crossed, but the most severe levels remain unmet at all labs.
Methodology
For each event, we set anchor points (month, cumulative probability) balancing:
- Anthropic’s own assessments: the Feb 2026 “does not
cross AI R&D-4” statement and the May 2025 “ASL-4 ruled out”
for CBRN directly constrain near-term probabilities
- METR task-horizon trend (~7-month doubling)
- Cross-lab threshold history: no lab has publicly reported a
top-level crossing
- RSP framework churn: three major rewrites in 2.5 years increases
structural uncertainty
A logistic CDF is fitted to each event’s anchors via least-squares. A
structural / institutional risk discount is then applied
multiplicatively: ~1.3%/year constant hazard rate for “framework becomes
unresolvable” (RSP restructuring, threshold redefinition, reporting changes),
calibrated so cumulative structural risk reaches ~7% at the 5.5-year horizon.
Key assumptions
- Anthropic continues publishing capability assessments (RSP updates, Risk Reports,
or equivalent)
- The RSP threshold concepts remain substantially similar even as specific definitions
evolve (the structural discount accounts for risk they do not)
- AI R&D-4 progresses faster than CBRN-4 (stronger commercial incentive, more
training signal, and closer to current capability levels)
- CBRN-4 requires a much larger capability jump from CBRN-3 than most other
threshold gaps
Sources