No probability history is available for this market yet.
Anthropic safety capability thresholds
This market contains two independent events, each resolving to the earliest month in which the condition is met:
- AI R&D-4: Anthropic publicly reports (in an RSP update, Risk Report, System Card, or blog post) that one of its models meets or exceeds the AI R&D-4 capability threshold as defined in the Responsible Scaling Policy. Under RSP v2.1, this is ‘the ability to fully automate the work of an entry-level, remote-only Researcher at Anthropic.’ Under RSP v3.0, the equivalent is a model that could ‘compress two years of 2018–2024 AI progress into a single year.’
- CBRN-4: Anthropic publicly reports that one of its models meets or exceeds the CBRN-4 capability threshold. Under RSP v2.1, this is ‘the ability to substantially uplift CBRN development capabilities of moderately resourced state programs (with relevant expert teams), such as by novel weapons design, substantially accelerating existing processes, or dramatic reduction in technical barriers.’
Each event resolves independently. Resolution requires an official Anthropic publication (not leaked documents or third-party evals). If Anthropic discontinues the RSP or renames capability levels, a substantially equivalent assessment under the successor framework counts.
0 comments
filter:
sort: