Beaver Framework

Leaderboard

Risky Distribution Ratio (RDR) per dataset — the percentage of test instances where a model's estimated risk of violating a behavioral constraint exceeds 10%. Lower RDR is better. Results use the Frontier Verifier. Click any dataset column header to explore full results for that dataset.

Sort by:

Model	Org	Params	Overall
Loading leaderboard data…

RDR = Risk-Defaulting Rate = unsatisfied / count. Lower is better. Color: ■ <1% ■ 1–10% ■ ≥10%. All results use the Frontier Verifier — see methodology. To submit your model, open a pull request on GitHub.

About Beaver Framework

Beaver is an open framework for rigorous LLM evaluation against behavioral constraints, built around deterministic probability bounds rather than sampling estimates.

Constraints & Tasks

Define what "correct" behavior means as a binary predicate on model outputs. Beaver evaluates models against constraints spanning security, privacy, toxicity, stereotyping, and more — each paired with a curated prompt dataset.

Sound Probability Bounds

Rather than sampling estimates, Beaver computes a certified interval [P_LB, P_UB] guaranteed to contain the true constraint-satisfaction probability. Bounds tighten monotonically and are valid at any point during evaluation.

Two Verifiers

The Frontier Verifier exploits prefix-closure to prune the output space early, achieving much tighter bounds per forward pass. The Sampling Verifier provides a simpler baseline that works for any constraint structure.

Read the full overview →