AI Capabilities

Capabilities Dashboard

Track how leading model capabilities evolve, see who is pushing the frontier forward, and compare trajectories across reasoning, coding, math, and agent benchmarks.

Last sync

May 1, 2026

New benchmarks

47

New models

133

Frontier leader

GPT-5.4 Pro (web)

What changed

Summary of the latest active Epoch update now powering the dashboard.

Loading benchmark dashboard data...

Mar 5, 2026

GPQA Diamond

GPT-5.4 2026.03 05 (Xhigh)

Visible score: 0.9

Mar 5, 2026

FrontierMath Tiers 1-3

GPT-5.4 Pro 2026.03 05 (Xhigh)

Visible score: 0.5

Mar 5, 2026

WeirdML (v2)

GPT-5.4 2026.03 05 None

Visible score: 0.6

Mar 5, 2026

SWE-bench Verified

GPT-5.4 2026.03 05 (High)

Visible score: 0.8