We synthesize answers from every leading AI model on the hardest problems in physics. Decomposing claims. Surfacing convergence and disagreement. Showing our work.
A single hard physics problem — open or partially open — attacked by the synthesis protocol in real time. Five frontier models generate independent answers. Claims are decomposed into atomic, verifiable units. Convergence and divergence are made visible. The adversarial layer stress-tests every derivation.
This is not a benchmark. It is a live, evolving demonstration of what AI can and cannot do when confronted with problems that matter.
Quarterly synthesis report evaluating every frontier and leading open-weight model on graduate-level physics across six domains: mechanics, electromagnetism, quantum mechanics, thermodynamics, relativity, and statistical mechanics. Closed-problem validation with ground truth. Free to read.
12–15 frontier and open-weight models. 30 closed problems across 6 physics domains. Synthesis protocol v0 vs. majority vote comparison. Which models reason about physics, and which merely pattern-match?