NewSee everything we shipped in Launch Week 3
All articles
ResearchMarch 19, 2026· 7 min read

We rebuilt the browser-agent leaderboards

A fairer, reproducible way to compare browser agents on real tasks.

MT
Marco T.
Engineering

Benchmarks age badly. We rebuilt our browser-agent leaderboards around reproducible tasks and a fixed harness, so the numbers mean the same thing every time you read them.

What we measure

  • Task success rate on a fixed set of real sites
  • Median steps to completion
  • Wall-clock time per task

Open by default

The harness and tasks are open so anyone can rerun them. A leaderboard you cannot reproduce is just a screenshot.

Build it on Ferr
Launch your first cloud browser for free.
Start For Free