An (actually useful) framework for evaluating AI code review tools
Benchmarks have always promised objectivity. Reduce a complex system to a score, compare competitors on equal footing, and let the numbers speak for themselves.
But, in practice, benchmarks rarely measure “quality” in the abstract. They measure whate...