006·326
benchmark saturation
/ˈbɛntʃmɑːrk ˌsætʃəˈreɪʃən/ - n.
1 [colloq.] Everyone got an A, so the test resigned.Keep. Punchy.This is the problem.
Working definition
2. The point at which models score near-perfectly on a benchmark, so it no longer distinguishes their real capability.