17 points | by chelm 12 hours ago ago
1 comments
a clearly LLM written piece about how frontier models are struggling to get past 76% accuracy on their benchmarks (they call it a "wall") in OCR tasks. that is, feeding it a picture of a document and asking it to extract the text.
The benchmark site is here https://www.idp-leaderboard.org/
They say some specialist models get better results on their benchmarks (Nanonets OCR-3 85.9%)
a clearly LLM written piece about how frontier models are struggling to get past 76% accuracy on their benchmarks (they call it a "wall") in OCR tasks. that is, feeding it a picture of a document and asking it to extract the text.
The benchmark site is here https://www.idp-leaderboard.org/
They say some specialist models get better results on their benchmarks (Nanonets OCR-3 85.9%)