Can We Trust AI Benchmarks? A Review of Current Issues in AI Evaluation
Article URL: https://arxiv.org/abs/2502.06559 Comments URL: https://news.ycombinator.com/item?id=43057968 Points: 8 # Comments: 0

Article URL: https://arxiv.org/abs/2502.06559
Comments URL: https://news.ycombinator.com/item?id=43057968
Points: 8
# Comments: 0