Hugging Face Open LLM Leaderboard
Description: Technical benchmark leaderboard for open-source LLMs with 6 standardized tests
Website: https://huggingface.co/open-llm-leaderboard
The Hugging Face Open LLM Leaderboard is the central destination for objective benchmarks of open-source language models. It uses standardized tests to separate real progress from marketing.
Evaluation Methodology
6 core benchmarks via Eleuther AI Evaluation Harness:
- IFEval: Instruction-following with strict format requirements
- BBH: 23 difficult tasks (arithmetic, reasoning, language understanding)
- MATH Lvl 5: High school math competitions
- GPQA: Graduate-level questions (biology, physics, chemistry)
- MuSR: Multistep soft reasoning
Features
- Filterable: By model size, license, architecture
- Detailed metrics: Full benchmark results per model
- Community-driven: 17-person team, 1,696+ followers
- Transparency: All evaluation datasets publicly available
Use Case
Ideal for developers who want to objectively compare which open-source model is best suited for their use case.