Hugging Face Open LLM Leaderboard

Description: Technical benchmark leaderboard for open-source LLMs with 6 standardized tests

Website: https://huggingface.co/open-llm-leaderboard

The Hugging Face Open LLM Leaderboard is the central destination for objective benchmarks of open-source language models. It uses standardized tests to separate real progress from marketing.

Evaluation Methodology

6 core benchmarks via Eleuther AI Evaluation Harness:

IFEval: Instruction-following with strict format requirements
BBH: 23 difficult tasks (arithmetic, reasoning, language understanding)
MATH Lvl 5: High school math competitions
GPQA: Graduate-level questions (biology, physics, chemistry)
MuSR: Multistep soft reasoning

Features

Filterable: By model size, license, architecture
Detailed metrics: Full benchmark results per model
Community-driven: 17-person team, 1,696+ followers
Transparency: All evaluation datasets publicly available

Use Case

Ideal for developers who want to objectively compare which open-source model is best suited for their use case.