Llama 4
Description: Meta's latest open-source LLM generation with native multimodal capabilities
Website: https://llama.meta.com
Llama 4 was released by Meta on April 5, 2025 and is the fourth generation of the Llama family. It is the first Llama model with Mixture-of-Experts (MoE) architecture and native multimodality.
Model Variants
-
Llama 4 Scout: 17B active parameters (16 experts), best multimodal model in its class, fits on one H100 GPU, supports 10M token context (industry’s longest)
-
Llama 4 Maverick: 17B active parameters (128 experts), surpasses GPT-4o and Gemini 2.0 Flash in many benchmarks
-
Llama 4 Behemoth Preview: 288B active parameters (16 experts), surpasses GPT-4.5, Claude Sonnet 3.7 and Gemini 2.0 Pro
Features
- Native multimodal: text, images, video
- Mixture-of-Experts architecture
- Industry-leading context windows
- Open permissive license
Download
Available on llama.com and Hugging Face. Can be run locally with llama.cpp, Ollama, or LM Studio.