Kimi K2.5

Models Local

Description: 1 Trillion parameter MoE model with visual capabilities and Agent Swarm

Website: https://www.kimi.com

Kimi K2.5 is a state-of-the-art multimodal AI model from Moonshot AI with Mixture-of-Experts (MoE) architecture. With 1 trillion parameters (32B activated per token) and 256K token context window, it offers native visual capabilities, code generation, and parallel agent execution.

Technical Specifications

Special Capabilities

Native Multimodality: Pre-trained on mixed visual and text data for true cross-modal understanding. Processes text, images and videos seamlessly.

Visual Coding: Generates production-ready frontend code directly from text, image and video inputs. Supports interactive layouts and animations.

Agent Swarm (Beta): Coordinates up to 100 parallel sub-agents executing up to 1,500 tool calls simultaneously. Reduces execution time for complex tasks by up to 4.5x.

Multiple Modes: Available as Instant, Thinking, Agent and Agent Swarm (Beta) modes for different use cases.

Performance Benchmarks

Local Execution

K2.5 can be run locally with inference engines like vLLM, SGLang and KTransformers. Requires transformers ≥ 4.57.1. Native INT4 quantization enables more efficient use on consumer hardware.

Use Cases

Available via kimi.com as cloud service and as open-source model on Hugging Face.