llama.cpp

Description: C++ implementation of LLaMA for efficient local inference

Website: https://github.com/ggerganov/llama.cpp

llama.cpp is a port of Facebook’s LLaMA model in C/C++. It enables running large language models locally with minimal dependencies.

Features

Download the latest version from GitHub and compile for your platform.