LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is designed to provide higher-quality “thinking” responses in a small 1.2B model.
Try NowFree lightweight reasoning
Edge device thinking task
Budget data extraction or rag
32,768 tokens
32,768 tokens
$0 per 1M tokens
$0 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls