LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per token, it delivers high-quality generation while maintaining low inference costs. The model fits within 32 GB of RAM, making it practical to run on consumer laptops and desktops without sacrificing capability.
Try NowOn-device high-quality generation
Efficient moe inference on consumer hardware
Low-cost edge deployment
32,768 tokens
32,768 tokens
$0.03
$0.12
$15
$0.19