Liquid AI
Try LFM β€’ Documentation β€’ LEAP

LFM2.5-1.2B-Base

LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.

Find more information about LFM2.5 in our blog post.

πŸ—’οΈ Model Details

Model Parameters Description
LFM2.5-1.2B-Base 1.2B Pre-trained base model for fine-tuning
LFM2.5-1.2B-Instruct 1.2B General-purpose instruction-tuned model
LFM2.5-1.2B-JP 1.2B Japanese-optimized chat model
LFM2.5-VL-1.6B 1.6B Vision-language model with fast inference
LFM2.5-Audio-1.5B 1.5B Audio-language model for speech and text I/O

LFM2.5-1.2B-Base is the pre-trained text-only checkpoint, used to create all the LFM2.5-1.2B variants. It has the following features:

  • Number of parameters: 1.17B
  • Number of layers: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks)
  • Training budget: 28T tokens
  • Context length: 32,768 tokens
  • Vocabulary size: 65,536
  • Languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish
Model Description
LFM2.5-1.2B-Base Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM.
LFM2.5-1.2B-Base-GGUF Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage.
LFM2.5-1.2B-Base-ONNX ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile).

This pre-trained checkpoint is only recommended for tasks that require heavy fine-tuning, like language-specific (e.g., Japanese) or domain-specific (e.g., medical) assistants, training on proprietary data, or experimenting with novel post-training approaches.

πŸƒ Inference

LFM2.5 is supported by many inference frameworks. See the Inference documentation for the full list.

Name Description Docs Notebook
Transformers Simple inference with direct access to model internals. Link Colab link
vLLM High-throughput production deployments with GPU. Link Colab link
llama.cpp Cross-platform inference with CPU offloading. Link Colab link

Here's a quick start example with transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "LiquidAI/LFM2.5-1.2B-Base"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

prompt = "What is C. elegans?"

input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
).to(model.device)

output = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.3,
    min_p=0.15,
    repetition_penalty=1.05,
    max_new_tokens=512,
    streamer=streamer,
)

πŸ”§ Fine-tuning

We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.

Name Description Docs Notebook
SFT (Unsloth) Supervised Fine-Tuning with LoRA using Unsloth. Link Colab link
SFT (TRL) Supervised Fine-Tuning with LoRA using TRL. Link Colab link
DPO (TRL) Direct Preference Optimization with LoRA using TRL. Link Colab link

Contact

For enterprise solutions and edge deployment, contact sales@liquid.ai.

Citation

@article{liquidai2025lfm2,
  title={LFM2 Technical Report},
  author={Liquid AI},
  journal={arXiv preprint arXiv:2511.23404},
  year={2025}
}
Downloads last month
66
Safetensors
Model size
1B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LiquidAI/LFM2.5-1.2B-Base

Finetunes
5 models
Quantizations
10 models

Collection including LiquidAI/LFM2.5-1.2B-Base

Paper for LiquidAI/LFM2.5-1.2B-Base