Liquid-Thinking-Preview - GGUF

This repository contains GGUF quantized versions of the Liquid-Thinking model, suitable for fast CPU-based inference.

These files were generated from the full-parameter fine-tuned model using the Unsloth library. For full details on the training process, dataset, and hyperparameters, please refer to the main model card.

Quantization Methods

The following quantizations are provided:

q4_k_m: A 4-bit quantization with K-quants, offering a good balance between model size, speed, and quality. Recommended for general use.
q8_0: An 8-bit quantization that preserves higher quality at the cost of a larger file size. Use this if you have sufficient RAM and need maximum fidelity.

How to Use with `llama.cpp`

You can easily run these GGUF models using llama.cpp.

Clone and build llama.cpp:

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
make

Download a GGUF file: Choose a quantization and download it from the "Files and versions" tab of this repository. For example, to download the Q4_K_M version:

huggingface-cli download kreasof-ai/Liquid-Thinking-Preview-GGUF Liquid-Thinking-GGUF.Q4_K_M.gguf --local-dir .
# Or with wget:
# wget https://huggingface.co/kreasof-ai/Liquid-Thinking-Preview-GGUF/resolve/main/Liquid-Thinking-GGUF.Q4_K_M.gguf

Run inference: This model uses the ChatML prompt template. You can use the -i flag for an interactive session with the correct formatting.

./main -m Liquid-Thinking-GGUF.Q4_K_M.gguf \
  --color \
  -n 2048 \
  -i \
  --reverse-prompt 'user:' \
  --in-prefix ' ' \
  --in-suffix 'assistant:' \
  -c 8192 \
  --temp 0.3 \
  --min-p 0.15 \
  --repeat-penalty 1.05

Alternatively, for a single prompt, use the --chatml flag which is designed for this template structure:

./main -m Liquid-Thinking-GGUF.Q4_K_M.gguf --chatml -p "user: Explain the step-by-step process of photosynthesis in simple terms.\nassistant:"

Use with other tools (LM Studio, Ollama, etc.)

You can also use these GGUF files with any GUI-based tool that supports them:

LM Studio: Search for this model from the Hugging Face integration within the app.
Ollama: Create a Modelfile to import and run the model locally.

Prompt Template

It is crucial to use the correct prompt template for the best results. This model expects the ChatML format:

<|im_start|>user
Your question or instruction here.<|im_end|>
<|im_start|>assistant

Downloads last month: 102

GGUF

Model size

1B params

Architecture

lfm2

Hardware compatibility

4-bit

8-bit

Model tree for kreasof-ai/Liquid-Thinking-Preview-GGUF

Base model

LiquidAI/LFM2-1.2B

Finetuned

kreasof-ai/Liquid-Thinking-Preview

Quantized

(3)