Magistral-Animus-V12.0-GGUF

Wings_of_Fire

Send me your support to help me feed the data beast! also taking comissions for universe specific models

Support on Ko-fi

Important: Reasoning Format & Backend Setup

This model uses a special reasoning format. There are two methods to enable it: the official format designed by MistralAI, and a legacy format that works due to the base model's pre-training. The correct method depends on your backend software (e.g., llama.cpp, Kobold.cpp).


Official Format: [THINK] (Recommended for llama.cpp)

This is the official instruction format from MistralAI and is the recommended method. It is confirmed to work with backends like llama.cpp (with specific flags) and mistral-common.

  • Llama.cpp Prerequisite: Launch llama.cpp with the --special and --jinja arguments enabled.
  • Instruction Format: The model uses [THINK] and [/THINK] tags.
  • Activation (2 steps):
    1. Set your prefill sequence (in your frontend like SillyTavern) to start with [THINK].
    2. You must also include the keyword /think anywhere in your system prompt to activate the reasoning module.

Recommended System Prompt for Official Format

Add the following to your system prompt to guide the model's output structure:

First draft your thinking process (inner monologue) until you arrive at a response. You must use the /think keyword. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input. Your thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response. Use the same language as the input.[/THINK]Here, provide a self-contained response.

SillyTavern Quick Setup

For a complete SillyTavern configuration, you can download and import this JSON file:

Download SillyTavern JSON

Legacy Format: <think> (For Kobold.cpp & TabbyAPI)

This format is not official but is highly effective with backends like Kobold.cpp and TabbyAPI. It works because the model's predecessor was trained on these angle-bracket tags, and the model inherits this behavior.

  • Instruction Format: Wrap the model's reasoning in <think> and </think> tags.
  • Activation: In your frontend, set your prefill sequence to start with <think>.

See the GitHub Issue for technical details

Quantized Models

The quantized model files are available for download. Click the buttons below to view the files.

Download EXL2 Files Download EXL3 Files

Character Card & Lore Book

For the best roleplaying experience, it is highly recommended to use the provided character card and lore book. These files help guide the model's persona and provide rich, in-universe context.

Download Files

Sampler Presets

For a seamless setup in SillyTavern, you can download the pre-configured JSON file linked in the "Reasoning & Usage" section above.

For those that dont use SillyTavern, the recommended sampler settings are:

    Temp: 1.0

    Min P: 0.02

Roleplay Format Guide

For the best results, use this structured format. This helps the AI clearly distinguish between actions, inner thoughts, and dialogue.

Actions / Descriptions
*He walked across the room and stared out the window.*
Inner Thoughts
*-I wonder what she's thinking.-*
Dialogue
Alex (Curious): "What do you see out there?"

Standard novel-style formatting is also understood, but this structured format is preferred for clarity.

Model Description

This is Version 12.0, an experimental model in the Animus series built with a new focus on reasoning. V12.0 is a direct fine-tune of Darkhn/Magistral-Small-2509-Text-Only (a text-only modification of `mistralai/Magistral-Small-2509`).

V12.0's strength comes from a novel dataset designed to teach the model the why behind the lore, not just the what. The training data is a mix of:

  • A 3,000-example Q&A dataset: This data is framed as an in-character study session, like a student at Jade Mountain Academy learning about the history, relationships, and politics of Pyrrhia's tribes. This provides a deep, contextual understanding of the universe.
  • A 3,000-example uncensored roleplay dataset: The same high-quality, mature roleplay scenarios used in previous versions, ensuring the model maintains its engaging and dynamic narrative capabilities.
  • 900 roleplay reasoning examples: These new examples are designed to teach the model how to "think" through its responses using a special format, improving coherence and logical flow.

The result is a model with exceptionally strong prose and a deep grasp of in-universe lore, making for a highly immersive and accurate roleplaying experience.

Note for roleplay, it follows system prompt and first message, meaning if the first assistant message is short, the following messages will be short.

Training Details

V12.0 Training Process

V12.0 marks a shift from model merging to a focused, direct fine-tuning approach. This allows for greater control over the final model's characteristics.

  • Base Model: Darkhn/Magistral-Small-2509-Text-Only
  • Hardware: 1x NVIDIA H200
  • Training Time: 4 hours
  • Epochs: 1
  • LoRA Rank: 128
  • Context size 8192
  • Scheduler: Cosine

Feature Update: Removal of DM Choices

A key feature in previous test versions—the presentation of multiple-choice actions (e.g., A, B, C) to guide the user—has been removed.

While a promising concept, this feature needs further refinement to ensure it enhances, rather than restricts, the roleplaying experience. It may be reintroduced in a more polished form in a future release. For now, the model returns to a more traditional, open-ended prose format.

Training Dataset

The V12.0 dataset consists of 6,900 high-quality examples, a combination of three distinct types:

  • In-Character Q&A (3,000 examples): This new dataset simulates a student at Jade Mountain Academy studying the world's lore. It's composed of roleplay-style questions and answers covering tribe history, family dynamics, and political relationships. This method builds a foundational, interconnected understanding of the lore.
  • Uncensored Roleplay (3,000 examples): This is the same mature, canon-centric dataset refined for previous versions. It explores pivotal "what-if" scenarios from the books using only canon characters, ensuring the model can handle complex and dramatic narratives.
  • Roleplay Reasoning (900 examples): This new dataset Tunes the Reasoning for roleplaying before generating prose.

Both datasets underwent a rigorous cleaning process to remove formatting artifacts, such as **scene transitions**, resulting in a cleaner and more natural narrative style.

Intended Use & Limitations

  • Intended Use: The primary purpose of this model is for creative and roleplaying within the Wings of Fire universe. However, user feedback indicates it is also highly effective for general-purpose roleplaying.
  • Limitations & Quirks:
    • Performance on tasks outside of its training domain (general knowledge, coding, etc.) is not guaranteed and will likely be poor.
    • Versatility: While it appears to be only a Wings of Fire tuned model, users have reported it is very capable of performing normal roleplay with other settings and characters.
    • The model may "hallucinate" or generate plausible but non-canonical information, especially when pushed outside the established "what-if" scenarios.
    • Content: The training data includes mature and darker themes from the Wings of Fire series, such as conflict, character death, and moral ambiguity. The model is capable of generating content reflecting these themes. As always, it is up to the user what they do with it.
    • Formatting: Training data was cleaned to remove narrative artifacts like **scene transitions**. The model should now produce cleaner prose.
    • Safety: This model has not undergone additional safety alignment beyond what was included in its base model. Standard responsible AI practices should be followed.

Acknowledgements

  • Credit to the Mistral for the powerful Magistral architecture.
  • Credit to Google for the Gemini Pro model, used in dataset generation.
Downloads last month
112
GGUF
Model size
24B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Darkhn-Quants/Magistral-2509-24B-Animus-V12.0-GGUF