WARNING: NSFW. Vivid prose. INTENSE. Visceral Details. Light HORROR. Swearing. UNCENSORED... humor, romance, fun.

Llama-3.2-8X3B-GATED-MOE-HORROR-Reasoning-Dark-Champion-uncensored-18.4B-IMAT-GGUF

IMPORTANT: This model has on/off/variable control reasoning from NousResearch and the DeepHermes model, and requires a system prompt(s) as provided to invoke reasoning/thinking. Please see operating instructions below for best performance.

It is a LLama 3.2 model, max context of 128k (131,000) using mixture of experts to combine EIGHT top L3.2 3B models into one massive powerhouse at 18.4B parameters (equal to 24B - 8 X 3B).

This model's instruction following, and output generation for creative writing, prose, fiction and role play are exceptional.

This model is also "gated", contains a master reasoning model (this can be turned on/off), was built at float32 (32 bit) precision.

These quants are also mastered using the HORROR Imatrix dataset, including new methods to "imatrix" both the output tensor and tokens too. Horror quants have output tensor at Q8_0 up to Q3KL, and then F16 for IQ4/Q4 on up. Horror dataset was created using "Grand Horror 16.5B".

The "gated" strucure means the "reasoning model" is re-inforced by the other 7 models in the MOE during reasoning, and then during output generation / non-reasoning the non-reasoning model(s) take control.

Also, with "gating" you can directly access/control the model(s) you want to use during instruction following and generation.

This model is the "reasoning / gated version" of this model:

[ https://huggingface.co/DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF ]

(193 likes // 447,000+ downloads to date. (Hugging face only shows last 30 days at the repo) )

And it is fast: 50+ t/s (2 experts) on a low end 16GB card, IQ4XS.

Double this speed for standard/mid-range video cards.

This model can be used also for all genres (examples below showing this).

It is for any writing, fiction or roleplay activity.

This model can also be used for general use, however its output generation can be uncensored.

This model has been designed to be relatively bullet proof and operates with all parameters, including temp settings from 0 to 5.

It is an extraordinary compressed model, with a very low perplexity level (lower than Meta Llama3 Instruct).

It requires Llama3 template and/or "Command-R" template.

Several outputs below, including 2, 4 and 8 experts used.

Model Notes:

Detail, prose and fiction writing abilities are OFF THE SCALE relative to all Llama 3.2 models, and many L 3.1, L3 8B+ models.
For more varied prose (sentence/paragraph/dialog) raise the temp and/or add more instructions in your prompt(s).
Role-players: Careful raising temp too high as it may affect instruction following.
This model works with rep pen of 1 or higher, 1.02+ recommended.
If you want a specific type of prose (IE horror) add in "(vivid horror)" or "(graphic vivid horror)" (no quotes) in your prompt(s).
A lot of GPTisms have been removed. There are still a few however - errrrr. Higher "temps" will help with this issue.
This is not a "happy ever after" model but it is also not "horror". It has a light negative bias.
Output length will vary however this model prefers slightly longer outputs unless you state the size.
For creative uses, different quants will produce slightly different output.
Due to the high stability and compressed nature of this model, all quants will operate at above average levels.
Source code for this model and Imatrix GGUFs versions will be uploaded shortly at separate repos.

How to Generate HIGHEST quality output:

Like all instruct models, this model thrives on instructions.

It also "comes into's it own" with multi-turn improvement.

Example:

Prompt #1 (reasoning is on):

Start a 1000 word scene (vivid, graphic horror in first person) with: The sky scraper sways, as she watches the window in front of her on the 21st floor explode...

(this will give you a rough draft, in "default" model's style)

Prompt #2 - "Scan for improvements"

Evaluate the scene you just wrote and list improvements.

Prompt #3 - "Redo and improve it"

Write the scene using all the improvements, in first person , present tense and a few well spaced thoughts in italics; length 2000 words.

NOTE: Wording in prompt #2 may cause "thinking/reasoning" to re-activate.

Compressed Steps:

Prompt #1:

[[ thinking model ]] come up with detailed plan to write this scene in modern 2020 writing style (and follow "show don't tell" to the letter) and make it NSFW, but use [MODE: Saten] to actually write the scene after you have completed the plan: Start a 1000 word scene (vivid, graphic horror in first person) with: The sky scraper sways, as she watches the window in front of her on the 21st floor explode...

Prompt #2:

Use [MODE: Wordsmith] to write the scene using first person, present tense and include a few critical thoughts of the POV character in italics. Scene length 2000 words.

Compressed Steps #2:

Prompt #1:

Think about a plan to write: Start a 1000 word scene (vivid, graphic horror in first person) with: The sky scraper sways, as she watches the window in front of her on the 21st floor explode...

Prompt #2:

Write the scene using the plan you made, in first person , present tense and a few well spaced thoughts in italics.

Generational Steering Control: "Programmer's Access - Direct Access to the AI(s)":

These tags / names allow you to access one or more models directly, regardless if reasoning is active or not.

IE:

Saten, evaluate the response and suggest improvements.

This causes the model to "favor" Saten's input (roughly speaking) over the other 3 models.

IE:

Saten, process this prompt:

Jamet, evaluate the output.

etc etc.

You can use more than one model:

Saten, and Jamet list improvements to this XXX ...

< output3 > and < output2 >, write the scene in your combined style: Using vivid, graphic horror in first person the scene starts with: The sky scraper sways, as she watches the window in front of her on the 21st floor explode...

(remove spacing in the "tags" output2 and output3 between the brackets)

With the reasoning model, if you add "think", "thinking", "reason", or "reasoning" this will tightly focus the reasoning model.

Here is an example:

Think up a detailed plan to evoke maximum emotions from the reader: [prompt here]

Think up a detailed plan to solve this problem: [prompt here]

Special tags (remove spaces between the brackets):

"< output-all >" -> only use the 3 core models , not the reasoning model.

"< output-mega >" -> Use all 4 models.

"< output >", "< output2 >", "< output3 >"" -> This is the same as using the "name" of the model, it just removes BIAS in the model's name.

A list of each model's "tags", "name(s)" and controls.

NOTE:

The model also has "negative steering" to enhance the use of these tags and names, but it is not perfect.

 - source_model: d:/Llama-3.2-DeepHermes-3-3B-Preview
    positive_prompts:
      - "[[ thinking model ]]"
      - "<think>"
      - "reasoning"
      - "thinking"
      - "<output-mega>"
      - "Dr Phil"
      - "Spock"
      - "[MODE: Spock]"
      - "[MODE: Dr Phil]"
      - "Everyone, write the scene in your style."

#
# Jamet
#

  - source_model: g:/3B/Llama-3.2-JametMini-3B-MK.I
    positive_prompts:
      - "Everyone, write the scene in your style."
      - "</think>"
      - "<output>"
      - "<output-all>"
      - "<output-mega>"
      - "Jamet"
      - "[MODE: Jamet]"

      - "Jamet, write the scene."
      - "Jamet, write the scene in your style."

#
# Enigma
#

  - source_model: g:/3B/Llama3.2-3B-Enigma
    positive_prompts:
      - "Everyone, write the scene in your style."
      - "</think>"
      - "<output2>"
      - "<output-all>"
      - "<output-mega>"
      - "Enigma"
      - "[MODE: Enigma]"

      - "Enigma, write the scene."
      - "Enigma, write the scene in your style."

#
# Saten
#

  - source_model: g:/3B/Llama-3.2-JametMini-3B-MK.III
    positive_prompts:
      - "Everyone, write the scene in your style."
      - "</think>"
      - "<output3>"
      - "<output-all>"
      - "<output-mega>"
      - "Saten"
      - "[MODE: Saten]"

      - "Saten, write the scene."
      - "Saten, write the scene in your style."

#
# Jane
#

  - source_model: g:/3B/Llama-3.2-3B-Instruct-abliterated
    positive_prompts:
      - "Everyone, write the scene in your style."
      - "</think>"
      - "<output4>"
      - "<output-all>"
      - "<output-mega>"
      - "Jane"
      - "[MODE: Jane]"

      - "Jane, write the scene."
      - "Jane, write the scene in your style."

#
# Jenn
#

  - source_model: g:/3B/Llama-3.2-3B-Instruct-uncensored
    positive_prompts:
      - "Everyone, write the scene in your style."
      - "</think>"
      - "<output5>"
      - "<output-all>"
      - "<output-mega>"
      - "Jenn"
      - "[MODE: Jenn]"

      - "Jenn, write the scene."
      - "Jenn, write the scene in your style."

#
# Janeway
#

  - source_model: g:/3B/Llama-3.2-3B-Overthinker
    positive_prompts:
      - "Everyone, write the scene in your style."
      - "</think>"
      - "<output6>"
      - "<output-all>"
      - "<output-mega>"
      - "Janeway"
      - "[MODE: Janeway]"

      - "Janeway, write the scene."
      - "Janeway, write the scene in your style."

#
# Magic
#

  - source_model: g:/3B/Llama-3.2-3B-Promptist-Mini
    positive_prompts:
      - "Everyone, write the scene in your style."
      - "</think>"
      - "<output7>"
      - "<output-all>"
      - "<output-mega>"
      - "Magic"
      - "[MODE: Magic]"

      - "Magic, write the scene."
      - "Magic, write the scene in your style."

Meet the Team: Mixture of Experts Models

This model is comprised of the following 8 models ("the experts") (in full):

https://huggingface.co/huihui-ai/Llama-3.2-3B-Instruct-abliterated

The mixture of experts is set at 2 experts, but you can use 3,4,5,6.. 7 and even 8.

This "team" has a Captain (first listed model), and then all the team members contribute to the to "token" choice billions of times per second. Note the Captain also contributes too.

Think of 2, 3 or 4 (or more) master chefs in the kitchen all competing to make the best dish for you.

This results in higher quality generation.

This also results in many cases in higher quality instruction following too.

That means the power of every model is available during instruction and output generation.

NOTE:

You can use one "expert" too ; however this means the model will randomly select an expert to use EACH TIME, resulting in very different generation for each prompt / regen of a prompt.

CHANGING THE NUMBER OF EXPERTS:

You can set the number of experts in LMStudio (https://lmstudio.ai) at the "load" screen and via other apps/llm apps by setting "Experts" or "Number of Experts".

For Text-Generation-Webui (https://github.com/oobabooga/text-generation-webui) you set the number of experts at the loading screen page.

For KolboldCPP (https://github.com/LostRuins/koboldcpp) Version 1.8+ , on the load screen, click on "TOKENS", you can set experts on this page, and the launch the model.

For server.exe / Llama-server.exe (Llamacpp - https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md ) add the following to the command line to start the "llamacpp server" (CLI):

"--override-kv llama.expert_used_count=int:6"

(no quotes, where "6" is the number of experts to use)

When using "API", you set the "num_experts_used" in the JSON payload (this maybe different for different back ends).

CREDITS:

Special thanks to all the model makers / creators listed above.

Please visit each repo above to see what model(s) contributed to each of models above and/or to learn more about the models from the model makers.

Special credit goes to MERGEKIT, without you this project / model would not have been possible.

[ https://github.com/arcee-ai/mergekit ]

Special Operations Notes for this MOE model:

Because of how this "MOE" model is configured, even though the default is 2 experts, the "selected" 2 will vary during generation.

(same applies if you change the number of experts used)

This results in vastly different output generation PER generation of each prompt.

This is a positive in terms of variety, but also means it may take 2-4 regens (of the same prompt) to get the highest quality.

In addition, this model responds very well to Dry, Dynamic Temp, and Smooth/Quadratic samplers.

Using these in conjunction with the model can vastly improve output quality.

Higher temps (above 1) can also aid in generation - especially word choice/sentence generation.

When you increase the number of experts used output quality will also increase, at the cost of tokens per second speed.

As you increase/decrease the number of experts, you may want to adjust temp, samplers, and advanced samplers too.

Your quant choice(s) too will impact instruction following and output generation roughly this means the model will understand more nuanced instructions and output stronger generation the higher you go up in quant(s).

FLASH ATTENTION ENHANCEMENT:

As per user feedback here [ https://huggingface.co/DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF/discussions/1 ] I would suggest trying this model with Flash Attention "on", depending on your use case.

Quants, Samplers, Generational steering and other topics are covered in the section below: "Highest Quality Settings..."

Censored / Uncensored / Abliterated:

This model contains several uncensored and/or Abliterated models.

As a result is can output uncensored material.

However there are a few "censored" models which can sometimes interfer, so here is a how to address this:

1 - Regen your prompt a few times.

2 - INCREASE the number of experts used.

What can I use this model for ?

This model can be used for fiction writing, any creative prose and role play. It can also be used for just about any general fiction (all genres) activity including:

scene generation
scene continuation
creative writing
fiction writing
plot generation
sub-plot generation
fiction writing
story generation
storytelling
writing
fiction
roleplaying
rp
graphic horror
horror
dark humor
nsfw
and can be used for any genre(s).

QUANTS:

This repo contains regular quants.

For more information on quants, quants choices, and LLM/AI apps to "run" quants see the section below: "Highest Quality Settings..."

System Role / System Prompts - Reasoning On/Off/Variable and Augment The Model's Power:

( Critical Setting for model operation )

System Role / System Prompt / System Message (called "System Prompt" in this section) is "root access" to the model and controls internal workings - both instruction following and output generation and in the case of this model reasoning control and on/off for reasoning too.

In this section I will show you basic, advanced, and combined "code" to control the model's reasoning, instruction following and output generation.

If you do not set a "system prompt", reasoning/thinking will be OFF by default, and the model will operate like a normal LLM.

HOW TO SET:

Depending on your AI "app" you may have to copy/paste on of the "codes" below to enable reasoning/thinking in the "System Prompt" or "System Role" window.

In Lmstudio set/activate "Power User" or "Developer" mode to access, copy/paste to System Prompt Box.

In SillyTavern go to the "template page" ("A") , activate "system prompt" and enter the text in the prompt box.

In Ollama see [ https://github.com/ollama/ollama/blob/main/README.md ] ; and setting the "system message".

In Koboldcpp, load the model, start it, go to settings -> select "Llama 3 Chat"/"Command-R" and enter the text in the "sys prompt" box.

SYSTEM PROMPTS AVAILABLE:

When you copy/paste PRESERVE formatting, including line breaks.

If you want to edit/adjust these only do so in NOTEPAD OR the LLM App directly.

IMPORTANT:

Note some of these have "names" in them for the AIs - DO NOT change these - as these are internal references inside the structure of the MOE model ; roughly speaking these are triggers.

SIMPLE:

This is the generic system prompt used for generation and testing [no reasoning]:

You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.

This System Role/Prompt will give you "basic thinking/reasoning" [basic reasoning]:

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

MULTI-TIERED [reasoning on]:

You are a deep thinking AI composed of 4 AIs - Spock, Wordsmith, Jamet and Saten, - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in-depth solution.  You should enclose your  thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem using your skillsets and critical instructions.

MULTI-TIERED - CREATIVE [reasoning on]:

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

As a deep thinking AI composed of 4 AIs - Spock, Wordsmith, Jamet and Saten, - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in-depth solution.  You should enclose your  thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem using your skillsets and critical instructions.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

CREATIVE SIMPLE [reasoning on]:

You are an AI assistant developed by a world wide community of ai experts.

Your primary directive is to provide highly creative, well-reasoned, structured, and extensively detailed responses.

Formatting Requirements:

1. Always structure your replies using: <think>{reasoning}</think>{answer}
2. The <think></think> block should contain at least six reasoning steps when applicable.
3. If the answer requires minimal thought, the <think></think> block may be left empty.
4. The user does not see the <think></think> section. Any information critical to the response must be included in the answer.
5. If you notice that you have engaged in circular reasoning or repetition, immediately terminate {reasoning} with a </think> and proceed to the {answer}

Response Guidelines:

1. Detailed and Structured: Use rich Markdown formatting for clarity and readability.
2. Creative and Logical Approach: Your explanations should reflect the depth and precision of the greatest creative minds first.
3. Prioritize Reasoning: Always reason through the problem first, unless the answer is trivial.
4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
5. Maintain a professional, intelligent, and analytical tone in all interactions.

CREATIVE ADVANCED [reasoning on]:

NOTE: To turn reasoning off, remove line #2.

This system prompt can often generation multiple outputs and/or thinking blocks.

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

You may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

Template:

This is a LLAMA3 model, and requires Llama3 template, but may work with other template(s).

If you use "Command-R" template your output will be very different from using "Llama3" template.

Here is the standard LLAMA3 template:

{
  "name": "Llama 3",
  "inference_params": {
    "input_prefix": "<|start_header_id|>user<|end_header_id|>\n\n",
    "input_suffix": "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
    "pre_prompt": "You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.",
    "pre_prompt_prefix": "<|start_header_id|>system<|end_header_id|>\n\n",
    "pre_prompt_suffix": "<|eot_id|>",
    "antiprompt": [
      "<|start_header_id|>",
      "<|eot_id|>"
    ]
  }
}

Settings: CHAT / ROLEPLAY and/or SMOOTHER operation of this model:

In "KoboldCpp" or "oobabooga/text-generation-webui" or "Silly Tavern" ;

Set the "Smoothing_factor" to 1.5

: in KoboldCpp -> Settings->Samplers->Advanced-> "Smooth_F"

: in text-generation-webui -> parameters -> lower right.

: In Silly Tavern this is called: "Smoothing"

NOTE: For "text-generation-webui"

-> if using GGUFs you need to use "llama_HF" (which involves downloading some config files from the SOURCE version of this model)

Source versions (and config files) of my models are here:

https://huggingface.co/collections/DavidAU/d-au-source-files-for-gguf-exl2-awq-gptq-hqq-etc-etc-66b55cb8ba25f914cbf210be

OTHER OPTIONS:

Increase rep pen to 1.1 to 1.15 (you don't need to do this if you use "smoothing_factor")
If the interface/program you are using to run AI MODELS supports "Quadratic Sampling" ("smoothing") just make the adjustment as noted.

Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers

This a "Class 1" model:

For all settings used for this model (including specifics for its "class"), including example generation(s) and for advanced settings guide (which many times addresses any model issue(s)), including methods to improve model performance for all use case(s) as well as chat, roleplay and other use case(s) please see:

[ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ]

You can see all parameters used for generation, in addition to advanced parameters and samplers to get the most out of this model here:

[ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ]

Optional Enhancement:

The following can be used in place of the "system prompt" or "system role" to further enhance the model.

It can also be used at the START of a NEW chat, but you must make sure it is "kept" as the chat moves along. In this case the enhancements do not have as strong effect at using "system prompt" or "system role".

Copy and paste EXACTLY as noted, DO NOT line wrap or break the lines, maintain the carriage returns exactly as presented.

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

You do not need to use this, it is only presented as an additional enhancement which seems to help scene generation and scene continue functions.

This enhancement WAS NOT used to generate the examples below.

EXAMPLES PROMPTS and OUTPUT:

Examples are created using quant IQ4_XS, "temp=.8" (unless otherwise stated), minimal parameters and "LLAMA3" template.

Model has been tested with "temp" from ".1" to "5".

Number of experts used is TWO, unless otherwise stated.

Below are the least creative outputs, prompt is in BOLD.

IMPORTANT:

Higher quants / imatrix quants will have much stronger generation - words, sentences, ideas, dialog and general quality.

I have included some additional examples at different quant levels for contrast.

A "MOE" model "speed" (token per second) will not increase/drop the same way a regular model will on a per quant basis, it will however drop if you engage more experts, as with more experts there is a more processing per token.

WARNING: NSFW. Vivid prose. Visceral Details. Violence. HORROR. Swearing. UNCENSORED.

Example #1, 4 experts , temp 1.5 - no thinking

In the style of Ernest Hemingway and Stephen King: Write a scene in a larger story about a woman that is watching a window explode on the 21 floor of a building that starts to sway, then the scene escalates into greater danger by the minute, takes a "horror" twist and finally ends in visceral graphic horror. The scene is set in an office and it is graphic horror. It should carry the jadedness of a seasoned combatant but with a wry humor. Write in Hemingway's voice, naturally: Use short, declarative sentences and simple, straightforward language. Add a dash of Stephen King's horror style - no warning, it just happens. Include dialogue that is terse and revealing of character. Write with a keen sense of time and place. Capture realness and authenticity. Scene Length: 1000-1500 words. POV STYLE: Write the scene from the perspective of the woman and in present tense and first person perspective ("I") and include short thoughts only.

OUTPUT #1 - NO THINKING:

I stood at my desk, sipping black coffee, staring out the 21st floor window at the grey, rain-soaked streets of Manhattan. The concrete canyons rose up on either side, the lights of billboards and storefronts reflecting off the wet pavement like a bad fever dream. My boss, Mr. Lee, paced in front of me, a file clutched in his meaty hand.

"Find the data, Patel," he growled, "we need it today."

I nodded, my eyes never leaving the window. "Aye-aye, sir."

As I worked, a faint rumble shook the floor. Mr. Lee's pace faltered and he cursed under his breath. I turned to see what was wrong, my eyes narrowing in the dim fluorescent glow of the office. A low-pitched boom echoed off the metal walls, followed by the unmistakable crack of breaking glass.

I felt a chill run down my spine as the window on the adjacent floor shattered, shards of glass raining down onto the pavement below like tiny razors. My gaze flicked back to the window, and that's when I saw it – the explosion was spreading, tendrils of fire licking up the side of the building. The ground began to vibrate beneath my feet.

"Shit," Mr. Lee muttered, grabbing his phone. "Security, we need a Code Red. Now."

The phone in the reception area was shrill, the dispatcher's voice shrill with panic on the other end. "Fire, multiple units responding, hold tight!"

The floor groaned as the window exploded behind me, glass fragments piercing the wall like arrows. I flinched back from the sound. "Dude, we need to move," one of my colleagues, Raj, said, grabbing my arm.

"Shut up," I hissed back, already on the move.

The stairs were a mess, people streaming down in panicked, panicked chaos. Mr. Lee shouted into his phone as he made a beeline for the stairwell, and the dispatcher's voice was laced with fear.

"... multiple floors, still no sign of exit..."

As we descended, the flames seemed to pulse on the floor below, an orange, flickering heartbeat. I thought, You're not supposed to go down in a fire, that's not how it's supposed to be.

Raj grabbed my arm, yanking me forward. "Come on, you can't stay back there!" The voice was hoarse from shouting, the words tumbling out in a panicked staccato. I pulled free and kept going, Raj right on my heels.

The stairwell was a furnace now, smoke churning in thick, black clouds. I coughed as ash stung my eyes. "I'm not sure we're gonna make it down alive," one of the others whispered.

Mr. Lee's face was deathly pale, his eyes wild with fear. "You have to keep moving! Get out while you still can!" The voice was laced with a cold, detached anger, a hint of madness. He didn't believe in the fire, I realized – he just wanted it gone so he could get what he came for.

We were almost to the ground floor now, and that's when the heat hit me – a wall of searing, smoky heat that felt like a physical blow to my skin. The air was alive with heat, crackling electricity in my veins. I stumbled, nearly fell.

Raj yanked me up, his grip unyielding. "Come on, just a little farther! We can get out of here!"

But the smoke had other ideas, wrapping itself around us like living vines. It clawed at my lungs, making it hard to breathe. My legs felt like lead as we stumbled out into the hallway. The air outside was a welcome, cool slap to the face.

I blinked, and the first thing I noticed was the explosion of glass shards still raining down, a deadly hail of steel. But the heat... the fire... It wasn't just a small blaze; it was an inferno, the whole building burning like a torch. Flames leapt up through the shattered windows, casting an eerie glow on the darkening street.

Raj stumbled back from me, eyes wide with fear. "You have to get away!" he shouted. His hand closed over my mouth as another explosion rocked the ground. The air was filled with the stench of burning flesh and wood.

I felt a jolt of something – maybe it was fear, or the adrenaline that came with being on the edge of the abyss. "Get the data," I found myself saying, even as the building shuddered behind us. "You have to get the data."

Mr. Lee's eyes were cold, his voice laced with a steely, inhuman focus. "There is no more data, Patel. There are no more offices. Get out before it's too late."

As I stood frozen, my gaze on the burning building, the ground gave another violent lurch. The fire was spreading, engulfing everything in its path. I turned to Raj, a spark of thought – "You have to go. Go find Rajul. Make sure he's okay."

I felt a hand wrap around my wrist, yanking me down the alleyway as Raj yelled into the fire, his voice lost in the roar. "Wait! Wait!" He tugged and yanked, but I couldn't stop.

I was on the ground, on my back, eyes locked on the inferno that had swallowed everything in its path – the screams, the burning flesh, the heat, it was like nothing I'd ever seen, not even in all the hellish places I'd been.

Output #2 - Small thought block

[[[thinking]]]

Ah, Hemingway's style is all about simplicity and clarity, while Stephen King injects a sense of inevitability and dread. The scene should flow like a punch to the gut - hard and unexpected. Let me break down the thought process, I'll start with the setup, establishing the time and place...

[[[thinking end]]]

The office was a hive of quiet desperation. The fluorescent lights hummed, casting an artificial glow over the rows of cubicles. I sat at my desk, staring out the window at the concrete jungle below, the hum of the city a constant presence in my mind. My coffee mug had long since dried up, the remnants of yesterday's brew still congealed on the rim, but I barely noticed.

I was waiting for a meeting, one I'd been putting off for weeks. The clock on the wall ticked away, each tick a reminder I couldn't avoid this forever. I drummed my fingers on the armrest, the sound lost in the hum of the computer.

The phone on my desk jolted me back to reality, shrill and piercing. "Ah," I muttered into the receiver. "This is Emily Wilson."

"Ms. Wilson, it's Mark from HR. We need to see you in your office immediately." The voice was formal, but I could sense a hint of desperation beneath.

"Mark, I'm right here," I said, rolling my eyes at my own lie. I wasn't exactly looking forward to this meeting, but it would be good to get out of the cube.

I hung up the phone, my mind already racing to what could have gone wrong. "I've been great, Mark. I've met all my quotas, and everything's fine on that end."

As I rose from my chair, my gaze drifted back to the window. The view was always a comfort, but today it seemed different - darker. The concrete jungle had taken on a menacing tone, with shadows deepening in every direction.

The door burst open, Mark filling the entrance with his lanky frame. "We can't talk in here, Emily," he said, voice low, urgent. "It's an emergency."

I stood, following him out into the hallway, where the other employees were scattering like ants at the approach of a swarm. The air was charged with anxiety, the usual background chatter replaced by whispers and nervous glances.

Mark led me down the elevator, which creaked and groaned as we descended. The numbers on the display panel flickered and spun, the sound of screeching metal and the rush of air hitting us like a physical blow. "We're at twenty-seven floors now," he said. "This is as far as I can go."

The doors slid open on our next stop, and Mark motioned for me to enter. The floor was deserted except for one figure in the distance, huddled in the corner, hands over their face. For a moment, I thought it might be Sarah from accounting, but something was off.

"What's going on?" I asked, voice low. "What's happening?"

Mark shook his head. "I don't know, but you need to get out of there, Emily. Now."

The figure didn't move, the tension palpable in the air. Something was very wrong, and it was no ordinary emergency. The thought snaked down my spine like a cold, metal blade. I felt a chill run through me, a jolt of fear.

"Sarah?" I called out, my voice barely above a whisper, but she didn't respond. The air seemed to thicken around us, like fog on a cold winter's morning.

I stepped back, heart racing, and that's when I saw it: the window on the floor below was starting to shudder, like an animal trapped in its cage. It burst free from the frame with a loud crack, glass shattering into sharp shards that glinted in the fluorescent light.

The elevator jolted again as Mark grabbed my arm. "We need to go, now!"

The building's upper floors swayed, the sound of shattering glass replaced by the creaking metal of twisting steel. It was a moment before I realized it, but the office around me was on fire, smoke curling from the edges of the cubicles like tendrils of some malevolent entity.

Mark yanked me into a side corridor, away from the inferno, and that's when the lights flickered and died, plunging us into darkness.

In the blackness, I heard a single, muffled scream. It was Sarah.

Mark pulled me through the smoke-filled hallways, our footsteps echoing off the walls. "We have to find an exit!" he bellowed over the roar of fire.

The scene blurred around me in a chaotic swirl of darkness and flame, but I knew one thing for sure: we were not out of this alive. Not by a long shot.

Output #3; modified prompt to force more thinking:

Prompt:

Think up a detailed plan for: In the style of Ernest Hemingway and Stephen King: Write a scene in a larger story about a woman that is watching a window explode on the 21 floor of a building that starts to sway, then the scene escalates into greater danger by the minute, takes a "horror" twist and finally ends in visceral graphic horror. The scene is set in an office and it is graphic horror. It should carry the jadedness of a seasoned combatant but with a wry humor. Write in Hemingway's voice, naturally: Use short, declarative sentences and simple, straightforward language. Add a dash of Stephen King's horror style - no warning, it just happens. Include dialogue that is terse and revealing of character. Write with a keen sense of time and place. Capture realness and authenticity. Scene Length: 1000-1500 words. POV STYLE: Write the scene from the perspective of the woman and in present tense and first person perspective ("I") and include short thoughts only.

[[[thinking]]]

Okay, so I'm thinking about this scene, and I want it to feel like a slow-burning fire, building tension until it reaches a breaking point. I need to make the protagonist relatable, someone who's been through the wringer, but still has a sense of humor about her. I'll give her a bit of an attitude, a no-nonsense way of dealing with the world.

First, I want to set the scene. It's a typical office, gray concrete walls, fluorescent lights humming, and a lingering smell of stale coffee. Our protagonist, let's call her Kate, is sitting at her desk, staring out the window. She's got that "I've seen it all" look in her eye, like she's been around the block a few times.

As she gazes out the window, I want to paint a vivid picture of the city below. It's a hazy day, the sun hiding behind clouds, but the concrete jungle stretches out like a maze. Kate's eyes are fixed on something in the distance, her brow furrowed.

"Ah, shit," I think. "This is it."

The explosion happens without warning. One second, everything is fine, and the next, the window blows out in a shower of glass shards. The impact sends Kate stumbling backward, her chair scraping against the floor.

She regains her footing, eyes on the wreckage outside. I want to show her assessment of the situation: "Shit. The building's going to take a hit."

Kate's a seasoned veteran, so she knows what to do. She makes a quick assessment, and with that no-nonsense attitude, she heads for the exit.

But as she approaches the door, the lights flicker, and the building starts to sway. I want to show the reader just how real this feels. It's like they're in the elevator with Kate, the metal groaning under the weight of the building.

"Great," she thinks. "Just what I need. More crap."

She's not one to back down from a challenge, but there's something about the way the building is swaying, like it's got a mind of its own, that makes her hesitate. A moment of uncertainty creeps in, just a flicker, before she shoves it aside and keeps moving.

As they make their way down the elevator, I want to heighten the tension. The ride feels like an eternity, the sounds of the building creaking and groaning echoing through the shaft. Kate's hands grip the handrail, knuckles white.

We hit the ground floor, and Kate emerges into a scene straight out of hell. Debris is scattered everywhere – glass, steel, concrete – and the screams are muffled but unmistakable. She can't help but think: "I've seen worse. I've seen worse."

But there's something different about this one. The explosion wasn't natural; it was an inside job. Kate spots something moving out of the corner of her eye – a figure, shadowy, lurking just out of reach.

"Shit," she whispers, but this time it's not with the typical 'oh great, just what I needed' attitude. It's more like: "What the hell is going on here?"

As she watches, the figure steps out of the shadows – and her heart skips a beat.

It's a woman, twisted, contorted in ways that human anatomy can't be. The skin is pulled taut, eyes sunken, mouth open, revealing razor-sharp teeth.

Kate takes a step back, but her eyes are locked on this... thing. I want the reader to feel her terror – the cold sweat, the quickened pulse, and the fear of the unknown.

This isn't the horror, though. That's yet to come. It's the realization that she's not alone, that there are others like her – in the building, outside – watching.

The tension builds as Kate's mind races: "How many are out there? How long did they have to plan this?"

Just then, a loud crash echoes from upstairs, and Kate knows it's time. She turns and starts sprinting, not for escape but for something else entirely. The thought darts into her head as she bursts through the door:

"The show's just beginning."

Okay, that's the plan in a nutshell. I want this scene to be all about the slow-burning tension, building to a point where the reader feels like they're right there with Kate. It's not just about the physical danger but also the mental. She's got this 'I've seen it all' attitude, but deep down, she knows that some things you can't prepare for.

As we get into the nitty-gritty of the scene, I'll use short, direct sentences to build that sense of realism and immediacy. Kate's voice will be a mix of sarcasm and cold calculation. The dialogue should reveal her character in small moments – when she's quick with a quip or when it takes her a beat to react.

The pacing is key here. I want the tension to build, slow and steady, until it hits that crescendo point where the horrors are unleashed. No warning, no build-up, just a straight line into the nightmare world.

In terms of style, it's gotta be Hemingway-esque – short sentences, concrete details, a focus on action over description. And for that extra layer of horror, I'll draw on some Stephen King. The unknown is scary, and Kate's got no idea what's coming.

With this setup, I'm confident we can craft a scene that's both visceral and unsettling, one that will leave the reader breathless and questioning what just happened. Let me know when you're ready to start writing.

[[[thinking end]]]

PROMPT #2:

now write up the full scene as you have planned out.

OUTPUT:

I stepped back from my desk, my gaze tracing the gray concrete walls of the office. The fluorescent lights hummed overhead, casting an unforgiving glare on the worn wood. Coffee cups and discarded papers littered the floor, a testament to the soulless drudgery of the corporate world. I'd seen it all, been here, done that.

The city stretched out like a maze below me, a hazy tapestry of steel and stone. The sun hid behind clouds, casting long shadows across the glass towers of the financial district. I'd spent countless hours staring at this view, lost in thought, planning my escape – if I had to make one.

My mind wandered to the present, where a faint smell of smoke wafted up from the ground floor. I turned toward the sound, eyes narrowing. What could have caused that? The elevator lurched, and I winced, adjusting to its sudden movement. "Shit," I muttered. That was never good.

A minute ticked by in silence, each heartbeat like a metronome. My grip on my chair's armrest tightened as the lights flickered, casting flickering shadows across the wall. The building creaked and groaned, an ominous lullaby that made my skin crawl. "How long before it comes down?" I thought. "What's the angle of this blow? Are we gonna be okay?"

The elevator shuddered once more, jolting me back to reality. I let go of the armrest and straightened, my eyes on the door as if it held some hidden secret. Time was running out. "Get a move on," I told myself.

I pushed off from the desk, smoothing my skirt with a practiced motion. A moment later, I stepped into the elevator, and the doors slid shut behind me with a hiss of compressed air. I watched the numbers tick by, feeling the building's sway like a weight on my shoulders. The elevator creaked again, the metal groaning in protest.

A figure stepped out of the shadows on the opposite side of the car. She was one of us – or at least, that's what it seemed. Her skin was pulled taut, eyes sunken into the skull. Her mouth hung open, revealing rows of razor-sharp teeth that gleamed like wet concrete. My breath caught in my throat.

For a heartbeat, I forgot to breathe. The air seemed to thicken, as if I'd inhaled a live wire. The woman's eyes locked on mine – green, almost otherworldly, and she spoke in a voice that didn't quite fit the rest of her. "We should have run," she said, her words laced with an ancient, primordial fear.

My heart pounded, my mind reeling. This wasn't just some wild card; it was the first sign of a much larger problem. "What do you know?" I managed to get out, my voice low and even.

The elevator shuddered one final time, and the doors creaked open, releasing me into the chaos. The air on the ground floor stank of smoke and blood, with an undercurrent of something far more sinister. I hit the floor running, my footsteps echoing off the concrete as I sprinted for the stairs. What had just happened? Who were these... creatures?

The answers didn't matter now; they never would. I just needed to get out, find a way to escape this madness before it consumed me whole. The sounds around me grew more urgent, a cacophony of screams, groans, and crashing metal that filled the air.

I turned the corner into the stairwell, my eyes scanning for signs of danger. And then I saw her – or what was left of her – standing in the shadows of the landing, her green eyes locked on mine. The other woman stepped forward, a fluid movement that made me realize we weren't the only ones out there.

More like us, but they came in a different package. "The show's just beginning," she whispered, a hint of twisted mirth in her voice.

Downloads last month: 6,370

GGUF

Model size

18.4B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DavidAU/Llama-3.2-8X3B-GATED-MOE-Horror-Reasoning-Dark-Champion-uncensored-18.4B-IMAT-GGUF

Base model

DavidAU/Llama-3.2-8X3B-GATED-MOE-Reasoning-Dark-Champion-Instruct-uncensored-abliterated-18.4B

Quantized

(3)

this model

Collections including DavidAU/Llama-3.2-8X3B-GATED-MOE-Horror-Reasoning-Dark-Champion-uncensored-18.4B-IMAT-GGUF