This is simply a patched version of the original Qwen-QwQ-32B model. (https://huggingface.co/Qwen/QwQ-32B)

Changed Functionality:

  • This version of the model will "remember thinking" as the conversation progresses. That is to say, all text contained between the <think> tags will be preserved in the context window and be utilized during all future response generations.

Pros: It can remember its previous thinking processes, thus potentially being better at unpacking complex concepts over a series of back-and-forth prompts.

Cons: The context window will fill much faster. And coherence will likely drop-off sooner.

Download GGUF:

Downloads last month
9
GGUF
Model size
33B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for theouterspaced/Qwen_QwQ-32B-Q8_0_remember-thinking.gguf

Base model

Qwen/Qwen2.5-32B
Finetuned
Qwen/QwQ-32B
Quantized
(170)
this model