This is simply a patched version of the original Qwen-QwQ-32B model. (https://huggingface.co/Qwen/QwQ-32B)
Changed Functionality:
- This version of the model will "remember thinking" as the conversation progresses. That is to say, all text contained between the <think> tags will be preserved in the context window and be utilized during all future response generations.
Pros: It can remember its previous thinking processes, thus potentially being better at unpacking complex concepts over a series of back-and-forth prompts.
Cons: The context window will fill much faster. And coherence will likely drop-off sooner.
Download GGUF:
- Downloads last month
- 9
Hardware compatibility
Log In
to view the estimation
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support