Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -32,7 +32,7 @@ Introducing **Qwen3-Coder-REAP-25B-A3B**, a **memory-efficient compressed varian
32
 
33
  This model was created using **REAP (Router-weighted Expert Activation Pruning)**, a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts. Key features include:
34
 
35
- - **Near-Lossless Performance**: Maintains almost identical accuracy on code generation, agentic coding, and function calling tasks compared to the full 25B model
36
  - **20% Memory Reduction**: Compressed from 30B to 25B parameters, significantly lowering deployment costs and memory requirements
37
  - **Preserved Capabilities**: Retains all core functionalities including code generation, agentic workflows, repository-scale understanding, and function calling
38
  - **Drop-in Compatibility**: Works with vanilla vLLM - no source modifications or custom patches required
 
32
 
33
  This model was created using **REAP (Router-weighted Expert Activation Pruning)**, a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts. Key features include:
34
 
35
+ - **Near-Lossless Performance**: Maintains almost identical accuracy on code generation, agentic coding, and function calling tasks compared to the full 30B model
36
  - **20% Memory Reduction**: Compressed from 30B to 25B parameters, significantly lowering deployment costs and memory requirements
37
  - **Preserved Capabilities**: Retains all core functionalities including code generation, agentic workflows, repository-scale understanding, and function calling
38
  - **Drop-in Compatibility**: Works with vanilla vLLM - no source modifications or custom patches required