minor typo
#4
by
owao
- opened
README.md
CHANGED
|
@@ -32,7 +32,7 @@ Introducing **Qwen3-Coder-REAP-25B-A3B**, a **memory-efficient compressed varian
|
|
| 32 |
|
| 33 |
This model was created using **REAP (Router-weighted Expert Activation Pruning)**, a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts. Key features include:
|
| 34 |
|
| 35 |
-
- **Near-Lossless Performance**: Maintains almost identical accuracy on code generation, agentic coding, and function calling tasks compared to the full
|
| 36 |
- **20% Memory Reduction**: Compressed from 30B to 25B parameters, significantly lowering deployment costs and memory requirements
|
| 37 |
- **Preserved Capabilities**: Retains all core functionalities including code generation, agentic workflows, repository-scale understanding, and function calling
|
| 38 |
- **Drop-in Compatibility**: Works with vanilla vLLM - no source modifications or custom patches required
|
|
|
|
| 32 |
|
| 33 |
This model was created using **REAP (Router-weighted Expert Activation Pruning)**, a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts. Key features include:
|
| 34 |
|
| 35 |
+
- **Near-Lossless Performance**: Maintains almost identical accuracy on code generation, agentic coding, and function calling tasks compared to the full 30B model
|
| 36 |
- **20% Memory Reduction**: Compressed from 30B to 25B parameters, significantly lowering deployment costs and memory requirements
|
| 37 |
- **Preserved Capabilities**: Retains all core functionalities including code generation, agentic workflows, repository-scale understanding, and function calling
|
| 38 |
- **Drop-in Compatibility**: Works with vanilla vLLM - no source modifications or custom patches required
|