File size: 3,421 Bytes
090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 090a489 fa91ac5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
---
license: mit
datasets:
- stack-dedup-v1.2
tags:
- code
language:
- code
programming_language:
- Python
- Bengali
model-index:
- name: sheikh-coder-v1-3b
results:
- task:
name: Code Completion
type: code-completion
dataset:
name: "Stack Dedup v1.2 + Bengali Tech Content"
type: custom
metrics:
- name: Accuracy
type: accuracy
value: 0.85
verified: false
- name: Cultural Context Score
type: custom
value: 0.90
verified: false
---
# SheikhCoder v1.3b π
A culturally-aware code completion model built on top of Microsoft's Phi-2, fine-tuned with Bengali tech content and MDX-based cultural intelligence.
## Model Description
SheikhCoder is a specialized code completion model that combines the efficiency of Phi-2 with cultural awareness, particularly for Bengali developers. It supports both English and Bengali inputs, and provides contextually appropriate code suggestions.
### Key Features
- π§ 2.7B parameters (Phi-2 base)
- π 2048 token context window
- π¨ MDX-native cultural intelligence
- π Bengali language support
- β‘ 4-bit quantization support
- π Optimized for VS Code/Codespaces
### Use Cases
1. Code Completion with Cultural Context
2. Technical Documentation in Bengali
3. Culturally-Aware Code Comments
4. MDX-Based Documentation Generation
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the model
model = AutoModelForCausalLM.from_pretrained("likhonsheikh/sheikh-coder-v1-3b", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("likhonsheikh/sheikh-coder-v1-3b")
# Example usage
code = """
def calculate_zakat(amount):
# Calculate Islamic Zakat (2.5% of wealth)
"""
inputs = tokenizer(code, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0]))
```
## Model Details
- **Base Model**: Microsoft Phi-2
- **Training Data**: Stack Dedup v1.2 + Bengali Tech Content
- **Parameters**: 2.7B
- **Context Length**: 2048 tokens
- **License**: MIT (following Phi-2)
- **Limitations**: See section below
## Performance and Limitations
- Best suited for code completion and documentation tasks
- May require fine-tuning for specific domains
- Bengali support is primarily for comments and documentation
- Resource requirements:
- RAM: 8GB minimum
- GPU: Optional, but recommended for faster inference
- Disk: ~5GB
## Benchmarks
```
Code Completion (Python):
- Accuracy: 85%
- Cultural Context Score: 90%
- Response Time: <100ms
Documentation Generation:
- BLEU Score: 0.75
- Cultural Relevance: 0.85
```
## Installation
```bash
# With pip
pip install torch transformers
# Optional: for 4-bit quantization
pip install bitsandbytes
```
## Contributing
We welcome contributions! Please check our contribution guidelines and feel free to submit pull requests.
## Citation
```bibtex
@software{sheikh_coder_2025,
author = {Likhon Sheikh},
title = {SheikhCoder: A Culturally-Aware Code Completion Model},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/likhonsheikh/sheikh-coder-v1-3b}
}
```
## License
This model is released under the MIT License, following the licensing of its base model, Phi-2.
## Contact
- GitHub: [@likhonsheikh](https://github.com/likhonsheikh)
- HuggingFace: [@likhonsheikh](https://huggingface.co/likhonsheikh) |