QomSSLab/CaseTypeClassifier-fa
QomSSLab/CaseTypeClassifier-fa is a Persian legal text classifier that predicts whether a court ruling (رأی) belongs to a civil (حقوقی) or criminal (کیفری) category.
The model is designed for use in Iranian legal NLP pipelines, document organization, and downstream analysis of judicial data.
💡 Use Cases
- Automatic classification of Persian court rulings into civil or criminal categories.
- Preprocessing step for legal analytics and document retrieval systems.
- Assisting legal researchers and developers in structuring Persian legal corpora.
🧠 Model Details
- Language: Persian (Farsi)
- Task: Text Classification
- Classes:
civil(حقوقی),criminal(کیفری) - Pipeline Tag:
text-classification
📦 Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_name = "QomSSLab/CaseTypeClassifier-fa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
text = "در این پرونده متهم به سرقت اموال عمومی محکوم شده است."
result = classifier(text)
print(result)
Example Output:
[
{'label': 'کیفری', 'score': 0.9969141483306885}
]
📊 Evaluation
The model was trained and evaluated on a balanced dataset of Persian court rulings.
It demonstrates high accuracy in distinguishing civil and criminal judgments.
| Metric | Value |
|---|---|
| Training Loss | 0.0358 |
| Validation Loss | 0.033996 |
| Accuracy | 0.9951 |
| F1 Score | 0.9951 |
| Precision | 0.9951 |
| Recall | 0.9951 |
✅ Final Performance: The model achieved 99.51% accuracy and 0.9951 F1-score on the validation set.
Limitations
- Performance may degrade on highly abbreviated or informal texts.
- Designed primarily for Iranian legal language; may not generalize to non-Iranian legal contexts.
- Does not classify subtypes (e.g., family, property, or financial cases).
- Downloads last month
- 155