--- license: mit language: - en base_model: - microsoft/deberta-v3-base pipeline_tag: text-classification library_name: transformers tags: - text-classification - deberta - it-support - ticket-classification --- # πŸ€– DeBERTa-v3-base for Employee IT Support Ticket Classification [![Model](https://img.shields.io/badge/Model-DeBERTa--v3--base-blue)](https://huggingface.co/microsoft/deberta-v3-base) [![Transformers](https://img.shields.io/badge/Transformers-πŸ€—-yellow.svg)](https://huggingface.co/docs/transformers) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) ## πŸ“– Model Overview This model is a fine-tuned version of **microsoft/deberta-v3-base** for classifying **employee IT support tickets** into **11 categories**. It was trained in two stages: 1. **Domain Adaptation** β€” fine-tuned on ~12k general customer support tickets. 2. **Task Adaptation** β€” fine-tuned on 2.5k synthetic employee IT tickets. The model automates **helpdesk ticket routing** by predicting the correct support category. --- ## πŸ—‚οΈ Labels The model predicts one of the following categories: - `Network` - `Software` - `Account` - `Training` - `Security` - `Licensing` - `Communication` - `RemoteWork` - `Hardware` - `Infrastructure` - `Performance` --- ## 🎯 Intended Uses - **Automated Ticket Routing** β€” Assign new tickets to the right IT team. - **Helpdesk Analytics** β€” Analyze ticket trends. - **Chatbots** β€” Suggest relevant answers or knowledge base articles. ⚠️ **Limitations**: - Synthetic training data may not capture all company-specific jargon. - Validation accuracy is near-perfect, but real-world accuracy expected is **85–95%**. --- ## πŸ’» Usage ```python from transformers import pipeline # Load model classifier = pipeline("text-classification", model="your-username/deberta-it-support") subject = "VPN connection dropping" description = "My VPN disconnects every 15 minutes, preventing access to remote servers." text_input = f"[SUBJECT] {subject} [TEXT] {description}" result = classifier(text_input) print(result) # [{'label': 'RemoteWork', 'score': 0.98}] ``` ## πŸ‹οΈTraining Data | Stage | Dataset | Size | Purpose | |---------|------------------------------------|--------|--------------------| | Stage 1 | Customer Support Tickets (public) | ~12,000 | Domain Adaptation | | Stage 2 | Synthetic Employee IT Tickets | 2,500 | Task Adaptation | ## Hyperparameters | Hyperparameter | Stage 1 | Stage 2 | |------------------------|---------|---------| | Learning Rate | 2e-5 | 5e-6 | | Epochs | 3 | 5 | | Batch Size (per device)| 8 | 8 | | Gradient Accumulation | 4 | 4 | | Optimizer | AdamW | AdamW | | Precision | FP16 | FP16 | ## πŸ“Š Evaluation The final model achieved 99.4% accuracy on the validation split of the synthetic dataset. The best checkpoint was saved using the load_best_model_at_end strategy, based on validation loss. As noted in the limitations, real-world performance will likely be slightly lower but is expected to be high. This model was fine-tuned by [Pulastya/Pulastya0]. Base model microsoft/deberta-v3-base is provided under the MIT license.