yangdingcheok commited on
Commit
ede5327
Β·
verified Β·
1 Parent(s): a26f37f

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +250 -7
  2. app.py +260 -0
  3. requirements.txt +64 -0
README.md CHANGED
@@ -1,12 +1,255 @@
1
  ---
2
- title: Language Detection
3
- emoji: πŸ“š
4
- colorFrom: green
5
- colorTo: yellow
6
  sdk: gradio
7
- sdk_version: 5.33.0
8
  app_file: app.py
9
- pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Language Detection App
3
+ emoji: 🌍
4
+ colorFrom: indigo
5
+ colorTo: blue
6
  sdk: gradio
7
+ python_version: 3.9
8
  app_file: app.py
9
+ license: mit
10
  ---
11
 
12
+ # 🌍 Language Detection App
13
+
14
+ A powerful and elegant language detection application built with Gradio frontend and a modular backend featuring multiple state-of-the-art ML models organized by architecture and training dataset.
15
+
16
+ ## ✨ Features
17
+
18
+ - **Clean Gradio Interface**: Simple, intuitive web interface for language detection
19
+ - **Multiple Model Architectures**: Choose between XLM-RoBERTa (Model A) and BERT (Model B) architectures
20
+ - **Multiple Training Datasets**: Models trained on standard (Dataset A) and enhanced (Dataset B) datasets
21
+ - **Centralized Configuration**: All model configurations and settings in one place
22
+ - **Modular Backend**: Easy-to-extend architecture for plugging in your own ML models
23
+ - **Real-time Detection**: Instant language detection with confidence scores
24
+ - **Multiple Predictions**: Shows top 5 language predictions with confidence levels
25
+ - **100+ Languages**: Support for major world languages (varies by model)
26
+ - **Example Texts**: Pre-loaded examples in various languages for testing
27
+ - **Model Switching**: Seamlessly switch between different models
28
+ - **Extensible**: Abstract base class for implementing custom models
29
+
30
+ ## πŸš€ Quick Start
31
+
32
+ ### 1. Setup Environment
33
+
34
+ ```bash
35
+ # Create virtual environment
36
+ python -m venv venv
37
+
38
+ # Activate environment
39
+ # On macOS/Linux:
40
+ source venv/bin/activate
41
+ # On Windows:
42
+ venv\Scripts\activate
43
+
44
+ # Install dependencies
45
+ pip install -r requirements.txt
46
+ ```
47
+
48
+ ### 2. Test the Backend
49
+
50
+ ```bash
51
+ # Run tests to verify everything works
52
+ python test_app.py
53
+
54
+ # Test specific model combinations
55
+ python test_model_a_dataset_a.py
56
+ python test_model_b_dataset_b.py
57
+ ```
58
+
59
+ ### 3. Launch the App
60
+
61
+ ```bash
62
+ # Start the Gradio app
63
+ python app.py
64
+ ```
65
+
66
+ The app will be available at `http://localhost:7860`
67
+
68
+ ## 🧩 Model Architecture
69
+
70
+ The system is organized around two dimensions:
71
+
72
+ ### πŸ—οΈ Model Architectures
73
+ - **Model A**: XLM-RoBERTa based architectures - Excellent cross-lingual transfer capabilities
74
+ - **Model B**: BERT based architectures - Efficient and fast processing
75
+
76
+ ### πŸ“Š Training Datasets
77
+ - **Dataset A**: Standard multilingual language detection dataset - Broad language coverage
78
+ - **Dataset B**: Enhanced/specialized language detection dataset - Ultra-high accuracy focus
79
+
80
+ ### πŸ€– Available Model Combinations
81
+
82
+ 1. **Model A Dataset A** - XLM-RoBERTa + Standard Dataset βœ…
83
+ - **Architecture**: XLM-RoBERTa (Model A)
84
+ - **Training**: Dataset A (standard multilingual)
85
+ - **Accuracy**: 97.9%
86
+ - **Size**: 278M parameters
87
+ - **Languages**: 100+ languages
88
+ - **Strengths**: Balanced performance, robust cross-lingual capabilities, comprehensive language coverage
89
+ - **Use Cases**: General-purpose language detection, multilingual content processing
90
+
91
+ 2. **Model B Dataset A** - BERT + Standard Dataset βœ…
92
+ - **Architecture**: BERT (Model B)
93
+ - **Training**: Dataset A (standard multilingual)
94
+ - **Accuracy**: 96.17%
95
+ - **Size**: 178M parameters
96
+ - **Languages**: 100+ languages
97
+ - **Strengths**: Fast inference, broad language support, efficient processing
98
+ - **Use Cases**: High-throughput detection, real-time applications, resource-constrained environments
99
+
100
+ 3. **Model A Dataset B** - XLM-RoBERTa + Enhanced Dataset βœ…
101
+ - **Architecture**: XLM-RoBERTa (Model A)
102
+ - **Training**: Dataset B (enhanced/specialized)
103
+ - **Accuracy**: 99.72%
104
+ - **Size**: 278M parameters
105
+ - **Training Loss**: 0.0176
106
+ - **Languages**: 20 carefully selected languages
107
+ - **Strengths**: Exceptional accuracy, focused language support, state-of-the-art results
108
+ - **Use Cases**: Research applications, high-precision detection, critical accuracy requirements
109
+
110
+ 4. **Model B Dataset B** - BERT + Enhanced Dataset βœ…
111
+ - **Architecture**: BERT (Model B)
112
+ - **Training**: Dataset B (enhanced/specialized)
113
+ - **Accuracy**: 99.85%
114
+ - **Size**: 178M parameters
115
+ - **Training Loss**: 0.0125
116
+ - **Languages**: 20 carefully selected languages
117
+ - **Strengths**: Highest accuracy, ultra-low training loss, precision-optimized
118
+ - **Use Cases**: Maximum precision applications, research requiring highest accuracy
119
+
120
+ ### πŸ—οΈ Core Components
121
+
122
+ - **`BaseLanguageModel`**: Abstract interface that all models must implement
123
+ - **`ModelRegistry`**: Manages model registration and creation with centralized configuration
124
+ - **`LanguageDetector`**: Main orchestrator for language detection
125
+ - **`model_config.py`**: Centralized configuration for all models and language mappings
126
+
127
+ ### πŸ”§ Adding New Models
128
+
129
+ To add a new model combination, simply:
130
+
131
+ 1. Create a new file in `backend/models/` (e.g., `model_c_dataset_a.py`)
132
+ 2. Inherit from `BaseLanguageModel`
133
+ 3. Implement the required methods
134
+ 4. Add configuration to `model_config.py`
135
+ 5. Register it in `ModelRegistry`
136
+
137
+ Example:
138
+ ```python
139
+ # backend/models/model_c_dataset_a.py
140
+ from .base_model import BaseLanguageModel
141
+ from .model_config import get_model_config
142
+
143
+ class ModelCDatasetA(BaseLanguageModel):
144
+ def __init__(self):
145
+ self.model_key = "model-c-dataset-a"
146
+ self.config = get_model_config(self.model_key)
147
+ # Initialize your model
148
+
149
+ def predict(self, text: str) -> Dict[str, Any]:
150
+ # Implement prediction logic
151
+ pass
152
+
153
+ def get_supported_languages(self) -> List[str]:
154
+ # Return supported language codes
155
+ pass
156
+
157
+ def get_model_info(self) -> Dict[str, Any]:
158
+ # Return model metadata from config
159
+ pass
160
+ ```
161
+
162
+ Then add configuration in `model_config.py` and register in `language_detector.py`.
163
+
164
+ ## πŸ§ͺ Testing
165
+
166
+ The project includes comprehensive test suites:
167
+
168
+ - **`test_app.py`**: General app functionality tests
169
+ - **`test_model_a_dataset_a.py`**: Tests for XLM-RoBERTa + standard dataset
170
+ - **`test_model_b_dataset_b.py`**: Tests for BERT + enhanced dataset (highest accuracy)
171
+ - **Model comparison tests**: Automated testing across all model combinations
172
+ - **Model switching tests**: Verify seamless model switching
173
+
174
+ ## 🌐 Supported Languages
175
+
176
+ The models support different language sets based on their training:
177
+
178
+ - **Model A/B + Dataset A**: 100+ languages including major European, Asian, African, and other world languages based on the CC-100 dataset
179
+ - **Model A/B + Dataset B**: 20 carefully selected high-performance languages (Arabic, Bulgarian, German, Greek, English, Spanish, French, Hindi, Italian, Japanese, Dutch, Polish, Portuguese, Russian, Swahili, Thai, Turkish, Urdu, Vietnamese, Chinese)
180
+
181
+ ## πŸ“Š Model Comparison
182
+
183
+ | Feature | Model A Dataset A | Model B Dataset A | Model A Dataset B | Model B Dataset B |
184
+ |---------|-------------------|-------------------|-------------------|-------------------|
185
+ | **Architecture** | XLM-RoBERTa | BERT | XLM-RoBERTa | BERT |
186
+ | **Dataset** | Standard | Standard | Enhanced | Enhanced |
187
+ | **Accuracy** | 97.9% | 96.17% | 99.72% | **99.85%** πŸ† |
188
+ | **Model Size** | 278M | 178M | 278M | 178M |
189
+ | **Languages** | 100+ | 100+ | 20 (curated) | 20 (curated) |
190
+ | **Training Loss** | N/A | N/A | 0.0176 | **0.0125** |
191
+ | **Speed** | Moderate | **Fast** | Moderate | **Fast** |
192
+ | **Memory Usage** | Higher | **Lower** | Higher | **Lower** |
193
+ | **Best For** | Balanced performance | Speed & broad coverage | Ultra-high accuracy | **Maximum precision** |
194
+
195
+ ### 🎯 Model Selection Guide
196
+
197
+ - **πŸ† Model B Dataset B**: Choose for maximum accuracy on 20 core languages (99.85%)
198
+ - **πŸ”¬ Model A Dataset B**: Choose for ultra-high accuracy on 20 core languages (99.72%)
199
+ - **βš–οΈ Model A Dataset A**: Choose for balanced performance and comprehensive language coverage (97.9%)
200
+ - **⚑ Model B Dataset A**: Choose for fast inference and broad language coverage (96.17%)
201
+
202
+ ## πŸ”§ Configuration
203
+
204
+ You can configure models using the centralized configuration system:
205
+
206
+ ```python
207
+ # Default model selection
208
+ detector = LanguageDetector(model_key="model-a-dataset-a") # Balanced XLM-RoBERTa
209
+ detector = LanguageDetector(model_key="model-b-dataset-a") # Fast BERT
210
+ detector = LanguageDetector(model_key="model-a-dataset-b") # Ultra-high accuracy XLM-RoBERTa
211
+ detector = LanguageDetector(model_key="model-b-dataset-b") # Maximum precision BERT
212
+
213
+ # All configurations are centralized in backend/models/model_config.py
214
+ ```
215
+
216
+ ## πŸ“ Project Structure
217
+
218
+ ```
219
+ language-detection/
220
+ β”œβ”€β”€ backend/
221
+ β”‚ β”œβ”€β”€ models/
222
+ β”‚ β”‚ β”œβ”€β”€ model_config.py # Centralized configuration
223
+ β”‚ β”‚ β”œβ”€β”€ base_model.py # Abstract base class
224
+ β”‚ β”‚ β”œβ”€β”€ model_a_dataset_a.py # XLM-RoBERTa + Standard
225
+ β”‚ β”‚ β”œβ”€β”€ model_b_dataset_a.py # BERT + Standard
226
+ β”‚ β”‚ β”œβ”€β”€ model_a_dataset_b.py # XLM-RoBERTa + Enhanced
227
+ β”‚ β”‚ β”œβ”€β”€ model_b_dataset_b.py # BERT + Enhanced
228
+ β”‚ β”‚ └── __init__.py
229
+ β”‚ └── language_detector.py # Main orchestrator
230
+ β”œβ”€β”€ tests/
231
+ β”œβ”€β”€ app.py # Gradio interface
232
+ └── README.md
233
+ ```
234
+
235
+ ## 🀝 Contributing
236
+
237
+ 1. Fork the repository
238
+ 2. Create your feature branch (`git checkout -b feature/new-model-combination`)
239
+ 3. Implement your model following the `BaseLanguageModel` interface
240
+ 4. Add configuration to `model_config.py`
241
+ 5. Add tests for your implementation
242
+ 6. Commit your changes (`git commit -m 'Add new model combination'`)
243
+ 7. Push to the branch (`git push origin feature/new-model-combination`)
244
+ 8. Open a Pull Request
245
+
246
+ ## πŸ“ License
247
+
248
+ This project is open source and available under the MIT License.
249
+
250
+ ## πŸ™ Acknowledgments
251
+
252
+ - **Hugging Face** for the transformers library and model hosting platform
253
+ - **Model providers** for the fine-tuned language detection models used in this project
254
+ - **Gradio** for the excellent web interface framework
255
+ - **Open source community** for the foundational technologies that make this project possible
app.py ADDED
@@ -0,0 +1,260 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from backend.language_detector import LanguageDetector
3
+
4
+ def main():
5
+ # Initialize the language detector with default model (Model A Dataset A)
6
+ detector = LanguageDetector()
7
+
8
+ # Create Gradio interface
9
+ with gr.Blocks(title="Language Detection App", theme=gr.themes.Soft()) as app:
10
+ gr.Markdown("# 🌍 Language Detection App")
11
+ gr.Markdown("Select a model and enter text below to detect its language with confidence scores.")
12
+
13
+ # Model Selection Section with visual styling
14
+ with gr.Group():
15
+ gr.Markdown(
16
+ "<div style='text-align: center; padding: 16px 0 8px 0; margin-bottom: 16px; font-size: 18px; font-weight: 600; border-bottom: 2px solid; background: linear-gradient(90deg, transparent, rgba(99, 102, 241, 0.1), transparent); border-radius: 8px 8px 0 0;'>πŸ€– Model Selection</div>"
17
+ )
18
+
19
+ # Get available models
20
+ available_models = detector.get_available_models()
21
+ model_choices = []
22
+ model_info_map = {}
23
+
24
+ for key, info in available_models.items():
25
+ if info["status"] == "available":
26
+ model_choices.append((info["display_name"], key))
27
+ else:
28
+ model_choices.append((f"{info['display_name']} (Coming Soon)", key))
29
+ model_info_map[key] = info
30
+
31
+ model_selector = gr.Dropdown(
32
+ choices=model_choices,
33
+ value="model-a-dataset-a", # Default to Model A Dataset A
34
+ label="Choose Language Detection Model",
35
+ interactive=True
36
+ )
37
+
38
+ # Model Information Display
39
+ model_info_display = gr.Markdown(
40
+ value=_format_model_info(detector.get_current_model_info()),
41
+ label="Model Information"
42
+ )
43
+
44
+ # Add visual separator
45
+ gr.Markdown(
46
+ "<div style='margin: 24px 0; border-top: 3px solid rgba(99, 102, 241, 0.2); background: linear-gradient(90deg, transparent, rgba(99, 102, 241, 0.05), transparent); height: 2px;'></div>"
47
+ )
48
+
49
+ # Analysis Section
50
+ with gr.Group():
51
+ gr.Markdown(
52
+ "<div style='text-align: center; padding: 16px 0 8px 0; margin-bottom: 16px; font-size: 18px; font-weight: 600; border-bottom: 2px solid; background: linear-gradient(90deg, transparent, rgba(34, 197, 94, 0.1), transparent); border-radius: 8px 8px 0 0;'>πŸ” Language Analysis</div>"
53
+ )
54
+
55
+ with gr.Row():
56
+ with gr.Column(scale=2):
57
+ # Input section
58
+ text_input = gr.Textbox(
59
+ label="Text to Analyze",
60
+ placeholder="Enter text here to detect its language...",
61
+ lines=5,
62
+ max_lines=10
63
+ )
64
+
65
+ detect_btn = gr.Button("πŸ” Detect Language", variant="primary", size="lg")
66
+
67
+ # Example texts
68
+ gr.Examples(
69
+ examples=[
70
+ ["Hello, how are you today?"],
71
+ ["Bonjour, comment allez-vous?"],
72
+ ["Hola, ΒΏcΓ³mo estΓ‘s?"],
73
+ ["Guten Tag, wie geht es Ihnen?"],
74
+ ["γ“γ‚“γ«γ‘γ―γ€ε…ƒζ°—γ§γ™γ‹οΌŸ"],
75
+ ["ΠŸΡ€ΠΈΠ²Π΅Ρ‚, ΠΊΠ°ΠΊ Π΄Π΅Π»Π°?"],
76
+ ["Ciao, come stai?"],
77
+ ["OlΓ‘, como vocΓͺ estΓ‘?"],
78
+ ["δ½ ε₯½οΌŒδ½ ε₯½ε—οΌŸ"],
79
+ ["μ•ˆλ…•ν•˜μ„Έμš”, μ–΄λ–»κ²Œ μ§€λ‚΄μ„Έμš”?"]
80
+ ],
81
+ inputs=text_input,
82
+ label="Try these examples:"
83
+ )
84
+
85
+ with gr.Column(scale=2):
86
+ # Output section
87
+ with gr.Group():
88
+ gr.Markdown(
89
+ "<div style='text-align: center; padding: 16px 0 8px 0; margin-bottom: 12px; font-size: 18px; font-weight: 600; border-bottom: 2px solid; background: linear-gradient(90deg, transparent, rgba(168, 85, 247, 0.1), transparent); border-radius: 8px 8px 0 0;'>πŸ“Š Detection Results</div>"
90
+ )
91
+
92
+ detected_language = gr.Textbox(
93
+ label="Detected Language",
94
+ interactive=False
95
+ )
96
+
97
+ confidence_score = gr.Number(
98
+ label="Confidence Score",
99
+ interactive=False,
100
+ precision=4
101
+ )
102
+
103
+ language_code = gr.Textbox(
104
+ label="Language Code (ISO 639-1)",
105
+ interactive=False
106
+ )
107
+
108
+ # Top predictions table
109
+ top_predictions = gr.Dataframe(
110
+ headers=["Language", "Code", "Confidence"],
111
+ label="Top 5 Predictions",
112
+ interactive=False,
113
+ wrap=True
114
+ )
115
+
116
+ # Status/Info section
117
+ with gr.Row():
118
+ status_text = gr.Textbox(
119
+ label="Status",
120
+ interactive=False,
121
+ visible=False
122
+ )
123
+
124
+ # Event handlers
125
+ def detect_language_wrapper(text, selected_model):
126
+ if not text.strip():
127
+ return (
128
+ "No text provided",
129
+ 0.0,
130
+ "",
131
+ [],
132
+ gr.update(value="Please enter some text to analyze.", visible=True)
133
+ )
134
+
135
+ try:
136
+ # Switch model if needed
137
+ if detector.current_model_key != selected_model:
138
+ try:
139
+ detector.switch_model(selected_model)
140
+ except NotImplementedError:
141
+ return (
142
+ "Model unavailable",
143
+ 0.0,
144
+ "",
145
+ [],
146
+ gr.update(value="This model is not yet implemented. Please select an available model.", visible=True)
147
+ )
148
+ except Exception as e:
149
+ return (
150
+ "Model error",
151
+ 0.0,
152
+ "",
153
+ [],
154
+ gr.update(value=f"Error loading model: {str(e)}", visible=True)
155
+ )
156
+
157
+ result = detector.detect_language(text)
158
+
159
+ # Extract main prediction
160
+ main_lang = result['language']
161
+ main_confidence = result['confidence']
162
+ main_code = result['language_code']
163
+
164
+ # Format top predictions for table
165
+ predictions_table = [
166
+ [pred['language'], pred['language_code'], f"{pred['confidence']:.4f}"]
167
+ for pred in result['top_predictions']
168
+ ]
169
+
170
+ model_info = result.get('metadata', {}).get('model_info', {})
171
+ model_name = model_info.get('name', 'Unknown Model')
172
+
173
+ return (
174
+ main_lang,
175
+ main_confidence,
176
+ main_code,
177
+ predictions_table,
178
+ gr.update(value=f"βœ… Analysis Complete\n\nInput Text: {text[:100]}{'...' if len(text) > 100 else ''}\n\nDetected Language: {main_lang} ({main_code})\nConfidence: {main_confidence:.2%}\n\nModel: {model_name}", visible=True)
179
+ )
180
+
181
+ except Exception as e:
182
+ return (
183
+ "Error occurred",
184
+ 0.0,
185
+ "",
186
+ [],
187
+ gr.update(value=f"Error: {str(e)}", visible=True)
188
+ )
189
+
190
+ def update_model_info(selected_model):
191
+ """Update model information display when model selection changes."""
192
+ try:
193
+ if detector.current_model_key != selected_model:
194
+ detector.switch_model(selected_model)
195
+ model_info = detector.get_current_model_info()
196
+ return _format_model_info(model_info)
197
+ except NotImplementedError:
198
+ return "**This model is not yet implemented.** Please select an available model."
199
+ except Exception as e:
200
+ return f"**Error loading model information:** {str(e)}"
201
+
202
+ # Connect the button to the detection function
203
+ detect_btn.click(
204
+ fn=detect_language_wrapper,
205
+ inputs=[text_input, model_selector],
206
+ outputs=[detected_language, confidence_score, language_code, top_predictions, status_text]
207
+ )
208
+
209
+ # Also trigger on Enter key in text input
210
+ text_input.submit(
211
+ fn=detect_language_wrapper,
212
+ inputs=[text_input, model_selector],
213
+ outputs=[detected_language, confidence_score, language_code, top_predictions, status_text]
214
+ )
215
+
216
+ # Update model info when selection changes
217
+ model_selector.change(
218
+ fn=update_model_info,
219
+ inputs=[model_selector],
220
+ outputs=[model_info_display]
221
+ )
222
+
223
+ return app
224
+
225
+
226
+ def _format_model_info(model_info):
227
+ """Format model information for display."""
228
+ if not model_info:
229
+ return "No model information available."
230
+
231
+ formatted_info = f"""
232
+ **{model_info.get('name', 'Unknown Model')}**
233
+
234
+ {model_info.get('description', 'No description available.')}
235
+
236
+ **πŸ“Š Performance:**
237
+ - Accuracy: {model_info.get('accuracy', 'N/A')}
238
+ - Model Size: {model_info.get('model_size', 'N/A')}
239
+
240
+ **πŸ—οΈ Architecture:**
241
+ - Model Architecture: {model_info.get('architecture', 'N/A')}
242
+ - Base Model: {model_info.get('base_model', 'N/A')}
243
+ - Training Dataset: {model_info.get('dataset', 'N/A')}
244
+
245
+ **🌐 Languages:** {model_info.get('languages_supported', 'N/A')}
246
+
247
+ **βš™οΈ Training Details:** {model_info.get('training_details', 'N/A')}
248
+
249
+ **πŸ’‘ Use Cases:** {model_info.get('use_cases', 'N/A')}
250
+
251
+ **βœ… Strengths:** {model_info.get('strengths', 'N/A')}
252
+
253
+ **⚠️ Limitations:** {model_info.get('limitations', 'N/A')}
254
+ """
255
+ return formatted_info
256
+
257
+
258
+ if __name__ == "__main__":
259
+ app = main()
260
+ app.launch()
requirements.txt ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ aiofiles==24.1.0
2
+ annotated-types==0.7.0
3
+ anyio==4.9.0
4
+ audioop-lts==0.2.1
5
+ certifi==2025.4.26
6
+ charset-normalizer==3.4.2
7
+ click==8.1.8
8
+ fastapi==0.115.12
9
+ ffmpy==0.5.0
10
+ filelock==3.18.0
11
+ fsspec==2025.5.1
12
+ gradio==5.31.0
13
+ gradio_client==1.10.1
14
+ groovy==0.1.2
15
+ h11==0.16.0
16
+ hf-xet==1.1.2
17
+ httpcore==1.0.9
18
+ httpx==0.28.1
19
+ huggingface-hub==0.32.0
20
+ idna==3.10
21
+ Jinja2==3.1.6
22
+ markdown-it-py==3.0.0
23
+ MarkupSafe==3.0.2
24
+ mdurl==0.1.2
25
+ mpmath==1.3.0
26
+ networkx==3.4.2
27
+ numpy==2.2.6
28
+ orjson==3.10.18
29
+ packaging==25.0
30
+ pandas==2.2.3
31
+ pillow==11.2.1
32
+ pydantic==2.11.5
33
+ pydantic_core==2.33.2
34
+ pydub==0.25.1
35
+ Pygments==2.19.1
36
+ python-dateutil==2.9.0.post0
37
+ python-multipart==0.0.20
38
+ pytz==2025.2
39
+ PyYAML==6.0.2
40
+ regex==2024.11.6
41
+ requests==2.32.3
42
+ rich==14.0.0
43
+ ruff==0.11.11
44
+ safehttpx==0.1.6
45
+ safetensors==0.5.3
46
+ semantic-version==2.10.0
47
+ setuptools==80.8.0
48
+ shellingham==1.5.4
49
+ six==1.17.0
50
+ sniffio==1.3.1
51
+ starlette==0.46.2
52
+ sympy==1.14.0
53
+ tokenizers==0.21.1
54
+ tomlkit==0.13.2
55
+ torch==2.7.0
56
+ tqdm==4.67.1
57
+ transformers==4.52.3
58
+ typer==0.15.4
59
+ typing-inspection==0.4.1
60
+ typing_extensions==4.13.2
61
+ tzdata==2025.2
62
+ urllib3==2.4.0
63
+ uvicorn==0.34.2
64
+ websockets==15.0.1