GoConqurer commited on
Commit
760f6ef
Β·
1 Parent(s): 67e2508

πŸ”§ Update OCR model in handlers.py and clean up README.md

Browse files

βœ… Changes:
- Updated OCR model from "microsoft/Florence-2-base" to "microsoft/Florence-2-large" for improved performance.
- Removed outdated architecture and deployment sections from README.md for clarity and conciseness.

πŸš€ This enhances the application's capabilities and streamlines documentation.

Files changed (2) hide show
  1. README.md +0 -241
  2. ui/handlers.py +1 -1
README.md CHANGED
@@ -14,7 +14,6 @@ license: mit
14
 
15
  [![Deploy to HuggingFace](https://img.shields.io/badge/πŸ€—-Deploy%20to%20Spaces-blue)](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
16
  [![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/KumarAmrit30/textlens-ocr)
17
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
18
  [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
19
 
20
  A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.
@@ -56,29 +55,6 @@ A state-of-the-art Vision-Language Model (VLM) based OCR application that extrac
56
  - **Graceful Shutdown**: Signal handling for clean application restarts
57
  - **Production Ready**: Scalable architecture with automated deployment
58
 
59
- ## πŸ—οΈ Architecture
60
-
61
- ```
62
- textlens-ocr/
63
- β”œβ”€β”€ πŸ“± Frontend (Gradio UI)
64
- β”‚ β”œβ”€β”€ ui/interface.py # Main interface components
65
- β”‚ β”œβ”€β”€ ui/handlers.py # Event handlers & logic
66
- β”‚ └── ui/styles.py # CSS styling & themes
67
- β”œβ”€β”€ 🧠 AI Models
68
- β”‚ └── models/ocr_processor.py # OCR engine with fallbacks
69
- β”œβ”€β”€ πŸ”§ Utilities
70
- β”‚ └── utils/image_utils.py # Image preprocessing
71
- β”œβ”€β”€ πŸš€ Deployment
72
- β”‚ β”œβ”€β”€ .github/workflows/ # CI/CD pipelines
73
- β”‚ β”œβ”€β”€ scripts/deploy.py # Manual deployment tools
74
- β”‚ └── deployment.config.yml # Deployment configuration
75
- β”œβ”€β”€ πŸ“š Documentation
76
- β”‚ β”œβ”€β”€ README.md # Main documentation
77
- β”‚ └── DEPLOYMENT.md # Deployment guide
78
- └── βš™οΈ Configuration
79
- β”œβ”€β”€ app.py # Main application entry
80
- └── requirements.txt # Dependencies
81
- ```
82
 
83
  ## πŸš€ Quick Start
84
 
@@ -172,41 +148,7 @@ INTERFACE_WIDTH = "100%"
172
  | `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` |
173
  | `CUDA_VISIBLE_DEVICES` | GPU selection | All available |
174
 
175
- ## πŸš€ Deployment
176
-
177
- ### πŸ€— HuggingFace Spaces (Recommended)
178
-
179
- **Automatic Deployment:**
180
-
181
- 1. Fork this repository
182
- 2. Push to `main`/`master` branch
183
- 3. GitHub Actions automatically deploys to HuggingFace Spaces
184
- 4. Access your deployed app at: `https://huggingface.co/spaces/USERNAME/textlens-ocr`
185
-
186
- **Manual Deployment:**
187
-
188
- 1. Go to [GitHub Actions](https://github.com/KumarAmrit30/textlens-ocr/actions)
189
- 2. Select "Deploy to HuggingFace Spaces"
190
- 3. Click "Run workflow"
191
- 4. Choose deployment type:
192
- - **Direct**: Quick deployment to production
193
- - **Blue-Green**: Zero downtime with staging validation
194
-
195
- ### πŸ”„ Zero Downtime Deployment
196
-
197
- Our enterprise-grade deployment system ensures **zero downtime** for users:
198
 
199
- **Features:**
200
-
201
- - πŸ”΅ **Blue-Green Deployment**: Test in staging before production
202
- - πŸ₯ **Health Monitoring**: Automatic health checks with retry logic
203
- - πŸ”„ **Graceful Shutdown**: Clean application restarts
204
- - πŸ“Š **Real-time Monitoring**: Deployment status tracking
205
-
206
- **Health Endpoints:**
207
-
208
- - `GET /health` - Application health status
209
- - `GET /ready` - Application readiness check
210
 
211
  **Deployment Flow:**
212
 
@@ -220,170 +162,6 @@ graph LR
220
  F --> G[Complete βœ…]
221
  ```
222
 
223
- ### 🐳 Docker Deployment
224
-
225
- ```dockerfile
226
- FROM python:3.9-slim
227
-
228
- WORKDIR /app
229
- COPY requirements.txt .
230
- RUN pip install -r requirements.txt
231
-
232
- COPY . .
233
- EXPOSE 7860
234
-
235
- CMD ["python", "app.py"]
236
- ```
237
-
238
- Build and run:
239
-
240
- ```bash
241
- docker build -t textlens-ocr .
242
- docker run -p 7860:7860 textlens-ocr
243
- ```
244
-
245
- ### ☁️ Cloud Platforms
246
-
247
- | Platform | Status | Guide |
248
- | ---------------------- | ------------- | ------------------------------------------------------------------- |
249
- | **HuggingFace Spaces** | βœ… Ready | [Deploy Now](https://huggingface.co/spaces/GoConqurer/textlens-ocr) |
250
- | **Google Colab** | βœ… Compatible | Open in Colab |
251
- | **AWS/GCP/Azure** | πŸ”§ Docker | Use Docker deployment |
252
- | **Heroku** | ⚠️ Limited | GPU not available |
253
-
254
- ## πŸ§ͺ Testing & Development
255
-
256
- ### πŸ” Running Tests
257
-
258
- ```bash
259
- # Basic functionality test
260
- python -c "
261
- from models.ocr_processor import OCRProcessor
262
- ocr = OCRProcessor()
263
- print(f'βœ… Model loaded: {ocr.get_model_info()}')
264
- "
265
-
266
- # Test with sample image
267
- python -c "
268
- from PIL import Image
269
- from models.ocr_processor import OCRProcessor
270
- import requests
271
-
272
- # Download test image
273
- img_url = 'https://via.placeholder.com/300x100/000000/FFFFFF?text=Hello+World'
274
- image = Image.open(requests.get(img_url, stream=True).raw)
275
-
276
- # Test OCR
277
- ocr = OCRProcessor()
278
- result = ocr.extract_text(image)
279
- print(f'βœ… OCR Result: {result}')
280
- "
281
- ```
282
-
283
- ### πŸ› οΈ Development Tools
284
-
285
- ```bash
286
- # Install development dependencies
287
- pip install -r requirements.txt
288
-
289
- # Format code
290
- black . --line-length 88
291
-
292
- # Type checking
293
- mypy models/ utils/ ui/
294
-
295
- # Lint code
296
- flake8 --max-line-length 88
297
- ```
298
-
299
- ## πŸ“š API Reference
300
-
301
- ### OCRProcessor Class
302
-
303
- ```python
304
- from models.ocr_processor import OCRProcessor
305
-
306
- # Initialize processor
307
- ocr = OCRProcessor(
308
- model_name="microsoft/Florence-2-base", # Model selection
309
- device=None, # Auto-detect device
310
- torch_dtype=None # Auto-select dtype
311
- )
312
-
313
- # Extract text from image
314
- text = ocr.extract_text(image)
315
- # Returns: str
316
-
317
- # Extract text with bounding boxes
318
- result = ocr.extract_text_with_regions(image)
319
- # Returns: dict with text and regions
320
-
321
- # Get model information
322
- info = ocr.get_model_info()
323
- # Returns: dict with model details
324
-
325
- # Cleanup resources
326
- ocr.cleanup()
327
- ```
328
-
329
- ### Health Check API
330
-
331
- ```bash
332
- # Check application health
333
- curl https://huggingface.co/spaces/GoConqurer/textlens-ocr/health
334
-
335
- # Response:
336
- {
337
- "status": "healthy",
338
- "timestamp": 1640995200,
339
- "version": "1.0.0",
340
- "environment": "production"
341
- }
342
-
343
- # Check readiness
344
- curl https://huggingface.co/spaces/GoConqurer/textlens-ocr/ready
345
-
346
- # Response:
347
- {
348
- "status": "ready",
349
- "timestamp": 1640995200
350
- }
351
- ```
352
-
353
- ## 🚨 Troubleshooting
354
-
355
- ### Common Issues
356
-
357
- | Issue | Symptoms | Solution |
358
- | ----------------------- | ------------------------ | --------------------------------------- |
359
- | **Model Loading Error** | ImportError, CUDA errors | Check GPU drivers, install CUDA toolkit |
360
- | **Memory Error** | Out of memory | Reduce batch size, use CPU inference |
361
- | **SSL Certificate** | SSL errors on macOS | Run certificate update command |
362
- | **Permission Error** | File access denied | Check file permissions, run as admin |
363
-
364
- ### Debug Commands
365
-
366
- ```bash
367
- # Check CUDA availability
368
- python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"
369
-
370
- # Check transformers version
371
- python -c "import transformers; print(f'Transformers: {transformers.__version__}')"
372
-
373
- # Test health endpoint locally
374
- curl http://localhost:7860/health
375
-
376
- # View application logs
377
- tail -f textlens.log
378
- ```
379
-
380
- ### Getting Help
381
-
382
- 1. πŸ“‹ **Check existing issues**: [GitHub Issues](https://github.com/KumarAmrit30/textlens-ocr/issues)
383
- 2. πŸ†• **Create new issue**: Provide error details and environment info
384
- 3. πŸ’¬ **Join discussion**: [GitHub Discussions](https://github.com/KumarAmrit30/textlens-ocr/discussions)
385
- 4. πŸ“§ **Contact**: Create an issue for direct support
386
-
387
  ## 🀝 Contributing
388
 
389
  We welcome contributions! Here's how to get started:
@@ -462,25 +240,6 @@ Special thanks to:
462
  | **API** | βœ… Stable | v1.0.0 |
463
  | **Documentation** | βœ… Complete | v1.0.0 |
464
 
465
- ### 🎯 Roadmap
466
-
467
- - [ ] **Multi-language UI** support
468
- - [ ] **Batch processing** for multiple images
469
- - [ ] **API rate limiting** and authentication
470
- - [ ] **Custom model** fine-tuning support
471
- - [ ] **Mobile app** development
472
- - [ ] **Cloud storage** integration
473
-
474
- ## πŸ“ž Support & Community
475
-
476
- ### πŸ”— Links
477
-
478
- - **🏠 Homepage**: [GitHub Repository](https://github.com/KumarAmrit30/textlens-ocr)
479
- - **πŸš€ Live Demo**: [HuggingFace Spaces](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
480
- - **πŸ“‹ Issues**: [Report Bugs](https://github.com/KumarAmrit30/textlens-ocr/issues)
481
- - **πŸ’¬ Discussions**: [GitHub Discussions](https://github.com/KumarAmrit30/textlens-ocr/discussions)
482
- - **πŸ“– Documentation**: [Deployment Guide](DEPLOYMENT.md)
483
-
484
  ### πŸ“Š Stats
485
 
486
  ![GitHub stars](https://img.shields.io/github/stars/KumarAmrit30/textlens-ocr?style=social)
 
14
 
15
  [![Deploy to HuggingFace](https://img.shields.io/badge/πŸ€—-Deploy%20to%20Spaces-blue)](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
16
  [![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/KumarAmrit30/textlens-ocr)
 
17
  [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
18
 
19
  A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.
 
55
  - **Graceful Shutdown**: Signal handling for clean application restarts
56
  - **Production Ready**: Scalable architecture with automated deployment
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
  ## πŸš€ Quick Start
60
 
 
148
  | `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` |
149
  | `CUDA_VISIBLE_DEVICES` | GPU selection | All available |
150
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
151
 
 
 
 
 
 
 
 
 
 
 
 
152
 
153
  **Deployment Flow:**
154
 
 
162
  F --> G[Complete βœ…]
163
  ```
164
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
165
  ## 🀝 Contributing
166
 
167
  We welcome contributions! Here's how to get started:
 
240
  | **API** | βœ… Stable | v1.0.0 |
241
  | **Documentation** | βœ… Complete | v1.0.0 |
242
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
243
  ### πŸ“Š Stats
244
 
245
  ![GitHub stars](https://img.shields.io/github/stars/KumarAmrit30/textlens-ocr?style=social)
ui/handlers.py CHANGED
@@ -16,7 +16,7 @@ def initialize_ocr_processor():
16
  global ocr_processor
17
  try:
18
  logger.info("Initializing OCR processor...")
19
- ocr_processor = OCRProcessor(model_name="microsoft/Florence-2-base")
20
  return True
21
  except Exception as e:
22
  logger.error(f"Failed to initialize OCR processor: {str(e)}")
 
16
  global ocr_processor
17
  try:
18
  logger.info("Initializing OCR processor...")
19
+ ocr_processor = OCRProcessor(model_name="microsoft/Florence-2-large")
20
  return True
21
  except Exception as e:
22
  logger.error(f"Failed to initialize OCR processor: {str(e)}")