Spaces:
Running
Running
Commit
Β·
760f6ef
1
Parent(s):
67e2508
π§ Update OCR model in handlers.py and clean up README.md
Browse filesβ
Changes:
- Updated OCR model from "microsoft/Florence-2-base" to "microsoft/Florence-2-large" for improved performance.
- Removed outdated architecture and deployment sections from README.md for clarity and conciseness.
π This enhances the application's capabilities and streamlines documentation.
- README.md +0 -241
- ui/handlers.py +1 -1
README.md
CHANGED
@@ -14,7 +14,6 @@ license: mit
|
|
14 |
|
15 |
[](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
|
16 |
[](https://github.com/KumarAmrit30/textlens-ocr)
|
17 |
-
[](LICENSE)
|
18 |
[](https://www.python.org/downloads/)
|
19 |
|
20 |
A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.
|
@@ -56,29 +55,6 @@ A state-of-the-art Vision-Language Model (VLM) based OCR application that extrac
|
|
56 |
- **Graceful Shutdown**: Signal handling for clean application restarts
|
57 |
- **Production Ready**: Scalable architecture with automated deployment
|
58 |
|
59 |
-
## ποΈ Architecture
|
60 |
-
|
61 |
-
```
|
62 |
-
textlens-ocr/
|
63 |
-
βββ π± Frontend (Gradio UI)
|
64 |
-
β βββ ui/interface.py # Main interface components
|
65 |
-
β βββ ui/handlers.py # Event handlers & logic
|
66 |
-
β βββ ui/styles.py # CSS styling & themes
|
67 |
-
βββ π§ AI Models
|
68 |
-
β βββ models/ocr_processor.py # OCR engine with fallbacks
|
69 |
-
βββ π§ Utilities
|
70 |
-
β βββ utils/image_utils.py # Image preprocessing
|
71 |
-
βββ π Deployment
|
72 |
-
β βββ .github/workflows/ # CI/CD pipelines
|
73 |
-
β βββ scripts/deploy.py # Manual deployment tools
|
74 |
-
β βββ deployment.config.yml # Deployment configuration
|
75 |
-
βββ π Documentation
|
76 |
-
β βββ README.md # Main documentation
|
77 |
-
β βββ DEPLOYMENT.md # Deployment guide
|
78 |
-
βββ βοΈ Configuration
|
79 |
-
βββ app.py # Main application entry
|
80 |
-
βββ requirements.txt # Dependencies
|
81 |
-
```
|
82 |
|
83 |
## π Quick Start
|
84 |
|
@@ -172,41 +148,7 @@ INTERFACE_WIDTH = "100%"
|
|
172 |
| `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` |
|
173 |
| `CUDA_VISIBLE_DEVICES` | GPU selection | All available |
|
174 |
|
175 |
-
## π Deployment
|
176 |
-
|
177 |
-
### π€ HuggingFace Spaces (Recommended)
|
178 |
-
|
179 |
-
**Automatic Deployment:**
|
180 |
-
|
181 |
-
1. Fork this repository
|
182 |
-
2. Push to `main`/`master` branch
|
183 |
-
3. GitHub Actions automatically deploys to HuggingFace Spaces
|
184 |
-
4. Access your deployed app at: `https://huggingface.co/spaces/USERNAME/textlens-ocr`
|
185 |
-
|
186 |
-
**Manual Deployment:**
|
187 |
-
|
188 |
-
1. Go to [GitHub Actions](https://github.com/KumarAmrit30/textlens-ocr/actions)
|
189 |
-
2. Select "Deploy to HuggingFace Spaces"
|
190 |
-
3. Click "Run workflow"
|
191 |
-
4. Choose deployment type:
|
192 |
-
- **Direct**: Quick deployment to production
|
193 |
-
- **Blue-Green**: Zero downtime with staging validation
|
194 |
-
|
195 |
-
### π Zero Downtime Deployment
|
196 |
-
|
197 |
-
Our enterprise-grade deployment system ensures **zero downtime** for users:
|
198 |
|
199 |
-
**Features:**
|
200 |
-
|
201 |
-
- π΅ **Blue-Green Deployment**: Test in staging before production
|
202 |
-
- π₯ **Health Monitoring**: Automatic health checks with retry logic
|
203 |
-
- π **Graceful Shutdown**: Clean application restarts
|
204 |
-
- π **Real-time Monitoring**: Deployment status tracking
|
205 |
-
|
206 |
-
**Health Endpoints:**
|
207 |
-
|
208 |
-
- `GET /health` - Application health status
|
209 |
-
- `GET /ready` - Application readiness check
|
210 |
|
211 |
**Deployment Flow:**
|
212 |
|
@@ -220,170 +162,6 @@ graph LR
|
|
220 |
F --> G[Complete β
]
|
221 |
```
|
222 |
|
223 |
-
### π³ Docker Deployment
|
224 |
-
|
225 |
-
```dockerfile
|
226 |
-
FROM python:3.9-slim
|
227 |
-
|
228 |
-
WORKDIR /app
|
229 |
-
COPY requirements.txt .
|
230 |
-
RUN pip install -r requirements.txt
|
231 |
-
|
232 |
-
COPY . .
|
233 |
-
EXPOSE 7860
|
234 |
-
|
235 |
-
CMD ["python", "app.py"]
|
236 |
-
```
|
237 |
-
|
238 |
-
Build and run:
|
239 |
-
|
240 |
-
```bash
|
241 |
-
docker build -t textlens-ocr .
|
242 |
-
docker run -p 7860:7860 textlens-ocr
|
243 |
-
```
|
244 |
-
|
245 |
-
### βοΈ Cloud Platforms
|
246 |
-
|
247 |
-
| Platform | Status | Guide |
|
248 |
-
| ---------------------- | ------------- | ------------------------------------------------------------------- |
|
249 |
-
| **HuggingFace Spaces** | β
Ready | [Deploy Now](https://huggingface.co/spaces/GoConqurer/textlens-ocr) |
|
250 |
-
| **Google Colab** | β
Compatible | Open in Colab |
|
251 |
-
| **AWS/GCP/Azure** | π§ Docker | Use Docker deployment |
|
252 |
-
| **Heroku** | β οΈ Limited | GPU not available |
|
253 |
-
|
254 |
-
## π§ͺ Testing & Development
|
255 |
-
|
256 |
-
### π Running Tests
|
257 |
-
|
258 |
-
```bash
|
259 |
-
# Basic functionality test
|
260 |
-
python -c "
|
261 |
-
from models.ocr_processor import OCRProcessor
|
262 |
-
ocr = OCRProcessor()
|
263 |
-
print(f'β
Model loaded: {ocr.get_model_info()}')
|
264 |
-
"
|
265 |
-
|
266 |
-
# Test with sample image
|
267 |
-
python -c "
|
268 |
-
from PIL import Image
|
269 |
-
from models.ocr_processor import OCRProcessor
|
270 |
-
import requests
|
271 |
-
|
272 |
-
# Download test image
|
273 |
-
img_url = 'https://via.placeholder.com/300x100/000000/FFFFFF?text=Hello+World'
|
274 |
-
image = Image.open(requests.get(img_url, stream=True).raw)
|
275 |
-
|
276 |
-
# Test OCR
|
277 |
-
ocr = OCRProcessor()
|
278 |
-
result = ocr.extract_text(image)
|
279 |
-
print(f'β
OCR Result: {result}')
|
280 |
-
"
|
281 |
-
```
|
282 |
-
|
283 |
-
### π οΈ Development Tools
|
284 |
-
|
285 |
-
```bash
|
286 |
-
# Install development dependencies
|
287 |
-
pip install -r requirements.txt
|
288 |
-
|
289 |
-
# Format code
|
290 |
-
black . --line-length 88
|
291 |
-
|
292 |
-
# Type checking
|
293 |
-
mypy models/ utils/ ui/
|
294 |
-
|
295 |
-
# Lint code
|
296 |
-
flake8 --max-line-length 88
|
297 |
-
```
|
298 |
-
|
299 |
-
## π API Reference
|
300 |
-
|
301 |
-
### OCRProcessor Class
|
302 |
-
|
303 |
-
```python
|
304 |
-
from models.ocr_processor import OCRProcessor
|
305 |
-
|
306 |
-
# Initialize processor
|
307 |
-
ocr = OCRProcessor(
|
308 |
-
model_name="microsoft/Florence-2-base", # Model selection
|
309 |
-
device=None, # Auto-detect device
|
310 |
-
torch_dtype=None # Auto-select dtype
|
311 |
-
)
|
312 |
-
|
313 |
-
# Extract text from image
|
314 |
-
text = ocr.extract_text(image)
|
315 |
-
# Returns: str
|
316 |
-
|
317 |
-
# Extract text with bounding boxes
|
318 |
-
result = ocr.extract_text_with_regions(image)
|
319 |
-
# Returns: dict with text and regions
|
320 |
-
|
321 |
-
# Get model information
|
322 |
-
info = ocr.get_model_info()
|
323 |
-
# Returns: dict with model details
|
324 |
-
|
325 |
-
# Cleanup resources
|
326 |
-
ocr.cleanup()
|
327 |
-
```
|
328 |
-
|
329 |
-
### Health Check API
|
330 |
-
|
331 |
-
```bash
|
332 |
-
# Check application health
|
333 |
-
curl https://huggingface.co/spaces/GoConqurer/textlens-ocr/health
|
334 |
-
|
335 |
-
# Response:
|
336 |
-
{
|
337 |
-
"status": "healthy",
|
338 |
-
"timestamp": 1640995200,
|
339 |
-
"version": "1.0.0",
|
340 |
-
"environment": "production"
|
341 |
-
}
|
342 |
-
|
343 |
-
# Check readiness
|
344 |
-
curl https://huggingface.co/spaces/GoConqurer/textlens-ocr/ready
|
345 |
-
|
346 |
-
# Response:
|
347 |
-
{
|
348 |
-
"status": "ready",
|
349 |
-
"timestamp": 1640995200
|
350 |
-
}
|
351 |
-
```
|
352 |
-
|
353 |
-
## π¨ Troubleshooting
|
354 |
-
|
355 |
-
### Common Issues
|
356 |
-
|
357 |
-
| Issue | Symptoms | Solution |
|
358 |
-
| ----------------------- | ------------------------ | --------------------------------------- |
|
359 |
-
| **Model Loading Error** | ImportError, CUDA errors | Check GPU drivers, install CUDA toolkit |
|
360 |
-
| **Memory Error** | Out of memory | Reduce batch size, use CPU inference |
|
361 |
-
| **SSL Certificate** | SSL errors on macOS | Run certificate update command |
|
362 |
-
| **Permission Error** | File access denied | Check file permissions, run as admin |
|
363 |
-
|
364 |
-
### Debug Commands
|
365 |
-
|
366 |
-
```bash
|
367 |
-
# Check CUDA availability
|
368 |
-
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"
|
369 |
-
|
370 |
-
# Check transformers version
|
371 |
-
python -c "import transformers; print(f'Transformers: {transformers.__version__}')"
|
372 |
-
|
373 |
-
# Test health endpoint locally
|
374 |
-
curl http://localhost:7860/health
|
375 |
-
|
376 |
-
# View application logs
|
377 |
-
tail -f textlens.log
|
378 |
-
```
|
379 |
-
|
380 |
-
### Getting Help
|
381 |
-
|
382 |
-
1. π **Check existing issues**: [GitHub Issues](https://github.com/KumarAmrit30/textlens-ocr/issues)
|
383 |
-
2. π **Create new issue**: Provide error details and environment info
|
384 |
-
3. π¬ **Join discussion**: [GitHub Discussions](https://github.com/KumarAmrit30/textlens-ocr/discussions)
|
385 |
-
4. π§ **Contact**: Create an issue for direct support
|
386 |
-
|
387 |
## π€ Contributing
|
388 |
|
389 |
We welcome contributions! Here's how to get started:
|
@@ -462,25 +240,6 @@ Special thanks to:
|
|
462 |
| **API** | β
Stable | v1.0.0 |
|
463 |
| **Documentation** | β
Complete | v1.0.0 |
|
464 |
|
465 |
-
### π― Roadmap
|
466 |
-
|
467 |
-
- [ ] **Multi-language UI** support
|
468 |
-
- [ ] **Batch processing** for multiple images
|
469 |
-
- [ ] **API rate limiting** and authentication
|
470 |
-
- [ ] **Custom model** fine-tuning support
|
471 |
-
- [ ] **Mobile app** development
|
472 |
-
- [ ] **Cloud storage** integration
|
473 |
-
|
474 |
-
## π Support & Community
|
475 |
-
|
476 |
-
### π Links
|
477 |
-
|
478 |
-
- **π Homepage**: [GitHub Repository](https://github.com/KumarAmrit30/textlens-ocr)
|
479 |
-
- **π Live Demo**: [HuggingFace Spaces](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
|
480 |
-
- **π Issues**: [Report Bugs](https://github.com/KumarAmrit30/textlens-ocr/issues)
|
481 |
-
- **π¬ Discussions**: [GitHub Discussions](https://github.com/KumarAmrit30/textlens-ocr/discussions)
|
482 |
-
- **π Documentation**: [Deployment Guide](DEPLOYMENT.md)
|
483 |
-
|
484 |
### π Stats
|
485 |
|
486 |

|
|
|
14 |
|
15 |
[](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
|
16 |
[](https://github.com/KumarAmrit30/textlens-ocr)
|
|
|
17 |
[](https://www.python.org/downloads/)
|
18 |
|
19 |
A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.
|
|
|
55 |
- **Graceful Shutdown**: Signal handling for clean application restarts
|
56 |
- **Production Ready**: Scalable architecture with automated deployment
|
57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
|
59 |
## π Quick Start
|
60 |
|
|
|
148 |
| `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` |
|
149 |
| `CUDA_VISIBLE_DEVICES` | GPU selection | All available |
|
150 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
151 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
152 |
|
153 |
**Deployment Flow:**
|
154 |
|
|
|
162 |
F --> G[Complete β
]
|
163 |
```
|
164 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
165 |
## π€ Contributing
|
166 |
|
167 |
We welcome contributions! Here's how to get started:
|
|
|
240 |
| **API** | β
Stable | v1.0.0 |
|
241 |
| **Documentation** | β
Complete | v1.0.0 |
|
242 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
243 |
### π Stats
|
244 |
|
245 |

|
ui/handlers.py
CHANGED
@@ -16,7 +16,7 @@ def initialize_ocr_processor():
|
|
16 |
global ocr_processor
|
17 |
try:
|
18 |
logger.info("Initializing OCR processor...")
|
19 |
-
ocr_processor = OCRProcessor(model_name="microsoft/Florence-2-
|
20 |
return True
|
21 |
except Exception as e:
|
22 |
logger.error(f"Failed to initialize OCR processor: {str(e)}")
|
|
|
16 |
global ocr_processor
|
17 |
try:
|
18 |
logger.info("Initializing OCR processor...")
|
19 |
+
ocr_processor = OCRProcessor(model_name="microsoft/Florence-2-large")
|
20 |
return True
|
21 |
except Exception as e:
|
22 |
logger.error(f"Failed to initialize OCR processor: {str(e)}")
|