Spaces:
Running
Running
title: TextLens - AI-Powered OCR | |
emoji: π | |
colorFrom: blue | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 4.0.0 | |
app_file: app.py | |
pinned: false | |
license: mit | |
# π TextLens - AI-Powered OCR | |
[](https://huggingface.co/spaces/GoConqurer/textlens-ocr) | |
[](https://github.com/KumarAmrit30/textlens-ocr) | |
[](https://www.python.org/downloads/) | |
A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment. | |
## π Live Demo | |
**π Try it now:** [https://huggingface.co/spaces/GoConqurer/textlens-ocr](https://huggingface.co/spaces/GoConqurer/textlens-ocr) | |
 | |
## β¨ Key Features | |
### π€ Advanced AI-Powered OCR | |
- **Microsoft Florence-2 VLM**: State-of-the-art vision-language model for text extraction | |
- **Intelligent Fallback System**: Automatic fallback to EasyOCR if primary model fails | |
- **Multi-Model Support**: Florence-2-base and Florence-2-large variants | |
- **Real-time Processing**: Instant text extraction on image upload | |
### π¨ Modern User Experience | |
- **Clean UI**: Professional Gradio interface with intuitive design | |
- **Multiple Input Methods**: Upload files, use webcam, or paste from clipboard | |
- **Copy-to-Clipboard**: One-click text copying functionality | |
- **Responsive Design**: Works seamlessly on desktop and mobile devices | |
- **Dark/Light Theme**: Automatic theme adaptation | |
### β‘ Performance & Reliability | |
- **GPU Acceleration**: Supports CUDA, MPS (Apple Silicon), and CPU inference | |
- **Smart Device Detection**: Automatically uses best available hardware | |
- **Error Resilience**: Robust error handling with graceful degradation | |
- **Memory Optimization**: Efficient model loading and cleanup | |
### π‘οΈ Enterprise Features | |
- **Zero Downtime Deployment**: Blue-green deployment with health checks | |
- **Health Monitoring**: Built-in `/health` and `/ready` endpoints | |
- **Graceful Shutdown**: Signal handling for clean application restarts | |
- **Production Ready**: Scalable architecture with automated deployment | |
## π Quick Start | |
### π Online (Recommended) | |
**Instant access** - No installation required: | |
π [**Launch TextLens**](https://huggingface.co/spaces/GoConqurer/textlens-ocr) | |
### π» Local Development | |
1. **Clone Repository** | |
```bash | |
git clone https://github.com/KumarAmrit30/textlens-ocr.git | |
cd textlens-ocr | |
``` | |
2. **Setup Environment** | |
```bash | |
python -m venv textlens_env | |
source textlens_env/bin/activate # Windows: textlens_env\Scripts\activate | |
pip install -r requirements.txt | |
``` | |
3. **Launch Application** | |
```bash | |
python app.py | |
``` | |
π Open: `http://localhost:7860` | |
### π§ͺ Quick Test | |
```bash | |
# Verify installation | |
python -c "from models.ocr_processor import OCRProcessor; print('β TextLens ready!')" | |
``` | |
## π Model Performance | |
| Model | Size | Speed | Accuracy | Best For | | |
| -------------------- | ----- | --------- | ------------ | ---------------------- | | |
| **Florence-2-base** | 270M | β‘ Fast | π High | General OCR, Real-time | | |
| **Florence-2-large** | 770M | π Medium | π Very High | High accuracy needs | | |
| **EasyOCR** | ~100M | π Medium | π Good | Fallback, Multilingual | | |
## π― Supported Use Cases | |
| Category | Examples | Performance | | |
| ------------------- | ------------------------------- | ----------- | | |
| π **Documents** | PDFs, Scanned papers, Forms | βββββ | | |
| π§Ύ **Receipts** | Shopping receipts, Invoices | ββββ | | |
| π± **Screenshots** | App interfaces, Error messages | βββββ | | |
| π **Vehicle** | License plates, VIN numbers | ββββ | | |
| π **Books** | Printed text, Handwritten notes | ββββ | | |
| π **Multilingual** | Multiple languages | βββ | | |
## π§ Configuration | |
### ποΈ Model Selection | |
```python | |
from models.ocr_processor import OCRProcessor | |
# Fast inference (recommended) | |
ocr = OCRProcessor(model_name="microsoft/Florence-2-base") | |
# Maximum accuracy | |
ocr = OCRProcessor(model_name="microsoft/Florence-2-large") | |
``` | |
### π¨ UI Customization | |
Modify `ui/styles.py` to customize appearance: | |
```python | |
# Change color scheme | |
PRIMARY_COLOR = "#1f77b4" | |
SECONDARY_COLOR = "#ff7f0e" | |
# Update layout | |
INTERFACE_WIDTH = "100%" | |
``` | |
### βοΈ Environment Variables | |
| Variable | Description | Default | | |
| ---------------------- | -------------------- | ---------------------- | | |
| `SPACE_ID` | HuggingFace Space ID | Auto-detected | | |
| `DEPLOYMENT_STAGE` | deployment stage | `production` | | |
| `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` | | |
| `CUDA_VISIBLE_DEVICES` | GPU selection | All available | | |
**Deployment Flow:** | |
```mermaid | |
graph LR | |
A[Code Push] --> B[Validate] | |
B --> C[Deploy Staging] | |
C --> D[Health Check] | |
D --> E[Deploy Production] | |
E --> F[Verify] | |
F --> G[Complete β ] | |
``` | |
## π€ Contributing | |
We welcome contributions! Here's how to get started: | |
### π§ Development Setup | |
1. **Fork & Clone** | |
```bash | |
git clone https://github.com/YOUR_USERNAME/textlens-ocr.git | |
cd textlens-ocr | |
``` | |
2. **Create Branch** | |
```bash | |
git checkout -b feature/your-feature-name | |
``` | |
3. **Make Changes** | |
- Add new features or fix bugs | |
- Update tests and documentation | |
- Follow code style guidelines | |
4. **Test Changes** | |
```bash | |
python -m pytest tests/ | |
python -c "from models.ocr_processor import OCRProcessor; OCRProcessor()" | |
``` | |
5. **Submit PR** | |
```bash | |
git add . | |
git commit -m "feat: add your feature description" | |
git push origin feature/your-feature-name | |
``` | |
### π Contribution Guidelines | |
- **Code Style**: Follow PEP 8, use Black formatter | |
- **Documentation**: Update README and docstrings | |
- **Tests**: Add tests for new functionality | |
- **Commits**: Use conventional commit messages | |
- **Issues**: Link PRs to relevant issues | |
## π License | |
This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details. | |
### π Third-Party Licenses | |
- **Microsoft Florence-2**: [MIT License](https://github.com/microsoft/Florence) | |
- **HuggingFace Transformers**: [Apache License 2.0](https://github.com/huggingface/transformers) | |
- **Gradio**: [Apache License 2.0](https://github.com/gradio-app/gradio) | |
- **EasyOCR**: [Apache License 2.0](https://github.com/JaidedAI/EasyOCR) | |
## π Acknowledgments | |
Special thanks to: | |
- **Microsoft Research** for the incredible Florence-2 vision-language model | |
- **HuggingFace** for the transformers library and Spaces platform | |
- **Gradio Team** for the amazing web interface framework | |
- **JaidedAI** for EasyOCR fallback capabilities | |
- **Open Source Community** for continuous support and contributions | |
## π Project Status | |
| Component | Status | Version | | |
| ----------------- | ------------- | ------- | | |
| **Core OCR** | β Stable | v1.0.0 | | |
| **Web UI** | β Stable | v1.0.0 | | |
| **Deployment** | β Production | v1.0.0 | | |
| **API** | β Stable | v1.0.0 | | |
| **Documentation** | β Complete | v1.0.0 | | |
### π Stats | |
 | |
 | |
 | |
--- | |
<div align="center"> | |
**Made with β€οΈ for the AI community** | |
[β Star this repo](https://github.com/KumarAmrit30/textlens-ocr) β’ [π Try the demo](https://huggingface.co/spaces/GoConqurer/textlens-ocr) β’ [π Read docs](DEPLOYMENT.md) | |
</div> | |