metadata

title: TextLens - AI-Powered OCR
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit

🔍 TextLens - AI-Powered OCR

A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.

🚀 Live Demo

🔗 Try it now: https://huggingface.co/spaces/GoConqurer/textlens-ocr

✨ Key Features

🤖 Advanced AI-Powered OCR

Microsoft Florence-2 VLM: State-of-the-art vision-language model for text extraction
Intelligent Fallback System: Automatic fallback to EasyOCR if primary model fails
Multi-Model Support: Florence-2-base and Florence-2-large variants
Real-time Processing: Instant text extraction on image upload

🎨 Modern User Experience

Clean UI: Professional Gradio interface with intuitive design
Multiple Input Methods: Upload files, use webcam, or paste from clipboard
Copy-to-Clipboard: One-click text copying functionality
Responsive Design: Works seamlessly on desktop and mobile devices
Dark/Light Theme: Automatic theme adaptation

⚡ Performance & Reliability

GPU Acceleration: Supports CUDA, MPS (Apple Silicon), and CPU inference
Smart Device Detection: Automatically uses best available hardware
Error Resilience: Robust error handling with graceful degradation
Memory Optimization: Efficient model loading and cleanup

🛡️ Enterprise Features

Zero Downtime Deployment: Blue-green deployment with health checks
Health Monitoring: Built-in /health and /ready endpoints
Graceful Shutdown: Signal handling for clean application restarts
Production Ready: Scalable architecture with automated deployment

🚀 Quick Start

🌐 Online (Recommended)

Instant access - No installation required: 👉 Launch TextLens

💻 Local Development

Clone Repository

git clone https://github.com/KumarAmrit30/textlens-ocr.git
cd textlens-ocr

Setup Environment

python -m venv textlens_env
source textlens_env/bin/activate  # Windows: textlens_env\Scripts\activate
pip install -r requirements.txt

Launch Application
```
python app.py
```
🌐 Open: http://localhost:7860

🧪 Quick Test

# Verify installation
python -c "from models.ocr_processor import OCRProcessor; print('✅ TextLens ready!')"

📊 Model Performance

Model	Size	Speed	Accuracy	Best For
Florence-2-base	270M	⚡ Fast	📈 High	General OCR, Real-time
Florence-2-large	770M	🐌 Medium	📊 Very High	High accuracy needs
EasyOCR	~100M	🚀 Medium	📋 Good	Fallback, Multilingual

🎯 Supported Use Cases

Category	Examples	Performance
📄 Documents	PDFs, Scanned papers, Forms	⭐⭐⭐⭐⭐
🧾 Receipts	Shopping receipts, Invoices	⭐⭐⭐⭐
📱 Screenshots	App interfaces, Error messages	⭐⭐⭐⭐⭐
🚗 Vehicle	License plates, VIN numbers	⭐⭐⭐⭐
📚 Books	Printed text, Handwritten notes	⭐⭐⭐⭐
🌐 Multilingual	Multiple languages	⭐⭐⭐

🔧 Configuration

🎛️ Model Selection

from models.ocr_processor import OCRProcessor

# Fast inference (recommended)
ocr = OCRProcessor(model_name="microsoft/Florence-2-base")

# Maximum accuracy
ocr = OCRProcessor(model_name="microsoft/Florence-2-large")

🎨 UI Customization

Modify ui/styles.py to customize appearance:

# Change color scheme
PRIMARY_COLOR = "#1f77b4"
SECONDARY_COLOR = "#ff7f0e"

# Update layout
INTERFACE_WIDTH = "100%"

⚙️ Environment Variables

Variable	Description	Default
`SPACE_ID`	HuggingFace Space ID	Auto-detected
`DEPLOYMENT_STAGE`	deployment stage	`production`
`TRANSFORMERS_CACHE`	Model cache path	`~/.cache/huggingface`
`CUDA_VISIBLE_DEVICES`	GPU selection	All available

Deployment Flow:

graph LR
    A[Code Push] --> B[Validate]
    B --> C[Deploy Staging]
    C --> D[Health Check]
    D --> E[Deploy Production]
    E --> F[Verify]
    F --> G[Complete ✅]

🤝 Contributing

We welcome contributions! Here's how to get started:

🔧 Development Setup

Fork & Clone

git clone https://github.com/YOUR_USERNAME/textlens-ocr.git
cd textlens-ocr

Create Branch

git checkout -b feature/your-feature-name

Make Changes
- Add new features or fix bugs
- Update tests and documentation
- Follow code style guidelines

Test Changes

python -m pytest tests/
python -c "from models.ocr_processor import OCRProcessor; OCRProcessor()"

Submit PR

git add .
git commit -m "feat: add your feature description"
git push origin feature/your-feature-name

📝 Contribution Guidelines

Code Style: Follow PEP 8, use Black formatter
Documentation: Update README and docstrings
Tests: Add tests for new functionality
Commits: Use conventional commit messages
Issues: Link PRs to relevant issues

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Third-Party Licenses

Microsoft Florence-2: MIT License
HuggingFace Transformers: Apache License 2.0
Gradio: Apache License 2.0
EasyOCR: Apache License 2.0

🌟 Acknowledgments

Special thanks to:

Microsoft Research for the incredible Florence-2 vision-language model
HuggingFace for the transformers library and Spaces platform
Gradio Team for the amazing web interface framework
JaidedAI for EasyOCR fallback capabilities
Open Source Community for continuous support and contributions

📈 Project Status

Component	Status	Version
Core OCR	✅ Stable	v1.0.0
Web UI	✅ Stable	v1.0.0
Deployment	✅ Production	v1.0.0
API	✅ Stable	v1.0.0
Documentation	✅ Complete	v1.0.0

📊 Stats

Made with ❤️ for the AI community

⭐ Star this repo • 🔗 Try the demo • 📖 Read docs