File size: 8,200 Bytes
1691ca8
 
 
 
 
 
 
 
 
 
 
 
 
 
67e2508
 
 
1691ca8
67e2508
1691ca8
67e2508
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1691ca8
 
 
 
67e2508
1691ca8
67e2508
 
1691ca8
67e2508
1691ca8
67e2508
1691ca8
 
67e2508
 
1691ca8
 
67e2508
1691ca8
 
67e2508
 
1691ca8
 
 
67e2508
1691ca8
 
 
67e2508
1691ca8
67e2508
1691ca8
 
67e2508
 
1691ca8
 
67e2508
1691ca8
67e2508
 
 
 
 
1691ca8
67e2508
1691ca8
67e2508
 
 
 
 
 
 
 
1691ca8
67e2508
1691ca8
67e2508
1691ca8
67e2508
 
1691ca8
67e2508
 
1691ca8
67e2508
 
1691ca8
 
67e2508
1691ca8
67e2508
1691ca8
67e2508
 
 
 
1691ca8
67e2508
 
 
1691ca8
67e2508
1691ca8
67e2508
 
 
 
 
 
1691ca8
 
 
67e2508
1691ca8
67e2508
 
 
 
 
 
 
 
1691ca8
 
67e2508
 
 
 
 
 
 
1691ca8
 
67e2508
 
1691ca8
 
67e2508
1691ca8
67e2508
 
1691ca8
 
67e2508
 
 
 
 
 
 
 
1691ca8
67e2508
 
1691ca8
 
67e2508
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1691ca8
 
67e2508
 
 
1691ca8
67e2508
 
 
 
1691ca8
67e2508
1691ca8
67e2508
1691ca8
67e2508
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1691ca8
 
 
67e2508
 
1691ca8
67e2508
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
---
title: TextLens - AI-Powered OCR
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
---

# πŸ” TextLens - AI-Powered OCR

[![Deploy to HuggingFace](https://img.shields.io/badge/πŸ€—-Deploy%20to%20Spaces-blue)](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
[![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/KumarAmrit30/textlens-ocr)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)

A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.

## πŸš€ Live Demo

**πŸ”— Try it now:** [https://huggingface.co/spaces/GoConqurer/textlens-ocr](https://huggingface.co/spaces/GoConqurer/textlens-ocr)

![TextLens Demo](https://img.shields.io/badge/Demo-Live-brightgreen)

## ✨ Key Features

### πŸ€– Advanced AI-Powered OCR

- **Microsoft Florence-2 VLM**: State-of-the-art vision-language model for text extraction
- **Intelligent Fallback System**: Automatic fallback to EasyOCR if primary model fails
- **Multi-Model Support**: Florence-2-base and Florence-2-large variants
- **Real-time Processing**: Instant text extraction on image upload

### 🎨 Modern User Experience

- **Clean UI**: Professional Gradio interface with intuitive design
- **Multiple Input Methods**: Upload files, use webcam, or paste from clipboard
- **Copy-to-Clipboard**: One-click text copying functionality
- **Responsive Design**: Works seamlessly on desktop and mobile devices
- **Dark/Light Theme**: Automatic theme adaptation

### ⚑ Performance & Reliability

- **GPU Acceleration**: Supports CUDA, MPS (Apple Silicon), and CPU inference
- **Smart Device Detection**: Automatically uses best available hardware
- **Error Resilience**: Robust error handling with graceful degradation
- **Memory Optimization**: Efficient model loading and cleanup

### πŸ›‘οΈ Enterprise Features

- **Zero Downtime Deployment**: Blue-green deployment with health checks
- **Health Monitoring**: Built-in `/health` and `/ready` endpoints
- **Graceful Shutdown**: Signal handling for clean application restarts
- **Production Ready**: Scalable architecture with automated deployment


## πŸš€ Quick Start

### 🌐 Online (Recommended)

**Instant access** - No installation required:
πŸ‘‰ [**Launch TextLens**](https://huggingface.co/spaces/GoConqurer/textlens-ocr)

### πŸ’» Local Development

1. **Clone Repository**

   ```bash
   git clone https://github.com/KumarAmrit30/textlens-ocr.git
   cd textlens-ocr
   ```

2. **Setup Environment**

   ```bash
   python -m venv textlens_env
   source textlens_env/bin/activate  # Windows: textlens_env\Scripts\activate
   pip install -r requirements.txt
   ```

3. **Launch Application**
   ```bash
   python app.py
   ```
   🌐 Open: `http://localhost:7860`

### πŸ§ͺ Quick Test

```bash
# Verify installation
python -c "from models.ocr_processor import OCRProcessor; print('βœ… TextLens ready!')"
```

## πŸ“Š Model Performance

| Model                | Size  | Speed     | Accuracy     | Best For               |
| -------------------- | ----- | --------- | ------------ | ---------------------- |
| **Florence-2-base**  | 270M  | ⚑ Fast   | πŸ“ˆ High      | General OCR, Real-time |
| **Florence-2-large** | 770M  | 🐌 Medium | πŸ“Š Very High | High accuracy needs    |
| **EasyOCR**          | ~100M | πŸš€ Medium | πŸ“‹ Good      | Fallback, Multilingual |

## 🎯 Supported Use Cases

| Category            | Examples                        | Performance |
| ------------------- | ------------------------------- | ----------- |
| πŸ“„ **Documents**    | PDFs, Scanned papers, Forms     | ⭐⭐⭐⭐⭐  |
| 🧾 **Receipts**     | Shopping receipts, Invoices     | ⭐⭐⭐⭐    |
| πŸ“± **Screenshots**  | App interfaces, Error messages  | ⭐⭐⭐⭐⭐  |
| πŸš— **Vehicle**      | License plates, VIN numbers     | ⭐⭐⭐⭐    |
| πŸ“š **Books**        | Printed text, Handwritten notes | ⭐⭐⭐⭐    |
| 🌐 **Multilingual** | Multiple languages              | ⭐⭐⭐      |

## πŸ”§ Configuration

### πŸŽ›οΈ Model Selection

```python
from models.ocr_processor import OCRProcessor

# Fast inference (recommended)
ocr = OCRProcessor(model_name="microsoft/Florence-2-base")

# Maximum accuracy
ocr = OCRProcessor(model_name="microsoft/Florence-2-large")
```

### 🎨 UI Customization

Modify `ui/styles.py` to customize appearance:

```python
# Change color scheme
PRIMARY_COLOR = "#1f77b4"
SECONDARY_COLOR = "#ff7f0e"

# Update layout
INTERFACE_WIDTH = "100%"
```

### βš™οΈ Environment Variables

| Variable               | Description          | Default                |
| ---------------------- | -------------------- | ---------------------- |
| `SPACE_ID`             | HuggingFace Space ID | Auto-detected          |
| `DEPLOYMENT_STAGE`     | deployment stage     | `production`           |
| `TRANSFORMERS_CACHE`   | Model cache path     | `~/.cache/huggingface` |
| `CUDA_VISIBLE_DEVICES` | GPU selection        | All available          |



**Deployment Flow:**

```mermaid
graph LR
    A[Code Push] --> B[Validate]
    B --> C[Deploy Staging]
    C --> D[Health Check]
    D --> E[Deploy Production]
    E --> F[Verify]
    F --> G[Complete βœ…]
```

## 🀝 Contributing

We welcome contributions! Here's how to get started:

### πŸ”§ Development Setup

1. **Fork & Clone**

   ```bash
   git clone https://github.com/YOUR_USERNAME/textlens-ocr.git
   cd textlens-ocr
   ```

2. **Create Branch**

   ```bash
   git checkout -b feature/your-feature-name
   ```

3. **Make Changes**

   - Add new features or fix bugs
   - Update tests and documentation
   - Follow code style guidelines

4. **Test Changes**

   ```bash
   python -m pytest tests/
   python -c "from models.ocr_processor import OCRProcessor; OCRProcessor()"
   ```

5. **Submit PR**
   ```bash
   git add .
   git commit -m "feat: add your feature description"
   git push origin feature/your-feature-name
   ```

### πŸ“ Contribution Guidelines

- **Code Style**: Follow PEP 8, use Black formatter
- **Documentation**: Update README and docstrings
- **Tests**: Add tests for new functionality
- **Commits**: Use conventional commit messages
- **Issues**: Link PRs to relevant issues

## πŸ“„ License

This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.

### πŸ™ Third-Party Licenses

- **Microsoft Florence-2**: [MIT License](https://github.com/microsoft/Florence)
- **HuggingFace Transformers**: [Apache License 2.0](https://github.com/huggingface/transformers)
- **Gradio**: [Apache License 2.0](https://github.com/gradio-app/gradio)
- **EasyOCR**: [Apache License 2.0](https://github.com/JaidedAI/EasyOCR)

## 🌟 Acknowledgments

Special thanks to:

- **Microsoft Research** for the incredible Florence-2 vision-language model
- **HuggingFace** for the transformers library and Spaces platform
- **Gradio Team** for the amazing web interface framework
- **JaidedAI** for EasyOCR fallback capabilities
- **Open Source Community** for continuous support and contributions

## πŸ“ˆ Project Status

| Component         | Status        | Version |
| ----------------- | ------------- | ------- |
| **Core OCR**      | βœ… Stable     | v1.0.0  |
| **Web UI**        | βœ… Stable     | v1.0.0  |
| **Deployment**    | βœ… Production | v1.0.0  |
| **API**           | βœ… Stable     | v1.0.0  |
| **Documentation** | βœ… Complete   | v1.0.0  |

### πŸ“Š Stats

![GitHub stars](https://img.shields.io/github/stars/KumarAmrit30/textlens-ocr?style=social)
![GitHub forks](https://img.shields.io/github/forks/KumarAmrit30/textlens-ocr?style=social)
![GitHub watchers](https://img.shields.io/github/watchers/KumarAmrit30/textlens-ocr?style=social)

---

<div align="center">

**Made with ❀️ for the AI community**

[⭐ Star this repo](https://github.com/KumarAmrit30/textlens-ocr) β€’ [πŸ”— Try the demo](https://huggingface.co/spaces/GoConqurer/textlens-ocr) β€’ [πŸ“– Read docs](DEPLOYMENT.md)

</div>