Spaces:

Adun
/

typhoon-ocr-v1

Runtime error

File size: 1,666 Bytes

34d8f3a

---

title: Typhoon OCR
emoji: 🌍
colorFrom: gray
colorTo: red
sdk: gradio
sdk_version: 5.29.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Convert Image & PDF to Markdown
---

## Typhoon OCR

Typhoon OCR is a model for extracting structured markdown from images or PDFs. It supports document layout analysis and table extraction, returning results in markdown or HTML. This package is a simple Gradio website to demonstrate the performance of Typhoon OCR.


### Features
- Upload a PDF or image (single page)
- Extracts and reconstructs document content as markdown
- Supports different prompt modes for layout or structure
- Language: English, Thai
- Uses a local or remote OpenAI-compatible API (e.g., vllm)

### Install
```bash

pip install -r requirements.txt

# edit .env

# pip install vllm # optional for hosting a local server

```

### Mac specific
```

brew install poppler

# The following binaries are required and provided by poppler:

# - pdfinfo

# - pdftoppm

```
### Linux specific
```

sudo apt-get update

sudo apt-get install poppler-utils

# The following binaries are required and provided by poppler-utils:

# - pdfinfo

# - pdftoppm

```


### Start vllm
```bash

vllm serve scb10x/typhoon-ocr-7b --served-model-name typhoon-ocr --dtype bfloat16 --port 8101

```

### Run Gradio demo
```bash

python app.py

```

### Dependencies
- openai
- python-dotenv
- ftfy
- pypdf
- gradio
- vllm (for hosting an inference server)
- pillow

### License
This project is licensed under the Apache 2.0 License. See individual datasets and checkpoints for their respective licenses.