kunato commited on
Commit
f0b40f2
·
verified ·
1 Parent(s): 56e5bed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -83,7 +83,7 @@ print(markdown)
83
 
84
  ```bash
85
  pip install vllm
86
- vllm serve scb10x/typhoon-ocr-7b --max-model-len 32000 --served-model-name typhoon-ocr-preview # OpenAI Compatible at http://localhost:8000 (or other port)
87
  # then you can supply base_url in to ocr_document
88
  ```
89
 
@@ -105,7 +105,7 @@ from openai import OpenAI
105
  from PIL import Image
106
  from typhoon_ocr.ocr_utils import render_pdf_to_base64png, get_anchor_text
107
 
108
- PROMPTS_SYS = {
109
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
110
  f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
111
  f"If the document contains images, use a placeholder like dummy.png for each image.\n"
@@ -128,7 +128,7 @@ def get_prompt(prompt_name: str) -> Callable[[str], str]:
128
  :param prompt_name: The identifier for the desired prompt.
129
  :return: The system prompt as a string.
130
  """
131
- return PROMPTS_SYS.get(prompt_name, lambda x: "Invalid PROMPT_NAME provided.")
132
 
133
 
134
 
@@ -169,8 +169,8 @@ print(text_output)
169
  *(Not Recommended): Local Model - Transformers (GPU Required)*:
170
  ```python
171
  # Initialize the model
172
- model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
173
- processor = AutoProcessor.from_pretrained("scb10x/typhoon-ocr-7b")
174
 
175
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
176
  model.to(device)
@@ -209,7 +209,7 @@ print(text_output[0])
209
  This model only works with the specific prompts defined below, where `{base_text}` refers to information extracted from the PDF metadata using the `get_anchor_text` function from the `typhoon-ocr` package. It will not function correctly with any other prompts.
210
 
211
  ```python
212
- PROMPTS_SYS = {
213
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
214
  f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
215
  f"If the document contains images, use a placeholder like dummy.png for each image.\n"
@@ -240,7 +240,7 @@ repetition_penalty: 1.2
240
  We recommend to inference typhoon-ocr using [vllm](https://github.com/vllm-project/vllm) instead of huggingface transformers, and using typhoon-ocr library to ocr documents. To read more about [vllm](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)
241
  ```bash
242
  pip install vllm
243
- vllm serve scb10x/typhoon-ocr-7b --max-model-len 32000 --served-model-name typhoon-ocr-preview # OpenAI Compatible at http://localhost:8000
244
  # then you can supply base_url in to ocr_document
245
  ```
246
 
 
83
 
84
  ```bash
85
  pip install vllm
86
+ vllm serve scb10x/typhoon-ocr-3b --max-model-len 32000 --served-model-name typhoon-ocr-preview # OpenAI Compatible at http://localhost:8000 (or other port)
87
  # then you can supply base_url in to ocr_document
88
  ```
89
 
 
105
  from PIL import Image
106
  from typhoon_ocr.ocr_utils import render_pdf_to_base64png, get_anchor_text
107
 
108
+ PROMPTS = {
109
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
110
  f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
111
  f"If the document contains images, use a placeholder like dummy.png for each image.\n"
 
128
  :param prompt_name: The identifier for the desired prompt.
129
  :return: The system prompt as a string.
130
  """
131
+ return PROMPTS.get(prompt_name, lambda x: "Invalid PROMPT_NAME provided.")
132
 
133
 
134
 
 
169
  *(Not Recommended): Local Model - Transformers (GPU Required)*:
170
  ```python
171
  # Initialize the model
172
+ model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-3b", torch_dtype=torch.bfloat16 ).eval()
173
+ processor = AutoProcessor.from_pretrained("scb10x/typhoon-ocr-3b")
174
 
175
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
176
  model.to(device)
 
209
  This model only works with the specific prompts defined below, where `{base_text}` refers to information extracted from the PDF metadata using the `get_anchor_text` function from the `typhoon-ocr` package. It will not function correctly with any other prompts.
210
 
211
  ```python
212
+ PROMPTS = {
213
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
214
  f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
215
  f"If the document contains images, use a placeholder like dummy.png for each image.\n"
 
240
  We recommend to inference typhoon-ocr using [vllm](https://github.com/vllm-project/vllm) instead of huggingface transformers, and using typhoon-ocr library to ocr documents. To read more about [vllm](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)
241
  ```bash
242
  pip install vllm
243
+ vllm serve scb10x/typhoon-ocr-3b --max-model-len 32000 --served-model-name typhoon-ocr-preview # OpenAI Compatible at http://localhost:8000
244
  # then you can supply base_url in to ocr_document
245
  ```
246