scb10x
/

typhoon-ocr-3b

@@ -83,7 +83,7 @@ print(markdown)
 ```bash
 pip install vllm
-vllm serve scb10x/typhoon-ocr-7b --max-model-len 32000 --served-model-name typhoon-ocr-preview # OpenAI Compatible at http://localhost:8000 (or other port)
 # then you can supply base_url in to ocr_document
 ```
@@ -105,7 +105,7 @@ from openai import OpenAI
 from PIL import Image
 from typhoon_ocr.ocr_utils import render_pdf_to_base64png, get_anchor_text
-PROMPTS_SYS = {
     "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
         f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
         f"If the document contains images, use a placeholder like dummy.png for each image.\n"
@@ -128,7 +128,7 @@ def get_prompt(prompt_name: str) -> Callable[[str], str]:
     :param prompt_name: The identifier for the desired prompt.
     :return: The system prompt as a string.
     """
-    return PROMPTS_SYS.get(prompt_name, lambda x: "Invalid PROMPT_NAME provided.")
@@ -169,8 +169,8 @@ print(text_output)
 *(Not Recommended): Local Model - Transformers (GPU Required)*:
 ```python
 # Initialize the model
-model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
-processor = AutoProcessor.from_pretrained("scb10x/typhoon-ocr-7b")
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 model.to(device)
@@ -209,7 +209,7 @@ print(text_output[0])
 This model only works with the specific prompts defined below, where `{base_text}` refers to information extracted from the PDF metadata using the `get_anchor_text` function from the `typhoon-ocr` package. It will not function correctly with any other prompts.
 ```python
-PROMPTS_SYS = {
     "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
         f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
         f"If the document contains images, use a placeholder like dummy.png for each image.\n"
@@ -240,7 +240,7 @@ repetition_penalty: 1.2
 We recommend to inference typhoon-ocr using [vllm](https://github.com/vllm-project/vllm) instead of huggingface transformers, and using typhoon-ocr library to ocr documents. To read more about [vllm](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)
 ```bash
 pip install vllm
-vllm serve scb10x/typhoon-ocr-7b --max-model-len 32000 --served-model-name typhoon-ocr-preview  # OpenAI Compatible at http://localhost:8000
 # then you can supply base_url in to ocr_document
 ```

 ```bash
 pip install vllm
+vllm serve scb10x/typhoon-ocr-3b --max-model-len 32000 --served-model-name typhoon-ocr-preview # OpenAI Compatible at http://localhost:8000 (or other port)
 # then you can supply base_url in to ocr_document
 ```
 from PIL import Image
 from typhoon_ocr.ocr_utils import render_pdf_to_base64png, get_anchor_text
+PROMPTS = {
     "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
         f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
         f"If the document contains images, use a placeholder like dummy.png for each image.\n"
     :param prompt_name: The identifier for the desired prompt.
     :return: The system prompt as a string.
     """
+    return PROMPTS.get(prompt_name, lambda x: "Invalid PROMPT_NAME provided.")
 *(Not Recommended): Local Model - Transformers (GPU Required)*:
 ```python
 # Initialize the model
+model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-3b", torch_dtype=torch.bfloat16 ).eval()
+processor = AutoProcessor.from_pretrained("scb10x/typhoon-ocr-3b")
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 model.to(device)
 This model only works with the specific prompts defined below, where `{base_text}` refers to information extracted from the PDF metadata using the `get_anchor_text` function from the `typhoon-ocr` package. It will not function correctly with any other prompts.
 ```python
+PROMPTS = {
     "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
         f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
         f"If the document contains images, use a placeholder like dummy.png for each image.\n"
 We recommend to inference typhoon-ocr using [vllm](https://github.com/vllm-project/vllm) instead of huggingface transformers, and using typhoon-ocr library to ocr documents. To read more about [vllm](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)
 ```bash
 pip install vllm
+vllm serve scb10x/typhoon-ocr-3b --max-model-len 32000 --served-model-name typhoon-ocr-preview  # OpenAI Compatible at http://localhost:8000
 # then you can supply base_url in to ocr_document
 ```