dung-vpt-uney commited on
Commit
c1c7f1e
·
1 Parent(s): 58fe08c

Deploy CoRGI demo - 2025-10-29 14:27:36

Browse files

Features:
- Structured reasoning with CoRGI protocol
- ROI extraction using Qwen3-VL grounding
- Visual evidence synthesis
- Gradio UI with per-step visualization

Model: Qwen/Qwen3-VL-8B-Thinking

README.md CHANGED
@@ -12,7 +12,7 @@ license: apache-2.0
12
 
13
  # CoRGI Qwen3-VL Demo
14
 
15
- This Space showcases the CoRGI reasoning pipeline powered entirely by **Qwen/Qwen3-VL-4B-Instruct**.
16
  Upload an image, ask a visual question, and the app will:
17
 
18
  1. Generate structured reasoning steps with visual-verification flags.
@@ -24,7 +24,7 @@ Upload an image, ask a visual question, and the app will:
24
  ```bash
25
  pip install -r requirements.txt
26
  python examples/demo_qwen_corgi.py \
27
- --model-id Qwen/Qwen3-VL-4B-Instruct \
28
  --max-steps 3 \
29
  --max-regions 3
30
  ```
@@ -35,9 +35,17 @@ To launch the Gradio demo locally:
35
  python app.py
36
  ```
37
 
 
 
 
 
 
 
 
 
38
  ## Configuration Notes
39
 
40
- - **Model**: Uses `Qwen/Qwen3-VL-4B-Instruct` (4B parameters, ~8GB VRAM)
41
  - **Single GPU**: Model loads on single GPU (cuda:0) to avoid memory fragmentation
42
  - **Hardware**: The Space runs on `cpu-basic` tier by default
43
  - **Customization**: Set `CORGI_QWEN_MODEL` environment variable to use a different checkpoint
 
12
 
13
  # CoRGI Qwen3-VL Demo
14
 
15
+ This Space showcases the CoRGI reasoning pipeline powered entirely by **Qwen/Qwen3-VL-2B-Instruct**.
16
  Upload an image, ask a visual question, and the app will:
17
 
18
  1. Generate structured reasoning steps with visual-verification flags.
 
24
  ```bash
25
  pip install -r requirements.txt
26
  python examples/demo_qwen_corgi.py \
27
+ --model-id Qwen/Qwen3-VL-2B-Instruct \
28
  --max-steps 3 \
29
  --max-regions 3
30
  ```
 
35
  python app.py
36
  ```
37
 
38
+ ## 📚 Full Documentation
39
+
40
+ See **[docs/](docs/)** folder for complete documentation:
41
+ - 🚀 **[Quick Start](docs/START_HERE.md)** - Begin here!
42
+ - 📖 **[Usage Guide](docs/USAGE_GUIDE.md)** - How to use
43
+ - 🔧 **[Deployment](docs/DEPLOY_NOW.md)** - Deploy to HF Spaces
44
+ - 📊 **[Summary Report](docs/SUMMARY_REPORT.md)** - Full overview
45
+
46
  ## Configuration Notes
47
 
48
+ - **Model**: Uses `Qwen/Qwen3-VL-2B-Instruct` (2B parameters, ~5GB VRAM)
49
  - **Single GPU**: Model loads on single GPU (cuda:0) to avoid memory fragmentation
50
  - **Hardware**: The Space runs on `cpu-basic` tier by default
51
  - **Customization**: Set `CORGI_QWEN_MODEL` environment variable to use a different checkpoint
corgi/__pycache__/cli.cpython-312.pyc CHANGED
Binary files a/corgi/__pycache__/cli.cpython-312.pyc and b/corgi/__pycache__/cli.cpython-312.pyc differ
 
corgi/__pycache__/qwen_client.cpython-312.pyc CHANGED
Binary files a/corgi/__pycache__/qwen_client.cpython-312.pyc and b/corgi/__pycache__/qwen_client.cpython-312.pyc differ
 
corgi/cli.py CHANGED
@@ -12,7 +12,7 @@ from .pipeline import CoRGIPipeline
12
  from .qwen_client import Qwen3VLClient, QwenGenerationConfig
13
  from .types import GroundedEvidence, ReasoningStep
14
 
15
- DEFAULT_MODEL_ID = "Qwen/Qwen3-VL-4B-Instruct"
16
 
17
 
18
  def build_parser() -> argparse.ArgumentParser:
 
12
  from .qwen_client import Qwen3VLClient, QwenGenerationConfig
13
  from .types import GroundedEvidence, ReasoningStep
14
 
15
+ DEFAULT_MODEL_ID = "Qwen/Qwen3-VL-2B-Instruct"
16
 
17
 
18
  def build_parser() -> argparse.ArgumentParser:
corgi/qwen_client.py CHANGED
@@ -128,7 +128,7 @@ def _load_backend(model_id: str) -> tuple[AutoModelForImageTextToText, AutoProce
128
 
129
  @dataclass
130
  class QwenGenerationConfig:
131
- model_id: str = "Qwen/Qwen3-VL-4B-Instruct"
132
  max_new_tokens: int = 512
133
  temperature: float | None = None
134
  do_sample: bool = False
 
128
 
129
  @dataclass
130
  class QwenGenerationConfig:
131
+ model_id: str = "Qwen/Qwen3-VL-2B-Instruct"
132
  max_new_tokens: int = 512
133
  temperature: float | None = None
134
  do_sample: bool = False