jamescallander commited on
Commit
715b96a
·
verified ·
1 Parent(s): abd0d44

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +151 -6
README.md CHANGED
@@ -1,6 +1,151 @@
1
- ---
2
- license: other
3
- license_name: deepsek
4
- license_link: >-
5
- https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct/blob/main/LICENSE
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: rkllm
3
+ pipeline_tag: text-generation
4
+ license: other
5
+ license_name: deepsek
6
+ license_link: >-
7
+ https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct/blob/main/LICENSE
8
+ language:
9
+ - en
10
+ base_model:
11
+ - deepseek-ai/deepseek-coder-1.3b-instruct
12
+ tags:
13
+ - rkllm
14
+ - rk3588
15
+ - rockchip
16
+ - code
17
+ - edge-ai
18
+ - llm
19
+ ---
20
+ # deepseek-coder-1.3b-instruct — RKLLM build for RK3588 boards
21
+
22
+ ### Built with DeepSeek (DeepSeek License Agreement)
23
+
24
+ **Author:** @jamescallander
25
+ **Source model:** [deepseek-ai/deepseek-coder-1.3b-instruct · Hugging Face](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct)
26
+
27
+ **Target:** Rockchip RK3588 NPU via RKNN-LLM Runtime
28
+
29
+ > This repository hosts a **conversion** of `deepseek-coder-1.3b-instruct` for use on Rockchip RK3588 single-board computers (Orange Pi 5 plus, Radxa Rock 5b+, Banana Pi M7, etc.). Conversion was performed using the [RKNN-LLM toolkit](https://github.com/airockchip/rknn-llm?utm_source=chatgpt.com)
30
+
31
+ #### Conversion details
32
+
33
+ - RKLLM-Toolkit version: v1.2.1
34
+ - NPU driver: v0.9.8
35
+ - Python: 3.12
36
+ - Quantization: `w8a8_g128`
37
+ - Output: single-file `.rkllm` artifact
38
+ - Tokenizer: not required at runtime (UI handles prompt I/O)
39
+
40
+ ## ⚠️ Code generation disclaimer
41
+
42
+ 🛑 **This model may produce incorrect or insecure code.**
43
+
44
+ - It is intended for **research, educational, and experimental purposes only**.
45
+ - Always **review, test, and validate code outputs** before using them in real projects.
46
+ - Do not rely on outputs for production, security-sensitive, or safety-critical systems.
47
+ - Use responsibly and in compliance with the source model’s license and restrictions.
48
+
49
+ ## Intended use
50
+
51
+ - On-device deployment of a **code-specialized LLM** on RK3588 SBCs.
52
+ - deepseek-coder-1.3b-instruct is tuned for **programming tasks, code completion, and instruction-following in developer workflows**, optimized for smaller edge hardware.
53
+
54
+ ## Limitations
55
+
56
+ - Requires 2.5GB free memory
57
+ - Quantized build (`w8a8_g128`) may show small quality differences vs. full-precision upstream.
58
+ - Tested on Radxa Rock 5B+; other devices may require different drivers/toolkit versions.
59
+ - Generated code should always be reviewed before use in production systems.
60
+
61
+ ## Quick start (RK3588)
62
+
63
+ ### 1) Install runtime
64
+
65
+ The RKNN-LLM toolkit and instructions can be found on the specific development board's manufacturer website or from [airockchip's github page](https://github.com/airockchip).
66
+
67
+ Download and install the required packages as per the toolkit's instructions.
68
+
69
+ ### 2) Simple Flask server deployment
70
+
71
+ The simplest way the deploy the `.rkllm` converted model is using an example script provided in the toolkit in this directory: `rknn-llm/examples/rkllm_server_demo`
72
+
73
+ ```bash
74
+ python3 <TOOLKIT_PATH>/rknn-llm/examples/rkllm_server_demo/flask_server.py \
75
+ --rkllm_model_path <MODEL_PATH>/deepseek-coder-1.3b-instruct_w8a8_g128_rk3588.rkllm \
76
+ --target_platform rk3588
77
+ ```
78
+
79
+ ### 3) Sending a request
80
+
81
+ A basic format for message request is:
82
+
83
+ ```json
84
+ {
85
+ "model":"deepseek-coder-1.3b-instruct",
86
+ "messages":[{
87
+ "role":"user",
88
+ "content":"<YOUR_PROMPT_HERE>"}],
89
+ "stream":false
90
+ }
91
+ ```
92
+
93
+ Example request using `curl`:
94
+
95
+ ```bash
96
+ curl -s -X POST <SERVER_IP_ADDRESS>:8080/rkllm_chat \
97
+ -H 'Content-Type: application/json' \
98
+ -d '{"model":"deepseek-coder-1.3b-instruct","messages":[{"role":"user","content":"Create a python function to calculate factorials using recursive method."}],"stream":false}'
99
+ ```
100
+
101
+ The response is formated in the following way:
102
+
103
+ ```json
104
+ {
105
+ "choices":[{
106
+ "finish_reason":"stop",
107
+ "index":0,
108
+ "logprobs":null,
109
+ "message":{
110
+ "content":"<MODEL_REPLY_HERE">,
111
+ "role":"assistant"}}],
112
+ "created":null,
113
+ "id":"rkllm_chat",
114
+ "object":"rkllm_chat",
115
+ "usage":{
116
+ "completion_tokens":null,
117
+ "prompt_tokens":null,
118
+ "total_tokens":null}
119
+ }
120
+ ```
121
+
122
+ Example response:
123
+
124
+ ```json
125
+ {"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"Sure! Here is the Python code for calculating factorial of an number (n) by implementing it in Recursion Method : ```python def Factorial(num): # Define Function with parameter num if num == 1 or num==0:# Base Case to stop recursive call when we reach one. It's the definition for factorial of any number n, where fact(n) = n * (n-1)! return 1 else: # Recursion Call -> Factorial function is called again with decreasing value until it reaches base case return num*Factorial(num-1) ``` You can call this Function as follows : `print (fact_of_5())`, where 5 will be the number for which you want to find factorial. This function works only with non negative integers and zero! It doesn't work properly if called without arguments or it is given a floating point argument because of integer division in Python2 when using `num*Factorial(num-1)`, this would result into an infinite recursion loop as the base case will never be reached.","role":"assistant"}}],"created":null,"id":"rkllm_chat","object":"rkllm_chat","usage":{"completion_tokens":null,"prompt_tokens":null,"total_tokens":null}}
126
+ ```
127
+
128
+ ### 4) UI compatibility
129
+
130
+ This server exposes an **OpenAI-compatible Chat Completions API**.
131
+
132
+ You can connect it to any OpenAI-compatible client or UI (for example: [Open WebUI](https://github.com/open-webui/open-webui?utm_source=chatgpt.com))
133
+ - Configure your client with the API base: `http://<SERVER_IP_ADDRESS>:8080` and use the endpoint: `/rkllm_chat`
134
+ - Make sure the `model` field matches the converted model’s name, for example:
135
+
136
+ ```json
137
+ {
138
+ "model": "deepseek-coder-1.3b-instruct",
139
+ "messages": [{"role":"user","content":"Hello!"}],
140
+ "stream": false
141
+ }
142
+ ```
143
+
144
+ # License
145
+
146
+ This conversion follows the [DeepSeek License Agreement](LICENSE)
147
+
148
+ - **Attribution:** Built with DeepSeek (© 2023 DeepSeek).
149
+ - **Required notice:** see [`NOTICE`](NOTICE)
150
+ - **Modifications:** quantization (w8a8_g128), export to `.rkllm` format for RK3588 SBCs.
151
+ - **Use Restrictions:** You may not use this model or its derivatives for prohibited purposes listed in Attachment A of the DeepSeek License Agreement (including military use, harming minors, generating PII without authorization, harassment, or discrimination)