Spaces:
Runtime error
Runtime error
tech-envision
commited on
Commit
·
e9772d2
1
Parent(s):
c56348f
Improve README readability
Browse files
README.md
CHANGED
@@ -1,103 +1,53 @@
|
|
1 |
# llm-backend
|
2 |
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
from ``stdout`` and ``stderr`` is captured when each command finishes.
|
13 |
-
The output string is capped at the last 10,000 characters so very long
|
14 |
-
results are truncated. A short notice is prepended whenever data is hidden.
|
15 |
-
Execution happens asynchronously so the assistant can continue responding
|
16 |
-
while the command runs.
|
17 |
-
The VM is created when a chat session starts and reused for all subsequent
|
18 |
-
tool calls. When ``PERSIST_VMS`` is enabled (default), each user keeps the
|
19 |
-
same container across multiple chat sessions and across application restarts,
|
20 |
-
so any installed packages and filesystem changes remain available. The
|
21 |
-
environment includes Python and ``pip`` so complex tasks can be scripted using
|
22 |
-
Python directly inside the terminal.
|
23 |
-
|
24 |
-
Sessions share state through an in-memory registry so that only one generation
|
25 |
-
can run at a time. Messages sent while a response is being produced are
|
26 |
-
ignored unless the assistant is waiting for a tool result—in that case the
|
27 |
-
pending response is cancelled and replaced with the new request.
|
28 |
-
|
29 |
-
The application injects a robust system prompt on each request. The prompt
|
30 |
-
guides the model to plan tool usage, execute commands sequentially and
|
31 |
-
verify results before replying. When the assistant is uncertain, it is directed
|
32 |
-
to search the internet with ``execute_terminal`` before giving a final answer.
|
33 |
-
The prompt is **not** stored in the chat history but is provided at runtime so
|
34 |
-
the assistant can orchestrate tool calls in sequence to fulfil the user's
|
35 |
-
request reliably. It also directs the assistant to avoid technical jargon so
|
36 |
-
responses are easy for anyone to understand. If a user message ends with
|
37 |
-
``/think`` it simply selects an internal reasoning mode and should be stripped
|
38 |
-
from the prompt before processing.
|
39 |
-
|
40 |
-
## Usage
|
41 |
|
42 |
```bash
|
43 |
python run.py
|
44 |
```
|
45 |
|
46 |
-
The script
|
47 |
|
48 |
-
|
49 |
|
50 |
```python
|
51 |
async with ChatSession() as chat:
|
52 |
-
|
53 |
-
async for part in chat.chat_stream(f"Summarize {
|
54 |
print(part)
|
55 |
```
|
56 |
|
57 |
-
When using the Discord bot, attach one or more text files to a message to
|
58 |
-
upload them automatically. The bot responds with the location of each document
|
59 |
-
inside the VM so they can be referenced in subsequent prompts.
|
60 |
-
|
61 |
## Discord Bot
|
62 |
|
63 |
-
Create a `.env` file with your
|
64 |
|
65 |
-
```bash
|
66 |
-
DISCORD_TOKEN="your-token"
|
67 |
-
```
|
|
|
68 |
|
69 |
-
|
|
|
|
|
70 |
|
71 |
-
|
72 |
-
python -m bot
|
73 |
-
```
|
74 |
-
|
75 |
-
Any attachments sent to the bot are uploaded to the VM and the bot replies with
|
76 |
-
their paths so they can be used in later messages.
|
77 |
|
78 |
## VM Configuration
|
79 |
|
80 |
-
The
|
81 |
-
it pulls the image defined by the ``VM_IMAGE`` environment variable, falling
|
82 |
-
back to ``python:3.11-slim``. This base image includes Python and ``pip`` so
|
83 |
-
packages can be installed immediately. The container has network access enabled
|
84 |
-
which allows fetching additional dependencies as needed.
|
85 |
-
|
86 |
-
When ``PERSIST_VMS`` is ``1`` (default), containers are kept around and reused
|
87 |
-
across application restarts. Each user is assigned a stable container name, so
|
88 |
-
packages installed or files created inside the VM remain available the next
|
89 |
-
time the application starts. Set ``VM_STATE_DIR`` to specify the host directory
|
90 |
-
used for per-user persistent storage mounted inside the VM at ``/state``.
|
91 |
-
Set ``PERSIST_VMS=0`` to revert to the previous behaviour where containers are
|
92 |
-
stopped once no sessions are using them.
|
93 |
|
94 |
-
To
|
95 |
-
``VM_IMAGE`` to that image. An example ``docker/Dockerfile.vm`` is provided:
|
96 |
|
97 |
```Dockerfile
|
98 |
FROM ubuntu:22.04
|
99 |
-
|
100 |
-
# Install core utilities and Python
|
101 |
RUN apt-get update && \
|
102 |
apt-get install -y --no-install-recommends \
|
103 |
python3 \
|
@@ -107,7 +57,6 @@ RUN apt-get update && \
|
|
107 |
git \
|
108 |
build-essential \
|
109 |
&& rm -rf /var/lib/apt/lists/*
|
110 |
-
|
111 |
CMD ["sleep", "infinity"]
|
112 |
```
|
113 |
|
@@ -119,12 +68,9 @@ export VM_IMAGE=llm-vm
|
|
119 |
python run.py
|
120 |
```
|
121 |
|
122 |
-
The custom VM includes typical utilities like ``sudo`` and ``curl`` so it behaves
|
123 |
-
more like a standard Ubuntu installation.
|
124 |
-
|
125 |
## REST API
|
126 |
|
127 |
-
Start the API server as a module or
|
128 |
|
129 |
```bash
|
130 |
python -m api_app
|
@@ -134,9 +80,9 @@ uvicorn api_app:app --host 0.0.0.0 --port 8000
|
|
134 |
|
135 |
### Endpoints
|
136 |
|
137 |
-
-
|
138 |
-
-
|
139 |
-
-
|
140 |
|
141 |
Example request:
|
142 |
|
@@ -146,26 +92,22 @@ curl -N -X POST http://localhost:8000/chat/stream \
|
|
146 |
-d '{"user":"demo","session":"default","prompt":"Hello"}'
|
147 |
```
|
148 |
|
149 |
-
##
|
150 |
|
151 |
-
|
152 |
-
platforms. Install the dependencies and run:
|
153 |
|
154 |
```bash
|
155 |
python -m src.cli --user yourname
|
156 |
```
|
157 |
|
158 |
-
|
159 |
-
new session. Type messages and the assistant's streamed replies will appear
|
160 |
-
immediately. Enter ``exit`` or press ``Ctrl+D`` to quit.
|
161 |
|
162 |
-
### Windows
|
163 |
|
164 |
-
For a standalone
|
165 |
-
`cli_app`. After installing ``pyinstaller`` run:
|
166 |
|
167 |
```bash
|
168 |
pyinstaller --onefile -n llm-chat cli_app/main.py
|
169 |
```
|
170 |
|
171 |
-
The resulting
|
|
|
1 |
# llm-backend
|
2 |
|
3 |
+
`llm-backend` provides an asynchronous chat interface built around Ollama models. It supports running shell commands in an isolated Linux VM and persists conversations in SQLite.
|
4 |
+
|
5 |
+
## Features
|
6 |
+
|
7 |
+
- **Persistent chat history** – conversations are stored in `chat.db` per user and session so they can be resumed later.
|
8 |
+
- **Tool execution** – a built-in `execute_terminal` tool runs commands inside a Docker-based VM. Network access is enabled and both stdout and stderr are captured (up to 10,000 characters). The VM is reused across chats when `PERSIST_VMS=1` so installed packages remain available.
|
9 |
+
- **System prompts** – every request includes a system prompt that guides the assistant to plan tool usage, verify results and avoid unnecessary jargon.
|
10 |
+
|
11 |
+
## Quick Start
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
```bash
|
14 |
python run.py
|
15 |
```
|
16 |
|
17 |
+
The script issues a sample command to the model and prints the streamed response. Uploaded files go to `uploads` and are mounted in the VM at `/data`.
|
18 |
|
19 |
+
### Uploading Documents
|
20 |
|
21 |
```python
|
22 |
async with ChatSession() as chat:
|
23 |
+
path = chat.upload_document("path/to/file.pdf")
|
24 |
+
async for part in chat.chat_stream(f"Summarize {path}"):
|
25 |
print(part)
|
26 |
```
|
27 |
|
|
|
|
|
|
|
|
|
28 |
## Discord Bot
|
29 |
|
30 |
+
1. Create a `.env` file with your bot token:
|
31 |
|
32 |
+
```bash
|
33 |
+
DISCORD_TOKEN="your-token"
|
34 |
+
```
|
35 |
+
2. Start the bot:
|
36 |
|
37 |
+
```bash
|
38 |
+
python -m bot
|
39 |
+
```
|
40 |
|
41 |
+
Attachments sent to the bot are uploaded automatically and the VM path is returned so they can be referenced in later messages.
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
## VM Configuration
|
44 |
|
45 |
+
The shell commands run inside a Docker container. By default the image defined by `VM_IMAGE` is used (falling back to `python:3.11-slim`). When `PERSIST_VMS=1` (default) each user keeps the same container across sessions. Set `VM_STATE_DIR` to choose where per-user data is stored on the host.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
+
To build a more complete environment you can create your own image, for example using `docker/Dockerfile.vm`:
|
|
|
48 |
|
49 |
```Dockerfile
|
50 |
FROM ubuntu:22.04
|
|
|
|
|
51 |
RUN apt-get update && \
|
52 |
apt-get install -y --no-install-recommends \
|
53 |
python3 \
|
|
|
57 |
git \
|
58 |
build-essential \
|
59 |
&& rm -rf /var/lib/apt/lists/*
|
|
|
60 |
CMD ["sleep", "infinity"]
|
61 |
```
|
62 |
|
|
|
68 |
python run.py
|
69 |
```
|
70 |
|
|
|
|
|
|
|
71 |
## REST API
|
72 |
|
73 |
+
Start the API server either as a module or via `uvicorn`:
|
74 |
|
75 |
```bash
|
76 |
python -m api_app
|
|
|
80 |
|
81 |
### Endpoints
|
82 |
|
83 |
+
- `POST /chat/stream` – stream the assistant's response as plain text.
|
84 |
+
- `POST /upload` – upload a document that can be referenced in chats.
|
85 |
+
- `GET /sessions/{user}` – list available session names for a user.
|
86 |
|
87 |
Example request:
|
88 |
|
|
|
92 |
-d '{"user":"demo","session":"default","prompt":"Hello"}'
|
93 |
```
|
94 |
|
95 |
+
## Command Line Interface
|
96 |
|
97 |
+
Run the interactive CLI on any platform:
|
|
|
98 |
|
99 |
```bash
|
100 |
python -m src.cli --user yourname
|
101 |
```
|
102 |
|
103 |
+
Existing sessions are listed and you can create new ones. Type messages to see streamed replies. Use `exit` or `Ctrl+D` to quit.
|
|
|
|
|
104 |
|
105 |
+
### Windows Executable
|
106 |
|
107 |
+
For a standalone Windows build install `pyinstaller` and run:
|
|
|
108 |
|
109 |
```bash
|
110 |
pyinstaller --onefile -n llm-chat cli_app/main.py
|
111 |
```
|
112 |
|
113 |
+
The resulting `llm-chat.exe` works on Windows 10/11.
|