File size: 3,108 Bytes
3c77517 459ed9b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
---
title: Prompt-Engineered Persona Agent
emoji: π
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
short_description: AI chatbot with a crafted personality (e.g., Wise Mentor)
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# π€ Prompt-Engineered Persona Agent with Mini-RAG
This project is an agentic chatbot built with a quantized LLM (`Gemma 1B`) that behaves according to a customizable persona prompt. It features a lightweight Retrieval-Augmented Generation (RAG) system using **TF-IDF + FAISS**, and **dynamic context length estimation** to optimize inference timeβperfectly suited for CPU-only environments like Hugging Face Spaces.
---
## π Features
* β
**Customizable Persona** via system prompt
* β
**Mini-RAG** using TF-IDF + FAISS to retrieve relevant past conversation
* β
**Efficient memory** β only top relevant chat history used
* β
**Dynamic context length** estimation speeds up response time
* β
Gradio-powered UI
* β
Runs on free CPU
---
## π§ How It Works
1. **User submits a query** along with a system persona prompt.
2. **Top-k similar past turns** are retrieved using FAISS over TF-IDF vectors.
3. Only **relevant chat history** is used to build the final prompt.
4. The LLM generates a response based on the combined system prompt, retrieved context, and current user message.
5. Context length (`n_ctx`) is dynamically estimated to minimize resource usage.
---
## π§ͺ Example Personas
You can change the persona in the UI system prompt box:
* π `"You are a wise academic advisor who offers up to 3 concise, practical suggestions."`
* π§ `"You are a calm mindfulness coach. Always reply gently and with encouragement."`
* π΅οΈ `"You are an investigative assistant. Be logical, skeptical, and fact-focused."`
---
## π¦ Installation
**For local setup:**
```bash
git clone https://huggingface.co/spaces/YOUR_USERNAME/Prompt-Persona-Agent
cd Prompt-Persona-Agent
pip install -r requirements.txt
```
Create an environment variable:
```bash
export HF_TOKEN=your_huggingface_token
```
Then run:
```bash
python app.py
```
---
## π Files
* `app.py`: Main application with chat + RAG + dynamic context
* `requirements.txt`: All Python dependencies
* `README.md`: This file
---
## π οΈ Tech Stack
* [Gradio](https://gradio.app/)
* [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
* [FAISS](https://github.com/facebookresearch/faiss)
* [scikit-learn (TF-IDF)](https://scikit-learn.org/)
* [Gemma 1B IT GGUF](https://huggingface.co/google/gemma-1.1-1b-it-gguf)
---
## π Limitations
* Basic TF-IDF + FAISS retrieval β can be extended with semantic embedding models.
* Not all LLMs strictly follow persona β prompt tuning helps but is not perfect.
* For longer-term memory, a database + summarizer would be better.
---
## π€ Deploy to Hugging Face Spaces
> Uses only CPU, no paid GPU required.
Make sure your `HF_TOKEN` is set as a secret or environment variable in your Hugging Face Space.
---
|