File size: 6,141 Bytes
7c012de |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
# Master Plan β "Knowledge-Base Browser" Gradio Component
*Track 2 β Custom Components*
## Project Timeline
| Day | Milestone | Output |
|-----|-----------|--------|
| Mon (Β½ day left) | Finalize spec & repo | README with scope, architecture diagram |
| Tue | Component scaffolding | gradio cc init kb_browser, index.html, script.tsx, __init__.py |
| Wed | Backend β retrieval service | LlamaIndex/FAISS index builder, query API |
| Thu | Frontend β results panel UI | React table / accordion, source-link cards |
| Fri | Agent integration demo | Notebook + minimal MCP agent calling component |
| Sat | Polishing, tests, docs | Unit tests, docs site, publish to Gradio Hub |
| Sun (AM) | Submission video & write-up | 90-sec demo, project report |
## Core Features (MVP)
1. Accepts a query string or agent-emitted JSON
2. Calls retrieval API β returns [{"title":..,"snippet":..,"url":..}]
3. Renders expandable result cards + "open source" button
4. Emits selected doc back to parent (so agent can cite)
5. Works in both human click and agent-autonomous modes
---
## Prompt-Script Series for LLM Assistant
Copy-paste each block into your favorite model (GPT-4o, Claude 3, etc.). Each step builds on the previous; stop when the code runs.
**System:** You are an expert Gradio + React developerβ¦
**User:** Follow the numbered roadmap below. Output only the requested files in markdown code-blocks each time.
### Step 1 β Scaffold
**1οΈβ£ Generate `__init__.py`, `index.html`, `script.tsx`, and `package.json`**
- Component name: kb_browser
- Props: `query: string`, `results: any[]`
- Events: `submit(query)`, `select(doc)`
### Step 2 β Backend retrieval
**2οΈβ£ Write `retriever.py`**
- Build FAISS vector store from ./data/*.pdf using LlamaIndex
- Expose `search(query, k=5) -> List[Dict]`
- Include dummy driver code for local test
### Step 3 β Wire front-end β back-end
**3οΈβ£ Update `script.tsx`**
- On `submit`, POST to `/search`
- Render results in Material-UI Accordion
- On click "Use", fire `select(doc)` event
### Step 4 β Gradio component class
**4οΈβ£ In `__init__.py`**
- subclass gradio.Component
- define `load`, `update`, `submit`, `select` methods
- Register REST `/search` route
### Step 5 β Demo app
**5οΈβ£ Create `demo.py`**
- Loads component
- Adds text input + "Ask" button
- Shows agent example that calls component via MCP
### Step 6 β Tests & publishing
**6οΈβ£ Provide pytest suite for backend & frontend**
- CI workflow yaml
**7οΈβ£ Command to publish:** `gradio cc publish kb_browser --name "KnowledgeBaseBrowser"`
*(After each step: run npm run dev + python demo.py, fix issues, then proceed.)*
---
## Pro-Tips for Implementation
- Keep package size < 2 MB (judging criteria).
- Defer heavy work to backend; UI stays lightweight.
- Use streaming in Gradio (yield) for snappy UX.
- Cache index on disk to slash startup time.
- Include a themed dark/light toggle β easy polish points.
- Record a GIF of the agent citing docs live β eye-catching in demo.
## Implementation Status
### β
Completed Features
- **Component Scaffolding**: Complete Gradio custom component structure with proper TypeScript and Python files
- **Backend Retrieval Service**: LlamaIndex + FAISS vector store with OpenAI embeddings for semantic search
- **Frontend UI**: React TypeScript interface with modern design, expandable result cards, and source links
- **Search Capabilities**: Semantic, keyword, and hybrid search modes with relevance scoring
- **Citation Management**: Real-time citation tracking with export functionality
- **Agent Integration**: Both human interactive mode and AI agent autonomous research capabilities
- **Documentation**: Comprehensive README, API documentation, and usage examples
- **Testing**: Test suite covering core functionality and edge cases
- **Publishing Setup**: Package configuration and publishing scripts ready
### π― Key Technical Achievements
1. **Authentic Data Integration**: Uses real OpenAI embeddings for semantic search instead of mock data
2. **Production-Ready Architecture**: Proper error handling, fallback mechanisms, and caching
3. **Multi-Modal Search**: Supports different search strategies for various use cases
4. **Source Verification**: Includes proper citation tracking and source links
5. **Agent-Ready Design**: Built for both human users and autonomous AI agents
### π Project Structure
```
kb_browser/
βββ __init__.py # Main Gradio component class
βββ retriever.py # LlamaIndex + FAISS backend
βββ script.tsx # React TypeScript frontend
βββ index.html # Component HTML template
βββ package.json # Frontend dependencies
βββ pyproject.toml # Python package configuration
βββ README.md # Component documentation
Root/
βββ demo.py # Human + Agent demo application
βββ gradio_demo.py # Complete Gradio demo
βββ test_kb_browser.py # Comprehensive test suite
βββ verify_component.py # Component verification script
βββ docs/
βββ master-plan.md # This master plan document
```
### π Usage Examples
**Basic Component Usage:**
```python
from kb_browser import KnowledgeBrowser
kb_browser = KnowledgeBrowser(
index_path="./documents",
search_type="semantic",
max_results=10
)
results = kb_browser.search("retrieval augmented generation")
```
**Agent Integration:**
```python
def agent_research(question):
results = kb_browser.search(question, search_type="semantic")
citations = [{"title": doc["title"], "source": doc["source"]}
for doc in results["results"]]
return citations
```
**Human Interface:**
```python
import gradio as gr
with gr.Blocks() as demo:
query = gr.Textbox(label="Search Query")
search_btn = gr.Button("Search")
results = gr.JSON(label="Results")
search_btn.click(kb_browser.search, query, results)
```
Execute the six prompt blocks sequentially and you'll have a polished, judge-ready custom component by Friday. Good luck! |