|
# Master Plan β "Knowledge-Base Browser" Gradio Component |
|
|
|
*Track 2 β Custom Components* |
|
|
|
## Project Timeline |
|
|
|
| Day | Milestone | Output | |
|
|-----|-----------|--------| |
|
| Mon (Β½ day left) | Finalize spec & repo | README with scope, architecture diagram | |
|
| Tue | Component scaffolding | gradio cc init kb_browser, index.html, script.tsx, __init__.py | |
|
| Wed | Backend β retrieval service | LlamaIndex/FAISS index builder, query API | |
|
| Thu | Frontend β results panel UI | React table / accordion, source-link cards | |
|
| Fri | Agent integration demo | Notebook + minimal MCP agent calling component | |
|
| Sat | Polishing, tests, docs | Unit tests, docs site, publish to Gradio Hub | |
|
| Sun (AM) | Submission video & write-up | 90-sec demo, project report | |
|
|
|
## Core Features (MVP) |
|
|
|
1. Accepts a query string or agent-emitted JSON |
|
2. Calls retrieval API β returns [{"title":..,"snippet":..,"url":..}] |
|
3. Renders expandable result cards + "open source" button |
|
4. Emits selected doc back to parent (so agent can cite) |
|
5. Works in both human click and agent-autonomous modes |
|
|
|
--- |
|
|
|
## Prompt-Script Series for LLM Assistant |
|
|
|
Copy-paste each block into your favorite model (GPT-4o, Claude 3, etc.). Each step builds on the previous; stop when the code runs. |
|
|
|
**System:** You are an expert Gradio + React developer⦠|
|
**User:** Follow the numbered roadmap below. Output only the requested files in markdown code-blocks each time. |
|
|
|
### Step 1 β Scaffold |
|
|
|
**1οΈβ£ Generate `__init__.py`, `index.html`, `script.tsx`, and `package.json`** |
|
- Component name: kb_browser |
|
- Props: `query: string`, `results: any[]` |
|
- Events: `submit(query)`, `select(doc)` |
|
|
|
### Step 2 β Backend retrieval |
|
|
|
**2οΈβ£ Write `retriever.py`** |
|
- Build FAISS vector store from ./data/*.pdf using LlamaIndex |
|
- Expose `search(query, k=5) -> List[Dict]` |
|
- Include dummy driver code for local test |
|
|
|
### Step 3 β Wire front-end β back-end |
|
|
|
**3οΈβ£ Update `script.tsx`** |
|
- On `submit`, POST to `/search` |
|
- Render results in Material-UI Accordion |
|
- On click "Use", fire `select(doc)` event |
|
|
|
### Step 4 β Gradio component class |
|
|
|
**4οΈβ£ In `__init__.py`** |
|
- subclass gradio.Component |
|
- define `load`, `update`, `submit`, `select` methods |
|
- Register REST `/search` route |
|
|
|
### Step 5 β Demo app |
|
|
|
**5οΈβ£ Create `demo.py`** |
|
- Loads component |
|
- Adds text input + "Ask" button |
|
- Shows agent example that calls component via MCP |
|
|
|
### Step 6 β Tests & publishing |
|
|
|
**6οΈβ£ Provide pytest suite for backend & frontend** |
|
- CI workflow yaml |
|
|
|
**7οΈβ£ Command to publish:** `gradio cc publish kb_browser --name "KnowledgeBaseBrowser"` |
|
|
|
*(After each step: run npm run dev + python demo.py, fix issues, then proceed.)* |
|
|
|
--- |
|
|
|
## Pro-Tips for Implementation |
|
|
|
- Keep package size < 2 MB (judging criteria). |
|
- Defer heavy work to backend; UI stays lightweight. |
|
- Use streaming in Gradio (yield) for snappy UX. |
|
- Cache index on disk to slash startup time. |
|
- Include a themed dark/light toggle β easy polish points. |
|
- Record a GIF of the agent citing docs live β eye-catching in demo. |
|
|
|
## Implementation Status |
|
|
|
### β
Completed Features |
|
|
|
- **Component Scaffolding**: Complete Gradio custom component structure with proper TypeScript and Python files |
|
- **Backend Retrieval Service**: LlamaIndex + FAISS vector store with OpenAI embeddings for semantic search |
|
- **Frontend UI**: React TypeScript interface with modern design, expandable result cards, and source links |
|
- **Search Capabilities**: Semantic, keyword, and hybrid search modes with relevance scoring |
|
- **Citation Management**: Real-time citation tracking with export functionality |
|
- **Agent Integration**: Both human interactive mode and AI agent autonomous research capabilities |
|
- **Documentation**: Comprehensive README, API documentation, and usage examples |
|
- **Testing**: Test suite covering core functionality and edge cases |
|
- **Publishing Setup**: Package configuration and publishing scripts ready |
|
|
|
### π― Key Technical Achievements |
|
|
|
1. **Authentic Data Integration**: Uses real OpenAI embeddings for semantic search instead of mock data |
|
2. **Production-Ready Architecture**: Proper error handling, fallback mechanisms, and caching |
|
3. **Multi-Modal Search**: Supports different search strategies for various use cases |
|
4. **Source Verification**: Includes proper citation tracking and source links |
|
5. **Agent-Ready Design**: Built for both human users and autonomous AI agents |
|
|
|
### π Project Structure |
|
|
|
``` |
|
kb_browser/ |
|
βββ __init__.py # Main Gradio component class |
|
βββ retriever.py # LlamaIndex + FAISS backend |
|
βββ script.tsx # React TypeScript frontend |
|
βββ index.html # Component HTML template |
|
βββ package.json # Frontend dependencies |
|
βββ pyproject.toml # Python package configuration |
|
βββ README.md # Component documentation |
|
|
|
Root/ |
|
βββ demo.py # Human + Agent demo application |
|
βββ gradio_demo.py # Complete Gradio demo |
|
βββ test_kb_browser.py # Comprehensive test suite |
|
βββ verify_component.py # Component verification script |
|
βββ docs/ |
|
βββ master-plan.md # This master plan document |
|
``` |
|
|
|
### π Usage Examples |
|
|
|
**Basic Component Usage:** |
|
```python |
|
from kb_browser import KnowledgeBrowser |
|
|
|
kb_browser = KnowledgeBrowser( |
|
index_path="./documents", |
|
search_type="semantic", |
|
max_results=10 |
|
) |
|
|
|
results = kb_browser.search("retrieval augmented generation") |
|
``` |
|
|
|
**Agent Integration:** |
|
```python |
|
def agent_research(question): |
|
results = kb_browser.search(question, search_type="semantic") |
|
citations = [{"title": doc["title"], "source": doc["source"]} |
|
for doc in results["results"]] |
|
return citations |
|
``` |
|
|
|
**Human Interface:** |
|
```python |
|
import gradio as gr |
|
|
|
with gr.Blocks() as demo: |
|
query = gr.Textbox(label="Search Query") |
|
search_btn = gr.Button("Search") |
|
results = gr.JSON(label="Results") |
|
|
|
search_btn.click(kb_browser.search, query, results) |
|
``` |
|
|
|
Execute the six prompt blocks sequentially and you'll have a polished, judge-ready custom component by Friday. Good luck! |