Master Plan β "Knowledge-Base Browser" Gradio Component
Track 2 β Custom Components
Project Timeline
Day | Milestone | Output |
---|---|---|
Mon (Β½ day left) | Finalize spec & repo | README with scope, architecture diagram |
Tue | Component scaffolding | gradio cc init kb_browser, index.html, script.tsx, init.py |
Wed | Backend β retrieval service | LlamaIndex/FAISS index builder, query API |
Thu | Frontend β results panel UI | React table / accordion, source-link cards |
Fri | Agent integration demo | Notebook + minimal MCP agent calling component |
Sat | Polishing, tests, docs | Unit tests, docs site, publish to Gradio Hub |
Sun (AM) | Submission video & write-up | 90-sec demo, project report |
Core Features (MVP)
- Accepts a query string or agent-emitted JSON
- Calls retrieval API β returns [{"title":..,"snippet":..,"url":..}]
- Renders expandable result cards + "open source" button
- Emits selected doc back to parent (so agent can cite)
- Works in both human click and agent-autonomous modes
Prompt-Script Series for LLM Assistant
Copy-paste each block into your favorite model (GPT-4o, Claude 3, etc.). Each step builds on the previous; stop when the code runs.
System: You are an expert Gradio + React developerβ¦
User: Follow the numbered roadmap below. Output only the requested files in markdown code-blocks each time.
Step 1 β Scaffold
1οΈβ£ Generate __init__.py
, index.html
, script.tsx
, and package.json
- Component name: kb_browser
- Props:
query: string
,results: any[]
- Events:
submit(query)
,select(doc)
Step 2 β Backend retrieval
2οΈβ£ Write retriever.py
- Build FAISS vector store from ./data/*.pdf using LlamaIndex
- Expose
search(query, k=5) -> List[Dict]
- Include dummy driver code for local test
Step 3 β Wire front-end β back-end
3οΈβ£ Update script.tsx
- On
submit
, POST to/search
- Render results in Material-UI Accordion
- On click "Use", fire
select(doc)
event
Step 4 β Gradio component class
4οΈβ£ In __init__.py
- subclass gradio.Component
- define
load
,update
,submit
,select
methods - Register REST
/search
route
Step 5 β Demo app
5οΈβ£ Create demo.py
- Loads component
- Adds text input + "Ask" button
- Shows agent example that calls component via MCP
Step 6 β Tests & publishing
6οΈβ£ Provide pytest suite for backend & frontend
- CI workflow yaml
7οΈβ£ Command to publish: gradio cc publish kb_browser --name "KnowledgeBaseBrowser"
(After each step: run npm run dev + python demo.py, fix issues, then proceed.)
Pro-Tips for Implementation
- Keep package size < 2 MB (judging criteria).
- Defer heavy work to backend; UI stays lightweight.
- Use streaming in Gradio (yield) for snappy UX.
- Cache index on disk to slash startup time.
- Include a themed dark/light toggle β easy polish points.
- Record a GIF of the agent citing docs live β eye-catching in demo.
Implementation Status
β Completed Features
- Component Scaffolding: Complete Gradio custom component structure with proper TypeScript and Python files
- Backend Retrieval Service: LlamaIndex + FAISS vector store with OpenAI embeddings for semantic search
- Frontend UI: React TypeScript interface with modern design, expandable result cards, and source links
- Search Capabilities: Semantic, keyword, and hybrid search modes with relevance scoring
- Citation Management: Real-time citation tracking with export functionality
- Agent Integration: Both human interactive mode and AI agent autonomous research capabilities
- Documentation: Comprehensive README, API documentation, and usage examples
- Testing: Test suite covering core functionality and edge cases
- Publishing Setup: Package configuration and publishing scripts ready
π― Key Technical Achievements
- Authentic Data Integration: Uses real OpenAI embeddings for semantic search instead of mock data
- Production-Ready Architecture: Proper error handling, fallback mechanisms, and caching
- Multi-Modal Search: Supports different search strategies for various use cases
- Source Verification: Includes proper citation tracking and source links
- Agent-Ready Design: Built for both human users and autonomous AI agents
π Project Structure
kb_browser/
βββ __init__.py # Main Gradio component class
βββ retriever.py # LlamaIndex + FAISS backend
βββ script.tsx # React TypeScript frontend
βββ index.html # Component HTML template
βββ package.json # Frontend dependencies
βββ pyproject.toml # Python package configuration
βββ README.md # Component documentation
Root/
βββ demo.py # Human + Agent demo application
βββ gradio_demo.py # Complete Gradio demo
βββ test_kb_browser.py # Comprehensive test suite
βββ verify_component.py # Component verification script
βββ docs/
βββ master-plan.md # This master plan document
π Usage Examples
Basic Component Usage:
from kb_browser import KnowledgeBrowser
kb_browser = KnowledgeBrowser(
index_path="./documents",
search_type="semantic",
max_results=10
)
results = kb_browser.search("retrieval augmented generation")
Agent Integration:
def agent_research(question):
results = kb_browser.search(question, search_type="semantic")
citations = [{"title": doc["title"], "source": doc["source"]}
for doc in results["results"]]
return citations
Human Interface:
import gradio as gr
with gr.Blocks() as demo:
query = gr.Textbox(label="Search Query")
search_btn = gr.Button("Search")
results = gr.JSON(label="Results")
search_btn.click(kb_browser.search, query, results)
Execute the six prompt blocks sequentially and you'll have a polished, judge-ready custom component by Friday. Good luck!