Update readme.md
Browse files- README copy.md +0 -344
- README.md +337 -5
README copy.md
DELETED
@@ -1,344 +0,0 @@
|
|
1 |
-
---
|
2 |
-
title: KnowledgeBridge
|
3 |
-
emoji: π
|
4 |
-
colorFrom: yellow
|
5 |
-
colorTo: red
|
6 |
-
sdk: docker
|
7 |
-
pinned: false
|
8 |
-
license: mit
|
9 |
-
short_description: 'A sophisticated AI-powered knowledge retrieval and analysis '
|
10 |
-
tags:
|
11 |
-
- agent-demo-track
|
12 |
-
---
|
13 |
-
|
14 |
-
# KnowledgeBridge
|
15 |
-
|
16 |
-
π **An AI-Enhanced Knowledge Discovery Platform**
|
17 |
-
|
18 |
-
A sophisticated AI-powered knowledge retrieval and analysis system that combines semantic search, real-time web integration, and intelligent document processing for research and information discovery.
|
19 |
-
|
20 |
-

|
21 |
-

|
22 |
-

|
23 |
-

|
24 |
-
|
25 |
-
## π― Hackathon Submission
|
26 |
-
|
27 |
-
**π€ Track 3: Agentic Demo Showcase**
|
28 |
-
|
29 |
-
**Submitted to**: [Hugging Face Agents-MCP-Hackathon](https://huggingface.co/Agents-MCP-Hackathon)
|
30 |
-
|
31 |
-
**Live Demo**: [Try KnowledgeBridge on Hugging Face Spaces](https://huggingface.co/spaces/YOUR_USERNAME/KnowledgeBridge)
|
32 |
-
|
33 |
-
### **π "Show us the most incredible things that your agents can do!"**
|
34 |
-
|
35 |
-
KnowledgeBridge demonstrates sophisticated AI agent orchestration through multi-modal knowledge discovery, intelligent query enhancement, and autonomous research synthesis.
|
36 |
-
|
37 |
-
## π€ Agentic Capabilities Showcase
|
38 |
-
|
39 |
-
### π§ **Multi-Agent Orchestration**
|
40 |
-
- **Coordinated Search Agents**: Simultaneous deployment across GitHub, Wikipedia, ArXiv, and web sources
|
41 |
-
- **Intelligent Load Balancing**: Agents dynamically distribute workload based on query type and source availability
|
42 |
-
- **Fallback Agent Strategy**: Backup agents activate when primary sources fail or timeout
|
43 |
-
- **Real-Time Coordination**: Agents communicate results and adapt search strategies collaboratively
|
44 |
-
|
45 |
-
### π **Query Enhancement Agents**
|
46 |
-
- **Intent Recognition Agents**: AI agents analyze user intent and suggest optimal search strategies
|
47 |
-
- **Semantic Expansion Agents**: Agents enhance queries with related terms and concepts
|
48 |
-
- **Context-Aware Agents**: Agents consider previous searches and user preferences
|
49 |
-
- **Multi-Modal Query Agents**: Agents adapt search approach based on content type (code, academic, general)
|
50 |
-
|
51 |
-
### π **Analysis & Synthesis Agents**
|
52 |
-
- **Document Processing Agents**: Autonomous analysis with configurable reasoning (summary, classification, key points)
|
53 |
-
- **Research Synthesis Agents**: AI agents combine insights from multiple sources into coherent analysis
|
54 |
-
- **Quality Assessment Agents**: Agents evaluate source credibility and content relevance
|
55 |
-
- **Format Adaptation Agents**: Agents dynamically adjust output format (markdown/plain text) based on user needs
|
56 |
-
|
57 |
-
### π‘οΈ **Security & Validation Agents**
|
58 |
-
- **URL Validation Agents**: Intelligent agents verify link accessibility and content authenticity
|
59 |
-
- **Rate Limiting Agents**: Protective agents prevent API abuse (100 requests/15min, 10/min for sensitive endpoints)
|
60 |
-
- **Input Sanitization Agents**: Security agents validate and clean all user inputs
|
61 |
-
- **Error Recovery Agents**: Resilient agents handle failures gracefully and maintain system stability
|
62 |
-
|
63 |
-
### π **Intelligent Integration Agents**
|
64 |
-
- **ArXiv Academic Agents**: Specialized agents for academic paper validation and retrieval
|
65 |
-
- **GitHub Repository Agents**: Code-focused agents with author filtering and relevance scoring
|
66 |
-
- **Wikipedia Knowledge Agents**: Authoritative content agents with intelligent caching strategies
|
67 |
-
- **Cross-Platform Synthesis Agents**: Agents that combine and rank results across all sources
|
68 |
-
|
69 |
-
## ποΈ Technical Architecture
|
70 |
-
|
71 |
-
### **Frontend Stack**
|
72 |
-
- **React 18** with TypeScript for type-safe development
|
73 |
-
- **Wouter Router** for lightweight client-side routing
|
74 |
-
- **TanStack Query** for efficient data fetching and caching
|
75 |
-
- **Radix UI + Tailwind CSS** for accessible, modern components
|
76 |
-
- **Framer Motion** for smooth animations and transitions
|
77 |
-
|
78 |
-
### **Backend Stack**
|
79 |
-
- **Node.js + Express** with comprehensive middleware
|
80 |
-
- **Nebius AI** integration with DeepSeek models
|
81 |
-
- **Modal** for distributed processing and scalability
|
82 |
-
- **Express Rate Limit** for API protection
|
83 |
-
- **Helmet.js** for security headers
|
84 |
-
|
85 |
-
### **AI & Processing**
|
86 |
-
- **DeepSeek-R1-0528** for chat completions and document analysis
|
87 |
-
- **BAAI/bge-en-icl** for embedding generation
|
88 |
-
- **Modal Client** for distributed compute tasks
|
89 |
-
- **Smart Ingestion Service** for advanced document processing
|
90 |
-
|
91 |
-
## π Quick Start
|
92 |
-
|
93 |
-
### **Environment Configuration**
|
94 |
-
|
95 |
-
Create a `.env` file in the project root:
|
96 |
-
|
97 |
-
```bash
|
98 |
-
# Nebius AI Configuration (Required)
|
99 |
-
NEBIUS_API_KEY=your_nebius_api_key_here
|
100 |
-
|
101 |
-
# Modal Configuration (Optional - for advanced processing)
|
102 |
-
MODAL_TOKEN_ID=your_modal_token_id
|
103 |
-
MODAL_TOKEN_SECRET=your_modal_token_secret
|
104 |
-
MODAL_BASE_URL=your_modal_endpoint
|
105 |
-
|
106 |
-
# GitHub Configuration (Optional - for repository search)
|
107 |
-
GITHUB_TOKEN=your_github_token_here
|
108 |
-
|
109 |
-
# Node Environment
|
110 |
-
NODE_ENV=development
|
111 |
-
```
|
112 |
-
|
113 |
-
### **Development Setup**
|
114 |
-
|
115 |
-
```bash
|
116 |
-
# Install dependencies
|
117 |
-
npm install
|
118 |
-
|
119 |
-
# Start development server
|
120 |
-
npm run dev
|
121 |
-
|
122 |
-
# Build for production
|
123 |
-
npm run build
|
124 |
-
|
125 |
-
# Type checking
|
126 |
-
npm run check
|
127 |
-
```
|
128 |
-
|
129 |
-
The application will be available at `http://localhost:5000`
|
130 |
-
|
131 |
-
## π― Usage Guide
|
132 |
-
|
133 |
-
### **Search Interface**
|
134 |
-
1. **Basic Search**: Enter queries in natural language
|
135 |
-
2. **AI Enhancement**: Click the sparkle icon to improve your query
|
136 |
-
3. **Advanced Search**: Use the AI tools panel for document analysis
|
137 |
-
4. **Export Results**: Generate citations in multiple formats
|
138 |
-
|
139 |
-
### **AI Tools**
|
140 |
-
- **Document Analysis**: Paste content for AI-powered analysis with configurable formatting
|
141 |
-
- **Embeddings**: Generate vector representations of text
|
142 |
-
- **Query Enhancement**: Get AI suggestions for better search queries
|
143 |
-
|
144 |
-
### **Knowledge Graph**
|
145 |
-
- Interactive visualization of document relationships
|
146 |
-
- Filter by concepts, authors, and source types
|
147 |
-
- Explore connections between research papers and topics
|
148 |
-
|
149 |
-
## π§ API Reference
|
150 |
-
|
151 |
-
### **Search Endpoints**
|
152 |
-
```typescript
|
153 |
-
POST /api/search
|
154 |
-
{
|
155 |
-
query: string;
|
156 |
-
searchType: "semantic" | "keyword" | "hybrid";
|
157 |
-
limit: number;
|
158 |
-
filters?: {
|
159 |
-
sourceTypes?: string[];
|
160 |
-
};
|
161 |
-
}
|
162 |
-
```
|
163 |
-
|
164 |
-
### **AI Analysis Endpoints**
|
165 |
-
```typescript
|
166 |
-
POST /api/analyze-document
|
167 |
-
{
|
168 |
-
content: string;
|
169 |
-
analysisType: "summary" | "classification" | "key_points" | "quality_score";
|
170 |
-
useMarkdown?: boolean;
|
171 |
-
}
|
172 |
-
|
173 |
-
POST /api/enhance-query
|
174 |
-
{
|
175 |
-
query: string;
|
176 |
-
context?: string;
|
177 |
-
}
|
178 |
-
|
179 |
-
POST /api/embeddings
|
180 |
-
{
|
181 |
-
input: string;
|
182 |
-
model?: string;
|
183 |
-
}
|
184 |
-
```
|
185 |
-
|
186 |
-
### **Health Check**
|
187 |
-
```typescript
|
188 |
-
GET /api/health
|
189 |
-
// Returns comprehensive health status of all services
|
190 |
-
```
|
191 |
-
|
192 |
-
## π Performance & Reliability
|
193 |
-
|
194 |
-
### **Response Times**
|
195 |
-
- Local search: <100ms for semantic queries
|
196 |
-
- Document analysis: ~3-5 seconds depending on content length
|
197 |
-
- URL validation: <2 seconds per URL with concurrent processing
|
198 |
-
- Embedding generation: ~500ms-1s per request
|
199 |
-
|
200 |
-
### **Scalability Features**
|
201 |
-
- Rate limiting prevents API abuse
|
202 |
-
- Concurrent URL validation with configurable limits
|
203 |
-
- Efficient caching for repeated queries
|
204 |
-
- Graceful degradation when external services are unavailable
|
205 |
-
|
206 |
-
### **Error Handling**
|
207 |
-
- React Error Boundaries prevent UI crashes
|
208 |
-
- Comprehensive API error responses
|
209 |
-
- Automatic retry logic for network requests
|
210 |
-
- User-friendly error messages
|
211 |
-
|
212 |
-
## π Security Features
|
213 |
-
|
214 |
-
### **Input Protection**
|
215 |
-
- Request body size limits (10MB)
|
216 |
-
- Comprehensive input sanitization
|
217 |
-
- SQL injection prevention
|
218 |
-
- XSS protection with CSP headers
|
219 |
-
|
220 |
-
### **API Security**
|
221 |
-
- Rate limiting on all endpoints
|
222 |
-
- Secure environment variable handling
|
223 |
-
- No hardcoded credentials
|
224 |
-
- Proper error logging without information disclosure
|
225 |
-
|
226 |
-
### **Infrastructure Security**
|
227 |
-
- Helmet.js security headers
|
228 |
-
- CORS configuration
|
229 |
-
- Secure cookie handling
|
230 |
-
- Production-ready error handling
|
231 |
-
|
232 |
-
## π οΈ Development
|
233 |
-
|
234 |
-
### **Code Quality**
|
235 |
-
- 100% TypeScript coverage
|
236 |
-
- ESLint + Prettier configuration
|
237 |
-
- Comprehensive error handling
|
238 |
-
- Type-safe API contracts with Zod validation
|
239 |
-
|
240 |
-
### **Testing**
|
241 |
-
```bash
|
242 |
-
# Type checking
|
243 |
-
npm run check
|
244 |
-
|
245 |
-
# Development server
|
246 |
-
npm run dev
|
247 |
-
|
248 |
-
# Production build
|
249 |
-
npm run build
|
250 |
-
```
|
251 |
-
|
252 |
-
## π Recent Updates
|
253 |
-
|
254 |
-
- β
**Security Hardening**: Removed all hardcoded credentials, added comprehensive security middleware
|
255 |
-
- β
**TypeScript Migration**: Achieved 100% type safety across the entire codebase
|
256 |
-
- β
**URL Validation**: Intelligent filtering of broken and invalid links
|
257 |
-
- β
**Error Handling**: React Error Boundaries and improved server error handling
|
258 |
-
- β
**AI Enhancement**: Nebius AI integration with configurable document analysis
|
259 |
-
- β
**Performance**: Rate limiting, input validation, and optimized processing
|
260 |
-
|
261 |
-
## π Architecture Highlights
|
262 |
-
|
263 |
-
### **AI Integration**
|
264 |
-
- **Nebius AI**: Primary AI service for all language model tasks
|
265 |
-
- **DeepSeek Models**: State-of-the-art reasoning capabilities
|
266 |
-
- **Modal Integration**: Distributed processing for heavy workloads
|
267 |
-
- **Embedding Search**: Semantic similarity matching
|
268 |
-
|
269 |
-
### **Data Flow**
|
270 |
-
1. User query β AI query enhancement (optional)
|
271 |
-
2. Parallel search: local storage + external sources
|
272 |
-
3. URL validation and content verification
|
273 |
-
4. Result ranking and relevance scoring
|
274 |
-
5. AI-powered analysis and synthesis
|
275 |
-
|
276 |
-
### **Component Architecture**
|
277 |
-
- **Enhanced Search Interface**: Unified search and AI tools
|
278 |
-
- **Knowledge Graph**: Interactive data visualization
|
279 |
-
- **Result Cards**: Rich content display with citations
|
280 |
-
- **Error Boundaries**: Resilient error handling
|
281 |
-
|
282 |
-
## π Track 3: Agentic Demo Showcase Features
|
283 |
-
|
284 |
-
### **π€ "Show us the most incredible things that your agents can do!"**
|
285 |
-
|
286 |
-
KnowledgeBridge demonstrates sophisticated multi-agent systems in action:
|
287 |
-
|
288 |
-
### **π§ Autonomous Agent Workflows**
|
289 |
-
- **Smart Agent Coordination**: Multiple specialized agents work together to fulfill complex research tasks
|
290 |
-
- **Adaptive Agent Behavior**: Agents dynamically adjust strategies based on query complexity and source availability
|
291 |
-
- **Multi-Modal Agent Processing**: Different agent types (search, analysis, validation) collaborate seamlessly
|
292 |
-
- **Intelligent Agent Fallbacks**: Backup agents activate automatically when primary agents encounter issues
|
293 |
-
|
294 |
-
### **π Real-Time Agent Decision Making**
|
295 |
-
- **Query Analysis Agents**: Instantly determine optimal search strategies across 4+ sources
|
296 |
-
- **Load Balancing Agents**: Distribute workload intelligently based on API response times and rate limits
|
297 |
-
- **Quality Control Agents**: Evaluate and filter results in real-time for relevance and authenticity
|
298 |
-
- **Synthesis Agents**: Combine disparate information sources into coherent, actionable insights
|
299 |
-
|
300 |
-
### **π Advanced Agent Orchestration**
|
301 |
-
- **Parallel Agent Execution**: Simultaneous deployment of search agents across GitHub, Wikipedia, ArXiv
|
302 |
-
- **Agent Communication Protocols**: Real-time coordination between agents for optimal resource utilization
|
303 |
-
- **Adaptive Agent Learning**: Agents improve performance based on user interactions and feedback
|
304 |
-
- **Error Recovery Agents**: Autonomous problem-solving when individual agents encounter failures
|
305 |
-
|
306 |
-
### **π‘οΈ Production-Grade Agent Infrastructure**
|
307 |
-
- **Security Agent Monitoring**: Continuous protection against abuse with intelligent rate limiting
|
308 |
-
- **Validation Agent Networks**: Multi-layer content verification and URL authenticity checking
|
309 |
-
- **Performance Agent Optimization**: Automatic scaling and resource management for enterprise workloads
|
310 |
-
- **Resilience Agent Systems**: Graceful degradation and fault tolerance across all agent operations
|
311 |
-
|
312 |
-
### **β‘ Agent Performance Metrics**
|
313 |
-
- **Sub-second Agent Response**: Query analysis and routing in <100ms
|
314 |
-
- **Concurrent Agent Processing**: 4+ agents working simultaneously on complex research tasks
|
315 |
-
- **Intelligent Agent Caching**: Smart result storage and retrieval for enhanced performance
|
316 |
-
- **Scalable Agent Architecture**: Horizontal scaling support for enterprise deployment
|
317 |
-
|
318 |
-
## π License
|
319 |
-
|
320 |
-
MIT License - see [LICENSE](LICENSE) file for details.
|
321 |
-
|
322 |
-
## π Related Resources
|
323 |
-
|
324 |
-
- [Nebius AI Documentation](https://docs.nebius.ai/)
|
325 |
-
- [Modal Documentation](https://modal.com/docs)
|
326 |
-
- [React Query Documentation](https://tanstack.com/query/latest)
|
327 |
-
- [Radix UI Components](https://www.radix-ui.com/)
|
328 |
-
|
329 |
-
---
|
330 |
-
|
331 |
-
## π Agents-MCP-Hackathon Submission Summary
|
332 |
-
|
333 |
-
**KnowledgeBridge** showcases the incredible power of AI agents through:
|
334 |
-
|
335 |
-
π€ **Multi-Agent Orchestration** - Coordinated intelligence across search, analysis, and synthesis agents
|
336 |
-
π **Real-Time Decision Making** - Agents adapt strategies and optimize performance dynamically
|
337 |
-
π **Advanced Agent Workflows** - Complex multi-step processes handled autonomously
|
338 |
-
π‘οΈ **Production-Ready Agent Infrastructure** - Enterprise-grade security and resilience
|
339 |
-
|
340 |
-
**Track 3: Agentic Demo Showcase** - Demonstrating what happens when sophisticated AI agents work together to revolutionize knowledge discovery and research workflows.
|
341 |
-
|
342 |
-
**Built for the Hugging Face Agents-MCP-Hackathon** π
|
343 |
-
|
344 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
@@ -1,12 +1,344 @@
|
|
1 |
---
|
2 |
title: KnowledgeBridge
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
-
colorTo:
|
6 |
sdk: docker
|
7 |
pinned: false
|
8 |
license: mit
|
9 |
-
short_description: '
|
|
|
|
|
10 |
---
|
11 |
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
title: KnowledgeBridge
|
3 |
+
emoji: π
|
4 |
+
colorFrom: yellow
|
5 |
+
colorTo: red
|
6 |
sdk: docker
|
7 |
pinned: false
|
8 |
license: mit
|
9 |
+
short_description: 'A sophisticated AI-powered knowledge retrieval and analysis '
|
10 |
+
tags:
|
11 |
+
- agent-demo-track
|
12 |
---
|
13 |
|
14 |
+
# KnowledgeBridge
|
15 |
+
|
16 |
+
π **An AI-Enhanced Knowledge Discovery Platform**
|
17 |
+
|
18 |
+
A sophisticated AI-powered knowledge retrieval and analysis system that combines semantic search, real-time web integration, and intelligent document processing for research and information discovery.
|
19 |
+
|
20 |
+

|
21 |
+

|
22 |
+

|
23 |
+

|
24 |
+
|
25 |
+
## π― Hackathon Submission
|
26 |
+
|
27 |
+
**π€ Track 3: Agentic Demo Showcase**
|
28 |
+
|
29 |
+
**Submitted to**: [Hugging Face Agents-MCP-Hackathon](https://huggingface.co/Agents-MCP-Hackathon)
|
30 |
+
|
31 |
+
**Live Demo**: [Try KnowledgeBridge on Hugging Face Spaces](https://huggingface.co/spaces/YOUR_USERNAME/KnowledgeBridge)
|
32 |
+
|
33 |
+
### **π "Show us the most incredible things that your agents can do!"**
|
34 |
+
|
35 |
+
KnowledgeBridge demonstrates sophisticated AI agent orchestration through multi-modal knowledge discovery, intelligent query enhancement, and autonomous research synthesis.
|
36 |
+
|
37 |
+
## π€ Agentic Capabilities Showcase
|
38 |
+
|
39 |
+
### π§ **Multi-Agent Orchestration**
|
40 |
+
- **Coordinated Search Agents**: Simultaneous deployment across GitHub, Wikipedia, ArXiv, and web sources
|
41 |
+
- **Intelligent Load Balancing**: Agents dynamically distribute workload based on query type and source availability
|
42 |
+
- **Fallback Agent Strategy**: Backup agents activate when primary sources fail or timeout
|
43 |
+
- **Real-Time Coordination**: Agents communicate results and adapt search strategies collaboratively
|
44 |
+
|
45 |
+
### π **Query Enhancement Agents**
|
46 |
+
- **Intent Recognition Agents**: AI agents analyze user intent and suggest optimal search strategies
|
47 |
+
- **Semantic Expansion Agents**: Agents enhance queries with related terms and concepts
|
48 |
+
- **Context-Aware Agents**: Agents consider previous searches and user preferences
|
49 |
+
- **Multi-Modal Query Agents**: Agents adapt search approach based on content type (code, academic, general)
|
50 |
+
|
51 |
+
### π **Analysis & Synthesis Agents**
|
52 |
+
- **Document Processing Agents**: Autonomous analysis with configurable reasoning (summary, classification, key points)
|
53 |
+
- **Research Synthesis Agents**: AI agents combine insights from multiple sources into coherent analysis
|
54 |
+
- **Quality Assessment Agents**: Agents evaluate source credibility and content relevance
|
55 |
+
- **Format Adaptation Agents**: Agents dynamically adjust output format (markdown/plain text) based on user needs
|
56 |
+
|
57 |
+
### π‘οΈ **Security & Validation Agents**
|
58 |
+
- **URL Validation Agents**: Intelligent agents verify link accessibility and content authenticity
|
59 |
+
- **Rate Limiting Agents**: Protective agents prevent API abuse (100 requests/15min, 10/min for sensitive endpoints)
|
60 |
+
- **Input Sanitization Agents**: Security agents validate and clean all user inputs
|
61 |
+
- **Error Recovery Agents**: Resilient agents handle failures gracefully and maintain system stability
|
62 |
+
|
63 |
+
### π **Intelligent Integration Agents**
|
64 |
+
- **ArXiv Academic Agents**: Specialized agents for academic paper validation and retrieval
|
65 |
+
- **GitHub Repository Agents**: Code-focused agents with author filtering and relevance scoring
|
66 |
+
- **Wikipedia Knowledge Agents**: Authoritative content agents with intelligent caching strategies
|
67 |
+
- **Cross-Platform Synthesis Agents**: Agents that combine and rank results across all sources
|
68 |
+
|
69 |
+
## ποΈ Technical Architecture
|
70 |
+
|
71 |
+
### **Frontend Stack**
|
72 |
+
- **React 18** with TypeScript for type-safe development
|
73 |
+
- **Wouter Router** for lightweight client-side routing
|
74 |
+
- **TanStack Query** for efficient data fetching and caching
|
75 |
+
- **Radix UI + Tailwind CSS** for accessible, modern components
|
76 |
+
- **Framer Motion** for smooth animations and transitions
|
77 |
+
|
78 |
+
### **Backend Stack**
|
79 |
+
- **Node.js + Express** with comprehensive middleware
|
80 |
+
- **Nebius AI** integration with DeepSeek models
|
81 |
+
- **Modal** for distributed processing and scalability
|
82 |
+
- **Express Rate Limit** for API protection
|
83 |
+
- **Helmet.js** for security headers
|
84 |
+
|
85 |
+
### **AI & Processing**
|
86 |
+
- **DeepSeek-R1-0528** for chat completions and document analysis
|
87 |
+
- **BAAI/bge-en-icl** for embedding generation
|
88 |
+
- **Modal Client** for distributed compute tasks
|
89 |
+
- **Smart Ingestion Service** for advanced document processing
|
90 |
+
|
91 |
+
## π Quick Start
|
92 |
+
|
93 |
+
### **Environment Configuration**
|
94 |
+
|
95 |
+
Create a `.env` file in the project root:
|
96 |
+
|
97 |
+
```bash
|
98 |
+
# Nebius AI Configuration (Required)
|
99 |
+
NEBIUS_API_KEY=your_nebius_api_key_here
|
100 |
+
|
101 |
+
# Modal Configuration (Optional - for advanced processing)
|
102 |
+
MODAL_TOKEN_ID=your_modal_token_id
|
103 |
+
MODAL_TOKEN_SECRET=your_modal_token_secret
|
104 |
+
MODAL_BASE_URL=your_modal_endpoint
|
105 |
+
|
106 |
+
# GitHub Configuration (Optional - for repository search)
|
107 |
+
GITHUB_TOKEN=your_github_token_here
|
108 |
+
|
109 |
+
# Node Environment
|
110 |
+
NODE_ENV=development
|
111 |
+
```
|
112 |
+
|
113 |
+
### **Development Setup**
|
114 |
+
|
115 |
+
```bash
|
116 |
+
# Install dependencies
|
117 |
+
npm install
|
118 |
+
|
119 |
+
# Start development server
|
120 |
+
npm run dev
|
121 |
+
|
122 |
+
# Build for production
|
123 |
+
npm run build
|
124 |
+
|
125 |
+
# Type checking
|
126 |
+
npm run check
|
127 |
+
```
|
128 |
+
|
129 |
+
The application will be available at `http://localhost:5000`
|
130 |
+
|
131 |
+
## π― Usage Guide
|
132 |
+
|
133 |
+
### **Search Interface**
|
134 |
+
1. **Basic Search**: Enter queries in natural language
|
135 |
+
2. **AI Enhancement**: Click the sparkle icon to improve your query
|
136 |
+
3. **Advanced Search**: Use the AI tools panel for document analysis
|
137 |
+
4. **Export Results**: Generate citations in multiple formats
|
138 |
+
|
139 |
+
### **AI Tools**
|
140 |
+
- **Document Analysis**: Paste content for AI-powered analysis with configurable formatting
|
141 |
+
- **Embeddings**: Generate vector representations of text
|
142 |
+
- **Query Enhancement**: Get AI suggestions for better search queries
|
143 |
+
|
144 |
+
### **Knowledge Graph**
|
145 |
+
- Interactive visualization of document relationships
|
146 |
+
- Filter by concepts, authors, and source types
|
147 |
+
- Explore connections between research papers and topics
|
148 |
+
|
149 |
+
## π§ API Reference
|
150 |
+
|
151 |
+
### **Search Endpoints**
|
152 |
+
```typescript
|
153 |
+
POST /api/search
|
154 |
+
{
|
155 |
+
query: string;
|
156 |
+
searchType: "semantic" | "keyword" | "hybrid";
|
157 |
+
limit: number;
|
158 |
+
filters?: {
|
159 |
+
sourceTypes?: string[];
|
160 |
+
};
|
161 |
+
}
|
162 |
+
```
|
163 |
+
|
164 |
+
### **AI Analysis Endpoints**
|
165 |
+
```typescript
|
166 |
+
POST /api/analyze-document
|
167 |
+
{
|
168 |
+
content: string;
|
169 |
+
analysisType: "summary" | "classification" | "key_points" | "quality_score";
|
170 |
+
useMarkdown?: boolean;
|
171 |
+
}
|
172 |
+
|
173 |
+
POST /api/enhance-query
|
174 |
+
{
|
175 |
+
query: string;
|
176 |
+
context?: string;
|
177 |
+
}
|
178 |
+
|
179 |
+
POST /api/embeddings
|
180 |
+
{
|
181 |
+
input: string;
|
182 |
+
model?: string;
|
183 |
+
}
|
184 |
+
```
|
185 |
+
|
186 |
+
### **Health Check**
|
187 |
+
```typescript
|
188 |
+
GET /api/health
|
189 |
+
// Returns comprehensive health status of all services
|
190 |
+
```
|
191 |
+
|
192 |
+
## π Performance & Reliability
|
193 |
+
|
194 |
+
### **Response Times**
|
195 |
+
- Local search: <100ms for semantic queries
|
196 |
+
- Document analysis: ~3-5 seconds depending on content length
|
197 |
+
- URL validation: <2 seconds per URL with concurrent processing
|
198 |
+
- Embedding generation: ~500ms-1s per request
|
199 |
+
|
200 |
+
### **Scalability Features**
|
201 |
+
- Rate limiting prevents API abuse
|
202 |
+
- Concurrent URL validation with configurable limits
|
203 |
+
- Efficient caching for repeated queries
|
204 |
+
- Graceful degradation when external services are unavailable
|
205 |
+
|
206 |
+
### **Error Handling**
|
207 |
+
- React Error Boundaries prevent UI crashes
|
208 |
+
- Comprehensive API error responses
|
209 |
+
- Automatic retry logic for network requests
|
210 |
+
- User-friendly error messages
|
211 |
+
|
212 |
+
## π Security Features
|
213 |
+
|
214 |
+
### **Input Protection**
|
215 |
+
- Request body size limits (10MB)
|
216 |
+
- Comprehensive input sanitization
|
217 |
+
- SQL injection prevention
|
218 |
+
- XSS protection with CSP headers
|
219 |
+
|
220 |
+
### **API Security**
|
221 |
+
- Rate limiting on all endpoints
|
222 |
+
- Secure environment variable handling
|
223 |
+
- No hardcoded credentials
|
224 |
+
- Proper error logging without information disclosure
|
225 |
+
|
226 |
+
### **Infrastructure Security**
|
227 |
+
- Helmet.js security headers
|
228 |
+
- CORS configuration
|
229 |
+
- Secure cookie handling
|
230 |
+
- Production-ready error handling
|
231 |
+
|
232 |
+
## π οΈ Development
|
233 |
+
|
234 |
+
### **Code Quality**
|
235 |
+
- 100% TypeScript coverage
|
236 |
+
- ESLint + Prettier configuration
|
237 |
+
- Comprehensive error handling
|
238 |
+
- Type-safe API contracts with Zod validation
|
239 |
+
|
240 |
+
### **Testing**
|
241 |
+
```bash
|
242 |
+
# Type checking
|
243 |
+
npm run check
|
244 |
+
|
245 |
+
# Development server
|
246 |
+
npm run dev
|
247 |
+
|
248 |
+
# Production build
|
249 |
+
npm run build
|
250 |
+
```
|
251 |
+
|
252 |
+
## π Recent Updates
|
253 |
+
|
254 |
+
- β
**Security Hardening**: Removed all hardcoded credentials, added comprehensive security middleware
|
255 |
+
- β
**TypeScript Migration**: Achieved 100% type safety across the entire codebase
|
256 |
+
- β
**URL Validation**: Intelligent filtering of broken and invalid links
|
257 |
+
- β
**Error Handling**: React Error Boundaries and improved server error handling
|
258 |
+
- β
**AI Enhancement**: Nebius AI integration with configurable document analysis
|
259 |
+
- β
**Performance**: Rate limiting, input validation, and optimized processing
|
260 |
+
|
261 |
+
## π Architecture Highlights
|
262 |
+
|
263 |
+
### **AI Integration**
|
264 |
+
- **Nebius AI**: Primary AI service for all language model tasks
|
265 |
+
- **DeepSeek Models**: State-of-the-art reasoning capabilities
|
266 |
+
- **Modal Integration**: Distributed processing for heavy workloads
|
267 |
+
- **Embedding Search**: Semantic similarity matching
|
268 |
+
|
269 |
+
### **Data Flow**
|
270 |
+
1. User query β AI query enhancement (optional)
|
271 |
+
2. Parallel search: local storage + external sources
|
272 |
+
3. URL validation and content verification
|
273 |
+
4. Result ranking and relevance scoring
|
274 |
+
5. AI-powered analysis and synthesis
|
275 |
+
|
276 |
+
### **Component Architecture**
|
277 |
+
- **Enhanced Search Interface**: Unified search and AI tools
|
278 |
+
- **Knowledge Graph**: Interactive data visualization
|
279 |
+
- **Result Cards**: Rich content display with citations
|
280 |
+
- **Error Boundaries**: Resilient error handling
|
281 |
+
|
282 |
+
## π Track 3: Agentic Demo Showcase Features
|
283 |
+
|
284 |
+
### **π€ "Show us the most incredible things that your agents can do!"**
|
285 |
+
|
286 |
+
KnowledgeBridge demonstrates sophisticated multi-agent systems in action:
|
287 |
+
|
288 |
+
### **π§ Autonomous Agent Workflows**
|
289 |
+
- **Smart Agent Coordination**: Multiple specialized agents work together to fulfill complex research tasks
|
290 |
+
- **Adaptive Agent Behavior**: Agents dynamically adjust strategies based on query complexity and source availability
|
291 |
+
- **Multi-Modal Agent Processing**: Different agent types (search, analysis, validation) collaborate seamlessly
|
292 |
+
- **Intelligent Agent Fallbacks**: Backup agents activate automatically when primary agents encounter issues
|
293 |
+
|
294 |
+
### **π Real-Time Agent Decision Making**
|
295 |
+
- **Query Analysis Agents**: Instantly determine optimal search strategies across 4+ sources
|
296 |
+
- **Load Balancing Agents**: Distribute workload intelligently based on API response times and rate limits
|
297 |
+
- **Quality Control Agents**: Evaluate and filter results in real-time for relevance and authenticity
|
298 |
+
- **Synthesis Agents**: Combine disparate information sources into coherent, actionable insights
|
299 |
+
|
300 |
+
### **π Advanced Agent Orchestration**
|
301 |
+
- **Parallel Agent Execution**: Simultaneous deployment of search agents across GitHub, Wikipedia, ArXiv
|
302 |
+
- **Agent Communication Protocols**: Real-time coordination between agents for optimal resource utilization
|
303 |
+
- **Adaptive Agent Learning**: Agents improve performance based on user interactions and feedback
|
304 |
+
- **Error Recovery Agents**: Autonomous problem-solving when individual agents encounter failures
|
305 |
+
|
306 |
+
### **π‘οΈ Production-Grade Agent Infrastructure**
|
307 |
+
- **Security Agent Monitoring**: Continuous protection against abuse with intelligent rate limiting
|
308 |
+
- **Validation Agent Networks**: Multi-layer content verification and URL authenticity checking
|
309 |
+
- **Performance Agent Optimization**: Automatic scaling and resource management for enterprise workloads
|
310 |
+
- **Resilience Agent Systems**: Graceful degradation and fault tolerance across all agent operations
|
311 |
+
|
312 |
+
### **β‘ Agent Performance Metrics**
|
313 |
+
- **Sub-second Agent Response**: Query analysis and routing in <100ms
|
314 |
+
- **Concurrent Agent Processing**: 4+ agents working simultaneously on complex research tasks
|
315 |
+
- **Intelligent Agent Caching**: Smart result storage and retrieval for enhanced performance
|
316 |
+
- **Scalable Agent Architecture**: Horizontal scaling support for enterprise deployment
|
317 |
+
|
318 |
+
## π License
|
319 |
+
|
320 |
+
MIT License - see [LICENSE](LICENSE) file for details.
|
321 |
+
|
322 |
+
## π Related Resources
|
323 |
+
|
324 |
+
- [Nebius AI Documentation](https://docs.nebius.ai/)
|
325 |
+
- [Modal Documentation](https://modal.com/docs)
|
326 |
+
- [React Query Documentation](https://tanstack.com/query/latest)
|
327 |
+
- [Radix UI Components](https://www.radix-ui.com/)
|
328 |
+
|
329 |
+
---
|
330 |
+
|
331 |
+
## π Agents-MCP-Hackathon Submission Summary
|
332 |
+
|
333 |
+
**KnowledgeBridge** showcases the incredible power of AI agents through:
|
334 |
+
|
335 |
+
π€ **Multi-Agent Orchestration** - Coordinated intelligence across search, analysis, and synthesis agents
|
336 |
+
π **Real-Time Decision Making** - Agents adapt strategies and optimize performance dynamically
|
337 |
+
π **Advanced Agent Workflows** - Complex multi-step processes handled autonomously
|
338 |
+
π‘οΈ **Production-Ready Agent Infrastructure** - Enterprise-grade security and resilience
|
339 |
+
|
340 |
+
**Track 3: Agentic Demo Showcase** - Demonstrating what happens when sophisticated AI agents work together to revolutionize knowledge discovery and research workflows.
|
341 |
+
|
342 |
+
**Built for the Hugging Face Agents-MCP-Hackathon** π
|
343 |
+
|
344 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|