File size: 7,953 Bytes
e07df60
67d801e
 
f063c68
 
e07df60
 
 
 
 
 
 
 
 
67d801e
490f257
67d801e
490f257
67d801e
490f257
67d801e
490f257
67d801e
 
 
 
 
 
 
 
490f257
67d801e
490f257
67d801e
 
 
 
 
 
 
 
 
 
 
 
 
490f257
67d801e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
490f257
67d801e
 
 
 
490f257
67d801e
 
 
 
d8c261b
67d801e
8a19631
 
 
 
d8c261b
67d801e
d8c261b
67d801e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d8c261b
67d801e
d8c261b
 
67d801e
d8c261b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67d801e
 
d8c261b
 
67d801e
490f257
67d801e
 
 
490f257
67d801e
490f257
67d801e
 
 
490f257
67d801e
490f257
67d801e
 
6189533
67d801e
490f257
 
 
67d801e
 
 
 
490f257
67d801e
 
 
 
 
490f257
67d801e
490f257
8985c60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
---
title: GAIA Agent - Q&A Chatbot
emoji: ๐Ÿค–
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: false
hf_oauth: true
# optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes.
hf_oauth_expiration_minutes: 480
---

# ๐Ÿค– **GAIA Agent - Advanced Q&A Chatbot**

## ๐ŸŒŸ **Introduction**

**GAIA Agent** is a sophisticated AI-powered chatbot system designed to handle complex questions and tasks through an intuitive Q&A interface. Built on top of the GAIA benchmark framework, this agent combines advanced reasoning, code execution, web search, document processing, and multimodal understanding capabilities. The system features both a user-friendly chatbot interface and a comprehensive evaluation runner for benchmark testing.

## ๐Ÿš€ **Key Features**

- **๐Ÿ” Multi-Modal Search**: Web search, Wikipedia, and arXiv paper search
- **๐Ÿ’ป Code Execution**: Support for Python, Bash, SQL, C, and Java
- **๐Ÿ–ผ๏ธ Image Processing**: Analysis, transformation, OCR, and generation
- **๐Ÿ“„ Document Processing**: PDF, CSV, Excel, and text file analysis
- **๐Ÿ“ File Upload Support**: Handle multiple file types with drag-and-drop
- **๐Ÿงฎ Mathematical Operations**: Complete set of mathematical tools
- **๐Ÿ’ฌ Conversational Interface**: Natural chat-based interaction
- **๐Ÿ“Š Evaluation System**: Automated benchmark testing and submission

## ๐Ÿ—๏ธ **Project Structure**

```
gaia-agent/
โ”œโ”€โ”€ app.py                    # Main Q&A chatbot interface
โ”œโ”€โ”€ evaluation_app.py         # GAIA benchmark evaluation runner
โ”œโ”€โ”€ agent.py                  # Core agent implementation with tools
โ”œโ”€โ”€ code_interpreter.py       # Multi-language code execution
โ”œโ”€โ”€ image_processing.py       # Image processing utilities
โ”œโ”€โ”€ system_prompt.txt         # System prompt for the agent
โ”œโ”€โ”€ requirements.txt          # Python dependencies
โ”œโ”€โ”€ metadata.jsonl           # GAIA benchmark metadata
โ”œโ”€โ”€ explore_metadata.ipynb   # Data exploration notebook
โ””โ”€โ”€ README.md               # This file
```

## ๐Ÿ› ๏ธ **Tool Categories**

### **๐ŸŒ Browser & Search Tools**
- **Wikipedia Search**: Search Wikipedia with up to 2 results
- **Web Search**: Tavily-powered web search with up to 3 results  
- **arXiv Search**: Academic paper search with up to 3 results

### **๐Ÿ’ป Code Interpreter Tools**
- **Multi-Language Execution**: Python, Bash, SQL, C, Java support
- **Plot Generation**: Matplotlib visualization support
- **DataFrame Analysis**: Pandas data processing
- **Error Handling**: Comprehensive error reporting

### **๐Ÿงฎ Mathematical Tools**
- **Basic Operations**: Add, subtract, multiply, divide
- **Advanced Functions**: Modulus, power, square root
- **Complex Numbers**: Support for complex number operations

### **๐Ÿ“„ Document Processing Tools**
- **File Operations**: Save, read, and download files
- **CSV Analysis**: Pandas-based data analysis
- **Excel Processing**: Excel file analysis and processing
- **OCR**: Extract text from images using Tesseract

### **๐Ÿ–ผ๏ธ Image Processing & Generation Tools**
- **Image Analysis**: Size, color, and property analysis
- **Transformations**: Resize, rotate, crop, flip, adjust brightness/contrast
- **Drawing Tools**: Add shapes, text, and annotations
- **Image Generation**: Create gradients, noise patterns, and simple graphics
- **Image Combination**: Stack and combine multiple images

## ๐ŸŽฏ **How to Use**

### **Q&A Chatbot Interface (app.py)**

1. **Start the Chatbot:**
   ```bash
   python app.py
   ```

2. **Access the Interface:**
   - Open `http://localhost:7860` in your browser
   - Upload files (images, documents, CSV, etc.) if needed
   - Ask questions in natural language
   - Get comprehensive answers with tool usage

3. **Supported Interactions:**
   - **Text Questions**: "What is the capital of France?"
   - **Math Problems**: "Calculate the square root of 144"
   - **Code Requests**: "Write a Python function to sort a list"
   - **Image Analysis**: Upload an image and ask "What do you see?"
   - **Data Analysis**: Upload a CSV and ask "What are the trends?"
   - **Web Search**: "What are the latest AI developments?"

### **Evaluation Runner (evaluation_app.py)**

1. **Run the Evaluation:**
   ```bash
   python evaluation_app.py
   ```

2. **Benchmark Testing:**
   - Log in with your Hugging Face account
   - Click "Run Evaluation & Submit All Answers"
   - Monitor progress as the agent processes GAIA benchmark questions
   - View results and scores automatically

## ๐Ÿ”ง **Technical Architecture**

### **LangGraph State Machine**
```
START โ†’ Retriever โ†’ Assistant โ†’ Tools โ†’ Assistant
                     โ†‘              โ†“
                     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

1. **Retriever Node**: Searches vector database for similar questions
2. **Assistant Node**: LLM processes question with available tools
3. **Tools Node**: Executes selected tools (web search, code, etc.)
4. **Conditional Routing**: Dynamically routes between assistant and tools

### **Vector Database Integration**
- **Supabase Vector Store**: Stores GAIA benchmark Q&A pairs
- **Semantic Search**: Finds similar questions for context
- **HuggingFace Embeddings**: sentence-transformers/all-mpnet-base-v2

### **Multi-Modal File Support**
- **Images**: JPG, PNG, GIF, BMP, WebP
- **Documents**: PDF, DOC, DOCX, TXT, MD
- **Data**: CSV, Excel, JSON
- **Code**: Python, Bash, SQL, C, Java

## โš™๏ธ **Installation & Setup**

### **1. Clone Repository**
```bash
git clone https://github.com/fisherman611/gaia-agent.git
cd gaia-agent
```

### **2. Install Dependencies**
```bash
pip install -r requirements.txt
```

### **3. Environment Variables**
Create a `.env` file with your API keys:
```env
SUPABASE_URL=your_supabase_url
SUPABASE_SERVICE_ROLE_KEY=your_supabase_key
GROQ_API_KEY=your_groq_api_key
TAVILY_API_KEY=your_tavily_api_key
HUGGINGFACEHUB_API_TOKEN=your_hf_token
LANGSMITH_API_KEY=your_langsmith_key

LANGSMITH_TRACING=true
LANGSMITH_PROJECT=ai_agent_course
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
```

### **4. Database Setup (Supabase)**
Execute this SQL in your Supabase database:
```sql
-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create match function for documents2 table
CREATE OR REPLACE FUNCTION public.match_documents_2(
  query_embedding vector(768)
)
RETURNS TABLE(
  id         bigint,
  content    text,
  metadata   jsonb,
  embedding  vector(768),
  similarity double precision
)
LANGUAGE sql STABLE
AS $$
  SELECT
    id,
    content,
    metadata,
    embedding,
    1 - (embedding <=> query_embedding) AS similarity
  FROM public.documents2
  ORDER BY embedding <=> query_embedding
  LIMIT 10;
$$;

-- Grant permissions
GRANT EXECUTE ON FUNCTION public.match_documents_2(vector) TO anon, authenticated;
```

## ๐Ÿš€ **Running the Application**

### **Chatbot Interface**
```bash
python app.py
```
Access at: `http://localhost:7860`

### **Evaluation Runner**
```bash
python evaluation_app.py
```
Access at: `http://localhost:7860`

### **Live Demo**
Try it online: [Hugging Face Space](https://huggingface.co/spaces/fisherman611/gaia-agent)

## ๐Ÿ”— **Resources**

- [GAIA Benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard)
- [Hugging Face Agents Course](https://huggingface.co/agents-course)
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [Supabase Vector Store](https://supabase.com/docs/guides/ai/vector-columns)

## ๐Ÿค **Contributing**

Contributions are welcome! Areas for improvement:
- **New Tools**: Add specialized tools for specific domains
- **UI Enhancements**: Improve the chatbot interface
- **Performance**: Optimize response times and accuracy
- **Documentation**: Expand examples and use cases

## ๐Ÿ“„ **License**

This project is licensed under the [MIT License](https://mit-license.org/).