spagestic commited on
Commit
8088298
·
verified ·
1 Parent(s): 89b22f4

Create .github\copilot-instructions.md

Browse files
Files changed (1) hide show
  1. .github//copilot-instructions.md +39 -0
.github//copilot-instructions.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Use this file to provide workspace-specific custom instructions to Copilot. For more details, visit https://code.visualstudio.com/docs/copilot/copilot-customization#_use-a-githubcopilotinstructionsmd-file -->
2
+
3
+ # Web Scraper Project Instructions
4
+
5
+ This is a Python Gradio application for web scraping that:
6
+
7
+ - Scrapes text content from websites
8
+ - Formats content as markdown
9
+ - Generates sitemaps from page links
10
+ - Provides MCP (Model Context Protocol) server functionality
11
+
12
+ ## Key Libraries
13
+
14
+ - gradio[mcp]: For the web interface and MCP server capabilities
15
+ - requests: For HTTP requests
16
+ - beautifulsoup4: For HTML parsing
17
+ - markdownify: For converting HTML to markdown
18
+ - urllib.parse: For URL handling
19
+
20
+ ## Project Structure
21
+
22
+ - `app.py`: Main web interface application
23
+ - `mcp_server.py`: MCP server that exposes tools for AI integration
24
+
25
+ ## MCP Tools
26
+
27
+ The MCP server exposes three main tools:
28
+
29
+ - `scrape_content`: Extract website content as markdown
30
+ - `generate_sitemap`: Create sitemap from page links
31
+ - `analyze_website`: Complete analysis with content and sitemap
32
+
33
+ ## Code Style
34
+
35
+ - Use type hints where appropriate
36
+ - Include proper error handling for web requests
37
+ - Follow PEP 8 style guidelines
38
+ - Add docstrings for functions with clear parameter descriptions
39
+ - MCP functions should have descriptive docstrings as they become tool descriptions