webclaw
MCPRust-based MCP server for local-first web content extraction with support for scraping, crawling, and structured data extraction for LLMs.
Install
git clone https://github.com/0xMassi/webclawStars
1,353
7d change
—
Downloads / week
—
Last active
today
About
Turn websites into clean markdown, JSON, and LLM-ready context. CLI, MCP server, REST API, and SDKs for AI agents and RAG pipelines.
Most web scraping tools give your agent one of two bad outputs:
a blocked page, login wall, or empty app shell raw HTML full of nav, scripts, styling, ads, and duplicated boilerplate
webclaw.io is the hosted web extraction API for webclaw. This repo contains the open-source CLI, MCP server, extraction engine, and self-hostable server.
30-day stars
Trust factors
- Source
- community
- Known advisories
- 0
- Maintenance
- active
- License
- AGPL-3.0
- Age
- 3 months
web-scrapingdeveloper-tools#rust#scraping#crawling#data-extraction#html-to-markdown#self-hosted#ai#ai-agents#ai-scraping#cli#crawler#firecrawl-alternative#llm#markdown#mcp#rag#tls-fingerprinting#web-crawler#web-extraction#web-scraper