Web Scraper

Web Scraper MCP Connector for Claude

A+

Equip your AI agent with the ability to read web pages, extract metadata, and crawl documentation sites as clean Markdown.

5 tools Official Updated Jun 28, 2026 Official Vinkius Partner

Connect the Web Scraper utility to any AI agent to give it direct access to the public internet. Instead of letting the AI hallucinate facts, allow it to read real-time articles, parse documentation, and fetch clean text from any URL you provide.

What you can do

  • Reader View — Convert any cluttered webpage into pristine, readable Markdown by stripping out ads, navbars, and boilerplate using Mozilla Readability logic
  • Site Crawling — Instruct the AI to crawl a starting URL (like a documentation hub or wiki) up to 10 pages deep automatically
  • Batch Processing — Fetch up to 10 different URLs in parallel to compare articles or summarize multiple sources at once
  • Metadata Extraction — Quickly pull SEO titles, descriptions, OG tags, canonical links, and all outbound hyperlinks without downloading the entire page body

How it works

  1. Subscribe to this server
  2. No API keys or authentication required
  3. Simply paste a link in your chat and tell your agent to 'read this URL' or 'crawl this documentation'

Who is this for?

  • Developers — point the agent to a new library's API docs and have it write code using the absolute latest syntax
  • Researchers — drop a handful of Wikipedia links and ask the AI to synthesize a comprehensive summary from the fetched data
  • SEO Specialists — audit a webpage's metadata, extracted titles, and outbound link structure dynamically
web-crawlingmarkdown-conversiondata-extractionreader-viewcontent-parsingurl-fetching

5 tools expose this connector's capabilities to your AI agent.

read

Uses @mozilla/readability (Firefox Reader View) to extract the main article content, then converts to Markdown. Works best for articles, docs, blogs, and Wikipedia. Fetch any public web page and return its full content as clean Markdown

extract

Returns: title, description, OG tags, lang, author, robots, canonical, link count. For the full page content, use the read tool instead. Extract structured metadata from a web page: title, description, OG tags, and more

list_links

Internal links share the same hostname as the source page. Extract all hyperlinks from a web page

batch_read

All URLs are fetched in parallel. Maximum 10 URLs per batch. Fetch multiple web pages in parallel

crawl

Maximum 10 pages to keep response size manageable. Crawl a website starting from a URL

See how to talk to your AI agent using Web Scraper.

Read https://en.wikipedia.org/wiki/Artificial_intelligence and summarize its history.

I've fetched the Wikipedia page. The history of AI spans back to antiquity with myths of artificial beings, but the formal field was founded in 1956 at Dartmouth College. It experienced cycles of immense optimism followed by disappointment ('AI winters'), eventually leading to the modern deep learning revolution fueled by huge datasets and compute power.

Extract the links from https://news.ycombinator.com/

I've extracted the outbound links. The site currently links out to 30 primary article sources including domains like github.com, weired.com, and nytimes.com, along with many internal navigational links to user profiles and comment threads.

Compare these two links: url1.com and url2.com

Using the batch reading tool, I've loaded both URLs simultaneously. URL 1 discusses a 'React-first' architecture and uses component styling. URL 2 advocates for 'HTML-first', server-rendered patterns. While both aim to increase web performance, they take fundamentally opposite approaches to client-side hydration.

Yes! You can use the `crawl` tool. For example: 'Crawl the getting started guide at https://example.com/docs'. The agent will fetch the starting page and automatically follow inner links to gather up to 10 pages of context.

Related Connectors