--- asIndexPage: true --- # Firecrawl import ToolInfo from "@/app/_components/tool-info"; import Badges from "@/app/_components/badges"; import TabbedCodeBlock from "@/app/_components/tabbed-code-block"; import TableOfContents from "@/app/_components/table-of-contents"; import ToolFooter from "@/app/_components/tool-footer"; import { Callout } from "nextra/components"; The Arcade Firecrawl MCP Server provides a pre-built set of tools for interacting with websites. These tools make it easy to build agents and AI apps that can: - Scrape web pages - Crawl websites - Map website structures - Retrieve crawl status and data - Cancel ongoing crawls ## Available Tools These tools are currently available in the Arcade Firecrawl MCP Sever. If you need to perform an action that's not listed here, you can [get in touch with us](mailto:contact@arcade.dev) to request a new tool, or [create your own tools](/guides/create-tools/tool-basics/build-mcp-server). ## Firecrawl.ScrapeUrl
Scrape a URL and return data in specified formats. **Auth:** - **Environment Variables Required:** - `FIRECRAWL_API_KEY`: Your [Firecrawl](https://www.firecrawl.dev/) API key. **Parameters** - **`url`** _(string, required)_ The URL to scrape. - **`formats`** _(enum ([Formats](/resources/integrations/development/firecrawl/reference#formats)), optional)_ The format of the scraped web page. Defaults to `Formats.MARKDOWN`. - **`only_main_content`** _(bool, optional)_ Only return the main content of the page. Defaults to `True`. - **`include_tags`** _(list, optional)_ List of tags to include in the output. - **`exclude_tags`** _(list, optional)_ List of tags to exclude from the output. - **`wait_for`** _(int, optional)_ Delay in milliseconds before fetching content. Defaults to `10`. - **`timeout`** _(int, optional)_ Timeout in milliseconds for the request. Defaults to `30000`. --- ## Firecrawl.CrawlWebsite
Crawl a website and return crawl status and data. **Auth:** - **Environment Variables Required:** - `FIRECRAWL_API_KEY`: Your [Firecrawl](https://www.firecrawl.dev/) API key. **Parameters** - **`url`** _(string, required)_ The URL to crawl. - **`exclude_paths`** _(list, optional)_ URL patterns to exclude from the crawl. - **`include_paths`** _(list, optional)_ URL patterns to include in the crawl. - **`max_depth`** _(int, required)_ Maximum depth to crawl. Defaults to `2`. - **`ignore_sitemap`** _(bool, required)_ Ignore the website sitemap. Defaults to `True`. - **`limit`** _(int, required)_ Limit the number of pages to crawl. Defaults to `10`. - **`allow_backward_links`** _(bool, required)_ Enable navigation to previously linked pages. Defaults to `False`. - **`allow_external_links`** _(bool, required)_ Allow following links to external websites. Defaults to `False`. - **`webhook`** _(string, optional)_ URL to send a POST request when the crawl is started, updated, and completed. - **`async_crawl`** _(bool, required)_ Run the crawl asynchronously. Defaults to `True`. --- ## Firecrawl.GetCrawlStatus
Retrieve the status of a crawl job. **Auth:** - **Environment Variables Required:** - `FIRECRAWL_API_KEY`: Your [Firecrawl](https://www.firecrawl.dev/) API key. **Parameters** - **`crawl_id`** _(string, required)_ The ID of the crawl job. --- ## Firecrawl.GetCrawlData
Retrieve data from a completed crawl job. **Auth:** - **Environment Variables Required:** - `FIRECRAWL_API_KEY`: Your [Firecrawl](https://www.firecrawl.dev/) API key. **Parameters** - **`crawl_id`** _(string, required)_ The ID of the crawl job. --- ## Firecrawl.CancelCrawl
Cancel an ongoing crawl job. **Auth:** - **Environment Variables Required:** - `FIRECRAWL_API_KEY`: Your [Firecrawl](https://www.firecrawl.dev/) API key. **Parameters** - **`crawl_id`** _(string, required)_ The ID of the asynchronous crawl job to cancel. --- ## Firecrawl.MapWebsite
Map a website from a single URL to a map of the entire website. **Auth:** - **Environment Variables Required:** - `FIRECRAWL_API_KEY`: Your [Firecrawl](https://www.firecrawl.dev/) API key. **Parameters** - **`url`** _(string, required)_ The base URL to start crawling from. - **`search`** _(string, optional)_ Search query to use for mapping. - **`ignore_sitemap`** _(bool, required)_ Ignore the website sitemap. Defaults to `True`. - **`include_subdomains`** _(bool, required)_ Include subdomains of the website. Defaults to `False`. - **`limit`** _(int, required)_ Maximum number of links to return. Defaults to `5000`. ## Auth The Arcade Web MCP Sever uses [Firecrawl](https://www.firecrawl.dev/) to scrape, crawl, and map websites. **Global Environment Variables:** - `FIRECRAWL_API_KEY`: Your [Firecrawl](https://www.firecrawl.dev/) API key.