Web Toolset Documentation#

Generated byAI

This toolset provides tools for making HTTP GET requests to retrieve web content.

http_get_web_page#

Purpose: Makes an HTTP GET request to a given URL that returns text and returns the content type and content. It strips line indents and coalesces line breaks for HTML, JSON, and XML. Unless asked not to, it removes the head, script, and style elements as well as all class attributes from HTML content. Content size of response must be less than 256K.

Parameters#

Parameter Type Required Description
url String Yes The URL to GET the web page from.
user_agent String No Optional user agent header; default is Firefox for macOS.
accept String No Optionally specify an acceptable content type (e.g., text/markdown) to ask the server for a specific format. Default is none.
preserve_source Boolean No Optional flag to ask to preserve the content from the server without any attempts to condense the text; default is false.

Success Case Example#

Input:

{
  "url": "https://example.com"
}

Output:

{
  "content_type": "text/html",
  "body": "<!DOCTYPE html>\n<html>\n<body>\n<h1>Example</h1>\n<p>This is an example page.</p>\n</body>\n</html>"
}

Error Case Examples#

  1. Missing URL:

    {
      "error": "The required URL was not specified"
    }
  2. Non-Text Content:

    {
      "error": "HTTP request did not return text content; it's content type was: application/pdf"
    }
  3. HTTP Request Failed:

    {
      "error": "HTTP request failed with status: 404"
    }
  4. Content Too Large:

    {
      "error": "Received content > 262144. Response is too big."
    }
  5. Network Error:

    {
      "error": "An error occurred while making the HTTP request: Connection refused"
    }

Notes#

  • Strips line indents and coalesces line breaks for HTML, JSON, and XML.
  • Removes head, script, and style elements from HTML content unless preserve_source is set to true.
  • Content size must be less than 256K.

http_get_web_page_as_markdown#

Purpose: Makes an HTTP GET request to a given web site URL that returns the content as markdown, either because the website supports returning markdown or by converting the HTML to markdown. If the text content from the web site is not HTML or markdown, the tool returns a markdown document with the web page content within a code block.

Parameters#

Parameter Type Required Description
url String Yes The URL to GET the web page from.
user_agent String No Optional user agent header; default is Firefox for macOS.

Success Case Example#

Input:

{
  "url": "https://example.com"
}

Output:

{
  "markdown": "# Example\n\nThis is an example page.\n"
}

Error Case Examples#

  1. Missing URL:

    {
      "error": "The required URL was not specified"
    }
  2. Non-Text Content:

    {
      "error": "HTTP response did not return text content; it's content type was: application/pdf"
    }
  3. HTTP Request Failed:

    {
      "error": "HTTP request failed with status: 404"
    }
  4. Network Error:

    {
      "error": "An error occurred while making the HTTP request: Connection refused"
    }

Notes#

  • Returns content as markdown if the website supports it.
  • Converts HTML to markdown using an internal translator.
  • Wraps non-HTML/Markdown text content in a code block with appropriate syntax highlighting.