The outputFormat parameter controls the format of the scraped content returned in the response.

Supported Formats

You can request one or more of the following:

  • html – raw page HTML (default DOM output)
  • text – plain text version of the page
  • markdown – converted Markdown output (good for LLMs and readability)
  • jsonnot a content format, but a wrapper to return all requested formats in a structured JSON response

Behavior

  • If you pass a single format like "html" or "text", the API returns just that content directly as a string.
  • If you pass multiple formats, the response will be a JSON object with each format as a separate key.
  • If you include json, it tells the API to wrap the response in a structured JSON object (even for a single format).

Use json to always get a structured response that’s easy to work with in code.

Example: One Format

{
  "outputFormat": ["text"]
}

Response:

Welcome to Example.com
This is a sample page...

Example: Multiple Formats

{
  "outputFormat": ["html", "markdown"]
}

Response:

{
  "requestMetadata": { /*...*/ },
  "content": "<html>...</html>",
  "markdown": "# Welcome to Example.com\nThis is a sample page...",
  "headers": { /*...*/ },
  "cookies": [],
  "screenshot": "https://..."
}

Forcing JSON Format with One Format Inside

If you want the response in JSON format but only need Markdown:

{
  "outputFormat": ["json", "markdown"]
}

Response:

{
  "markdown": "# Page title\n...",
  "requestMetadata": { /*...*/ },
  "headers": { /*...*/ },
  "cookies": []
}

Notes

  • json is not a content type — it controls the response structure
  • If you want to include markdown or text inside a JSON response, add json to the list
  • If you include multiple content formats, response is always JSON, even without "json" explicitly listed