The outputFormat parameter controls the format of the scraped content returned in the response.
You can request one or more of the following:
html – raw page HTML (default DOM output)
text – plain text version of the page
markdown – converted Markdown output (good for LLMs and readability)
json – not a content format, but a wrapper to return all requested formats in a structured JSON response
Behavior
- If you pass a single format like
"html" or "text", the API returns just that content directly as a string.
- If you pass multiple formats, the response will be a JSON object with each format as a separate key.
- If you include
json, it tells the API to wrap the response in a structured JSON object (even for a single format).
Use json to always get a structured response that’s easy to work with in code.
{
"outputFormat": ["text"]
}
Response:
Welcome to Example.com
This is a sample page...
{
"outputFormat": ["html", "markdown"]
}
Response:
{
"requestMetadata": { /*...*/ },
"content": "<html>...</html>",
"markdown": "# Welcome to Example.com\nThis is a sample page...",
"headers": { /*...*/ },
"cookies": [],
"screenshot": "https://..."
}
If you want the response in JSON format but only need Markdown:
{
"outputFormat": ["json", "markdown"]
}
Response:
{
"markdown": "# Page title\n...",
"requestMetadata": { /*...*/ },
"headers": { /*...*/ },
"cookies": []
}
Notes
json is not a content type — it controls the response structure
- If you want to include
markdown or text inside a JSON response, add json to the list
- If you include multiple content formats, response is always JSON, even without
"json" explicitly listed