API Params
This page lists all parameters for the Web Scraping API.
Send a POST request to https://api.hasdata.com/scrape/web
with a JSON body using the fields below.
Basic Configuration
The URL of the page to scrape. Must be a valid absolute URI (e.g. https://example.com
).
Proxy Settings
Type of proxy to use. Options: datacenter
, residential
. Required if you’re targeting geo-restricted or bot-protected content.
ISO 3166-1 alpha-2 country code for proxy location (e.g. US
, DE
, IN
).
Data Extraction
CSS selectors for field-level extraction. Example: { "title": "h1", "link": "a @href" }
.
Structured AI rules for LLM-based extraction. Supports types: string
, number
, boolean
, list
, item
.
Example:
To learn more, see LLM Extraction.
Capture a screenshot of the page.
Extract all email addresses found in the page content.
Extract all hyperlinks (<a href="...">
) from the page.
Timing
Delay (in milliseconds) after page load before scraping. Max: 30000.
CSS selector to wait for before scraping begins.
Example: .product-listing
Resource Control
Block loading of images and stylesheets.
Block common ad scripts and tracking pixels.
Block any network requests containing these substrings or domains.
Example: ["googleanalytics", "doubleclick"]
JavaScript Options
Enable JavaScript rendering (required for SPAs or dynamic content).
List of JavaScript actions to run on the page (click, scroll, wait, evaluate, etc.).
Example:
To learn more, see Page Interactions.
Advanced Settings
Custom headers to include in the request. Example: { "User-Agent": "custom-agent" }
.
To learn more, see Custom Headers and Cookies.
Response format(s). Options: html
, text
, markdown
, json
. Multiple formats allowed.