This page lists all parameters for the Web Scraping API. Send a POST request to https://api.hasdata.com/scrape/web with a JSON body using the fields below.

Basic Configuration

url
string
required

The URL of the page to scrape. Must be a valid absolute URI (e.g. https://example.com).

Proxy Settings

proxyType
string

Type of proxy to use. Options: datacenter, residential. Required if you’re targeting geo-restricted or bot-protected content.

proxyCountry
string

ISO 3166-1 alpha-2 country code for proxy location (e.g. US, DE, IN).

Data Extraction

extractRules
object

CSS selectors for field-level extraction. Example: { "title": "h1", "link": "a @href" }.

aiExtractRules
object

Structured AI rules for LLM-based extraction. Supports types: string, number, boolean, list, item.

Example:

{
    "company": { "description": "Company name", "type": "string" },
    "email": { "type": "string" },
    "founded": { "type": "number" },
    "isHiring": { "type": "boolean" }
}

To learn more, see LLM Extraction.

screenshot
boolean

Capture a screenshot of the page.

extractEmails
boolean

Extract all email addresses found in the page content.

Extract all hyperlinks (<a href="...">) from the page.

Timing

wait
integer

Delay (in milliseconds) after page load before scraping. Max: 30000.

waitFor
string

CSS selector to wait for before scraping begins.

Example: .product-listing

Resource Control

blockResources
boolean

Block loading of images and stylesheets.

blockAds
boolean

Block common ad scripts and tracking pixels.

blockUrls
string[]

Block any network requests containing these substrings or domains.

Example: ["googleanalytics", "doubleclick"]

JavaScript Options

jsRendering
boolean

Enable JavaScript rendering (required for SPAs or dynamic content).

jsScenario
array

List of JavaScript actions to run on the page (click, scroll, wait, evaluate, etc.).

Example:

[
    {
        "click": "#buttonId"
    },
    {
        "fill": [".text_input", "value"]
    }
]

To learn more, see Page Interactions.

Advanced Settings

headers
object

Custom headers to include in the request. Example: { "User-Agent": "custom-agent" }.

To learn more, see Custom Headers and Cookies.

outputFormat
string[]

Response format(s). Options: html, text, markdown, json. Multiple formats allowed.