Example Domain

> ## Documentation Index
> Fetch the complete documentation index at: https://docs.hasdata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Quickstart - Web Scraping API

The **Web Scraping API** lets you scrape any public web page without handling proxies, headless browsers, or captchas. You send a URL, and we return the raw page content and optionally parsed data.

## Get Your API Key

Sign in at [hasdata.com](http://app.hasdata.com/sign-in), go to your account settings, and copy your API key.
All requests must include your key in the `x-api-key` header.

## Make Your First Request

<CodeGroup>
  ```bash cURL theme={null}
  curl --request POST \
    --url 'https://api.hasdata.com/scrape/web' \
    --header 'Content-Type: application/json' \
    --header 'x-api-key: <your-api-key>' \
    --data '{"url":"https://example.com","outputFormat":["html","text","markdown"],"screenshot":true}'
  ```

  ```bash HasData CLI theme={null}
  hasdata web-scraping \
    --url 'https://example.com' \
    --output-format 'html,text,markdown' \
    --screenshot
  ```

  ```javascript Node.js theme={null}
  const axios = require('axios').default;

  const options = {
    method: 'POST',
    url: 'https://api.hasdata.com/scrape/web',
    headers: {'Content-Type': 'application/json', 'x-api-key': '<your-api-key>'},
    data: {
      url: 'https://example.com',
      outputFormat: ['html', 'text', 'markdown'],
      screenshot: true
    }
  };

  try {
    const { data } = await axios.request(options);
    console.log(data);
  } catch (error) {
    console.error(error);
  }
  ```

  ```python Python theme={null}
  import requests

  url = "https://api.hasdata.com/scrape/web"

  payload = {
      "url": "https://example.com",
      "outputFormat": ["html", "text", "markdown"],
      "screenshot": True
  }
  headers = {
      "Content-Type": "application/json",
      "x-api-key": "<your-api-key>"
  }

  response = requests.post(url, json=payload, headers=headers)

  print(response.json())
  ```

  ```php PHP theme={null}
  <?php

  $payload = [
      "url" => "https://example.com",
      "outputFormat" => ["html", "text", "markdown"],
      "screenshot" => true,
  ];

  $curl = curl_init();

  curl_setopt_array($curl, [
    CURLOPT_URL => "https://api.hasdata.com/scrape/web",
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_CUSTOMREQUEST => "POST",
    CURLOPT_POSTFIELDS => json_encode($payload),
    CURLOPT_HTTPHEADER => [
      "Content-Type: application/json",
      "x-api-key: <your-api-key>",
    ],
  ]);

  $response = curl_exec($curl);
  curl_close($curl);

  echo $response;
  ```

  ```java Java theme={null}
  OkHttpClient client = new OkHttpClient();

  MediaType mediaType = MediaType.parse("application/json");
  String json = """
    {
      "url": "https://example.com",
      "outputFormat": [
        "html",
        "text",
        "markdown"
      ],
      "screenshot": true
    }
  """;
  RequestBody requestBody = RequestBody.create(json, mediaType);

  Request request = new Request.Builder()
    .url("https://api.hasdata.com/scrape/web")
    .post(requestBody)
    .addHeader("Content-Type", "application/json")
    .addHeader("x-api-key", "<your-api-key>")
    .build();

  Response response = client.newCall(request).execute();
  ```

  ```csharp C# theme={null}
  using System.Net.Http;
  using System.Text;

  var client = new HttpClient();

  var json = """
  {
    "url": "https://example.com",
    "outputFormat": [
      "html",
      "text",
      "markdown"
    ],
    "screenshot": true
  }
  """;
  var content = new StringContent(json, Encoding.UTF8, "application/json");

  var request = new HttpRequestMessage(new HttpMethod("POST"), "https://api.hasdata.com/scrape/web")
  {
      Content = content,
  };
  request.Headers.Add("x-api-key", "<your-api-key>");

  using var response = await client.SendAsync(request);
  response.EnsureSuccessStatusCode();
  var body = await response.Content.ReadAsStringAsync();
  Console.WriteLine(body);
  ```

  ```ruby Ruby theme={null}
  require 'net/http'
  require 'uri'
  require 'json'

  uri = URI("https://api.hasdata.com/scrape/web")
  payload = {
    "url" => "https://example.com",
    "outputFormat" => ["html", "text", "markdown"],
    "screenshot" => true,
  }

  http = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true

  request = Net::HTTP::Post.new(uri)
  request["Content-Type"] = 'application/json'
  request["x-api-key"] = '<your-api-key>'
  request.body = payload.to_json

  response = http.request(request)
  puts response.read_body
  ```

  ```rust Rust theme={null}
  use reqwest::blocking::Client;
  use serde_json::json;

  fn main() -> Result<(), Box<dyn std::error::Error>> {
      let client = Client::new();
      let payload = json!({
          "url": "https://example.com",
          "outputFormat": [
              "html",
              "text",
              "markdown"
          ],
          "screenshot": true
      });
      let res = client
          .post("https://api.hasdata.com/scrape/web")
          .header("Content-Type", "application/json")
          .header("x-api-key", "<your-api-key>")
          .json(&payload)
          .send()?
          .text()?;
      println!("{}", res);
      Ok(())
  }
  ```

  ```go Go theme={null}
  package main

  import (
  	"bytes"
  	"fmt"
  	"io"
  	"net/http"
  )

  func main() {
  	payload := []byte(`{
    "url": "https://example.com",
    "outputFormat": [
      "html",
      "text",
      "markdown"
    ],
    "screenshot": true
  }`)

  	req, _ := http.NewRequest("POST", "https://api.hasdata.com/scrape/web", bytes.NewBuffer(payload))
  	req.Header.Add("Content-Type", "application/json")
  	req.Header.Add("x-api-key", "<your-api-key>")

  	res, _ := http.DefaultClient.Do(req)
  	defer res.Body.Close()

  	responseBody, _ := io.ReadAll(res.Body)
  	fmt.Println(string(responseBody))
  }
  ```
</CodeGroup>

<Accordion title="Response">
  ```json theme={null}
  {
      "requestMetadata": {
          "id": "e29e8506-143b-4872-a079-53b72e0edb10",
          "status": "ok"
      },
      "headers": {
          "accept-ranges":"bytes",
          "alt-svc":"h3=\":443\"; ma=93600,h3-29=\":443\"; ma=93600,quic=\":443\"; ma=93600; v=\"43\"",
          "cache-control":"max-age=2589",
          "content-encoding":"gzip",
          "content-length":"648",
          "content-type":"text/html",
          "date":"Fri, 11 Apr 2025 23:41:30 GMT",
          "etag":"\"84238dfc8092e5d9c0dac8ef93371a07:1736799080.121134\"",
          "last-modified":"Mon, 13 Jan 2025 20:11:20 GMT",
          "vary":"Accept-Encoding"
      },
      "screenshot": "https://f005.backblazeb2.com/file/hasdata-screenshots/e29e8506-143b-4872-a079-53b72e0edb10.jpeg",
      "content": "<!DOCTYPE html><html><head><title>Example Domain</title> ... </body></html>",
      "markdown": "# Example Domain\n\nThis domain is for use in ... [More information...](https://www.iana.org/domains/example)",
      "text": "Example Domain\n\nThis domain is for use in ... More information..."
  }
  ```
</Accordion>

<Warning>
  Maximum page load time is restricted to 300 seconds. In cases where a request fails after 300 seconds, you will not be charged for the unsuccessful request.
</Warning>

## Add Optional Parameters

You can include additional parameters to control what data is returned:

* `outputFormat` – Array of formats to return. Supports: html, text, markdown, json.
* `screenshot` – Set to `true` to include a screenshot of the rendered page.
* `extractLinks` – Set to `true` to extract all hyperlinks from the page.
* `extractEmails` – Set to `true` to extract all email addresses found on the page.

## Proxy Configuration

To scrape content with location-specific views or bypass stricter anti-bot systems, you can configure proxies:

* `proxyType` – Choose proxy type. Options: `residential`, `datacenter`.
* `proxyCountry` – Set a specific country for the proxy. Use [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) format (e.g. `US`, `DE`, `IN`).

That's it. You're ready to start scraping. For advanced usage and all parameters, see the [API Params](/apis/web-scraping-api/api-params) section.