You can check the status of a scraper job and fetch results manually using the job ID.
This is useful if you’re not using webhooks or need to monitor job progress in your system.
Check Job Status
To check whether a job is still running or finished:
curl --request GET \
--url 'https://api.hasdata.com/scrapers/jobs/:jobId' \
--header 'x-api-key: <your-api-key>'
Example Response
{
"id": "dd1a8c53-2d47-4444-977d-8d653a6a3c82",
"status": "in_progress",
"creditsSpent": 200,
"dataRowsCount": 20,
"input": {
/* job parameters */
}
}
{
"id": "dd1a8c53-2d47-4444-977d-8d653a6a3c82",
"status": "finished",
"creditsSpent": 200,
"dataRowsCount": 20,
"data": {
"csv": "https://api.hasdata.com/scrapers/jobs/dd1a8c53-2d47-4444-977d-8d653a6a3c82/results/b6cc6733-6d0e-4e44-9e94-38688aad3884.csv",
"json": "https://api.hasdata.com/scrapers/jobs/dd1a8c53-2d47-4444-977d-8d653a6a3c82/results/9cb592e3-6700-42ff-b58c-e7da3f478f28.json",
"xlsx": "https://api.hasdata.com/scrapers/jobs/dd1a8c53-2d47-4444-977d-8d653a6a3c82/results/ecea853c-e0ca-4a23-ae74-eea0588e54b6.xlsx"
},
"input": {
"limit": 25,
"urls": ["https://hasdata.com", "https://example.com"],
"maxDepth": 5,
"includePaths": "(blog/.+|articles/.+)",
"webhook": {
"url": "https://example.com/webhook",
"events": ["scraper.job.started", "scraper.job.finished", "scraper.data.scraped"]
}
}
}
Job Statuses
pending
— Waiting to be processed
in_progress
— Currently running
finished
— Completed
Fetch Results
Once the job status is finished
, you can retrieve results:
curl --request GET \
--url 'https://api.hasdata.com/scrapers/jobs/:jobId/results?page=1&limit=100' \
--header 'x-api-key: <your-api-key>'
Maximum limit is 100 per request.
Response Example
{
"meta": {
"total": 122,
"perPage": 100,
"currentPage": 1,
"lastPage": 2,
"firstPage": 1,
"firstPageUrl": "/?page=1",
"lastPageUrl": "/?page=2",
"nextPageUrl": "/?page=2",
"previousPageUrl": null
},
"data": [
{
"id": "8e705f7b-c542-403d-8acc-3e3d5c2f1271",
"data": {
"url": "https://example.com/page1",
"statusCode": 200,
"text": "Extracted content...",
"title": "Example Page",
"depth": 1
},
"createdAt": "2025-05-02T17:26:28.603+03:00",
"updatedAt": "2025-05-02T17:26:28.603+03:00"
},
{
"id": "01d1f7d2-43b4-4752-a114-b5e6601c5722",
"data": {
"url": "https://example.com/page2",
"statusCode": 404,
"error": "Page not found",
"depth": 1
},
"createdAt": "2025-05-02T17:26:25.740+03:00",
"updatedAt": "2025-05-02T17:26:25.740+03:00"
}
]
}