POST /v1/scrape
scrape endpoint
fetch a url. get parsed content back. handles captchas, proxies, rendering automatically.
productionidempotentavg 204ms p50
POSThttps://api.scantir.dev/v1/scrape
takes a url, returns its rendered content. automatically rotates a proxy, solves any captcha challenge, and waits for the page to settle. response shape depends on the extract spec you provide.
TIPuse
session_id to keep the same cookie jar across requests. it's the difference between scraping 10 pages and getting blocked on #11.request body
| field | type | description | |
|---|---|---|---|
| url | string | required | target url. must be http(s). fragments are stripped. |
| render_js | boolean | run a real chromium and wait for the dom to settle. defaults to false. adds ~100ms. | |
| session_id | string | arbitrary id. persists cookies, fingerprint, and proxy for ~30min of idle. | |
| region | enum | us-east · us-west · eu-west · eu-central · ap-south · ap-northeast · auto | |
| extract | object | map of name → selector. css, jsonpath, or $llm("..."). returns parsed values in place of raw html. | |
| wait_for | string | css selector to await before capturing. overrides default domcontentloaded. | |
| headers | object | extra headers to forward. user-agent is set automatically. | |
| timeout_ms | number | hard cap. default 15000 · max 60000. |
response
returns a ScrapeResult. content is a string when no extractor is set; otherwise data holds the parsed result.
| id | string | opaque request id. use in support tickets. | |
| url | string | final url after redirects. | |
| status | number | target's http status. 200 means the page loaded. | |
| content | string | rendered html. present when no extract is specified. | |
| data | object | parsed values, keyed by extractor name. | |
| solved | array | list of challenges solved, e.g. ["cloudflare"]. | |
| timing_ms | object | detailed stopwatch: dns, tls, ttfb, render, extract. |
errors
4xx for client errors (bad url, missing auth). 5xx only when scantir itself fails — never when the target did. target failures come back as status: 4xx/5xx on a successful scrape.
# basic render + extract curl https://api.scantir.dev/v1/scrape \ -H "authorization: bearer $KEY" \ -H "content-type: application/json" \ -d '{ "url": "https://shop.example/p/42", "render_js": true, "session_id": "shopper-42", "extract": { "title": "h1.product-title", "price": ".price::text", "in_stock": "[data-stock=\"true\"]::exists" } }'
{
"id": "req_01HF3X9Z2A",
"url": "https://shop.example/p/42",
"status": 200,
"solved": ["cloudflare"],
"data": {
"title": "honeycomb jar, 8oz",
"price": "$42.00",
"in_stock": true
},
"timing_ms": {
"dns": 8, "tls": 34,
"ttfb": 92, "render": 48,
"extract": 22, "total": 204
}
}{
"error": {
"code": "quota_exceeded",
"message": "monthly request quota hit",
"docs": "https://scantir.dev/docs/errors#402"
}
}