POST /v1/scrape

scrape endpoint

fetch a url. get parsed content back. handles captchas, proxies, rendering automatically.

productionidempotentavg 204ms p50
POSThttps://api.scantir.dev/v1/scrape

takes a url, returns its rendered content. automatically rotates a proxy, solves any captcha challenge, and waits for the page to settle. response shape depends on the extract spec you provide.

TIPuse session_id to keep the same cookie jar across requests. it's the difference between scraping 10 pages and getting blocked on #11.

request body

fieldtypedescription
urlstringrequiredtarget url. must be http(s). fragments are stripped.
render_jsbooleanrun a real chromium and wait for the dom to settle. defaults to false. adds ~100ms.
session_idstringarbitrary id. persists cookies, fingerprint, and proxy for ~30min of idle.
regionenumus-east · us-west · eu-west · eu-central · ap-south · ap-northeast · auto
extractobjectmap of name → selector. css, jsonpath, or $llm("..."). returns parsed values in place of raw html.
wait_forstringcss selector to await before capturing. overrides default domcontentloaded.
headersobjectextra headers to forward. user-agent is set automatically.
timeout_msnumberhard cap. default 15000 · max 60000.

response

returns a ScrapeResult. content is a string when no extractor is set; otherwise data holds the parsed result.

idstringopaque request id. use in support tickets.
urlstringfinal url after redirects.
statusnumbertarget's http status. 200 means the page loaded.
contentstringrendered html. present when no extract is specified.
dataobjectparsed values, keyed by extractor name.
solvedarraylist of challenges solved, e.g. ["cloudflare"].
timing_msobjectdetailed stopwatch: dns, tls, ttfb, render, extract.

errors

4xx for client errors (bad url, missing auth). 5xx only when scantir itself fails — never when the target did. target failures come back as status: 4xx/5xx on a successful scrape.

# basic render + extract
curl https://api.scantir.dev/v1/scrape \
  -H "authorization: bearer $KEY" \
  -H "content-type: application/json" \
  -d '{
    "url": "https://shop.example/p/42",
    "render_js": true,
    "session_id": "shopper-42",
    "extract": {
      "title": "h1.product-title",
      "price": ".price::text",
      "in_stock": "[data-stock=\"true\"]::exists"
    }
  }'
response · 200
{
  "id": "req_01HF3X9Z2A",
  "url": "https://shop.example/p/42",
  "status": 200,
  "solved": ["cloudflare"],
  "data": {
    "title":    "honeycomb jar, 8oz",
    "price":    "$42.00",
    "in_stock": true
  },
  "timing_ms": {
    "dns": 8,  "tls": 34,
    "ttfb": 92, "render": 48,
    "extract": 22, "total": 204
  }
}
error · 402
{
  "error": {
    "code":    "quota_exceeded",
    "message": "monthly request quota hit",
    "docs":    "https://scantir.dev/docs/errors#402"
  }
}