scrape endpoint

fetch a url. get parsed content back. handles captchas, proxies, rendering automatically.

productionidempotentavg 204ms p50

POSThttps://api.scantir.dev/v1/scrape

takes a url, returns its rendered content. automatically rotates a proxy, solves any captcha challenge, and waits for the page to settle. response shape depends on the extract spec you provide.

TIPuse session_id to keep the same cookie jar across requests. it's the difference between scraping 10 pages and getting blocked on #11.

request body

field	type		description
url	string	required	target url. must be http(s). fragments are stripped.
render_js	boolean		run a real chromium and wait for the dom to settle. defaults to false. adds ~100ms.
session_id	string		arbitrary id. persists cookies, fingerprint, and proxy for ~30min of idle.
region	enum		`us-east` · `us-west` · `eu-west` · `eu-central` · `ap-south` · `ap-northeast` · `auto`
extract	object		map of `name → selector`. css, jsonpath, or `$llm("...")`. returns parsed values in place of raw html.
wait_for	string		css selector to await before capturing. overrides default domcontentloaded.
headers	object		extra headers to forward. `user-agent` is set automatically.
timeout_ms	number		hard cap. default 15000 · max 60000.

response

returns a ScrapeResult. content is a string when no extractor is set; otherwise data holds the parsed result.

id	string	opaque request id. use in support tickets.
url	string	final url after redirects.
status	number	target's http status. 200 means the page loaded.
content	string	rendered html. present when no extract is specified.
data	object	parsed values, keyed by extractor name.
solved	array	list of challenges solved, e.g. `["cloudflare"]`.
timing_ms	object	detailed stopwatch: `dns, tls, ttfb, render, extract`.

errors

4xx for client errors (bad url, missing auth). 5xx only when scantir itself fails — never when the target did. target failures come back as status: 4xx/5xx on a successful scrape.

# basic render + extract
curl https://api.scantir.dev/v1/scrape \
  -H "authorization: bearer $KEY" \
  -H "content-type: application/json" \
  -d '{
    "url": "https://shop.example/p/42",
    "render_js": true,
    "session_id": "shopper-42",
    "extract": {
      "title": "h1.product-title",
      "price": ".price::text",
      "in_stock": "[data-stock=\"true\"]::exists"
    }
  }'

response · 200

{
  "id": "req_01HF3X9Z2A",
  "url": "https://shop.example/p/42",
  "status": 200,
  "solved": ["cloudflare"],
  "data": {
    "title":    "honeycomb jar, 8oz",
    "price":    "$42.00",
    "in_stock": true
  },
  "timing_ms": {
    "dns": 8,  "tls": 34,
    "ttfb": 92, "render": 48,
    "extract": 22, "total": 204
  }
}

error · 402

{
  "error": {
    "code":    "quota_exceeded",
    "message": "monthly request quota hit",
    "docs":    "https://scantir.dev/docs/errors#402"
  }
}