Docs

Rate limits & concurrency

Per-plan ceilings on sustained RPS and concurrent in-flight requests. What 429 means, how to check your current utilisation, and how to ask for more.

Rate limits & concurrency

Two ceilings apply to every API key:

Concurrency - how many requests can be in-flight at the same moment.
Sustained throughput - how many requests per second, averaged over a short window, you can send before we start returning 429.

Both are generous enough that most customers never hit them, and both scale with your plan.

Per-plan ceilings

Plan	Concurrent in-flight	Sustained RPS	Burst ceiling
Free	10	2	10 for 3s
Vibe ($19/mo)	30	10	40 for 3s
Pro ($79/mo)	50	25	60 for 3s
Custom ($100-$2,000/mo)	100 - 500+	50 - 500	2x sustained for 3s

“Burst ceiling” means you can briefly exceed the sustained RPS by that much before we start pushing back. Once the 3s window closes, you’re back to the sustained figure.

What you see when you hit a limit

A 429 response with a standard Retry-After header:

HTTP/1.1 429 Too Many Requests
Retry-After: 2
Content-Type: application/json

{
  "error": "rate_limited",
  "message": "Exceeded sustained RPS (25) for key asa_live_Tq0x...",
  "docs_url": "https://amazonscraperapi.com/docs/guides/rate-limits",
  "request_id": "req_9f2c..."
}

Always respect Retry-After. All of our official SDKs (Node / Python / Go / CLI) do this automatically. If you’re calling raw HTTP, the pattern is:

if (res.status === 429) {
  await sleep(Number(res.headers.get("Retry-After") || 1) * 1000);
  return retry();
}

Checking your current utilisation

Every successful response carries two informational headers:

Asa-Concurrency: 3/20 - you have 3 requests in-flight out of 20 allowed.
Asa-Rps: 8/10 - you’re sending 8 RPS on average in the current window against a 10 RPS ceiling.

Plot these in your own observability stack and you’ll see headroom before you run out of it.

Concurrency, explained

If your ceiling is 30, you can have exactly 30 requests open to our API at once. Open a 31st and it queues client-side (in our SDKs) or gets a 429 immediately (raw HTTP). Long-running requests - for example a Batch poll that takes 8 minutes - consume a concurrency slot the whole time.

Rule of thumb: Concurrency = Target RPS x Average request duration (seconds). At 10 RPS with a 3 s average, you need 30 concurrent slots. That’s exactly what the Vibe plan ships, and Pro’s 50 covers a 17 RPS sustained workload.

Batch endpoint

Batch submissions themselves are cheap - one POST /v1/amazon/batch with 1,000 items consumes one concurrency slot for the duration of the POST (a few seconds). The items inside the batch are processed by our worker under a separate internal concurrency budget that doesn’t count against yours.

How to get more

Upgrade. Each paid tier bumps both ceilings by 2-4x.
Custom plan. If you’re already on Pro and hitting 50 concurrent, the Custom tier ladder goes up to 1,000 concurrent for $2,000/mo.
Short-term boosts. Email info@amazonscraperapi.com with the numbers you need and how long. If it fits the pool budget, we’ll flip it on without requiring a plan change.

Preventing bursts in your own code

Even when you have the headroom, a polite client is a resilient client. The Node SDK exposes a built-in token-bucket limiter:

import { AmazonScraperAPI } from "amazon-scraper-api-sdk";

const asa = new AmazonScraperAPI("asa_live_…", {
  maxConcurrency: 30,   // defaults to your plan's ceiling if omitted
  maxRps: 10,
});

// Now you can fire-and-forget 10,000 ASINs; the SDK paces them.
const results = await Promise.all(
  asins.map(a => asa.product({ query: a, domain: "com" }))
);

Equivalent code in Python, Go, and the CLI ships the same limiter.

Rate limits & concurrency

Per-plan ceilings

What you see when you hit a limit

Checking your current utilisation

Concurrency, explained

Batch endpoint

How to get more

Preventing bursts in your own code

Related