Rate limits & concurrency
Per-plan ceilings on sustained RPS and concurrent in-flight requests. What 429 means, how to check your current utilisation, and how to ask for more.
Rate limits & concurrency
Two ceilings apply to every API key:
- Concurrency - how many requests can be in-flight at the same moment.
- Sustained throughput - how many requests per second, averaged over a short window, you can send before we start returning
429.
Both are generous enough that most customers never hit them, and both scale with your plan.
Per-plan ceilings
| Plan | Concurrent in-flight | Sustained RPS | Burst ceiling |
|---|---|---|---|
| Free | 10 | 2 | 10 for 3s |
| Vibe ($19/mo) | 30 | 10 | 40 for 3s |
| Pro ($79/mo) | 50 | 25 | 60 for 3s |
| Custom ($100-$2,000/mo) | 100 - 500+ | 50 - 500 | 2x sustained for 3s |
“Burst ceiling” means you can briefly exceed the sustained RPS by that much before we start pushing back. Once the 3s window closes, you’re back to the sustained figure.
What you see when you hit a limit
A 429 response with a standard Retry-After header:
HTTP/1.1 429 Too Many Requests
Retry-After: 2
Content-Type: application/json
{
"error": "rate_limited",
"message": "Exceeded sustained RPS (25) for key asa_live_Tq0x...",
"docs_url": "https://amazonscraperapi.com/docs/guides/rate-limits",
"request_id": "req_9f2c..."
}
Always respect Retry-After. All of our official SDKs (Node / Python / Go / CLI) do this automatically. If you’re calling raw HTTP, the pattern is:
if (res.status === 429) {
await sleep(Number(res.headers.get("Retry-After") || 1) * 1000);
return retry();
}
Checking your current utilisation
Every successful response carries two informational headers:
Asa-Concurrency: 3/20- you have 3 requests in-flight out of 20 allowed.Asa-Rps: 8/10- you’re sending 8 RPS on average in the current window against a 10 RPS ceiling.
Plot these in your own observability stack and you’ll see headroom before you run out of it.
Concurrency, explained
If your ceiling is 30, you can have exactly 30 requests open to our API at once. Open a 31st and it queues client-side (in our SDKs) or gets a 429 immediately (raw HTTP). Long-running requests - for example a Batch poll that takes 8 minutes - consume a concurrency slot the whole time.
Rule of thumb: Concurrency = Target RPS x Average request duration (seconds). At 10 RPS with a 3 s average, you need 30 concurrent slots. That’s exactly what the Vibe plan ships, and Pro’s 50 covers a 17 RPS sustained workload.
Batch endpoint
Batch submissions themselves are cheap - one POST /v1/amazon/batch with 1,000 items consumes one concurrency slot for the duration of the POST (a few seconds). The items inside the batch are processed by our worker under a separate internal concurrency budget that doesn’t count against yours.
How to get more
- Upgrade. Each paid tier bumps both ceilings by 2-4x.
- Custom plan. If you’re already on Pro and hitting 50 concurrent, the Custom tier ladder goes up to 1,000 concurrent for $2,000/mo.
- Short-term boosts. Email info@amazonscraperapi.com with the numbers you need and how long. If it fits the pool budget, we’ll flip it on without requiring a plan change.
Preventing bursts in your own code
Even when you have the headroom, a polite client is a resilient client. The Node SDK exposes a built-in token-bucket limiter:
import { AmazonScraperAPI } from "amazon-scraper-api-sdk";
const asa = new AmazonScraperAPI("asa_live_…", {
maxConcurrency: 30, // defaults to your plan's ceiling if omitted
maxRps: 10,
});
// Now you can fire-and-forget 10,000 ASINs; the SDK paces them.
const results = await Promise.all(
asins.map(a => asa.product({ query: a, domain: "com" }))
);
Equivalent code in Python, Go, and the CLI ships the same limiter.