Noah Bennett· ChocoData Amazon data expert · 11 min read

Is Scraping Amazon Legal? A 2026 Guide for Developers

Updated at 2026-06-21

The Answer

Scraping publicly visible Amazon data (product titles, prices, ratings, review excerpts, search results) is generally legal in the United States under the Ninth Circuit ruling in hiQ Labs v. LinkedIn, which held that scraping publicly accessible web data does not violate the Computer Fraud and Abuse Act. Amazon’s own Conditions of Use prohibit automated access, which is a separate contract matter rather than a criminal one. The practical risks are civil (account termination, IP-level rate limits, cease-and-desist letters) rather than criminal. For commercial teams, the defensible posture is to scrape only what a logged-out browser can see, respect rate limits and robots.txt directives, avoid any flow that requires an Amazon login, and keep records of what was scraped and why. This is not legal advice; consult counsel in your jurisdiction.

What Is Web Scraping in the Context of Amazon?

Web scraping in the context of Amazon means automating the extraction of data from publicly visible Amazon pages, primarily product detail pages (URLs starting with /dp/), search result pages (/s?k=), and category pages. The data extracted typically includes product titles, prices, availability, star ratings, review counts, the first 8 to 10 featured reviews, seller names, and category breadcrumbs. Scraping does not require an Amazon account in its base form because all of this data renders to anonymous browsers without authentication.

Scraping is legally distinct from data resale, ToS violation, and unauthorized access. The act of fetching a public URL and parsing the HTML is not the legally fraught part; what gets done with the data afterward and how it was fetched (rate, frequency, authentication state) carries the legal weight.

Is It Legal to Scrape Amazon in the United States?

In the United States, scraping publicly visible Amazon data is generally legal, based on the hiQ Labs v. LinkedIn ruling from the Ninth Circuit Court of Appeals (originally decided in 2019, reaffirmed on remand in 2022). The court held that scraping publicly accessible data does not violate the Computer Fraud and Abuse Act (CFAA), because the CFAA’s “without authorization” language applies to access controls (like login walls), not to data that the operator has chosen to make publicly visible.

That precedent covers product titles, prices, ratings, review counts, Buy Box sellers, and the aggregate review data visible without login. It does not cover scraping behind an Amazon login, buyer order history, seller dashboards, or the private parts of Seller Central. Those flows touch authenticated content and bring CFAA liability back into scope.

The hiQ ruling is binding in the Ninth Circuit (which includes California, where Amazon is headquartered through Lab126 and AWS). Other circuits have not directly reaffirmed it, though no contradicting ruling has emerged at the federal appellate level since 2022.

What About Amazon’s Terms of Service?

Amazon’s Conditions of Use explicitly prohibit “data mining, robots, or similar data gathering and extraction tools.” That clause is a contract term, not a criminal statute. Violating it can give Amazon grounds to terminate your account, revoke your Associates affiliate access, or send a cease-and-desist letter. It does not give them grounds to bring a federal CFAA claim, per hiQ.

In practice, Amazon’s enforcement is technical rather than legal: aggressive IP-range bans, account suspension on Seller Central, and the AWS WAF challenge layer that serves “Robot Check” pages to suspicious traffic. Most commercial scrapers experience the technical enforcement (CAPTCHAs, IP blocks) before they ever experience a legal letter.

Is It Legal to Scrape Amazon in the European Union?

In the European Union, scraping publicly visible Amazon data sits under a different framework. The relevant law is the EU Database Directive (96/9/EC) for the database protection question, the GDPR for personal data, and the Computer Misuse Act equivalents in member states for unauthorized access.

The 2021 CV-Online Latvia v. Melons ruling at the Court of Justice of the European Union held that database scraping is permissible if it does not “substantially harm” the database operator’s economic investment. That standard is more restrictive than US law: a scraper that pulls Amazon data at a rate or scale that meaningfully impacts Amazon’s infrastructure or commercial value could face liability under EU law where the same scraper would be safe in the US.

GDPR adds a separate constraint: any review text containing personal data of EU customers becomes subject to data minimization, purpose limitation, and erasure rights once you store it. For most pricing or catalog scraping, this is a non-issue (no personal data). For review-text scraping, it is significant, and it is why most commercial review datasets are anonymized at storage time.

What Amazon Data Can You Legally Scrape?

You can legally scrape Amazon data that is visible to a logged-out browser and not protected by an authentication wall. The defensible scope:

Product detail page fields: ASIN, title, price, strikethrough price, currency, bullets, full description, category ladder, images, variations, Buy Box seller, rating, review count, top reviews
Search result pages: organic and sponsored positions, prices, ratings, image URLs
Category pages: best-seller rank, category trees
Public seller profile pages (the ones you reach without logging in)
Help documentation, policy pages, and category landing pages

What you should not scrape under the cleanest legal posture:

Anything behind https://www.amazon.com/auth/login
Buyer order history (requires login)
Seller Central (requires authenticated seller account)
The post-November 2024 paginated review pages, which now redirect unauthenticated requests to a login wall
Any endpoint that requires solving a CAPTCHA or other bot-detection challenge to reach

The bright line is the login wall. Scraping past it shifts the analysis from “publicly visible data” to “unauthorized access,” which is exactly the framing that pre-hiQ courts used to find scrapers liable.

What Are the Legal Risks of Scraping Amazon?

The practical legal risks of scraping Amazon, ordered by likelihood:

Account termination and IP bans. The most common outcome. Amazon’s anti-bot layer flags your IP, then your account, then your downstream uses (Associates affiliate links stop tracking, Seller Central locks).
Cease-and-desist letter. Amazon’s legal team sends these when scraping is large-scale, public-facing, or commercial. Most teams respond by changing infrastructure (rotating IPs, switching to a managed scraper API). A small number end up in court.
Civil contract suit for ToS breach. Rare and expensive for Amazon to bring, almost always reserved for commercial scrapers with public-facing products that Amazon views as competitive.
Federal CFAA suit. Effectively foreclosed by hiQ for publicly visible data. Real risk only if scraping crosses the login wall.
GDPR enforcement (EU only). Real risk if scraped data includes EU customer personal data and is stored without lawful basis.

Most commercial scrapers operating on publicly visible Amazon data end up at the first or second risk, never the rest. The mitigation pattern is the same in both cases: switch to a managed scraping API like the Amazon Scraper API that rotates infrastructure on your behalf, so a single block does not take out the whole pipeline.

How Do You Scrape Amazon Legally and Defensibly?

The defensible posture for commercial Amazon scraping has six elements:

Stay on the public side of the login wall. Never scrape pages that require authentication.
Honor robots.txt directives, even though they are not legally binding in the US. Amazon’s robots.txt allows the URL paths most commercial scrapers use (/dp/, /s/, /gp/product/) and disallows the ones you should not be hitting anyway.
Use rate limits that mimic human browsing. A scraper that hits 100 requests per second from one IP looks like an attack. The same scraper distributed across residential IPs at one request per IP per minute looks like normal traffic.
Identify the use case in good faith. If asked, document why you are scraping (price monitoring for MAP compliance, competitor analysis, market research, ML training data). Bad-faith uses (counterfeit listing creation, review fraud) lose any legal defense.
Anonymize personal data at storage time. Strip reviewer names from review datasets. Hash or drop IP addresses if you log them.
Keep the data internal or aggregate. Reselling raw Amazon data is the surest way to draw a legal letter. Aggregating (price indices, market shares, trend reports) is much safer.

What Is the Difference Between Scraping and Using Amazon’s Official APIs?

Amazon’s official APIs (Product Advertising API, Selling Partner API) are legally and technically distinct from scraping. The Product Advertising API requires an active Amazon Associates account and an approved use case (typically affiliate marketing). The Selling Partner API requires a registered seller account.

The official APIs return structured data with rate limits, support, and clear ToS. They do not require any of the residential proxy or anti-bot infrastructure that scraping does. They also impose stricter limits on what data you can access, how long you can store it, and what you can do with it.

Most commercial scrapers exist because the official APIs do not return the data the team needs (the Product Advertising API’s data is much more limited than what’s visible on a product page) or because the use case does not fit the API’s allowed-purposes clause (most market intelligence, competitor analysis, and MAP-compliance work falls outside Associates’ permitted use).

What Are Best Practices to Avoid Trouble?

Beyond the legal posture above, four operational practices keep most scrapers under Amazon’s enforcement radar:

Use country-matched residential IPs. A scraper hitting amazon.de from a US datacenter IP triggers anti-bot in milliseconds. The same scraper through a German residential IP rarely does.
Rotate fingerprints, not just IPs. Amazon’s WAF inspects TLS handshakes, HTTP/2 frame ordering, and header sets. Rotating only the IP while keeping the same fingerprint still gets blocked.
Cache aggressively. A product page rarely changes in under an hour. Caching responses for 30 to 60 minutes cuts your request count and your detection risk by the same factor.
Use a managed Amazon scraper API for production traffic. Tools like Amazon Scraper API handle the proxy, fingerprint, and retry orchestration on their side, which means your team is not the one Amazon sees when an IP gets flagged. Pricing starts at $0.90 per 1,000 successful requests, with 1,000 free on signup.

FAQ

Can Amazon sue me for scraping their site?

In theory yes, in practice rarely. Amazon’s primary enforcement against commercial scrapers is technical (IP bans, account terminations, CAPTCHA challenges) rather than legal. They do send cease-and-desist letters to particularly visible or large-scale scrapers, and a small number of those have escalated to civil suit. The legal exposure is to a contract-breach claim for ToS violation, not a federal CFAA claim, because hiQ closed off the CFAA path for publicly visible data.

Is scraping Amazon reviews legal?

Scraping the 8 to 10 featured reviews that render on the public product detail page is legal under hiQ. Scraping the full paginated review history past the featured block is more complicated because, after November 5, 2024, those pages redirect unauthenticated requests to a login wall. Scraping past that wall would require an authenticated session, which moves the activity into “unauthorized access” territory under the CFAA. For full review extraction at scale, see our Amazon reviews scraping guide.

Can I sell data scraped from Amazon?

Selling raw Amazon data (full product detail dumps, full review datasets) is the riskiest commercial use because it directly competes with Amazon’s own data licensing business and is the most likely to trigger legal action. Selling derivative or aggregated data (price indices, market share reports, category trend analyses) is much safer because the underlying data is anonymized and aggregated, and the buyer does not get a usable Amazon dataset out of the transaction.

Does Amazon’s robots.txt allow scraping?

Amazon’s robots.txt explicitly allows the product detail page (/dp/), search (/s/), and product (/gp/product/) URL patterns that commercial scrapers use most. It disallows authenticated pages (/gp/profile/, /gp/buy/, /gp/order-history/), which lines up with the legal “stay on the public side of the login wall” rule. Robots.txt is not legally binding in the US, but honoring it is a good-faith signal that meaningfully strengthens your defensive posture.

Is using a managed Amazon scraping API legal?

Using a managed Amazon scraping API is legal under the same principles as any other commercial scraping: the underlying data must be publicly visible, the use case must be in good faith, and personal data must be handled per applicable privacy law. The vendor (the API provider) is the party making the actual outbound requests, which is why teams often switch to a managed service after their first cease-and-desist. Amazon Scraper API handles the request rotation, IP geography, and fingerprint management, leaving the customer to focus on the data use case rather than the infrastructure.

Can Amazon ban my account for scraping?

Yes. Amazon’s enforcement against scraping is primarily technical, and account-level bans are the most common consequence. If your scraping is associated with an Amazon Associates affiliate account, Seller Central account, or AWS account, those accounts can be terminated independently of any legal action. The mitigation is to keep scraping infrastructure (IPs, accounts, traffic) fully separate from any commercial Amazon accounts you maintain.

What about scraping Amazon for AI training data?

Scraping public Amazon data to train ML or LLM models sits in the same legal framework as any other commercial scraping. The hiQ precedent applies. The 2023 New York Times v. OpenAI case has not yet produced a binding rule on training-data scraping specifically, and most US AI training pipelines continue to scrape publicly visible web data. The pragmatic risk is the same as any other large-scale scraper: technical enforcement first, legal letters second. For AI-agent runtime data (not training, but real-time fetching), an MCP-server-equipped scraper is the cleanest pattern because it routes through a vendor with explicit Amazon ToS handling.

Sources

hiQ Labs v. LinkedIn (Ninth Circuit, 2022) - controlling US precedent on public-data scraping
Computer Fraud and Abuse Act, 18 USC 1030 - the statute hiQ interpreted
Amazon Conditions of Use - the contract terms scraping technically violates
Amazon robots.txt - official robot directives
CV-Online Latvia v. Melons (CJEU, 2021) - EU database scraping framework
Federal Trade Commission - guidance on web scraping - regulatory perspective on commercial data collection