Reddit v. Perplexity: Terms of Access as the Next Front in AI Data Litigation
November 04, 2025
Written by: Jameson Pasek
When Reddit sued Anthropic in June, the company drew attention for omitting copyright claims altogether. Instead, it framed data scraping as an issue of how content is accessed rather than who owns it, grounding its argument in platform contracts and system-use theories.[1] Reddit’s new lawsuit against Perplexity continues that approach while adding allegations involving third-party data-collection intermediaries. Together, these cases illustrate how platforms are increasingly turning to contract law to regulate AI training and data use.[2]
The New Case at a Glance
Reddit has filed suit in federal court against Perplexity, an AI-powered search and answer engine, along with several data-collection partners. The complaint alleges that the defendants bypassed Reddit’s rules and safeguards to obtain Reddit content at scale for commercial AI use without a license. Reddit seeks damages and an injunction blocking further use of its data. Perplexity denies the allegations and maintains that it merely summarizes public discussions rather than training on Reddit data.[3]
As in the Anthropic case, Reddit’s arguments rely on its User Agreement, Privacy Policy, and technical access controls. By emphasizing the manner of access rather than ownership of content, Reddit positions the case around contract, trespass, interference, and unfair competition claims, rather than copyright disputes.[4]
Strategic Through-Line from Reddit v. Anthropic
From a litigation perspective, the Perplexity case represents an extension of the same approach Reddit used earlier this year:
- Contract as the foundation. Reddit argues that anyone (human or automated) who accesses the platform is bound by its terms, including restrictions on commercial use and automated scraping outside approved application programming interfaces (APIs).
- System-use harms, not expressive harms. The company frames scraping as unauthorized use of systems and interference with its operations and relationships, rather than as copying protected expression.
- Licensing and damages. Reddit’s established licensing programs with companies such as Google and OpenAI are central to its case. They provide a benchmark for damages and help define what constitutes authorized access.[5]
This strategy aligns with how major platforms have invested in user-privacy controls, rate-limiting, and structured APIs. Enforcing access through contracts and technical compliance mechanisms allows for remedies that courts can more readily oversee.
What’s New Here: Intermediaries and “Data Laundering”
The Perplexity complaint names multiple scraping partners and alleges that they disguised their identities, evaded rate limits, and concealed the source of the data ultimately used by Perplexity. Those allegations broaden the factual and legal landscape, raising three important considerations:[6]
- Attribution and knowledge. Establishing that platform content was used to train or power AI systems may require examining logs, vendor agreements, and forensic evidence. When intermediaries are involved, plaintiffs will likely invoke joint-venture or agency theories to connect the dots.
- Scope of injunctive relief. If a court finds unauthorized access, remedies may need to reach not only the AI developer but also third-party data sources and cached datasets. That could complicate deletion or “model-purge” orders and highlight the need for clear data-handling protocols in future licenses.
- Damages and unjust enrichment. The inclusion of intermediaries invites alternative damages theories based on avoided license fees, disgorgement, or restitution tied to the value derived from Reddit content. Defendants are likely to dispute causation and the appropriate measure of enrichment.
The Legal Questions to Watch
- Contract formation and assent. A key question is whether automated agents can assent to online terms and whether those terms were properly presented. Courts have been increasingly willing to enforce such agreements against scrapers, but outcomes depend on notice, technical barriers, and available APIs.[7]
- Authorization and circumvention. Reddit will likely argue that defendants intentionally bypassed access limits and security measures. Defendants may counter that publicly viewable pages remain accessible to all users and that using search results is not equivalent to direct scraping. The tension between “public availability” and “authorized commercial use” will be central.
- State-law torts vs. federal preemption. Defendants may seek to reframe the dispute as one involving copyright to support preemption arguments or removal. Reddit would necessarily contend that its claims are rooted in contract and interference, not copyright. Early motion practice could help define the boundaries between these theories.
- Design of injunctions. Any injunction will likely need to specify technical obligations such as respecting robots.txt, using licensed APIs, or purging cached data. Courts increasingly expect clarity and practicality in these orders given the rapid pace of AI development.
The Broader Impact
If courts uphold this contract-based framework, AI developers will face a clearer choice: obtain data through licensed channels, avoid it entirely, or accept higher litigation risk. Over time, the market may shift toward verified data pipelines and provenance tracking as a competitive advantage.[8]
Platforms are likely to continue refining their terms of service and monetizing access through licensing. That could lead to more granular arrangements, such as community-level licenses, privacy-tiered data feeds, and built-in deletion features, that balance compliance and accessibility.
Conclusion
Reddit’s latest lawsuit against Perplexity illustrates a broader evolution in how data rights are asserted in the age of AI. By focusing on authorization and access terms rather than ownership of content, platforms are testing a practical and enforceable route for protecting their data ecosystems. The outcome could shape how courts, companies, and developers approach data licensing and compliance for years to come.[9]
This publication is distributed with the understanding that the author, publisher, and distributor of this publication and/or any linked publication are not rendering legal, accounting, or other professional advice or opinions on specific facts or matters and, accordingly, assume no liability whatsoever in connection with its use. Pursuant to applicable rules of professional conduct, portions of this publication may constitute Attorney Advertising. The choice of a lawyer is an important decision and should not be based solely upon advertisements.
[1] Michelle Morgante, Reddit Suit Against Anthropic Over Using Content to Train AI Cites Contract, Not Copyright, THE RECORDER (June 6, 2025), Law.com, https://www.law.com/therecorder/2025/06/06/reddit-suit-against-anthropic-over-using-content-to-train-ai-cites-contract-not-copyright/.
[2] Dylan Butts, Reddit Accuses Perplexity of Stealing User Posts, Expanding Data Rights Battle with AI Industry, CNBC (Oct. 23, 2025), https://www.cnbc.com/2025/10/23/reddit-user-data-battle-ai-industry-sues-perplexity-scraping-posts-openai-chatgpt-google-gemini-lawsuit.html.
[3] Reddit, Inc. v. SerpApi LLC, No. 1:25-cv-08736 (S.D.N.Y., Oct. 22, 2025).
[4] Complaint at ¶ __, Reddit, Inc. v. Anthropic, PBC, No. CGC-25-625892 (Cal. Super. Ct. filed June 4, 2025).
[5] Id.
[6] Reddit, No. 1:25-cv-08736.
[7] hiQ Labs, Inc. v. LinkedIn Corp., 31 F.4th 1180 (9th Cir. 2022).
[8] Butts, supra note 2.
[9] Reddit, No. CGC-25-625892.