The future is all bot vs. bot
Add Axios as your preferred source to
see more of our stories on Google.

Illustration: Gabriella Turrisi/Axios
A new kind of turf war is breaking out on the web, with AI bots battling other AI bots to seize or defend stockpiles of the AI era's most valuable commodity: data.
The big picture: AI makers hungry for more data to train their language models are grabbing everything they can, while information owners are increasingly fighting fire with fire by turning to AI-powered tools to protect their intellectual property.
Driving the news: Cloudflare, the infrastructure and security firm used by 1 in 5 websites, introduced a new service last week that protects clients' content from poaching by data-harvesting bots.
- "We hear clearly that customers don't want AI bots visiting their websites, and especially those that do so dishonestly. To help, we've added a brand new one-click to block all AI bots," Cloudflare said in a blog post.
Catch up quick: Since the '90s, websites have used a simple file on their sites —called "robots.txt" — to declare whether their content was available for automated tools to read and copy.
- These files are like "no trespassing" signs that rely on robots to respect them. They have always been informal pacts, and are neither technically nor legally enforceable.
In the AI era, AI makers are using increasingly aggressive scraping tactics, seizing any web data that's "publicly available."
- Microsoft AI CEO Mustafa Suleyman recently told the Aspen Ideas Festival that all "open web" content is "freeware" available for the taking under fair use principles: "Anyone can copy it, recreate with it, reproduce with it," he said.
- Suleyman called content protected by robots.txt a "gray area ... that's going to work its way through the courts."
Content owners and media publishers sharply disagree.
- But they've been torn between their instinct to protect their intellectual property and their eagerness to take money from those AI makers, like OpenAI, that have been opening their checkbooks to secure data rights.
- Some copyright holders, like the New York Times, have challenged AI data use in court.
Between the lines: Cloudflare's "one-click" tool goes further than robots.txt and actively blocks AI bots.
- Ironically, Cloudflare's tool depends on its own machine learning model to "fingerprint" which bots are working for the AI data stockpilers.
- The company says its model is smart enough to adapt fast when AI makers rename their bots or mask their identities.
- "We're proud to say that our global machine learning model has always recognized this activity as a bot, even when operators lie about their user agent," Cloudflare said in its post.
Zoom out: The bot versus bot free-for-all over access to web data is just the opening salvo of what will be an increasingly hot war in every realm, as AI fuels new attack strategies and defense tactics.
- You can already see it in social media content moderation, where AI-managed rule sets are being applied to identify and manage new floods of AI-generated posts.
- The same dynamic is playing itself out in cybersecurity, where malicious hackers are using AI to probe and exploit vulnerabilities while defenders increasingly rely on AI-based pattern detection to flag anomalous behavior.
- In finance, AI algorithms are already trading against one another, pushing edge-case crises to extremes — and we haven't even begun to see the longer-term impact of "smart contract"-driven crypto trading.
- The future of medicine could pit super-bugs unleashed by AI-empowered biohackers against AI-produced antidotes raced to the field by white-hat biohackers.
- Arms-makers and the military are already preparing for robot-filled, drone-drenched battlefields where AI agents call the shots.
The bottom line: There's no escape from this bot vs. bot "technodialectic," as Andrew Leonard prophetically wrote in a 1996 Wired story.
- Leonard's dictum that bots "cause as many problems as they solve" should be front of mind for every AI CEO today.
Editor's note: This story has been corrected to note that Cloudflare introduced its new tool last week, not Friday.
