Technology

AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

Source

Ars Technica

Published

Aug 4, 2025

TL;DR

AI Generated

Cloudflare has accused AI search engine Perplexity of using stealth bots to bypass websites' no-crawl directives, violating long-standing Internet norms. Despite efforts by website owners to block Perplexity scraping bots through robots.txt files and firewalls, Cloudflare claims Perplexity still accessed site content. Cloudflare researchers conducted tests and found that Perplexity employed stealth tactics to evade blocks, continuing to scrape content undetected. This situation raises concerns about the challenges posed by advanced AI technologies in circumventing established web protocols.

Read Full Article

The Download: murderous ‘mirror’ bacteria, and Chinese workers fighting AI doubles

Scientists initially proposed creating mirror bacteria with reversed proteins and sugars for research purposes, but now fear these organisms could pose a catastrophic threat to life on Earth. In China, tech workers are concerned about AI replication of their skills and personalities, leading to fears of losing professional identity. Meanwhile, the White House and Anthropic are working on a compromise, and other tech news includes Palantir's manifesto, Germany's push for looser AI rules, and Nvidia's shift towards AI over gaming.

MIT Technology Review•

1 week ago

News outlets are blocking Wayback Machine from archiving their pages — 23 outlets concerned AI companies might abuse fair use and use it to train their models

Many news outlets are blocking the Wayback Machine from archiving their pages due to concerns that AI companies may misuse fair use policies by using archived content to train their models. This move could limit access to historical news stories and crucial information in an era where misinformation is rampant and AI models can generate convincing but false answers. Notably, 23 major publications, including USA Today and The New York Times, are currently blocking the Internet Archive's crawler. While some argue that publications should handle their own archiving, a neutral third party like the Wayback Machine is crucial for maintaining accurate historical records and tracking changes in online content. Despite concerns about fair use, preventing archiving services like the Wayback Machine could have more negative consequences than benefits for society.

Tom's Hardware•

2 weeks ago

MIT Technology Review

The Download: gig workers training humanoids, and better AI benchmarks

Gig workers worldwide, like medical student Zeus in Nigeria, are training humanoid robots by recording their daily activities for data collection by companies like Micro1. These workers are helping to train humanoids for various tasks, raising concerns about privacy and consent. Meanwhile, the AI evaluation process needs to shift from isolated problem-solving to assessing performance in real-world, complex environments to better understand AI capabilities and impacts. Additionally, a quantum computer competition in Oxford aims to solve healthcare problems that traditional computers cannot, with a $5 million prize at stake.

MIT Technology Review•

4 weeks ago

US judge sides with Anthropic, says company supply chain risk branding over Pentagon disagreement 'Orwellian' — Trump slapped AI company with designation after it refused to lower its guardrails for the military

A U.S. court has ruled in favor of Anthropic, temporarily preventing the Pentagon from labeling the company a supply chain risk. The dispute arose when the military demanded Anthropic to compromise its AI safety policies, which the company refused. Judge Rita Lin criticized the government's actions, stating that branding a company as a potential adversary for disagreeing is unjust. Anthropic's CEO refused to allow the use of their AI for mass surveillance and autonomous weapons, leading to President Trump banning the company from federal agencies. Despite this win, Anthropic still faces legal battles against the government.