New Delhi, August 5: Perplexity, the AI-powered answer engine, is reportedly under scrutiny for its web crawling behaviour. Perplexity is accused of using “stealth and undeclared crawlers” to access website data without following the rules set by website owners. It might raise serious concerns about transparency but what exactly is happening behind the scenes?

Perplexity is allegedly altering its identity online to bypass website restrictions. In a blog post, internet infrastructure provider Cloudflare mentioned that while Perplexity starts off using its declared user agent and it seems to change its identity once blocked by a website. Cloudflare noted that “they appear to obscure their crawling identity” when access is denied, calling it an “attempt to circumvent the website’s preferences.” Perplexity CEO Aravind Srinivas Announces Comet Browser Update, More Invites and New AI Models Coming Soon.

Perplexity is reportedly trying to hide its identity while scraping web pages. Cloudflare claims there is “continued evidence” that Perplexity is modifying its "user agent" and switching the its source ASNs to stay hidden while crawling websites. It is also reportedly ignoring, or at sometimes not even checking, the robots.txt file that specifies which files and pages web crawlers can access.

Cloudflare reportedly spotted the behavior happening on a large scale, affecting "tens of thousands of domains" and generating "millions of requests per day." The company said it was able to identify the crawler by using “a combination of machine learning and network signals.” Cloudflare stated that “we created multiple brand-new domains” that were never shared publicly, had no links pointing to them, and were not indexed by any search engine. These test websites also included a robots.txt file instructing bots not to access any part of the content.

Despite these precautions, Cloudflare reported that when they asked Perplexity AI about these domains, the platform still returned detailed information. The company described this as “unexpected,” noting they had “taken all necessary precautions” to prevent such data from being accessed by their crawlers. Grok Imagine: Elon Musk Announces ‘Super Fast Image and Video Generation’ in Grok App Now Available to All X Premium Users, Says ‘Bring Any Photo to Life in 15 Seconds’.

Cloudflare said, "We observed that Perplexity uses not only their declared user-agent, but also a generic browser intended to impersonate Google Chrome on macOS when their declared crawler was blocked." However, as per a report of TechCrunch, Perplexity spokesperson Jesse Dwyer reportedly dismissed Cloudflare’s blog post, calling it a “sales pitch.” Dwyer also claimed that the bot mentioned in the blog post “isn’t even ours.”

