Major websites are blocking AI crawlers from accessing their content

Source: 
Author: 
Coverage Type: 

Nearly 20% of the top 1000 websites in the world are blocking crawler bots that gather web data for artificial intelligence (AI) services, according to new data from Originality.AI, an AI content detector. In the absence of clear legal or regulatory rules governing AI's use of copyrighted material, websites big and small are taking matters into their own hands. OpenAI introduced its GPTBot crawler early in August, declaring that the data gathered "may potentially be used to improve future models," promising that paywalled content would be excluded and instructing websites on how to bar the crawler. Of the 1000 most visited websites in the world, the number of sites blocking OpenAI's ChatGPT bot has increased from 9.1% on Aug 22, 2023, to 12% on Aug 29, 2023, per Originality.AI's data. Google and other web firms see their data crawlers' work as fair use, but many publishers and intellectual property holders have long objected, and the company has faced multiple lawsuits over the practice.


Major websites are blocking AI crawlers from accessing their content