Logo
READLEARNKNOWCONNECT
Back to posts
the-archives-access-problem

The Archive's Access Problem

ChriseApril 13, 2026 at 6 PM WAT

The Internet Archive Is Still Being Locked Out of News Sites

News sites are blocking the Internet Archive's crawler over AI fears.

The Wayback Machine has been saving web pages for nearly 30 years. Over a trillion of them. Journalists use it to check old versions of government sites, verify claims, and see what's been changed or deleted.

A recent USA Today investigation used the Wayback Machine to track how a US agency's public data had shifted over time. That's the kind of work the Archive was built for.

But USA Today's parent company, which runs over 200 news outlets, blocks the Wayback Machine from saving its own articles. The New York Times does the same. So does Reddit. Originality AI (a content verification platform) found 23 major news sites currently blocking the Archive's crawler, and 241 news sites across nine countries block at least one of its bots.

The reason: AI. Publishers are worried that AI companies will scrape archived news content from the Wayback Machine to train their models without paying. The New York Times said its content on the Archive is being used by AI companies to compete with them directly. Reddit said the same thing.

More than 100 journalists have signed a letter supporting the Internet Archive, calling the Wayback Machine a crucial resource for their work. They pointed out that with local newspapers struggling and libraries stretched thin, the Archive has become essential to preserving journalism.

If the Archive loses access to major news sources, a lot of early digital history could become much harder to find. Pages saved by the tool are also used as evidence in court cases, so it's not just about journalism.

Tags

#ai#digital-history#internet-archive#journalism#news

Join the Discussion

Enjoyed this? Ask questions, share your take (hot, lukewarm, or undecided), or follow the thread with people in real time. The community’s open, join us.