8.7 The “Disappeared” Web

Not Found. The requested URL was not found on this server.

One of the greatest fallacies of the information age is that the information found online is secure. Nothing could be further from the truth. After a candidate loses the election, her campaign website comes down. When a news organization puts a new publication system in place, the contents of its former website might disappear. Some information, like user comments, might be ephemeral or updated information will overwrite what was previously there.

Archiving “born-digital” content is an issue more specialists are paying attention to and there are some efforts to capture the web that searchers should be aware of.

A massive effort to archive the web is available through the Wayback Machine from the Internet Archive. They capture websites (somewhat sporadically) which allow people to retrieve web page contents by entering the URL of the site. Searchers can see what a site, and links, looked like – sometimes from 10-15 years ago.

It is also possible to look at the “cached” version of a website if the current site appears to be unavailable. For example, you might search Google for a topic or issue and when you click on one of the top search results, you get the dreaded “404 not found error,” meaning you can’t link to that site. Sometimes this is because the server is temporarily down, sometimes it is because the site may no longer exist. In that case, you can click on the little green down arrow right next to the URL in the Google results listing to see the “cached” version of the site. That is the version of the site that was “spidered” or “crawled” before the site went dark or the server became unavailable. In most instances, you will be able to look at the home page of the site but the links on that home page may no longer work. But at least you can see what the home page looked like before the site disappeared.