Wednesday, June 15, 2011

A real copy of every book


"Internet Archive Starts Backing Up Digital Books … on Paper"

by

Jon Stokes

June 14th, 2011

Wired

If you want real long-term backups of digitized e-books, then look no further than dead-tree. At least, that’s the consensus of the Internet Archive project, which has announced an incredibly ambitious plan to store one physical copy of every published book in the world.

“Internet Archive is building a physical archive for the long term preservation of one copy of every book, record, and movie we are able to attract or acquire… The goal is to preserve one copy of every published work,” writes IA’s Brewster Kahle in a lengthy blog post about the plan.

Kahle cites a number of reasons for wanting to preserve physical copies of works that are being digitized; for instance, a dispute could arise about the fidelity of the digital version, and only access to a copy of the original would resolve it. Kahle also told Kevin Kelly that we’ll eventually want to rescan these books at an even higher DPI, so the digital copies will be waiting when we do.

Another reason for keeping a physical copy, and one that Kahle doesn’t mention, is that the problem of long-term digital storage still isn’t completely solved. The cloud as a large-scale storage medium has only recently emerged, and it’s definitely not perfect as a long-term archival medium. Digital archivists have long pointed out that given a sufficient length of time, data loss is a problem even for highly redundant, highly available, distributed storage systems.

Apart from the well-known phenomenon of bit rot, bits can get flipped for any number of reasons as they traverse a network and multiple software stacks, and ECC and other failsafes don’t catch 100 percent of these errors. So on a long enough time horizon with a large-enough, complex-enough system, these undetected, flipped bits will begin to accumulate.

Realistically, Kahle and Co. expect to preserve 10 million books, out of an estimated 100 million published. These will be packed into climate-controlled storage containers, and stored in a facility in Richmond, CA that opens this month.

Kahle describes the details of the physical preservation as follows:

Books are cataloged, and have acid free paper inserts with information about the book and its location,
Boxes store approximately 40 books with labeling on the outside,
Pallets hold 24 boxes each,
Modified 40? shipping containers are used as secure and individually controllable environments of 50 or 60 degrees Fahrenheit and 30% relative humidity,
Buildings contain shipping containers and environmental systems,
Non-profit organizations own and protect the property and its contents.

In order to reach its goal of collecting one of every book ever made, the Internet Archive is soliciting contributions of books from everyone from libraries to individual collectors.

Internet Archive [Wikipedia]

Internet Archive

No comments: