May 2009
Preserving Digital Data For Generations To Come
Thankfully, I’m not the only one who has thought of this.  Our society captures and stores a huge volume of data digitally every day, at an ever increasing rate.  And it’s not just scientific data either.  Just think of all the digital photos and videos we take, not to mention all the movies, TV shows, articles, etc.  How can we ensure that this data lasts for multiple human generations, or even as long as humanity itself?  Not only are there data storage issues - that’s a problem in itself - but think of all the different digital data formats, particularly those that use lossy compression.  How can we ensure that the data in these files will always be readable with all the original information intact?  What happens when data formats change?

For example, our digital camera captures photos in the JPEG file format.  JPEG uses lossy compression.  That means when I convert to a different format, I could lose some of the original information in the photo.  After 200 years (i.e., several human generations) of being converted to newer formats, will the photos still be useful, or will they be garbage - useless to my descendants?  And what if I store my personal journals in Microsoft Word format?  Will anyone be able to read them 100 years from now?  I’m sure that the “popular” information like movies and historical information will be OK, but what about all the data generated by the average person?

There is at least one project out there that seeks to solve the data preservation dilemma.  It’s called CASPAR - short for “Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval”.  Hopefully this project generates some serious energy and commitment around this issue.

