Heidorn’s article on data curation and E-science was an enjoyable and informative read, but I was left with a distressed feeling that academic libraries are ill-equipped to take on the task of providing long term management of the mountains of data coming from the scientists, scholars and affiliated institutions. I actually shuttered when I read:
Instrumentation and computerization enable scholars and civil servants to collect data with volumes equal to the text content of the entire Library of Congress in a matter of days (Baraniuk, 2011).
How can underfunded and overworked libraries possibly keep up with this massive accumulation of digital material? I was glad to hear that the NIH and NSF are requiring data management plans when they are doling out grants, but I hardly think that is enough oversight as society is expecting to see not just published results, but raw data that will have to be not only stored, but checked and migrated constantly. It seems like scholars would need an endowment in place to preserve their work, but it is more likely for the burden of preservation to fall on the lap of the LIS community. Plus, getting taxpayers to chip in for saving a 1983 clinical study on string cheese consumption is going to be difficult to say the least.
For perspective, I found another interesting blog post from two years ago that also charted various organizations that create “a Library of Congress” amount of data. [http://blogs.loc.gov/digitalpreservation/2012/03/how-many-libraries-of-congress-does-it-take/] Not surprising to see NASA and Facebook on that list.
With the advancements in commercial cloud servers, our hopes may lie in the private sector and academic libraries must strive to work with these 3rd part vendors or risk distancing themselves from the role of collecting and sharing the intellectual output of society. We are drowning in data and it may be up to the tech section to throw the libraries, scholars and general public a lifeline.