Weekly Response Alana MohamedWright’s description of Paul Otlet’s system paints him as an innovative mind who intuited the power of linked data in a social space. Rayward’s article provides even more context of his system by discussing the implicit biases and problems with the often utilized Dewey Decimal System. The first idea of Otlet’s that was impressive to me was his idea of creating a social space for information. While Dewey’s system places information in hierarchies, Otlet’s UCD system outlines multiple horizontal relationships between information. I was surprised at how easily I understood the UCD system, though I certainly can imagine there would be some conflict deciding if something what a “philosophy” or a “science.” I suppose that’s the point of these articles, to point to the ways knowledge is constantly in flux. Organizing knowledge seems to mean constantly changing, or at the very least, operating within a classification system that allows for change. Wright’s description of Paul Otlet’s system paints him as an innovative mind who intuited the power of linked data in a social space. Rayward’s article provides even more context of his system by discussing the implicit biases and problems with the often utilized Dewey Decimal System. The first idea of Otlet’s that was impressive to me was his idea of creating a social space for information. While Dewey’s system places information in hierarchies, Otlet’s UCD system outlines multiple horizontal relationships between information. I was surprised at how easily I understood the UCD system, though I certainly can imagine there would be some conflict deciding if something what a “philosophy” or a “science.” I suppose that’s the point of these articles, to point to the ways knowledge is constantly in flux. Organizing knowledge seems to mean constantly changing, or at the very least, operating within a classification system that allows for change.
Alex Wright’s article on Paul Otlet and his visionary (and mind-boggling ambitious) explorations to organize all the world’s data places him on a grand pedestal on par with forgotten geniuses like his contemporaries Leon Theremin and Nikola Tesla. Otlet’s work gives me another reason to marvel at the ingenuity and progress of the arts and sciences at the turn of the century. Wright’s detail of the birth of the UDC system, with it’s functional faceted search similar to Ranganathan, and the creation of the Mundaneum, that unfortunately named uber database, were equal parts fascinating and tragic and set the stage for a deeper analysis by W. Rayward.
Rayward’s history of information science sheds must greater light on Otlet’s work and how it was a remarkable precursor to today’s stage of library and information science. While Otlet’s theories were sound, it would take a mere 60 years for the technological advances to catch up to make them fully realized in the interplay of the internet, Google and current cataloging practices. Otlet was thinking big with his universal catalog (RBU), universal classification system (UDC) and even the universal book. His attempts to reclaim the term ‘monograph’ from the printed codex and expand it to the 3×5″ index card or microfilm was admirable and prophetic in regards to today’s upheaval of traditional publishing as we move further in to a more integrated digital age.
I would like to the think of Paul Otlet smiling down on the many realizations of his work, but I imagine he would rather be busy tackling ever more complex theories in the pursue of collating, analyzing and disseminating all of the world’s collective human knowledge in one tidy package.
“the one with the lions in front” – I like this description in a book.
I did enjoy Weinberger’s indictment of Dewey in their article – we must always remember in these conversations about classification that it has all been built on a flawed, white Christian male supremacist lens of a base. While Weinberger does ask the question “why don’t we fix it?”, which, after reading too much detail about Dewey, I did not expect them to answer. But they did – saying essentially that even if we do revamp the entire structure in light of a hopefully less People-Like-Dewey-centric view, it will take work and people will still complain.
I am entirely unconvinced that this is a reason not to change a dated system that close to all librarians admit is problematically biased. They say “But that isn’t a good enough answer if you’re organizing physical objects,” as though it is a good enough answer to any question. They also see (continued) greatness in Dewey’s system because it lets patrons in libraries physically explore “What We Know.” This seemed absurd to me because hey man, it’s 2007, there are different cataloging systems with less messed up classification that also allow readers to browse books by subject in a physical library setting. This was a revolutionary system at the time it was designed, but that is not an excuse to leave it untouched.
With the Paul Otlet article, there seemed to be less of these petty arguments going on behind the historical scenes of cataloging. While Dewey and Carlyle and of that ilk seem to be aggressive (well) nerds, Otlet seemed to me to be more of an artist. His approach to organizing and accessing reminded me a lot of current data mining practices. I can’t say I find any pleasure in the act of tagging data points within information directly, but seeing the resulting projects of some of these processes is really interesting, and beyond that, sometimes even really playful or creative. I think access in this day and age has moved beyond getting the information into the hands of the public and towards encouraging that public to engage with said information.
As a more basic appreciation, I find the term “social space of a document” really useful and plan to make everyone really tired of hearing it.
That being said, I am quite annoyed with the less than complimentary comparisons that I’ve seen drawn, more frequently than just the Weinberger reading, between Dewey’s system and Amazon. Personal Amazon boycott aside, the company admittedly offers some of that same creative, innovative, even playful relationship building between information sources that I appreciated in Otlet’s approach. Still, it’s hardly fair to compare the two, and I’m not really sure it’s beneficial either. What suggestions could Amazon’s process really offer the library cataloging system? Beyond an interesting philosophical distinction that is handy to perhaps students of cataloging, the system mandated by physical space (not to mention resources) makes for libraries and amazon a case of apples and oranges.
Paul Otlet’s impressiveness, to me, lies in his understanding that information is social. Wright is hesitant to admit Otlet outright influenced the creation of the web; both he and Rayward emphasize Otlet’s predictive, ambitious-beyond-his-era talents. He understood that information is relative and linked and he created concepts for information retrieval and networks. Yet while this eerily supernatural premonition of the internet is exciting to read about, I was most drawn to the way Otlet incorporated humanity into his mundaneum. He articulated that information is best understood in context, including the reader’s relationship to the document. Not only does Otlet’s project allow the reader to find what to read, the reader can interact with the document, inherently changing “the social space” of a document by reading it. Wright refers to Otlet’s desire to create a “new ‘world city'”, a world in which information is shared and created across borders and languages. That Otlet frames his proposed information sharing network as a society illustrates to me the important characteristic of information: it is how everyone communicates with each other. Information is no longer experts communicating to lay people, or experts communicating to other experts. It includes everyone. The idea of reader contribution is no more evident than in the myriad of ways readers, writers, consumers, producers interact on a daily basis online today. It is Otlet’s inclusiveness and understanding of the social aspect of information that impresses me most.
The Alex Wright article introduced to me the word Mundaneum, which evoked in me memory of certain temp jobs, so I went online to determine whether it actually derived from the word “mundane.” I failed that part of the quest but learned that this same author recently published a book about Paul Otlet. The summary of the book on the amazon website provided phrases even more entertaining that the ones in the article, including defining the Mundaneum as the “steampunk version of hypertext,” and adds that beyond dubbing his envisioned network merely a réseau, he described it as a réseau mondiale – a worldwide web. Perhaps this article, published in 2003, started the thread of interest in Alex Wright that led to the book, which was published only last June.
But outside the steam punk fun of it, Wright reveals Otlet as a true visionary: “he simply believed that documents could best be understood as three-dimensional, with the third dimension being their social context: their relationship to time, language, other readers, writers and topics.” And later, “With the advent of the Semantic Web and related technologies . . . we are moving towards an environment where social context is becoming just as important as topic content.”
The chapter “The Geography of Knowledge” from Everything is Miscellaneous was entertaining and informative. One never gets tired of reading about Melvil Dewey, his bizarre fascination with the metric system, his attempt to make English more efficient, and his limited world view, which is only recently being acknowledged. While Weinberger acknowledges that Dewey’s perspective is that of a “small-minded American Christian jingoist,” he concludes “today’s category easily becomes tomorrow’s embarrassment,” and answers a question raised earlier, when we discussed radical cataloging, “The Dewey Decimal Classification system can’t be fixed because knowledge itself is unfixed. Knowledge is diverse, changing, imbued with the cultural values of the moment.” Case in point: Weinberger’s analysis of amazon.com, which examines its classification system, is described as “fun” and “friendly.” He describes its collection as a “miscellaneous pile that can be digitally sorted to reflect the individual interests of each visitor.” In fact, human beings are responsible for sorting through that pile in order to fulfill purchases, human beings employed by other human beings who may also be described as “small-minded.”
Weinberger’s book was published in 2008. In June, 2014, the Department of Labor launched an investigation of Amazon’s labor practices after two worker deaths. Labor practices have also been decried here, and here.
Even when you’re digital, everything is geography.
I’ll be interested to find out whether there’s a distinct difference in data management conventions for LIS professionals as opposed to whoever is creating the data. The blog post about the scarf and the overview on the Penn State library website both seemed to be addressing people who create data and, more generally, assuming an expert understanding of the data’s significance. As a result, they seemed to place a significant fraction of the responsibility for facilitating access to the data on researchers. If researchers are expected to have the skills to create their own metadata and maintain their data, does that mean that the role of librarians in a scientific setting is more advisory than custodial? In the scarf analogy, the person who publishes knitting patterns may not have created the actual pieces or even be able to, and the person who makes the scarf may not know the instruction-manual industry conventions and best practices for representing physical actions in a language any other knitter can understand. At the same time, the author of the manual has to know enough about the process of knitting to be able to foresee what knitters need in a pattern. So how much do scientific librarians need to know about the research that produced a certain dataset? What exactly are they doing with the data that researchers aren’t expected to either do themselves or direct in minute detail?
Heidorn’s article on data curation and E-science was an enjoyable and informative read, but I was left with a distressed feeling that academic libraries are ill-equipped to take on the task of providing long term management of the mountains of data coming from the scientists, scholars and affiliated institutions. I actually shuttered when I read:
Instrumentation and computerization enable scholars and civil servants to collect data with volumes equal to the text content of the entire Library of Congress in a matter of days (Baraniuk, 2011).
How can underfunded and overworked libraries possibly keep up with this massive accumulation of digital material? I was glad to hear that the NIH and NSF are requiring data management plans when they are doling out grants, but I hardly think that is enough oversight as society is expecting to see not just published results, but raw data that will have to be not only stored, but checked and migrated constantly. It seems like scholars would need an endowment in place to preserve their work, but it is more likely for the burden of preservation to fall on the lap of the LIS community. Plus, getting taxpayers to chip in for saving a 1983 clinical study on string cheese consumption is going to be difficult to say the least.
For perspective, I found another interesting blog post from two years ago that also charted various organizations that create “a Library of Congress” amount of data. [http://blogs.loc.gov/digitalpreservation/2012/03/how-many-libraries-of-congress-does-it-take/] Not surprising to see NASA and Facebook on that list.
With the advancements in commercial cloud servers, our hopes may lie in the private sector and academic libraries must strive to work with these 3rd part vendors or risk distancing themselves from the role of collecting and sharing the intellectual output of society. We are drowning in data and it may be up to the tech section to throw the libraries, scholars and general public a lifeline.
I found the article The Emerging Role of Libraries in Data Curation and E-Science extremely interesting–in discussing the role of the library and changing needs of scientific data collection in the digital age. The article broke down what needs to be done, and again just like the Artist Books we talked about last week, that it would need to be a collaborative effort, and as a result of a shift from “data poor to data rich” in research the need for data curation.
What I thought was funny and reread several times was the instances where the author seemed to get very dramatic, especially in sentences like “when academic library administrators first hear that scholarly data now fall within the purview of the library, they may lose many nights’ sleep wondering who has cast the curse upon them…” or “Mornings, over coffee with public and school library friends, academic librarians may lament their fate.” While I understand that librarians are able to look at a situation like data curation and recognize without knowing the details what a huge undertaking it will be–isn’t that what we want. It seems like in other professions that technology is making their jobs more obsolete (a movie projectionist comes to mind–I worked in a movie theater for a long time). This is a perfect example of how and why librarians will always be needed. A lot of the articles we read about this semester talked about rigid certain things are, or how conservative, or not keeping up with the times things are–records, Library of Congress headings, Dewey, cataloging. It also seems at the same time we do not like things that are clearly defined or new. I think that these new areas where our librarian knowledge can be utilized is a good thing. As mentioned in the article that public and school librarians “may be relieved that this task of curating…is not their fate,” those are the jobs that always seemed to get cut first in economic unstable times.
(when I saw the title “How Is a Scarf Like a Dataset?” I was really hoping it would connect the process of knitting to this topic, which is an accessible analogy for me. I wasn’t disappointed!) In any case, while this article was wholly entertaining, I do not feel that I learned a lot about the creation of datasets, and am now really curious about what the “bind off” of datasets is. I did find it to be generally applicable to how I USED datasets in other classes. It was a fun article, though.
I also found the Heidorn article very interesting. As someone who works with government documents, and documents produced by government entitities (which do not always seem to be the same thing – the judicial branch never seems to be categorized with govdocs), I am so interested in how proprietary databases and products making these documents and data accessible work.
“Many scholars are unaware of the coming changes in the sociology of science and do not have the required skill sets to address the requirements in their new proposals (Cragin, Palmer, Carlson, & Witt, 2010). Worse, librarians know relatively little about current data management practices of scholars. Institutions have not yet established who will conduct data curation work.”
This is precisely why this kind of data ends up only functionally accessible by something like ProQuest. Equating scholars to government agencies and entitites is perhaps ignoring nuance, particularly that government data is publicly funded and that which is not classified should be free and easily accessable, but training librarians in specialized data management practices can only make information more accessible to the public.
In direct contrast to my last comment, I think it is so interesting that data collection (as in, making data part of the collection) by academic libraries is even happening. It is something that had never occurred to me. The idea that libraries should be involved from the start of data collection is so intriguing, and I see where the idea is coming from, but is that standard feasible for scientists who are not working under an academic umbrella? I am sure that many scientific organizations do have librarians, but, my point is, could a rise in librarian-aided research lead to preference for that data, and therefore research from smaller organizations may become even less represented? Is data collection with the help of a librarian better data, or just easier to integrate into a library?