What is a controlled vocabulary?


I found this article quite interesting, especially given the experience our class has had recently in creating our virtual collections.  Particularly when the authors say, “The most basic, and often overlooked, form of controlled vocabulary is a consistent labeling system.” When creating the virtual collection, I found myself wanting to naturally use different words to describe objects that could be grouped in the same category. However, we had to create a controlled vocabulary of tags to make the collection more accessible to users. I do wonder what is lost from doing this– you cannot be as specific and on point in describing something, which can be a little frustrating. However, ensuring that users can locate items takes precedent. 




controlled vocabularies and thesauri

I think I used a thesauri once or twice in college….back when we had to do our research in the library with physical materials, not on a computer.  I remember thinking it was quite useful, this book of terms, had it been there all along?  Then computers took off, and databases, and online search capabilities.  I don’t think I’ve consulted one since then, although I believe I’ve encountered them online (when the site responds to my misspelled search with, “did you mean xxxx?”).  Its easy to be unaware of the existence of a controlled vocabulary or thesauri online when the site automatically applies it, making suggestions and correcting or suggesting terms.  I wonder how many researchers purposefully consult such sources?  Or if it doesn’t matter, when online searching seems to apply them whenever a search is performed?  The tutorial on constructing thesauri demonstrated how much detail and thought must go into creating these tools that many are so unaware of!  

Katie B.


I thought Broughton’s article Essential Thesaurus Construction brought up good points both about what is valuable in using a controlled vocabulary, and what the challenges are in creating one that is reliable.  On the one hand a controlled vocabulary creates greater access by linking terms so a user’s query more often gets them to the information they seek.  On the other, the value of the c.v. depends on its quality.  If it hasn’t been built with integrity, and isn’t maintained, it can become an obstacle for users.  The article made me realize that what makes a controlled vocabulary effective is evaluating user needs and then creating constancy in terminology to meet them.

Chowdhury – Chapter 6

This chapter outlined the basics of the terminology control readings, which was helpful because I’ve never been really good at this kind of thing and I was getting some high school grammar/syntax test feelings. (I was never good at those but look at me now!) This is an aspect of information organization that is definitely really important, but not really one I’d ever thought about before. It’s one of those things that’s always been there in the LCSH but I never took notice before.

What the chapter reminded me of is how this basically functions a lot like a Google search where if you put in a term pretty close to what you mean, the engine will understand and return results for what you actually wanted. (I couldn’t remember the term ‘cosmology’ while trying to do a search on it but when I did a search for ‘inflation theory’, Google returned results for cosmology.) I know that Google is a much more advanced and powerful system but that’s the parallel I see between the two. The semantically related phrases are all grouped together so that all the items with that information can be found or browsed with ease.

Controlled Vocabulary

I really loved the article on controlled vocabulary. It’s something that most people don’t consider when browsing a website, or using the library. Most of us just go with what we know and if it isn’t there we just give up. I think it is often difficult for users to realize just how much work needs to go into organizing a library. All of these terms need to be added so that search results can be easily found. It is just interesting to think of all of the ways that someone could potentially describe something. As librarians where do we draw the line as to what is associated with what materials?

Controlled Vocabulary Challenge

The article by Fred Leise was very insightul – pointing out the power of a controlled vocabulary  to help a user effectively query a database.  The explanation of the need for authority files and hierarchical relationships (broader term and narrower term) was also quite clear.  However, I question the assertion that the intended meaning of even simple terms can be understood by almost everyone.

The example is given in the article of The Gap (clothing store) with the indication that the controlled vocabulary  (along with the accompanying pictures) make the meaning perfectly clear.  As one reader responded in the comments of the article, there is plenty of room for misinterpretation – even by speakers of the same language.  The Gap uses the term “bottoms and pants.” In the United Kingdom, pants refer specifically to underwear.

As I discovered one evening in London, even a simple request for “cream” for my tea had negative results when Housekeeping arrived with several small containers of “clotted cream.” This product is great for spreading on scones – but of no use to a cup of tea.


What is a controlled vocabulary?

Controlled vocabulary is described as a subset of natural language.  it is not how we speak, but instead a translation of how we speak, in order to understand the actual organization of words.  When initially reading the definition of a controlled vocabulary, I first associated it with the “spell-check” mechanism of Google.  Even when a user is performing a Google search and using incorrect spelling, the search engine will still auto-correct and generate the intended search.  By having the controlled vocabulary, Google is able to ensure that the user finds the intended results, even with errors like misspellings.  I find the idea of a controlled vocabulary particularly interesting when it comes to some other examples that the article listed, such as for synonyms.  I imagine that individuals from different geographical areas who have different dialects would each use unique terms for the same words.  In a normal situation, those words would generate different responses, but with a controlled vocabulary both users are able to receive the same results.

Otlet (Forgotten Forefather, Origins of Info Science)

How interesting to read about Paul Otlet and his ideas, creations and visions with respect to information classification and retrieval (among other related topics).  Its amazing that someone would undertake such a comprehensive task of creating a “master bibliography” of the entire world’s books and documents!  And for him to envision it as a faceted system so that topical relationships are interconnected, something that we seem to still be discussing and perfecting today.  Its quite fascinating that someone can propose concepts that are not fully appreciated or comprehended for nearly 100 years.  

I’d love to know more details about how Otlet’s work (or what was left of it) was more or less abandoned for fifty years at at University?!?  How does that happen?  It was the 1940s, and while I suppose the war going on in Europe likely had something to do with it at the time, what about afterwards?  No one bothered to clean out that room until the 1990s?  I thought space in Europe was at more of a premium than that, especially at a University!  It seems to say a lot about Otlet’s fall from recognition for his contributions, although as is the case with many big thinkers and creators, the value of his ideas seems to be more appreciated now than during his life.  

Katie B.

Paul Otlet

“UDC’s most innovative and influential feature is its ability to express not just simple subjects but relations between subjects … In UDC, the universe of information (all recorded knowledge) is treated as a coherent system, built of related parts, in contrast to a specialised classification, in which related subjects are treated as subsidiary even though in their own right they may be of major importance.”

Expressing the relationships between subjects and ideas in a “web” is the entire dream of hypertext and hypermedia. The “links” between information under UDC are relational. Organizational structures for information, such as the Dewey Decimal or anything organized in a hierarchy of distinct subjects, moves from being introduced as a general subject and then becomes more specific in a top-down direction through the node. Relational information structures found in UDC or hypertexts, information can be linked across numerous subject “nodes” and users can access information in a nonlinear way. UDC implies that no documents have self-evident, eternal subjects and meanings, but their aboutness is always being defined by new associations and amalgamations. Even subject matter from a long time ago is constantly being redefined by the present, so it seems that faceted organization is more significant than ever.

Learning about Paul Otlet’s contribution to information architecture, I especially loved hearing about his installation of index cards in a sprawling array of cabinets. This sounds bizarre and beautiful to me; Otlet literally was beginning to build a visual/physical analogue of the Internet at the beginning of the 20th century. 

Weinberger – The Geography of Knowledge

I thought this chapter was great because it touched on a lot of what we’ve gone over in class with radical cataloging, the general outdated feel of the Dewey Decimal System, and how Amazon pretty much has the coolest classification systems ever.  I liked that Weinberger pointed out that overhauling the system is much easier said than done, and it is very much a double-edged sword. It would be great if Dewey Decimal became more inclusive of other cultures and easier to understand and categorized for the contemporary age, but then you have to deal with the hundreds of thousands of libraries that now have to overhaul where their physical books go and how they are arranged.

When I worked at a library in high school I was there when they decided to give graphic novels/comic anthologies their own section separate from young adult books (which are placed in their own area in my library), and even though that was just one small section to rearrange we complained about it for weeks. I can’t imagine having to re-shelve half the library because the Dewey Decimals changed.

It’s hard to find a happy medium in a situation like this, but I think it is possible to find a solution that’s a bit better than “oh, we all know this is outdated, but it’s what we’ve got to work with.” Weinberger points out that information and knowledge are ever-changing and evolving, so it might just actually be impossible to ever have a truly Amazon-esque cataloging system for a library. I think a possibly solution may be for each library to individually consider its own users and what they’re looking for, but that would also unleash a whole other pile of problems. (A lack of a universal system might mean you’re out of luck if you go to a library outside your own neighborhood, etc.)