Haunted Spaces: Aboutness and Data-about-Data

In another post, the idea was brought up that there is a disconnect between information that humans make, produce or understand (think) and information (data) that computers are structured to use as they communicate with other parts of the machine (or between machines). This might have as much to do with tagging posts in a blog, adding labels to items posted publication platforms such as Google's Blogger or writing descriptions while cataloging items in Millennium or Ex Libris Voyager. These last two software options are interacted with via a library's search catalog in their OPAC or publicly available URL. The previous interfaces are different.

There are some similarities between each of these. But basically, the similarities revolve around code built into the systems because these are assumed to be how knowledge is categorized. The above article highlighted as "tagging" suggests platforms such as Wordpress have categories and tags. The blend of these features create a general "box" for the knowledge in said post while the tags allow for a little nuance added that supposedly helps "aboutness" to be more clear for readers. The fact of this knowledge organization structure is assumed with the use of the technology and there is no more available to the user of the technology at any give time except for what the designers have assumed as more correct (or justified) at the time. Every piece of machinery has this arrangement, but the ubiquitous quality of these technologies' use currently means that these set modes of knowledge organization are hoisted upon more and more people.

     Millennium and Ex Libris Voyager have their own set of built-in assumptions about knowledge organization and own ways of applying metadata to items - in this case surrogate records for items that are not the record itself. The distinction between the post and the surrogate record means that even though there are still many machine-specific assumptions in every technology mentioned thus far, the surrogate is STILL a very different interaction because it is not necessarily read for its own sake in most cases. Both of these technologies have certain set fields within their interfaces that cannot be changed - even if they can be fine-tuned to a much far greater degree than any of the web-publishing technologies mentioned above.

     Today, however, I was in a conversation with a polyglot cataloger of serials in many languages (currently working with a collection of items from Harry Houdini's library donated to special collections) with the Library of Congress. The conversation was specifically on data-about-data (metadata) and the ways in which technologies do and do not accomplish certain jobs which they could accomplish if certain arrangements were different. She told us that even with the code-style used with cataloging (MARC - Machine-Readable Cataloging), all the detailed set of rules for each field and sub-field (including the formatting of those sub-fields) and all the facets of information able to be added to the surrogate record made in the cataloging module, the technology is still quite limited. By this she meant at least one important point - that even though there are so many methods within this technology to describe artifacts, the human mind understands and is frustrated by the singular method offered to accomplish the cataloger's goals.

     The same conversation included a man, also from the Library of Congress, but from the Preservation Directorate - Re-formatting Division, who has written on the modes of expression possible in describing any given work that are not used due to who has already decided what kinds of information counts as data. There are a great number of factors in these decisions, but much of them have to do with socio-economics. These decisions do not revolve around issues about people or writing. Rather, they are also tied to "truths" about physical and mathematical sciences from positions of power. For a good read on this topic, I heartily recommend "Cataloging Theory in Search of Graph Theory and Other Ivory Towers," a paper that has this post's topic as one facet. The paper is available in a pre-print format from American Library Association here. And again, both of these library minded people recognize that even though computers and IT-minded groups/companies have done a lot in the world, they may not have set the world up for a multitude of knowledge organization structures even though most technologies in use today are capable of so much more than what is being taken advantage of at the present time. Machines do certain things really well. But they only do what they do. Humans do the rest (and built those machines).

Thank you.

As always, dialogue is welcome here or @ Twitter.

Haunted Spaces

Monday, September 10, 2012

Aboutness and Data-about-Data

1 comment: