Bundle of joy

[Questions at the end]

del.icio.us has a new “bundle” feature that allows for non-exclusive categories to contain tags. All it does now is break your tags up into headings, but it’s clearly intended to provide the second level of hierarchy as user’s tag lists sprawl beyond manageability. At first I was freaked out because I wasn’t sure whether I wanted to further categorize my tags. The nice thing about del.icio.us is the way that it encourages you to make a thousand eccentrically tagged piles and not worry about careful taxonomy. Each tag has a meaning for you, and when you want to find something again you look at the list and back-trail just one of your several associations. But then it’s social software, and so you take into consideration both the intelligibility of your tagging for other people and, possibly, their respect for your ordering scheme. (Certainly, one avoids telling associations. See, for instance, someone’s “funny” tag.)

I found with the bundling that I was self-conscious about making stupid general categories. For instance, I tend to throw everything “humanistic” into [literature/philosophy/art], because I have trouble separating out, say, critical approaches to works of art. But there’s something else that I consider [culture] that involves pop art, current events and light news stories (but not [politics]) and sociology. The other thing to keep in mind is that the bundles are non-exclusive, but that they contain tags, ie., sloppy on-the-fly categories, mostly chosen based upon associable attributes. Before I started writing this entry, I was chirping to myself, “It’s non-exclusive—no worries!” and even mused to myself that the law of noncontradiction only applies to things in the same time and place and respect, so I was okay. Now, I’m upset. My innocent, free-form, self-expressive tagging activity has been opened up to all this heavy stuff about the right way to chop up being.

Take, for instance, shirt. There’s this site that showcases different t-shirts. Right now, I put the tag “shirt” in [information] with my “apartment” tag. Now, if I decided that t-shirts were legitimately a form of art, then I would have no problem adding the tag shirt to the [literature/philosophy/art] bundle. What doesn’t make sense about this, though, is that usually an item tagged shirt is either going to be of informational or artistic value! So if you click on the [information] bundle and find shirt items in it, you don’t know whether they have anything to do with shirt qua information…

This problem is the reason why no librarian would ever use a del.icio.us-esque system for categorizing elements in an inventory. Actually, I think that tags tend to be backwards to normal taxonomy, because they’re not essentialist (e.g., biological taxonomy) but pull together disparate things by aspects. I think we choose the tags we use according to guesses about the relevance of the set of items that will share a given aspect, and I know I tend to discriminate against things that would make the contents of a tag too divergent and make a new tag instead.

We do the same thing on Google: we pick the words (aspects) that we think would be included in the type(essence) of page we’re looking for. (This is induction, right?) Here’s a question: do we actually have the concept of the page in our heads before we find what we want, or is this mostly a story we tell ourselves after the fact? I think the latter is the current (computerized) information science perspective; also, I have heard librarians complain about Google-style relevance searches applied to their Dewey-categorized catalogs. What the new IS people say is that old-school library science is tied to systems of categorization for physical elements, e.g., books, which actually have to reside in one place on a shelf (unless the library wants to keep multiple copies). So when the new IS people get excited they say that the same information can reside in many “places” and can be represented at the same time to many people in different ways. That’s why the del.icio.us-style tagging and relevance searching, because it’s easier to point to the thing that you’re looking for in the location where you know to look for it than to figure out deductively its location in a hierarchical system, especially if that system is impossibly big.

Except that physicality and distinctness keep reintruding. The W3C treats URIs (like http://www.stupididea.com/method/) as distinct locales that are always just one thing, at the same time and in the same respect (e.g., cgi inputs). For another thing, information always occupies a physical location, usually in the bits on a harddrive, although multiple identical instantiations of an item of information can exist simultaneously in different places. Since each instantiation of an item of information has the potential to be altered, W3C uses URIs to establish the unique identities of authoritative documents. Further, while most documents are texts that allow for a wide degree of interpretation of intent and the nature of contents, many documents on the web are XML-like data files that rigidly describe a set of contents which are defined in relation to explicit definition files. These documents carry their own intended meaning, simply a set of hierarchical and associational relations between values (for example, a Friend-of-a-Friend (FOAF) file binds together values of people’s names, addresses, etc. into a network of relations).

Noncontradiction holds for information as well, if one takes “in the same respect” to mean something like “in the same system of categorization”. Moreover, relevance search and tagging, or reverse look-ups based on aspects, approach essential, ie. hierarchical categorization as more terms or more tags are used to single out an item. In Google, you should find only one document (and copies) once you reach the absurdity of searching for every word in that document in sequence. A del.icio.us bookmark list will be trash if every item is tagged with every tag, and bundling will only provide more information rather than less if tags are chosen to work rationally together with a carefully distinguished set of bundles. (I might be misunderstanding something, though. These bundles confuse me).

For example, I had a tag named dog that was neither [people] nor [information] nor [culture], so I put it in its own bundle named [miscellaneous]. This is real lame, so I considered anticipating future tagging needs by creating an [animal] bundle, set against [people]. I realized how funny it was that in my supposedly informal categorization activity, I was being pushed toward established classification schemes (if I have [people] and [animals] I should have [places] and all sorts of things like [rocks] and [concepts]). Of course, I’m just imposing my prejudice for philosophically justified hierarchies onto a system of aspect-based tagging that is intended to provide a quick and dirty solution to mass organization of materials, right?

Okay, I’m going to stop now. Sorry for the incoherence and bad writing. My questions are:

  • (asked by Mr. David to Amy T) “Is there more than one way to divide up being?”
  • Does the need to reconcile the categorization of a large number of items in a system force specificity and thus truthfulness, or does it strain rationality, or is it just a perverse way to play a game of limited interest?
  • How should I organize my del.icio.us using bundles?
  • What is the being of a combined tag (like blue+car)?
  • Should we force computers to make inferences based on philosophically rationalized datasets? Does this make sense, or is this possible?