So far we’ve seen that there are three sources of semantics in HTML
- The built in semantics of HTML itself – its elements and attributes
- The ad hoc semantics of developers inventing their own vocabularies, which is typically “injected” into HTML largely using the class and id attributes of HTML
- Semi structured approaches to developing richer semantics, in particular the microformats project.
It would make sense that future semantic developments of HTML would come from these or similar sources or approaches. In this article I want to focus on each in turn, and consider the benefits and shortcomings of each approach to developing richer semantics for HTML.
I’ll begin with the second approach, “bottom up” semantics, which I considered in the first article, and have paid no small amount of attention to with previous research. In short, despite the success of bottom up ontologies, what Thomas Vander Wal terms “folksonomies”, where common vocabularies for describing things emerge through ad hoc usage (well known examples are Flickr’s tags, and Del.icio.us), vocabularies for describing common data on the web simply haven’t emerged. This is not just an assertion, as my previous research indicates. It should in fact not come as a surprise, because class values, for example, are “hidden”, while tags at del.icio.us or flickr, by comparison are visible giving rise to a positive feedback loop – when I as a user see a tag for a particular kind of thing, I am more likely to use it myself for similar kinds of things. Over time, particular terms appear to “win”, and become the conventionally accepted tag for that kind of thing. With class and id values on the other hand, we simply don’t get the network effect to anoint particular words as the names of things.