Clay Shirky: Ontology is Overrated
Writing about web page http://conferences.oreillynet.com/cs/et2005/view/e_sess/6117
Premise: The ways we currently try to apply categorisation to the web is wrong, because we're trying to re-apply categorisation techniques from the pre-web world
Assertion: It will get worse before it gets better; things get broken before they get fixed
Parable: Travel agents – Travelocity takes the place of traditional travel agents: they made the mistake of trying to do everything that a travel agent does including helping people make choices about which holiday to take. In fact people don't want an online agent to do this: travelocity made the mistake of transposing the offline categorisation into the online world.
It's not possible to avoid cultural assumptions in categorisation: Dewey Decimal categorisation has 10 religion categories – 9 for aspects of christianity and 1 for 'other'. Similarly the lib. of congress. geographic terms.
Physical ontologies are optimised for physical access – librarians invent classifications to help them find books. If there is no shelf then there's no need for a librarians ontology.
Hierarchical ontologies are fundamentally not suitable for non-physical information – because they're predicated on an object being in one place at one time – which isn't true.
hierarchy—>hierarchy+links. When the number of links becomes large enough, you don't need the hierarchy any more.
browse (hierarchy) —>search(network of links)
when does ontological organisation work well?
- Small domain, formal categories, stable entities, restricted entities, clear edges
- Coordinated expert users (searchers/browsers), expert catalogers, authoritative source
n.b. the web is the diametric opposite of this!
When categorisations are collapsed, there's always some signal loss. Clay's example: if I tag something "queer" and you tag it "homosexual" we probably mean something subtly different. When categorisations are fixed, there will always be errors introduced with time. e.g. "dresden is in east germany"
great minds don't think alike
Usage (number per user) of tags on delicious follows a power law – indicating an organic organisation. Similarly the number of items per tag for an individual user. Looking at the number and distribution of tags for a given URL gives an indication of how clear 'the community' is about the categorisation of the item.
Key point: In a folksonomy, each categorisation is worth less individually than a 'professional' categorisation would have been – but when aggregated they have much more value.
User and Time are important attributes of tags. You need to know who tagged a resource and when, in order to assign a value to the tag. The semantics in a folksonomy are in the users not in the system. When del.icio.us sees OSX it doesn't know that it's an operating system; it just knows that things tagged as OSX are also often tagged as 'Mac' or 'Apple'
Does the world make sense, or do we make sense of the world?