The Szekeres Chronicles: The Long Tail
So, what were we talking about last time? Oh yes, search behaviour. In the last post, we saw that one- and two-word searches are becoming less frequent, while longer searches are "gaining popularity". This is all very good, but why is this so important to some people, who seem to be collecting this data, plotting it, making tables about it,and all kinds of other numerical acrobatics? Well guess what, it turns out there's money involved. Imagine you own a website that somehow tries to make a profit, maybe by selling things. Then you want to attract as many visitors as possible - since more visitors presumably means greater profit - but this is not always easy, given that a lot of people discover your site simply by stumbling upon it after searching for something or other on a search engine. By using programs like Google Analytics, however, you can get an idea of which keywords bring the most traffic to your site, something which can give you crucial insight about how to adapt your site, to attract even more potential customers. Plus, there's something with publicity and advertisement involved, if you link your site to certain keywords or something. Go figure.
If we then plot each keyword against the number of "hits" that keyword gets (i.e. the number of visitors who visited your site after searching for that keyword), we might get something like this:
What this means is that there are a few keywords that attract most of the visitors (dark green), while the majority of keywords give very few hits each (light green). This kind of distribution, known as the "Pareto Distribution", pops up in all kinds of places - for example, the money made by an airline company, the wealth of a country's population, or the size of meteorites. Which is why I haven't labeled the axes.
Closely linked to the Pareto distribution is something called the "Pareto Principle" (aka the "80-20 Rule" aka the "law of the vital few"). It states that "roughly 80% of the effects come from 20% of the causes", as a rule of thumb. For instance, in most countries, about 20% of the population owns 80% of the country's wealth, and the same is true for the world population. In most businesses, 20% of the clients account for 80% of the sales. Microsoft is said to have noticed that usually, 20% of all the bugs cause 80% of all the problems. And one could easily imagine other applications: maybe 20% of the videos on YouTube get 80% of all the views, and maybe 20% of your actions cause 80% of your carbon foot print.
In our case, we can assume that the 20% most popular keywords generate 80% of the traffic our fictional site. And this is where our discovery from the previous post gets interesting. See, if more and more people are starting to use longer keywords, then the one- and two-word searches that used to generate the bulk of all incoming traffic, are no longer as important. Graphically, what is happening is that the head is turning smaller, while the tail is becoming bigger and longer (see above graph). And your website will probably have to adapt to this change, if so it may even benefit from it.
This change is not unique this particular situation, it is also a general business phenomenon that has appropriately been labeled "The Long Tail". I said earlier that in most businesses, 20% of the clients account for 80% of the sales; likewise it has traditionally been true a few popular items has accounted for most of the revenue. It turns out that this is slowly changing, especially in the online world: the focus is slowly shifting from the head to the tail. This is why Amazon, eBay, online bookstores and other such retailers look the way they do. Popular items are still being sold, but the market is now flooded with so-called "non-hit items", items that are not very popular and that you wouldn't find in a traditional you-have-to-leave-your-house-to-get-there kind of shop. And a considerable amount of sales are being made from this long tail, so this new business model that seems to work.
Chris Anderson, who coined the term "The Long Tail", has written a few books on this topic. They are, naturally, available on Amazon.
Also, if my posts have whet anyone's apetite for web analytics, a certain Avinash Kaushik maintains a blog named Occam's Razor, in which he writes about this subject in a clear and engaging way.
Finally, all this has reminded me of a short anecdote of mine: During a Quantitative Economics seminars (a dozen students in a small classroom), our teacher was telling us about Complementary Goods. "Perfect complementary goods are goods that have to be consumed together. They aren't worth anything alone; good A is nothing without good B and vice versa. A classic example is left shoe and right shoe - you need both of them together. What do you do if you have a right shoe only?"
The question was of course rhetorical but I couldn't help myself, and said: "You sell it on eBay..."