Over 20% of Google's Searches have never been seen by Google before.
Writing about web page http://www.johnston.co.uk/2007_06_01_blog-archive.html
I happened across a couple of ‘long tail of search’ quotes recently that confirm the rule that half of all search terms are unique.
Chris Anderson’s ‘Long Tail’ blog quoted a senior Microsoft executive’s letter confirming this ratio.
Steve Johnston’s blog confirms this ratio for Google and goes on to quote an even more amazing claim.
Udi Manber, VP of Google Engineering said at a recent conference, that there are three reasons why Search is only going to get harder in the future. One of these I already use – “scale and diversity are almost beyond comprehension” – one of which is not particularly relevant to this point, but the third will replace my previous reference: “20 to 25% of the queries we see today, we have never seen before”. I will convert this into ‘1 in 4 of the expressions typed into Google today have never been seen by Google before’. Ponder that for a moment. Google is tapped into our collective consciousness. It’s astonishing.
Truly astonishing indeed! My experience, where no matter the size of the server logs that I processed, 50% of the search terms were unique also confirms that these figures are taken from unprocessed logs. So extraneous characters, ignored by the search engines, have helped reach these easily remembered ratios. The other route causes are synonyms and word ordering, see word order counts.
Most of these ‘very long tail’ search terms can include words from the ‘short head’. My findings that 80% of visitors to a well titled page will have used one or more words from the title yet 50% of the page’s search terms will still be unique! Having looked at all of the long tail terms for a page titled “klaxon horn” (and many others) I can also believe that 20-25% had never been seen by Google before!
No comments
Add a comment
You are not allowed to comment on this entry as it has restricted commenting permissions.