All 2 entries tagged Search Engine Results

No other Warwick Blogs use the tag Search Engine Results on entries | View entries tagged Search Engine Results at Technorati | There are no images tagged Search Engine Results on this blog

July 25, 2007

Over 20% of Google's Searches have never been seen by Google before.

Writing about web page

I happened across a couple of ‘long tail of search’ quotes recently that confirm the rule that half of all search terms are unique.

Chris Anderson’s ‘Long Tail’ blog quoted a senior Microsoft executive’s letter confirming this ratio.

Steve Johnston’s blog confirms this ratio for Google and goes on to quote an even more amazing claim.

Udi Manber, VP of Google Engineering said at a recent conference, that there are three reasons why Search is only going to get harder in the future. One of these I already use – “scale and diversity are almost beyond comprehension” – one of which is not particularly relevant to this point, but the third will replace my previous reference: “20 to 25% of the queries we see today, we have never seen before”. I will convert this into ‘1 in 4 of the expressions typed into Google today have never been seen by Google before’. Ponder that for a moment. Google is tapped into our collective consciousness. It’s astonishing.

Truly astonishing indeed! My experience, where no matter the size of the server logs that I processed, 50% of the search terms were unique also confirms that these figures are taken from unprocessed logs. So extraneous characters, ignored by the search engines, have helped reach these easily remembered ratios. The other route causes are synonyms and word ordering, see word order counts.

Most of these ‘very long tail’ search terms can include words from the ‘short head’. My findings that 80% of visitors to a well titled page will have used one or more words from the title yet 50% of the page’s search terms will still be unique! Having looked at all of the long tail terms for a page titled “klaxon horn” (and many others) I can also believe that 20-25% had never been seen by Google before!

July 20, 2007

The Longest Tail of Search because word–order counts.

Writing about web page

I have blogged before about how my OU MSc project found Chris Anderson’s “The Long Tail” distribution throughout my results. I have also found ‘long tails’ within ‘long tails’, as predicted by Chris, when I filtered the overall visitors to to record the visitors to company showcases and profiles.

The Long Tail, Anderson (2006)

I noted before that the last 30,000 visitors to referred by search engines used just over 20,000 different search terms. The most popular search terms are used by hundreds of visitors resulting in over 15,000 terms being used uniquely by only one visitor.

These unique terms were captured raw from our server-log and the diversity of unique terms were generated in two ways;
  • The permutations of words; caused by search refinement, sorting and the flexibility of the English language.
  • Extraneous characters, commas, brackets, etc., probably as a result of copying.

The extra random characters will be ignored by the search engines but does the order of words matter? I was finding really competitive ‘short head’ terms in our results for companies that could not compete with the top global sites. Where variations of popular ‘short head’ terms appearing in the ‘long tail’ because of word order?

Search on Google for “bookshops online” and “online bookshops” and you will find that the order of the top 10 results changes with a couple of new entries. Research using Google’s Adwords proves that closely matching the searchers search term generates better Click-Through-Rates, CTRs. So if the search engines are using CTR as a small factor in producing their results pages then word order will make a difference. I am convinced that the searchers ‘votes’ count with Google and can demonstrate that word order also varies results with Yahoo!.

The visitors using these ‘long tail’ terms found our pages because they addressed the human audience and used natural, varied, English. This “long tail” of unique search terms cannot be addressed by mechanical key term stuffing and explains why all the search engines recommends good copy in the human voice.

Chris Anderson has explored many significant long tails. His expertise from the music arena and the media has taken him from tracks to books, TV episodes and films. His long tails were all of physical goods or downloadable entities with one correct title.

I have found that searching on the web has created the longest tail, of natural language, because the word order counts.

Search this blog

Most recent comments

  • I can see what the probable pensioner was aiming for with the green bananas but I always find that i… by Wendy on this entry
  • When I was a child I remember reading the words "The end of the world is nigh" and thinking it said … by Sue on this entry
  • If anyone's in the Bath area I'd recommend "Roman Glass" on the Lower Bristol Road. by Sue on this entry
  • When the old Vectra died and I came to scrap my car I called Paul at Car Removals UK. by Robert McGonigle on this entry
  • Before making major changes to this account the owner took my advice and improved his Page Titles. H… by Robert McGonigle on this entry

Blog archive

RSS2.0 Atom


Google Analytics

Not signed in
Sign in

Powered by BlogBuilder