January 28, 2008

LSA

I mentioned in my last post that I've been looking at PLSA, an evolution of LSA (Latent Semantic Analysis). Well, this term really does need a bit of explaining now that I've posted it.

Basically it's to do with measuring terms used often and together across a range of documents. I could possibly use it because I can look through songs and find which terms occur together - suggesting a theme. For example: 'cross', 'jesus', 'died', 'save', 'blood', 'me' and 'thank' might all appear together in a set of songs quite often. This suggests these songs all share a theme (Jesus' death / salvation in this example). 

I discovered that Google can do this! (Is there anything it can't do?!)

If you search for a term with a tilde ('~') before it, Google will magically perform LSA on it and return results with similar terms to that word.

Combining this with Google's NOT operator ('-'), if you search for '~computer -computer', you will get results that contain similar terms to 'computer' but won't actually search for the word 'computer'. So the results highlight words like 'hardware', 'PC', 'laptop' and 'computing' instead of 'computer' like it would do for a search for 'computer'. Clever!

(If you're really interested - it seems that Google will only search for 5 terms beyond the search term. ie, in my '~computer' example, the results are only for 'computer', 'PC', 'laptop', 'computing' and 'computerized'. If you 'not' all of those terms, nothing gets returned.)


- No comments Not publicly viewable


Add a comment

You are not allowed to comment on this entry as it has restricted commenting permissions.

January 2008

Mo Tu We Th Fr Sa Su
Dec |  Today  | Feb
   1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31         

Search this blog

Blog archive

Loading…

Most recent comments

  • Congrats for ur final report,, I dinn have ne problem veiwing the report, its seems to be gud,, ..! … by Veda Informatics on this entry
  • By the way, if you have any problems with viewing the report, or in using PSALM, please still let me… by on this entry
  • It appears that when I tested PSALM, the computer that I tested it on already had the Bitstream Vera… by on this entry
  • Ah… thank you. The font hasn't been picked up. I'll have a look at what needs to be done. by on this entry
  • Chords aren't displayed in the right place for me. The spacing seems to drift slowly left, so that c… by Steve Rumsby on this entry

James Williams

Not signed in
Sign in

Powered by BlogBuilder
© MMXXI