All entries for Monday 28 January 2008

January 28, 2008


I mentioned in my last post that I've been looking at PLSA, an evolution of LSA (Latent Semantic Analysis). Well, this term really does need a bit of explaining now that I've posted it.

Basically it's to do with measuring terms used often and together across a range of documents. I could possibly use it because I can look through songs and find which terms occur together - suggesting a theme. For example: 'cross', 'jesus', 'died', 'save', 'blood', 'me' and 'thank' might all appear together in a set of songs quite often. This suggests these songs all share a theme (Jesus' death / salvation in this example). 

I discovered that Google can do this! (Is there anything it can't do?!)

If you search for a term with a tilde ('~') before it, Google will magically perform LSA on it and return results with similar terms to that word.

Combining this with Google's NOT operator ('-'), if you search for '~computer -computer', you will get results that contain similar terms to 'computer' but won't actually search for the word 'computer'. So the results highlight words like 'hardware', 'PC', 'laptop' and 'computing' instead of 'computer' like it would do for a search for 'computer'. Clever!

(If you're really interested - it seems that Google will only search for 5 terms beyond the search term. ie, in my '~computer' example, the results are only for 'computer', 'PC', 'laptop', 'computing' and 'computerized'. If you 'not' all of those terms, nothing gets returned.)


Hi there... if youre reading this, there's a good chance it's because you've just completed my online questionnaire. In which case, thank you very much! If not, please feel free to take the survey.


Although I've been doing lots of reading recently on subjects such as 'Probabilistic Latent Semantic Analysis' (!) in an attempt to find ways of automatically detecting song themes from lyrics alone, there hasn't been much progress made in the last couple of weeks in actually making it work.

But some tweaks have been made to my PSALM program. The mains ones are:

  1. Load database supplied as an argument to the program: (eg. 'java -jar XMLGUI.jar mydb.xml')
  2. Searching for keywords now works as searching for songs containing each word separately rather than as a single phrase. To search for a single phrase, enclose it in "speech marks". This then follows the convention used by Google.

If you're wondering just what PSALM is, it is the software component of my project. Take a look at the progress report or specification for more details. But basically, it is a song organisation system, that will soon also include theme detection & setlist/song recommendation functionality. (PSALM = Personal Software Aid for Leading Music)

So, here is Psalm 1...

Blessed is the man
    who does not walk in the counsel of the wicked
    or stand in the way of sinners
    or sit in the seat of mockers.

But his delight is in the law of the LORD,
    and on his law he meditates day and night.

He is like a tree planted by streams of water,
    which yields its fruit in season
    and whose leaf does not wither.
    Whatever he does prospers.

Not so the wicked!
    They are like chaff
    that the wind blows away.

Therefore the wicked will not stand in the judgment,
    nor sinners in the assembly of the righteous.

For the LORD watches over the way of the righteous,
    but the way of the wicked will perish.

And the newly released 'PSALM 1.1' ...

Download PSALM 1.1 here


January 2008

Mo Tu We Th Fr Sa Su
Dec |  Today  | Feb
   1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31         

Search this blog

Blog archive


Most recent comments

  • Congrats for ur final report,, I dinn have ne problem veiwing the report, its seems to be gud,, ..! … by Veda Informatics on this entry
  • By the way, if you have any problems with viewing the report, or in using PSALM, please still let me… by on this entry
  • It appears that when I tested PSALM, the computer that I tested it on already had the Bitstream Vera… by on this entry
  • Ah… thank you. The font hasn't been picked up. I'll have a look at what needs to be done. by on this entry
  • Chords aren't displayed in the right place for me. The spacing seems to drift slowly left, so that c… by Steve Rumsby on this entry

James Williams

Not signed in
Sign in

Powered by BlogBuilder