All entries for Wednesday 19 May 2010

May 19, 2010

Getting a random word in a bash script

Earlier I got to wondering if there was an easy way to get a random word in a bash script. My first thought was

$ sort -R /usr/share/dict/words | head -1

That works but takes over ten seconds. The first Google hit I got was this page which suggets:

#Number of lines in $WORDFILE
tL=`awk 'NF!=0 {++c} END {print c}' $WORDFILE`
for i in `seq $NUMWORDS`
sed -n "$rnum p" $WORDFILE

Which works, but only returns words that start with a b or c. The problem is that bash's $RANDOM is a number between 0 and 32767 and there's many more words than that in /usr/share/dict/words.

I didn't find any better suggestions so after a bit of fiddling I came up with this.

# seed random from pid
# using cat means wc outputs only a number, not number followed by filename
lines=$(cat $WORDFILE | wc -l);
sed -n "$rnum p" $WORDFILE;

The maximum value of multiplying two values of $RANDOM is greater than the number of lines in /usr/share/dict/words thus the problem of only getting back words starting with a b or c is eliminated.

Some time later, whilst looking at something else I came accross reference to the shuf command. Turns out that does much the same as 'sort -R' only much faster (around 0.2 second). So if you want a simple and fast way to get a random word in a bash script all you need is:

$ shuf -n1  /usr/share/dict/words

Edit: Assuming you're using a *nix machine which has shuf installed. shuf is part of GNU core utilities so it should be included in any randomly chosen Linux install. It's not included in Mac OS X (at least not Leopard which is what I have to hand) though or Solaris 10. Mac users could get shuf by installing the coreutils package that's available via MacPorts.

Search this blog


Not signed in
Sign in

Powered by BlogBuilder