May 19, 2010

Getting a random word in a bash script

Earlier I got to wondering if there was an easy way to get a random word in a bash script. My first thought was

$ sort -R /usr/share/dict/words | head -1

That works but takes over ten seconds. The first Google hit I got was this page which suggets:

#Number of lines in $WORDFILE
tL=`awk 'NF!=0 {++c} END {print c}' $WORDFILE`
for i in `seq $NUMWORDS`
sed -n "$rnum p" $WORDFILE

Which works, but only returns words that start with a b or c. The problem is that bash's $RANDOM is a number between 0 and 32767 and there's many more words than that in /usr/share/dict/words.

I didn't find any better suggestions so after a bit of fiddling I came up with this.

# seed random from pid
# using cat means wc outputs only a number, not number followed by filename
lines=$(cat $WORDFILE | wc -l);
sed -n "$rnum p" $WORDFILE;

The maximum value of multiplying two values of $RANDOM is greater than the number of lines in /usr/share/dict/words thus the problem of only getting back words starting with a b or c is eliminated.

Some time later, whilst looking at something else I came accross reference to the shuf command. Turns out that does much the same as 'sort -R' only much faster (around 0.2 second). So if you want a simple and fast way to get a random word in a bash script all you need is:

$ shuf -n1  /usr/share/dict/words

Edit: Assuming you're using a *nix machine which has shuf installed. shuf is part of GNU core utilities so it should be included in any randomly chosen Linux install. It's not included in Mac OS X (at least not Leopard which is what I have to hand) though or Solaris 10. Mac users could get shuf by installing the coreutils package that's available via MacPorts.

- No comments Not publicly viewable

Add a comment

You are not allowed to comment on this entry as it has restricted commenting permissions.

Search this blog


Not signed in
Sign in

Powered by BlogBuilder