Blast from the past
For some unknown reason I've just stumbled across the documentation I wrote for the CS120 coursework I did a little under 3 years ago. It made me chuckle so I thought I'd share it with you. Feel free to ignore the first section – it isn't that interesting.
DOCUMENTATION
Utilities
I have written this assignment using shell script, awk and sed. While I realise that it is probably easier to implement using Perl, I decided to stick to what I know.
My submission for this assignment is 6 files: league, printout, cluboption, dateoption, ties, and capital. Printout is called by league when the -f option is given, or when no options are given at all. Cluboption is called when -c is given, and dateoption when -d is given. Ties is called by printout and dateoption to detect any ties between athletes, and capital, a sed script file, is called by printout and cluboption in order to solve to case insensitivity problem.
Generally, I have tried to use the simplest ways of solving the problem that I could think of (although it might not always look that way - this program may give you some fascinating insites into how my mind works ;-) ). I have used awk mainly for printing out certain parts of lines, avoiding long scripts and complicated functions. Similarly for sed, I know it is possible to do what I did in the "capital" file in about 5-10 lines, as I have seen examples on the internet, but I did not use this because I was not sure how it worked. I also tried to use tr as oppose to sed as often as possible, as it is simpler to understand.
Other than that, I have used utilities such as getopts (to combine options), grep, sort, cut, and if and while loops in conjunction with test. I feel that these were generally the simplest solutiouns to the problems faced. Grep -n was especially useful, as it prints line number: exactly as required for the postion in the league.
One thing I have used repeatedly throughout my code is while loop with the form:
counter=1 max=`grep -c . temp` while [ $counter != `expr $max + 1` ] do variable=`awk '{ print $string }' file` variable=`echo $variable | cut -f$counter -d ' '` ....<em>do some task involving variable</em>...... counter=`expr $counter + 1` done
I have used this as a way of splitting up a variable which is initially in the form string1 string2 string.... into seperate variables, one per iteration, which can then be used by grep (for example) as ways of testing if lines with that string exist in another file.
I have also used uniq to solve the problem of ties - as a way of removing all but one occurence of a repeated name from the input file.
Problems
The problems I have encountered while doing this assignment are almost too numerous to mention. Every time I thought I had solved something I found another little fault had cropped up. Even as I write this (at 1 am on Tuesday 11 March), having worked for about 3 and a half weeks on this program, I am in a state of frustration and annoyance at one or two things which I have not solved (I will discuss them later), and cannot solve without rewriting most of my script. Believe me, because about 3 hours ago I was trying to do just that, and found that it was creating even more problems and errors. I decided to stick with what you see now, as it seemed to me that this script contains the minimum number of errors I have been able to achieve.
I think the major problem I encountered was in trying to combine multiple options. This involved rewriting most of my script at one point, if I remember correctly. The problem was getting one option to recognise that another had been run, and to use the data provided by the previous option to create its own output. In the end, this was sloved by having a file called temp being produced by every called script. This way, using the test [ -f temp ] in one script would tell you whether another has been executed. The options' scripts are executed in the order -f, -d, -c, so -f needs no test for the temp file, -d needs to test for it, and -c needs to test for both the temp file, and whether -d has been given, as the form of the temp files created by printout and dateoption are slightly different. Once these tests have been done, it is possible to use the temp file, if necessary, as the input for the script, rather than the raw data from the results file.
Aside from this, most of the implementation was reasonably easy going (for a given value of reasonable). Before I started trying to combine options my code was a lot more consise (believe it or not), it was just in undergoing this process that it turned into the monolith you see now.
Functionality
As I have already stated, I decided to make the options combine successfully. I also chose to run a test to make sure that the date given with the -d option was valid (ie the day must actually exist for the given month, and the month must be 01 to 12).
Testing
I have thoroughly tested this script. Basic tests (league distance, league -c distance, league -d date distance, league -f file distance, league -c, league -x) have been run and given correct output. Also combinations of options have been tested, and in all cases the output was correct. I have also tested bad data if the club file, and in the results file, for various combinations of options, including both together. I have also tested for case insensitivity, and ties, and achieved correct output. The command also gives correct output for the extra white spaces test, but I think I solved this by fluke, and am yet to work out exactly how.
This brings me (regretably) to the last section, which I shall call...
Things which didn't work
I have decided that honesty is the best policy and, as you are pretty sure to find these any way, I might as well put them out in the open.
Firstly, I have only solved the bad data issues for the specific types of bad data given in the automatic test. Any other forms of bad data will likely produce incorrect output. Also, bad data in clubs.dat that does not involve adding an extra line (as in the test) will print out the bad line, but also use it to form the output, (although the output IS in the correct form). Also, the combination of -f with more than one item bad data and another gives bad output and error messages from various utilities used in the script. Finally (I hope) the -f - option does not work properly if there is bad data typed in by the user, or if there is no input by the user for the given distance.
I believe that is it (or at least, this is what I have spotted). So now, for a bit of light entertainment (and please don't mark me down for this), I'd like to include my favourite saying from the last 4 weeks:
"CS120 Coursework - Like kicking a dead whale down a beach"