All entries for July 2019
July 24, 2019
SOUP: The FizzBuzz Variations
I like challenges - sometimes I do maths and programming puzzles for fun. It's good practice, and if you go into industry as a programmer you may well get hit with this sort of thing in an interview. Interviewers want to verify that somebody can vaguely code before moving on to more detailled questions, so often set something simple, but originally, novel. The novelty has rather been eroded now, so you can find entire sites dedicated to compiling these questions and drilling them - which we definitely don't recommend here - but they are a good source of challenges.
One classic "challenge" is FizzBuzz. You may have encountered this as a game - I remember it from Secondary School French lessons, to help learn the names of numbers. The basic idea is very simple - you go through reading out the numbers, usually from 1 to 100, and for every multiple of three, substitute the word "fizz", for every multiple of five, the word "buzz" and for multiples of three and five (i.e. fifteen) "fizzbuzz".
For anybody who has done maths past GCSE or so, you've probably met Modulo division in some form. But the best thing about FizzBuzz is that you don't need to have. In fact, there is a complete solution without using anything except addition and equality-checking. Moreover, the core problem is nice and simple, so anybody should be able to create the pseudo-code version even if they have no idea how to check for being a multiple of three or five.
I once read an article from somebody who thought that FizzBuzz was a terribly difficult problem that needed high-level maths to solve. In the comments was somebody claiming that anybody who could just write out the code must have spent far too long drilling interview questions. That seemed a little absurd, but it did make me want to produce a FizzBuzz solution which did indicate spending far too much time thinking about it. Just to see if it was possible.
More seriously though, simple problems like this can be made into a great little challenge. There are many solutions, and you can make the problem almost arbitrarily hard by imposing artificial restrictions. You can always verify the answer, and you can get surprisingly creative. I've compiled a bunch of variations on FizzBuzz, from the outline, to the "cleverest" solution I could create, to illustrate the learning opportunity of this sort of thing. They are all on Github here.
FizzBuzz outline code
Anybody who can program should be able to turn the description above into psuedo-code, which is itself a handy skill. Here, all you have to do is expand the words into something slightly more precise. For example:
loop variable i from 1 to 100 if i divides by 3, but not 5, print "fizz" else if i divides by 5, but not 3, print "buzz" else if i divides by 3 and 5, print "fizzbuzz" else print i end loop
So, here is the first place where knowing some maths will help us: I wrote those three branches very carefully to be independent. But they don't need to be! I could just do this:
if i divides by 3: print "fizz" set flag indicating not to print i if i divides by 5: print "buzz" set flag indicating not to print i if not flag: print i
Note that this is not necessarily better! It's different, sure, but to get this to work I need to add an extra flag variable, and have three independent "if"s rather than one "if-else" chain. We can work around this pretty easily, by constructing a string rather than printing it directly, such as (I have this in the repo in Fortran as StringConstructionFizzBuzz.f90)
string = "" if i divides by 3: string = "fizz" if i divides by 5: string += "buzz" if string =="": string = str(i)
Before going on to real code, note the other possible approach, if we didn't have a handy way of checking for divisibility:
loop variable i from 1 to 100 increment counter_3s increment counter_5s if counter_3s equals 3 print "fizz" set flag_not_print_i counter_3s = 1 if counter_5s equals 5 print "buzz" set flag_not_print_i counter_5s = 1 if not flag_not_print_i print i
Both of these "stock solutions" are in the repo in C, Fortran and Python as FizzBuzzModulo and FizzBuzzCounter.
FizzBuzz as a challenge part 1 - pseudo codes
So, what might we come up with as a deliberate variation on this? Well first off, lets fall back onto the "You Aren't Going to Need It" and assume that 3, 5, and 100 are absolutely fixed. Then we have a super simple solution:
print 1 print 2 print "fizz" print 4 print "buzz" ...
Simple, and it works. But it's seriously inflexible and we have to type a lot!
But wait a moment! There's a pattern here, isn't there, and it repeats every 15 lines. So what about
i = 1 while i is less than 100 print i increment i print i increment i print "fizz" increment i print i increment i print "buzz" ... print i increment i print "fizzbuzz" increment i end while
Where the ellipsis (...) is, we have enough lines that i ends up as 15 on iteration 1. The nice thing about this, is that we've relaxed the requirement that the end be at exactly 100. Except we have a rather large bug here: we get a multiple of 15 lines, every time. We'd need to find another way of stopping on the last round. In the GitHub repo for this post, I have a silly example using Bash taking this approach (UnrolledFizzBuzzGenerator.sh). It is undoubtedly silly, but this sort of approach is a pretty long way from the obvious first idea, which was after all the intention.
Note the biggest downsides of this code: it is not obvious what it's doing because it is unpacked a little too much, and the 3 and 5 are baked right into the code.
Fizz Buzz Without If
That last example was my first attempt at writing FizzBuzz without using "if", which technically it did achieve. However, "while" can trivially be used to replace if (assuming a language or variant that runs 0 times if the condition is untrue). We can do better than that, surely?
Of course we can, but first we need to think hard about what "if" is doing for us here.
Understanding What If Does
First, we need to find a way of describing the action of "if" without using the word "if". (Note that modern processors actually have inbuilt instructions called "jump-if" so this isn't strictly true, but helps us find a way to code the problem without using "if"). Note that "on" counts as a synonym here.
Consider a simple one line if chain such as
1 if (condition) 2 do thing one 3 else 4 do thing two 5 end if
If the condition is true, we end up on line 2, otherwise we end up on line 4. Then, we end up on line 5 and carry on. So, putting on our geeky hats and trying to be clever, we might think "aha! We end up on line (2 + 0 (true)|2 (false)).
The next step is far more intuitive if you come from C, or one of the other languages where true is just a number. In C, we can write something like
index = 2 + 2 * (condition != true) and get 2 if the condition is true, and 4 if its not, because "true" has value of 1, and false has 0.
Using an actual "goto" instruction is, of course, evil (not entirely true, but good enough), and in this case would be the essence of solving the letter of the problem, rather than the spirit. We need to think more generally.
Function Pointers
I have written before about using function pointers to replace "if" conditions. In this case we can't set a pointer beforehand, but we have just worked out how to replicate the if chains in our original pseudocode. So, what if we made an array, and used something like the expression above to calculate the offset into it, and then called the function we find there?
This variant works pretty well in C, but in Fortran, where Logicals are a distinct kind of variable, it is very clumsy to calculate the offsets, but can be done. This example is in the repo as FunctionPointerFizzBuzz.f90, .c and .py.
Once again, this a fairly silly way to solve FizzBuzz itself, but does use a technique that can be very very handy! Practice with this sort of thing on a silly example is a great way to learn, and if you keep these sorts of snippets around, you have something to refer to when you need to use the technique in anger later.
FizzBuzz for C programmers
The last example, which is the cleverest I could manage to be about this, is pretty deeply based on C semantics. It relies on both how C handles it's strings, and how its pointers work. I'm not going to explain what is going on, but I thought of it after trying to simplify the function pointers example because of how similar the print commands were. This example is in the repo as SillyStringFizzBuzz.c
This example really goes overboard on the cleverness and is not an approach I would ever use in "real code", but it was fun to write and a good challenge. I think it meets my original challenge pretty well - it is FizzBuzz being entirely too clever. If this sort of things appeals, look up the idea of Code Golf and/or Obfuscated C - coding answers to puzzles which are very very short and hard to understand.
That might sound exceptionally silly - why would anybody want to program in a way that's hard to understand? But of course, knowing what's hard to understand really helps know what makes things easy, and brings to your attention all of the sneaky things that might not do what you first think.
Wrap Up
To conclude - there can be a lot to learn even from super simple seeming challenges. In particular, you can lay the challenge of not using some code feature, especially one you first turn to to solve the problem normally. Developing the flair for thinking "outside the box" is very handy when you have such challenges imposed externally.
Cleverness that makes you say "oh that's clever" is often not a good thing - your first reaction should probably include "oh OK, I see how it works" before "that's clever". But developing clever solutions is a good way to learn. Think of it like something like keepy-uppies for a football (soccer) player. They're not a skill that's useful in a game (the Wikipedia anecdote about Scotland not-withstanding) but they develop control skills that absolutely are. It's also a handy display piece to demonstrate those skills, should you need one.
July 10, 2019
Datastructures – Linked lists part 3
Follow-up to Datastructures – Linked lists part 2 from Research Software Engineering at Warwick
This time we're presenting the final piece of the puzzle for linked lists - adding items. Adding items is very, very similar to removing them. To add an item "I" between two other items (lets call them "A" and "B", with A <-> B initially) you just have to set it up so that A's next pointer points to I, B's prev pointer also points to I and that I's prev pointer goes to A and I's next pointer goes to B. You do then have to take care if either A or B don't exist (i.e. you're adding at the start or the end of the list) but basically the idea is always the same. When you don't have a standard that you have to follow interface design in programming is often as much an art as a science and there are several ways of implementing a useful interface to do this. You can write a routine "insert_between(I,A,B)" that adds an item between two specified items (although this the has to deal with what should happen if you try to insert between two items that are not actually linked to each other and also requires the developer to provide redundant information. If this is going to work then A must be linked to B so if you know A you know B and vice versa). To add at the start of the list you specify "A" to be NULL, and to add at the end of the list you can specify "B" to be NULL. This isn't seen very often in the wild but it works perfectly well. You could also write "insert_after(I,A)" and "insert_before(I,B)" routines that would allow you to insert item I either "after A" or "before B", again specifying appropriate NULL items for A or B to add to the start or the end of the list. The most common approach in the wild at the moment is that used by the C++ Standard Type Library containers. Have a routine "insert(I,B)" that inserts the item before a specified item B, a routine "insert_after(I,A) to insert an item after the specified item, a routine "push_back(I)" that adds a new element to the end of your list and a "push_front(I)" routine to add the new item at the start of the list. While I'm not sure that I'd pick them from scratch I'll stick to the STL
void insert(llitem* head, llitem* new, llitem *before) { new->next = before; if (before->prev) { before->prev->next = new; new->prev = before->prev; } else { head = new; } before->prev = new; } void insert_after(llitem* new, llitem *after) { /*head is not used here because inserting after an item cannot touch the head*/ new->prev = after; if (after->next) { after->next->prev = new; new->next = after->prev; } after->next = new; } void push_front(llitem* head, llitem* new) { if (head){ head->prev = new; } head = new; } void push_back(llitem* head, llitem* new) { llitem * current = head; if (!current) { head = new; } else { /*spin on until you find an item that has no next element. That element is the end of the list*/ while(current->next) current = current->next; current->next = new; new->prev = current; } }
SUBROUTINE insert(head, new, before) TYPE(llitem), POINTER, INTENT(INOUT) :: head !Head of linked list TYPE(llitem), POINTER, INTENT(INOUT) :: new !Item to be inserted TYPE(llitem), POINTER, INTENT(INOUT) :: before !Item to insert before new%next => before IF (ASSOCIATED(before%prev)) THEN before%prev%next => new new%prev => before%prev ELSE head => new END IF before%prev => new END SUBROUTINE insert SUBROUTINE insert_after(new, after) !No head pointer because inserting after an item can't touch the head TYPE(llitem), POINTER, INTENT(INOUT) :: new !Item to be inserted TYPE(llitem), POINTER, INTENT(INOUT) :: after !Item to insert after new%prev => after IF (ASSOCIATED(after%next)) THEN after%next%prev => new new%next => after%next END IF after%next => new END SUBROUTINE insert_after SUBROUTINE push_front(head, new) !No head pointer because inserting after an item can't touch the head TYPE(llitem), POINTER, INTENT(INOUT) :: head !Head of the linked list TYPE(llitem), POINTER, INTENT(INOUT) :: new !Item to be added to the list IF (ASSOCIATED(head)) THEN head%prev => new END IF head => new END SUBROUTINE push_front SUBROUTINE push_front(head, new) !No head pointer because inserting after an item can't touch the head TYPE(llitem), POINTER, INTENT(INOUT) :: head !Head of the linked list TYPE(llitem), POINTER, INTENT(INOUT) :: new !Item to be added to the list TYPE(llitem), POINTER :: current current => head IF (.NOT. ASSOCIATED(current)) THEN head => new ELSE DO WHILE(ASSOCIATED(current%next)) current=>current%next END DO current%next=>new new%prev => current END IF END SUBROUTINE push_front
Once again a diagram will probably make things clearer.
There is one thing that you can spot easily from the implementations : in order to add to the end of the list you have to cycle through all of the intermediate items to get to the end. This isn't actually a major problem since for a lot of the uses of linked lists you can freely choose to add to the start of the list which happens immediately by just updating the head. However if you want to be able to add to the end of a linked list then it is common to actually store an equivalent of the "head" item that marks where the end of the list is. This item is generally called the "tail" item. I haven't described anything about using tails because it doesn't add anything to the fundamental nature of the linked list while making all of the implementation a little bit messier. Holding the tail also introduces a debugging risk because it is (technically) redundant and can be inconsistent with the actual last element that you would reach by iterating through the list (usually only because of errors in your implementation). The addition of multiple sources of information about something always complicates matters and you have to test very carefully to ensure that they can't become inconsistent.
This series has basically described how to implement a very simple linked list (if you wanted to). There are many limitations and weaknesses of this implementation (not being certain to nullify pointers on items when adding them, not having a tail item and also the fact that each item can only be in a single linked list at any point (usually you relax that requirement by having each llitem have a pointer to your real data rather than holding the data within the llitem structure itself. You do have to be careful to make sure that items are fully deleted when the last reference to them is released though!)) but it does, mostly, work. The final part of this series on linked lists is going to go through why you might not want to use them. Not just this limited implementation but when linked lists are not suitable datastructures and how you can create hybrids of linked lists and arrays that can perform much better.