All entries for Wednesday 07 August 2019

August 07, 2019

Fortran Memory Management

Memory Managment

Manually Managing

Memory management is one of the banes of the programmer's life in almost all programming languages. In many languages, such as C, you have to manually pair up all allocations of memory with associated deallocations or you will "leak" memory as your program runs. (Strictly a "memory leak" is when you create memory that you then lose track of in some way, so that you can't then deallocate it even if you want to. From the perspective of your program crashing when you run out of memory there's no difference between a true leak and just piling up unused but technically available memory somewhere so I tend to use the term a bit loosely).

Collecting the Garbage

Other languages, like Java, use mechanisms for counting how many times you use an item and when the last reference to an item is removed then the item will be deleted by something that is generally called a "garbage collector". The problem with these languages is that the garbage collector runs only periodically, not every time an item has its last reference released, so memory use can increase due to items that will be deleted when the garbage collector next runs but haven't been deleted yet.

Destructive Magic

Still other languages like C++ use objects that have systems of constructor and destructor functions that create memory when an object is created and release the memory when the object is deleted. This doesn't immediately sound like it's very helpful because surely that's still just a different way of saying that you have to have matched allocation and deallocation logic? The advantage comes from the fact that simple variables (like integers, floats etc.) don't have these memory management problems, the compiler automatically knows the lifetimes of these variables: they're either global variables the exist for the entire time that the code runs or they are local to a function and exist only when you are in that function (or in other functions called from that function etc.).

So long as you are dealing with a single instance of a C++ class and not a pointer to one or more of them then the compiler has the same level of lifetime guarantee that it has with the simple variables, and so long as the class knows how to allocate its memory when it is created (constructor function) and deallocate its memory when the compiler decides that it is finished with (destructor function) you don't have to worry about matching every single creation of the object with a matching destruction, you simply create them when you want them and let the compiler get rid of them when you are finished with them. To be strictly correct, modern C++ actually recommends against a developer doing memory management at all and recommends using STL containers to hold your data (which do the memory management themselves internally and have correctly implemented constructors and destructors). It really is a good idea to do this but scientific and research codes quite often find the odd edge cases where manually allocating and deallocating memory is the easiest solution.

Getting it for Free (Fortran Rules, OK?)

Since mostly in academic programming we tend to be working with arrays, C++ style objects are a fairly heavyweight way of dealing with problems of allocating and deallocating memory. It feels like it should be possible to have the same advantage of simply allocating an array when you need one but keep the advantage of allowing the compiler to automatically infer the lifetime of the variable and deallocate it when the lifetime is over without needing to go to a fully garbage collected model. In C that isn't really possible because arrays are mostly just pointers and the compiler can't be sure that a pointer is the only pointer to a block of memory. Programming would be impossible if your memory was deallocated as soon as any pointer to that memory went out of scope! But in more restrictive languages it is possible and fortunately Fortran is one of the languages that does have an option to do exactly that, and it's probably a feature of the language that you are already familiar with : the humble ALLOCATABLE array.


Allocatable Arrays in Fortran

Fortran allocatable arrays are very easy to use and anyone working in modern Fortran is probably familiar with them, but their properties are often not so well understood. For example to someone with a C background it feels as though this function should leak memory badly.

  SUBROUTINE alloc_fn(els)
    INTEGER, INTENT(IN) :: els
    INTEGER, DIMENSION(:), ALLOCATABLE :: array

    ALLOCATE(array(els))
  END SUBROUTINE alloc_fn

but it actually doesn't leak any memory at all (although it equally doesn't do anything useful here). It also doesn't crash because you are trying to reallocate an already allocated array. The Fortran standard requires that when an allocatable array goes out of scope it should be deallocated, so that function will run allocate the array as requested and then deallocate it again as soon as the function is over. One question that you might then ask is "What about if I did want the array to stick around?", and as usual in Fortran that is possible by using the SAVE attribute to the array

  SUBROUTINE alloc_fn(els)
    INTEGER, INTENT(IN) :: els
    INTEGER, DIMENSION(:), ALLOCATABLE, SAVE :: array

    IF (.NOT. ALLOCATED(array)) ALLOCATE(array(els))
  END SUBROUTINE alloc_fn

You'll notice that this time I've had to test if the array is allocated because otherwise I would wind up trying to allocate it twice and that will cause a runtime error. That is actually another nice feature of Fortran allocatable arrays that protects you from getting a common way of getting a memory leak in C - you cannot allocate an already allocated allocatable array. You can deallocate memory using a DEALLOCATE(array) statement and this can be useful if you want to explicitly manage the lifetime of your memory, for example if you have large intermediate arrays in a calculation that you don't want to have hanging around while you allocate other intermediate arrays. Many style guides do recommend manually deallocating memory on leaving a function, but that's mainly just a combination of caution and working round (mostly very old) broken compilers that don't comply with the standard.

Simple enough so far, but are there any pitfalls? Yes, but they tend to be a bit subtle.

Show your Intentions

Since Fortran 2003 an array argument to a function can have the "ALLOCATABLE" property, and that means that you can allocate and deallocate the array inside the function, for example

SUBROUTINE alloc_fn(els, array)
    INTEGER, INTENT(IN) :: els
    INTEGER, DIMENSION(:), ALLOCATABLE, INTENT(INOUT) :: array

    IF (ALLOCATED(array)) THEN
      PRINT *, "Deallocating"
      DEALLOCATE(array)
    END IF
    ALLOCATE(array(els))
  END SUBROUTINE alloc_fn

Nothing terribly surprising there. I call my function the first time on an unallocated array and it silently allocates, and if I call it again then it prints "Deallocating" and reallocates the array to the new size.

But what if I switch the INTENT argument of the array from "INOUT" (meaning that I can both read and write data to the array) to "IN" (meaning that I can read data from the array but not make changes). Happily as you might expect the compiler refuses to compile this code because it involves making changes to the array and I've specified that I can only read data from it.

But what about if I switch the intent to "OUT" (meaning that I can put data in the array but not read data from it)? You would probably expect this to work because I'm not using data from the array, but on second thoughts you might expect it to fail because I am testing the allocation status of the array. Well, if you try it it compiles, it runs and it allocates the array as expected. The strange thing is that the "Deallocating" print statement never triggers, and this is exactly how Fortran reads the "INTENT(OUT)" statement for allocatable arrays. Since INTENT(OUT) is supposed to mean that you don't take any information from this variable you must be assuming that it is not allocated when it enters the function SO IT DEALLOCATES IT IF IT IS ALLOCATED! This is useful but you have to be very careful! The same behaviour happens for types that contain allocatable components so watch out!

Coming Next - Pitfalls of Allocatables

There are more things to watch out for with Fortran allocatables, including the behaviour of array bounds when they are passed into functions in different ways, automatic reallocation of variables during array assignment (that sounds good but can cause absolute chaos!), the behaviour of types containing allocatable components and a few similar bits, but those will have to wait for part 2 of this post.


August 2019

Mo Tu We Th Fr Sa Su
Jul |  Today  | Sep
         1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31   

Search this blog

Tags

Galleries

Blog archive

Loading…
Not signed in
Sign in

Powered by BlogBuilder
© MMXXIV