Memory management

The way-too-long-don't-wanna-read summary for what to do when you wanna do dynamic memory management:

  1. Try not to — Ada offers a few features that makes it possible to work with dynamically sized types without manually allocating. Notably, out and in out parameters are by far the preffered way of modifying things from procedures. Subprograms can also take and return values of indefinite types (like String) without issue.
  2. Containers — the standard library containers are automatically managed.
  3. Indefinite holders — the indefinite holder container is a special container (meaning it has automatic memory management) which can hold a single value of an indefinite type. Values inside such a holder are allowed to change size without issue. Unbounded_String is similar, but only for strings.
  4. Controlled types — deriving from Controlled and Limited_Controlled from Ada.Finalization (see RM 7.6) let's you create RAII-style types by overriding Initialize and Finalize, as well as Adjust for Controlled.
  5. Smart pointersGNATCOLL's refcount package implements thread safe reference counting using controlled types. See also the Wikibook for more.
  6. Storage pools — these allow you to effectively override the new keyword with custom allocators, allowing you to implement arena allocators or bind conservative GC's like Boehm without pain.
  7. 'Storage_Size — access types can set the 'Storage_Size attribute with a dynamic value, in which case the compiler will create a storage pool that automatically reclaims the pool when the type is no longer visible.
  8. Unchecked_Deallocation — if none of the above help you, this should be a final resort. It is essentially a type-safe free, and should be treated as such.

Table of contents
  1. Introduction
  2. Do you have to?
  3. Unchecked deallocation
  4. Controlled types
  5. Holders
    1. Aside: containers
  6. Storage pools
    1. Aside: The 'Storage_Size attribute
  7. Garbage collection anyway
  8. Further reading

Introduction

Allocating items in Ada is easy!

with Ada.Text_IO; use Ada.Text_IO;

procedure Main is
    type Integer_Access is access Integer;
    X : Integer_Access := new Integer'(5);
begin
    Put_Line (Integer'Image (X.all));
end Main;

If we run this through Valgrind, however, we might see something troubling.

$ valgrind ./main
==7070== Memcheck, a memory error detector
==7070== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7070== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7070== Command: ./main
==7070==
 5
==7070==
==7070== HEAP SUMMARY:
==7070==     in use at exit: 4 bytes in 1 blocks
==7070==   total heap usage: 1 allocs, 0 frees, 4 bytes allocated
==7070==
==7070== LEAK SUMMARY:
==7070==    definitely lost: 4 bytes in 1 blocks
==7070==    indirectly lost: 0 bytes in 0 blocks
==7070==      possibly lost: 0 bytes in 0 blocks
==7070==    still reachable: 0 bytes in 0 blocks
==7070==         suppressed: 0 bytes in 0 blocks
==7070== Rerun with --leak-check=full to see details of leaked memory
==7070==
==7070== For lists of detected and suppressed errors, rerun with: -s
==7070== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

A memory leak! If you have a background in C, C++, Rust, or similar systems-level languages, this might ring some alarm bells. Leaks aren't necessarily bad, but if you only ever allocate and never deallocate, you might soon find yourself using a lot more memory than you actually need to. That holds especially true for Ada, which is meant to be useful for real-time and embedded devices, where memory might not be in an excessive surplus.

The relevant section of the reference manual implies that a compiler is allowed to use automatic reclamation strategies such as garbage collection – but …

An implementation need not support garbage collection [...]

… which in practice means you can't rely on it being supported. Indeed, GNAT does not compile your program to use garbage collection.

Do you have to?

It's worth noting that in the program above, deallocation is not actually necessary, considering we exit immedieatly after where the deallocation would happen anyway, at which point the OS will reclaim the memory.

This is probably one of the first things to consider when dealing with dynamic memory management in Ada. Sometimes, it's not actually needed. The language does provide a few features which makes dynamic memory management unnecessary in certain cases, such as being able to take and return values of dynamic size:

with Ada.Text_IO; use Ada.Text_IO;

procedure Main is

    function Happify (S : String) return String is
    begin
        return S & "!";
    end Happify;

begin
    Put_Line (Happify ("Hello!"));
end Main;

The above program does technically allocate, but in the form of a (secondary) stack, which reclaims the memory automatically. An equivalent and reasonably idiomatic C program needs heap allocation for this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *happify(const char *s) {
    size_t len = strlen(s);
    char *res = (char *) malloc(len + 2);
    strcpy(res, s);
    res[len] = '!';
    res[len + 1] = 0;
    return res;
}

int main(void) {
  char *happy = happify("Hello!");
  printf("%s\n", happy);
  free(happy);
}

Combine Ada's powerful indefinite types with in out and out parameters, nested subprograms, packages and types, and a quite a few use cases of pointers and dynamic memory management can be substituted with other alternatives.

Still, there are cases where dynamic memory management is actually useful. Where we to put the initial program in a loop, for instance, we could quickly use up a lot of memory. Implementing recursive types such as ASTs is also an area where dynamic memory management can be useful.

Unchecked deallocation

If you've ever used malloc and free in C, this is pretty much the same deal. It is a pain to use, and for good reason, but it does do the job and deallocates what you give it.

with Ada.Unchecked_Deallocation;
with Ada.Text_IO;                use Ada.Text_IO;

procedure Main is
    type Integer_Access is access Integer;
    procedure Free is new Ada.Unchecked_Deallocation (Integer, Integer_Access);

    X : Integer_Access := new Integer'(5);
begin
    Put_Line (Integer'Image (X.all));
    Free (X);
end Main;

Run it through Valgrind, et voilĂ !

$ valgrind ./main
==10548== Memcheck, a memory error detector
==10548== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==10548== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==10548== Command: ./main
==10548==
 5
==10548==
==10548== HEAP SUMMARY:
==10548==     in use at exit: 0 bytes in 0 blocks
==10548==   total heap usage: 1 allocs, 1 frees, 4 bytes allocated
==10548==
==10548== All heap blocks were freed -- no leaks are possible
==10548==
==10548== For lists of detected and suppressed errors, rerun with: -s
==10548== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

1 allocation and 1 free. Wonderful!

Maybe not. See, the above is pretty verbose, even for Ada. Compare how easy it is to just new Integer'(5) versus how annoying it is to first have to import the procedure, then instantiate it, and only then can you actually go do manual memory management – something I don't even wanna do anyway!

Whether this is on purpose or not, I don't know, but it does serve to discourage you from using this method of memory management. For one, it is pretty error prone (something we don't like in Ada land!)

Controlled types

Controlled types offer RAII-like constructs that lets you write custom code on initialization, adjustment, and finalization. Note here that Ada differentiates between finalization and deallocation. The former is some process that automatically happens when an item is no longer accessible, whereas deallocation actually reclaims memory.

with Ada.Finalization; use Ada.Finalization;
with Ada.Text_IO;      use Ada.Text_IO;

procedure Main is
    -- Derive Ada.Finalization.Controlled to
    -- create a controlled type.
    type Integer_Holder is new Controlled with record
        Value : Integer;
    end record;

    -- Initialize(in out T) and Finalize(in out T)
    -- can then be implemented with the necessary
    -- logic.
    procedure Finalize (I : in out Integer_Holder) is
    begin
        Put_Line ("Finalized!");
    end Finalize;

    X : Integer_Holder := Integer_Holder'(Controlled with 5);
begin
    Put_Line (Integer'Image (X.Value));
end Main;

Note that there's no new anywhere, so the above doesn't actually allocate anything. Controlled types can in fact be used for more than just dynamic memory management – that just happens to be one area where it can be useful.

It's worth taking time to understand the difference between finalization and deallocation. Let's say we make a small adjustment to the program above:

type Holder_Access is access Integer_Holder;
X : Holder_Access := new Integer_Holder'(Controlled with Value => 5);

If we now run the program to Valgrind, we'll see that our finalizer runs, but the allocated memory isn't actually deallocated.

$ valgrind ./main
==13306== Memcheck, a memory error detector
==13306== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==13306== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==13306== Command: ./main
==13306==
 5
Finalized!
==13306==
==13306== HEAP SUMMARY:
==13306==     in use at exit: 32 bytes in 1 blocks
==13306==   total heap usage: 1 allocs, 0 frees, 32 bytes allocated
==13306==
==13306== LEAK SUMMARY:
==13306==    definitely lost: 32 bytes in 1 blocks
==13306==    indirectly lost: 0 bytes in 0 blocks
==13306==      possibly lost: 0 bytes in 0 blocks
==13306==    still reachable: 0 bytes in 0 blocks
==13306==         suppressed: 0 bytes in 0 blocks
==13306== Rerun with --leak-check=full to see details of leaked memory
==13306==
==13306== For lists of detected and suppressed errors, rerun with: -s
==13306== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Controlled types provide a convenient way to know when it may be appropriate to deallocate, but not how. Something to note is that controlled types do work nicely when nested – if a controlled type contains a field that is itself a value of a controlled type, both types' initializers, adjusters and finalizers get called. Finalizers are also called even when exceptions are raised.

One idea you might have is to combine finalizers with Unchecked_Deallocation to create something like this:

with Ada.Finalization; use Ada.Finalization;
with Ada.Text_IO;      use Ada.Text_IO;
with Ada.Unchecked_Deallocation;

procedure Main is
    type Integer_Access is access Integer;
    procedure Free is new Ada.Unchecked_Deallocation (Integer, Integer_Access);

    type My_Int is new Controlled with record
        Value : Integer_Access;
    end record;

    procedure Finalize (I : in out My_Int) is
    begin
        Free (I.Value);
    end Finalize;

    Value : Integer_Access := new Integer'(5);
    X     : My_Iny         := My_Int'(Controlled with Value => Value);
begin
    Put_Line (Integer'Image (X.Value.all));
end Main;

The above works – run it through Valgrind, and “no leaks are possible”. However, the above fails terribly the moment you start moving variables around:

declare
    Y : My_Int;
begin
    Put_Line (Integer'Image (X.Value.all));
    Y := X;
    Put_Line (Integer'Image (Y.Value.all));
end; -- Double free!

The problem is that My_Int assume singular ownership over the Value field, which is an assumption that breaks once you copy things around, because suddenly there are two values which both believe they own the Value access.

The above can, however, be a useful starting point for creating your own smart pointers, such as reference counters which automatically deallocate once no more objects use it. A fairly simple implementation is not too difficult and can be a nice exercise. There are also a few libraries out there, such as GNATCOLL's refcount package , which uses similar techniques to achieve automatic memory management.

Additionally, the type Limited_Controlled, which is a limited type, meaning you have the ability to restrict such assignments.

Holders

Ada's standard containers package has a generic pacakge, Indefinite_Holders, which exists to mainly to make indefinite types like String nicer to use , with an automatically and dynamically managed interface.

Because these holders use automatic dynamic memory management, we can in fact use them as a kind of sneaky, automatically managed pointer:

with Ada.Containers.Indefinite_Holders;
with Ada.Text_IO; use Ada.Text_IO;

procedure Main is
    package Ints is new Ada.Containers.Indefinite_Holders (Integer);
    use Ints;

    I : Ints.Holder := To_Holder (5);
begin
    Put_Line (Integer'Image (Element (I)));
end Main;

Run this through Valgrind, and sure enough – memory is both allocated and freed without leaks!

$ valgrind ./main
==19038== Memcheck, a memory error detector
==19038== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19038== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==19038== Command: ./main
==19038==
 5
==19038==
==19038== HEAP SUMMARY:
==19038==     in use at exit: 0 bytes in 0 blocks
==19038==   total heap usage: 2 allocs, 2 frees, 20 bytes allocated
==19038==
==19038== All heap blocks were freed -- no leaks are possible
==19038==
==19038== For lists of detected and suppressed errors, rerun with: -s
==19038== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Holders, similar to controlled types, are deallocated properly in the face of exceptions. Unlike a naive controlled types/unchecked deallocation implementation, however, they also respect copying and avoid double frees.

Holders are markantly different from access types, and do not serve as a complete replacement. They make indefinite types a lot nicer to use, but they are difficult if not impossible to use with, say, recursive types.

Aside: containers

The indefinite holders are one example of Ada's standard containers, all of which are automatically managed. This means that we can use vectors, sets and hashmaps in a relatively care-free way, without worrying about manually managing the memory.

with Ada.Containers.Vectors;
with Ada.Text_IO; use Ada.Text_IO;

procedure Main is
    package Int_Vector is new Ada.Containers.Vectors
       (Index_Type => Natural, Element_Type => Integer);
    use Int_Vector;

    V : Vector := 1 & 2 & 3 & 4;
begin
    V.Append (5);
    V.Prepend (0);

    for Item of V loop
        Put_Line (Integer'Image (Item));
    end loop;
end Main;
$ valgrind ./main
==20066== Memcheck, a memory error detector
==20066== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==20066== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==20066== Command: ./main
==20066==
 0
 1
 2
 3
 4
 5
==20066==
==20066== HEAP SUMMARY:
==20066==     in use at exit: 0 bytes in 0 blocks
==20066==   total heap usage: 8 allocs, 8 frees, 152 bytes allocated
==20066==
==20066== All heap blocks were freed -- no leaks are possible
==20066==
==20066== For lists of detected and suppressed errors, rerun with: -s
==20066== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Storage pools

Personally, my mind goes to the heap when I think about dynamic memory management. Whereas the stack is "automatically" managed as functions and scopes are entered and exited, the heap allows for a more “random-access” style of managing memory. In C, we typically access the heap with malloc() and free(), which tends to be one of the primary ways we do dynamic memory management there.

Instead of talking about the heap, we generally use the term storage pool in Ada. A storage pool is basically just some object that let's us allocate and deallocate at will. By default , GNAT compiles uses of new and Unchecked_Deallocation into __gnat_malloc and __gnat_free, which are pretty nmuch the same as C's malloc and free:

void *__gnat_malloc(size_t size);
void __gnat_free(void *ptr);

Thus the “default storage pool” is what we tend to think of as the heap. However, you can actually implement your own storage pools, and these are not required to use the heap at all. Once you have a storage pool, you can then change the 'Storage_Pool attribute on access types, such that any uses of new for that access type will use that pool.

Creating a custom storage pool involves inheriting from Root_Storage_Pool in System.Storage_Pools, and overriding the Allocate, Deallocate, and Storage_Size subprograms.

As an example, we can create a custom storage pool which allocates into a fixed portion of memory. First, we need to create a type

with System;                  use System;
with System.Storage_Elements; use System.Storage_Elements;
with System.Storage_Pools;    use System.Storage_Pools;

package Fixed_Pools is

    type Fixed_Pool is new Root_Storage_Pool with record
        Size  : Storage_Count := 0;
        Store : Storage_Array (0 .. 1_024);
    end record;
    

Here, the type Fixed_Pool inherits from Root_Storage_Pool, which makes it a storage pool, and contains a Store array and a Size number. Store is where the actual data will be allocated into, and Size keeps track of the index of the last element.

    overriding procedure Deallocate
       (Self      : in out Fixed_Pool;
        Addr      :        Address;
        Size      :        Storage_Count;
        Alignment :        Storage_Count)
    is null;

    overriding function Storage_Size
       (Self : Fixed_Pool) return Storage_Count
    is (Self.Store'Last);
        

In this example, we don't actually care about making sure memory is deallocated properly, since it is just a simple bump allocator, so the Deallocate function does nothing.

The Storage_Size function measures “the number of storage elements reserved for the pool” (RM 13.11). How much this comes up in practice I'm not sure, but it effectively tells you how much you can allocate in a given pool

Finally, there's the allocation:

    overriding procedure Allocate
       (Self      : in out Fixed_Pool;
        Addr      :    out Address;
        Size      :        Storage_Count;
        Alignment :        Storage_Count)
    is
    begin
        Addr      := Self.Store (Self.Size)'Address;
        Self.Size := Self.Size + Size;
    end Allocate;

end Fixed_Pools;

We simply bump up the Size field, and return the appropriate address.

With that out of the way, we can now use our very own super-simple bump allocator for dynamic memory management!

with Ada.Text_IO; use Ada.Text_IO;
with Fixed_Pools; use Fixed_Pools;

procedure Main is
    Pool : Fixed_Pool;  -- (1)

    type Integer_Access is access Integer;
    for Integer_Access'Storage_Pool use Pool;  -- (2)

    X : Integer_Access := new Integer'(5);  -- (3)
begin
    Put_Line (Integer'Image (X.all));
end Main;

At (1), we actually create a Fixed_Pool which we will be allocating into, and specify that our Integer_Access type will use that pool in (2). Finally, at (3), we can use the Integer_Access type like any other access type, and allocate into it with new. This will actually call our custom allocation function instead of calling out to __gnat_malloc.

Naturally, the above storage pool is a pretty bad storage pool for most use cases – for one, if you just use it enough, it will fail with a Constraint_Error as we attempt to allocate past the end of the array. It does however demonstrate two important things:

  1. We can have very precise control over where and how things get allocated on a type-by-type basis.
  2. We don't need to use the heap for new.

When dealing with lots of tree-like and recursive structures, this is really nice, because it means that we can create a fairly simple bump allocator which the type uses, and once we're done with the trees, we can free it all in one go, without having to worry about leaks and use-after-frees.

Aside: The 'Storage_Size attribute

The section of the reference manual quoted above mentions something interesting (RM 13.11, emphasis mine):

If Storage_Size is specified for an access type T, an implementation-defined pool P is used for the type. The Storage_Size of P is at least that requested, and the storage for P is reclaimed when the master containing the declaration of the access type is left . [...] The storage pool P is used only for allocators returning type P or other access types specified to use T'Storage_Pool.

If we specify 'Storage_Size for some access type, the memory used to allocate objects of that access type is automatically reclaimed when that type is no longer visible!

procedure Do_Lots_Of_Work (N : Natural) is
    type Integer_Access is access Integer;
    for Integer_Access'Storage_Size use N * Integer'Size;

    X : Integer_Access;
begin
    for I in 1 .. N loop
        X := new Integer'(I);
        Put_Line (Integer'Image (X.all));
    end loop;
end Do_Lots_Of_Work;

Running the above with Valgrind shows that no heap allocations occur, and if we decompile and sift through the assembly, it seems like GNAT essentially allocates allocates our integers on the stack. As such, this is probably one of the best ways of using dynamic memory management if you can know the number of elements ahead of time.

This is actually quite powerful, since, as we see above, the 'Storage_Size attribute doesn't need to be a compile-time constant. As long as we are able to figure out the maximum number of elements we're going to allocate, this is a fairly nice and usable arena-like alternative.

Garbage collection anyway

As mentioned, Ada supports garbage collectors, but doesn't require them. But maybe we are inspired by the abilities of the storage pool and figure that we can just call some external garbage collector within the Allocate function.

This is in fact very possible: see for instance ytomino's Boehm GC Ada interface , which uses storage pools together with the Boehm-Demers-Weiser garbage collector to achieve nice and easy GC with your run-of-the-mill GNAT compiler. It should be noted that it has limitations: for one you can't just plug in the library and all of your previous non-GC access types will be GC'd; you have to specify the GC storage pool for all the access types you want to use. Secondly, the particular implementation above does not support multithreading. It probably is possible to use an external collector and support multithreading, but I'm not aware of what steps need to be taken to do that.

Further reading