Memory management
The way-too-long-don't-wanna-read summary for what to do when you wanna do dynamic memory management:
-
Try not to — Ada offers a few features
that makes it possible to work with dynamically sized types
without manually allocating. Notably,
out
andin out
parameters are by far the preffered way of modifying things from procedures. Subprograms can also take and return values of indefinite types (likeString
) without issue. - Containers — the standard library containers are automatically managed.
-
Indefinite holders — the indefinite
holder container is a special container (meaning it has
automatic memory management) which can hold a single value of an
indefinite type. Values inside such a holder are allowed to
change size without issue.
Unbounded_String
is similar, but only for strings. -
Controlled types — deriving from
Controlled
andLimited_Controlled
fromAda.Finalization
(see RM 7.6) let's you create RAII-style types by overridingInitialize
andFinalize
, as well asAdjust
forControlled
. - Smart pointers — GNATCOLL's refcount package implements thread safe reference counting using controlled types. See also the Wikibook for more.
-
Storage pools — these allow you to
effectively override the
new
keyword with custom allocators, allowing you to implement arena allocators or bind conservative GC's like Boehm without pain. -
'Storage_Size
— access types can set the'Storage_Size
attribute with a dynamic value, in which case the compiler will create a storage pool that automatically reclaims the pool when the type is no longer visible. -
Unchecked_Deallocation
— if none of the above help you, this should be a final resort. It is essentially a type-safefree
, and should be treated as such.
Table of contents
Introduction
Allocating items in Ada is easy!
with Ada.Text_IO; use Ada.Text_IO; procedure Main is type Integer_Access is access Integer; X : Integer_Access := new Integer'(5); begin Put_Line (Integer'Image (X.all)); end Main;
If we run this through Valgrind, however, we might see something troubling.
$ valgrind ./main ==7070== Memcheck, a memory error detector ==7070== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==7070== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==7070== Command: ./main ==7070== 5 ==7070== ==7070== HEAP SUMMARY: ==7070== in use at exit: 4 bytes in 1 blocks ==7070== total heap usage: 1 allocs, 0 frees, 4 bytes allocated ==7070== ==7070== LEAK SUMMARY: ==7070== definitely lost: 4 bytes in 1 blocks ==7070== indirectly lost: 0 bytes in 0 blocks ==7070== possibly lost: 0 bytes in 0 blocks ==7070== still reachable: 0 bytes in 0 blocks ==7070== suppressed: 0 bytes in 0 blocks ==7070== Rerun with --leak-check=full to see details of leaked memory ==7070== ==7070== For lists of detected and suppressed errors, rerun with: -s ==7070== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
A memory leak! If you have a background in C, C++, Rust, or similar systems-level languages, this might ring some alarm bells. Leaks aren't necessarily bad, but if you only ever allocate and never deallocate, you might soon find yourself using a lot more memory than you actually need to. That holds especially true for Ada, which is meant to be useful for real-time and embedded devices, where memory might not be in an excessive surplus.
The relevant section of the reference manual implies that a compiler is allowed to use automatic reclamation strategies such as garbage collection – but …
An implementation need not support garbage collection [...]
… which in practice means you can't rely on it being supported. Indeed, GNAT does not compile your program to use garbage collection.
Do you have to?
It's worth noting that in the program above, deallocation is not actually necessary, considering we exit immedieatly after where the deallocation would happen anyway, at which point the OS will reclaim the memory.
This is probably one of the first things to consider when dealing with dynamic memory management in Ada. Sometimes, it's not actually needed. The language does provide a few features which makes dynamic memory management unnecessary in certain cases, such as being able to take and return values of dynamic size:
with Ada.Text_IO; use Ada.Text_IO; procedure Main is function Happify (S : String) return String is begin return S & "!"; end Happify; begin Put_Line (Happify ("Hello!")); end Main;
The above program does technically allocate, but in the form of a (secondary) stack, which reclaims the memory automatically. An equivalent and reasonably idiomatic C program needs heap allocation for this:
#include <stdio.h> #include <stdlib.h> #include <string.h> char *happify(const char *s) { size_t len = strlen(s); char *res = (char *) malloc(len + 2); strcpy(res, s); res[len] = '!'; res[len + 1] = 0; return res; } int main(void) { char *happy = happify("Hello!"); printf("%s\n", happy); free(happy); }
Combine Ada's powerful indefinite types with in out
and
out
parameters, nested subprograms, packages and types,
and a quite a few use cases of pointers and dynamic memory
management can be substituted with other alternatives.
Still, there are cases where dynamic memory management is actually useful. Where we to put the initial program in a loop, for instance, we could quickly use up a lot of memory. Implementing recursive types such as ASTs is also an area where dynamic memory management can be useful.
Unchecked deallocation
If you've ever used malloc
and free
in C,
this is pretty much the same deal. It is a pain to use, and for
good reason, but it does do the job and deallocates what you give
it.
with Ada.Unchecked_Deallocation; with Ada.Text_IO; use Ada.Text_IO; procedure Main is type Integer_Access is access Integer; procedure Free is new Ada.Unchecked_Deallocation (Integer, Integer_Access); X : Integer_Access := new Integer'(5); begin Put_Line (Integer'Image (X.all)); Free (X); end Main;
Run it through Valgrind, et voilĂ !
$ valgrind ./main ==10548== Memcheck, a memory error detector ==10548== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==10548== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==10548== Command: ./main ==10548== 5 ==10548== ==10548== HEAP SUMMARY: ==10548== in use at exit: 0 bytes in 0 blocks ==10548== total heap usage: 1 allocs, 1 frees, 4 bytes allocated ==10548== ==10548== All heap blocks were freed -- no leaks are possible ==10548== ==10548== For lists of detected and suppressed errors, rerun with: -s ==10548== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
1 allocation and 1 free. Wonderful!
Maybe not. See, the above is pretty verbose, even for Ada. Compare
how easy it is to just new Integer'(5)
versus how
annoying it is to first have to import the procedure, then
instantiate it, and only then can you actually go do manual memory
management – something I don't even wanna do anyway!
Whether this is on purpose or not, I don't know, but it does serve to discourage you from using this method of memory management. For one, it is pretty error prone (something we don't like in Ada land!)
Controlled types
Controlled types offer RAII-like constructs that lets you write custom code on initialization, adjustment, and finalization. Note here that Ada differentiates between finalization and deallocation. The former is some process that automatically happens when an item is no longer accessible, whereas deallocation actually reclaims memory.
with Ada.Finalization; use Ada.Finalization; with Ada.Text_IO; use Ada.Text_IO; procedure Main is -- Derive Ada.Finalization.Controlled to -- create a controlled type. type Integer_Holder is new Controlled with record Value : Integer; end record; -- Initialize(in out T) and Finalize(in out T) -- can then be implemented with the necessary -- logic. procedure Finalize (I : in out Integer_Holder) is begin Put_Line ("Finalized!"); end Finalize; X : Integer_Holder := Integer_Holder'(Controlled with 5); begin Put_Line (Integer'Image (X.Value)); end Main;
Note that there's no new
anywhere, so the above doesn't
actually allocate anything. Controlled types can in fact be used for
more than just dynamic memory management – that just happens
to be one area where it can be useful.
It's worth taking time to understand the difference between finalization and deallocation. Let's say we make a small adjustment to the program above:
type Holder_Access is access Integer_Holder; X : Holder_Access := new Integer_Holder'(Controlled with Value => 5);
If we now run the program to Valgrind, we'll see that our finalizer runs, but the allocated memory isn't actually deallocated.
$ valgrind ./main ==13306== Memcheck, a memory error detector ==13306== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==13306== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==13306== Command: ./main ==13306== 5 Finalized! ==13306== ==13306== HEAP SUMMARY: ==13306== in use at exit: 32 bytes in 1 blocks ==13306== total heap usage: 1 allocs, 0 frees, 32 bytes allocated ==13306== ==13306== LEAK SUMMARY: ==13306== definitely lost: 32 bytes in 1 blocks ==13306== indirectly lost: 0 bytes in 0 blocks ==13306== possibly lost: 0 bytes in 0 blocks ==13306== still reachable: 0 bytes in 0 blocks ==13306== suppressed: 0 bytes in 0 blocks ==13306== Rerun with --leak-check=full to see details of leaked memory ==13306== ==13306== For lists of detected and suppressed errors, rerun with: -s ==13306== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Controlled types provide a convenient way to know when it may be appropriate to deallocate, but not how. Something to note is that controlled types do work nicely when nested – if a controlled type contains a field that is itself a value of a controlled type, both types' initializers, adjusters and finalizers get called. Finalizers are also called even when exceptions are raised.
One idea you might have is to combine finalizers with
Unchecked_Deallocation
to create something like this:
with Ada.Finalization; use Ada.Finalization; with Ada.Text_IO; use Ada.Text_IO; with Ada.Unchecked_Deallocation; procedure Main is type Integer_Access is access Integer; procedure Free is new Ada.Unchecked_Deallocation (Integer, Integer_Access); type My_Int is new Controlled with record Value : Integer_Access; end record; procedure Finalize (I : in out My_Int) is begin Free (I.Value); end Finalize; Value : Integer_Access := new Integer'(5); X : My_Iny := My_Int'(Controlled with Value => Value); begin Put_Line (Integer'Image (X.Value.all)); end Main;
The above works – run it through Valgrind, and “no leaks are possible”. However, the above fails terribly the moment you start moving variables around:
declare Y : My_Int; begin Put_Line (Integer'Image (X.Value.all)); Y := X; Put_Line (Integer'Image (Y.Value.all)); end; -- Double free!
The problem is that My_Int
assume singular ownership
over the Value
field, which is an assumption that
breaks once you copy things around, because suddenly there are
two values which both believe they own the Value
access.
The above can, however, be a useful starting point for creating your own smart pointers, such as reference counters which automatically deallocate once no more objects use it. A fairly simple implementation is not too difficult and can be a nice exercise. There are also a few libraries out there, such as GNATCOLL's refcount package , which uses similar techniques to achieve automatic memory management.
Additionally, the type Limited_Controlled
, which is a
limited type, meaning you have the ability to restrict
such assignments.
Holders
Ada's standard containers package has a generic pacakge,
Indefinite_Holders
, which exists to mainly to make
indefinite types like String
nicer to use
, with an automatically and dynamically managed interface.
Because these holders use automatic dynamic memory management, we can in fact use them as a kind of sneaky, automatically managed pointer:
with Ada.Containers.Indefinite_Holders; with Ada.Text_IO; use Ada.Text_IO; procedure Main is package Ints is new Ada.Containers.Indefinite_Holders (Integer); use Ints; I : Ints.Holder := To_Holder (5); begin Put_Line (Integer'Image (Element (I))); end Main;
Run this through Valgrind, and sure enough – memory is both allocated and freed without leaks!
$ valgrind ./main ==19038== Memcheck, a memory error detector ==19038== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==19038== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==19038== Command: ./main ==19038== 5 ==19038== ==19038== HEAP SUMMARY: ==19038== in use at exit: 0 bytes in 0 blocks ==19038== total heap usage: 2 allocs, 2 frees, 20 bytes allocated ==19038== ==19038== All heap blocks were freed -- no leaks are possible ==19038== ==19038== For lists of detected and suppressed errors, rerun with: -s ==19038== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Holders, similar to controlled types, are deallocated properly in the face of exceptions. Unlike a naive controlled types/unchecked deallocation implementation, however, they also respect copying and avoid double frees.
Holders are markantly different from access types, and do not serve as a complete replacement. They make indefinite types a lot nicer to use, but they are difficult if not impossible to use with, say, recursive types.
Aside: containers
The indefinite holders are one example of Ada's standard containers, all of which are automatically managed. This means that we can use vectors, sets and hashmaps in a relatively care-free way, without worrying about manually managing the memory.
with Ada.Containers.Vectors; with Ada.Text_IO; use Ada.Text_IO; procedure Main is package Int_Vector is new Ada.Containers.Vectors (Index_Type => Natural, Element_Type => Integer); use Int_Vector; V : Vector := 1 & 2 & 3 & 4; begin V.Append (5); V.Prepend (0); for Item of V loop Put_Line (Integer'Image (Item)); end loop; end Main;
$ valgrind ./main ==20066== Memcheck, a memory error detector ==20066== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==20066== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==20066== Command: ./main ==20066== 0 1 2 3 4 5 ==20066== ==20066== HEAP SUMMARY: ==20066== in use at exit: 0 bytes in 0 blocks ==20066== total heap usage: 8 allocs, 8 frees, 152 bytes allocated ==20066== ==20066== All heap blocks were freed -- no leaks are possible ==20066== ==20066== For lists of detected and suppressed errors, rerun with: -s ==20066== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Storage pools
Personally, my mind goes to the heap when I think about dynamic
memory management. Whereas the stack is "automatically" managed as
functions and scopes are entered and exited, the heap allows for
a more “random-access” style of managing memory. In C,
we typically access the heap with malloc()
and
free()
, which tends to be one of the primary ways we do
dynamic memory management there.
Instead of talking about the heap, we generally use the term
storage pool in Ada. A storage pool is basically just some
object that let's us allocate and deallocate at will.
By default
, GNAT compiles uses of new
and
Unchecked_Deallocation
into __gnat_malloc
and __gnat_free
, which are pretty nmuch the same as C's
malloc
and free
:
void *__gnat_malloc(size_t size); void __gnat_free(void *ptr);
Thus the “default storage pool” is what we tend to think
of as the heap. However, you can actually implement your own storage
pools, and these are not required to use the heap at all. Once you
have a storage pool, you can then change the
'Storage_Pool
attribute on access types, such that any
uses of new
for that access type will use that pool.
Creating a custom storage pool involves inheriting from
Root_Storage_Pool
in System.Storage_Pools
,
and overriding the Allocate
, Deallocate
,
and Storage_Size
subprograms.
As an example, we can create a custom storage pool which allocates into a fixed portion of memory. First, we need to create a type
with System; use System; with System.Storage_Elements; use System.Storage_Elements; with System.Storage_Pools; use System.Storage_Pools; package Fixed_Pools is type Fixed_Pool is new Root_Storage_Pool with record Size : Storage_Count := 0; Store : Storage_Array (0 .. 1_024); end record;
Here, the type Fixed_Pool
inherits from
Root_Storage_Pool
, which makes it a storage pool, and
contains a Store
array and a Size
number.
Store
is where the actual data will be allocated into,
and Size
keeps track of the index of the last element.
overriding procedure Deallocate (Self : in out Fixed_Pool; Addr : Address; Size : Storage_Count; Alignment : Storage_Count) is null; overriding function Storage_Size (Self : Fixed_Pool) return Storage_Count is (Self.Store'Last);
In this example, we don't actually care about making sure memory is
deallocated properly, since it is just a simple bump allocator, so
the Deallocate
function does nothing.
The Storage_Size
function measures “the number of
storage elements reserved for the pool”
(RM 13.11).
How much this comes up in practice I'm not sure, but it effectively
tells you how much you can allocate in a given pool
Finally, there's the allocation:
overriding procedure Allocate (Self : in out Fixed_Pool; Addr : out Address; Size : Storage_Count; Alignment : Storage_Count) is begin Addr := Self.Store (Self.Size)'Address; Self.Size := Self.Size + Size; end Allocate; end Fixed_Pools;
We simply bump up the Size
field, and return the
appropriate address.
With that out of the way, we can now use our very own super-simple bump allocator for dynamic memory management!
with Ada.Text_IO; use Ada.Text_IO; with Fixed_Pools; use Fixed_Pools; procedure Main is Pool : Fixed_Pool; -- (1) type Integer_Access is access Integer; for Integer_Access'Storage_Pool use Pool; -- (2) X : Integer_Access := new Integer'(5); -- (3) begin Put_Line (Integer'Image (X.all)); end Main;
At (1), we actually create a Fixed_Pool
which we will
be allocating into, and specify that our Integer_Access
type will use that pool in (2). Finally, at (3), we can use the
Integer_Access
type like any other access type, and
allocate into it with new
. This will actually call our
custom allocation function instead of calling out to
__gnat_malloc
.
Naturally, the above storage pool is a pretty bad storage pool for
most use cases – for one, if you just use it enough, it will
fail with a Constraint_Error
as we attempt to allocate
past the end of the array. It does however demonstrate two important
things:
- We can have very precise control over where and how things get allocated on a type-by-type basis.
-
We don't need to use the heap for
new
.
When dealing with lots of tree-like and recursive structures, this is really nice, because it means that we can create a fairly simple bump allocator which the type uses, and once we're done with the trees, we can free it all in one go, without having to worry about leaks and use-after-frees.
Aside: The 'Storage_Size
attribute
The section of the reference manual quoted above mentions something interesting (RM 13.11, emphasis mine):
If Storage_Size is specified for an access type
T
, an implementation-defined poolP
is used for the type. The Storage_Size ofP
is at least that requested, and the storage forP
is reclaimed when the master containing the declaration of the access type is left . [...] The storage poolP
is used only for allocators returning typeP
or other access types specified to useT'Storage_Pool
.
If we specify 'Storage_Size
for some access type, the
memory used to allocate objects of that access type is automatically
reclaimed when that type is no longer visible!
procedure Do_Lots_Of_Work (N : Natural) is type Integer_Access is access Integer; for Integer_Access'Storage_Size use N * Integer'Size; X : Integer_Access; begin for I in 1 .. N loop X := new Integer'(I); Put_Line (Integer'Image (X.all)); end loop; end Do_Lots_Of_Work;
Running the above with Valgrind shows that no heap allocations occur, and if we decompile and sift through the assembly, it seems like GNAT essentially allocates allocates our integers on the stack. As such, this is probably one of the best ways of using dynamic memory management if you can know the number of elements ahead of time.
This is actually quite powerful, since, as we see above, the
'Storage_Size
attribute doesn't need to be a
compile-time constant. As long as we are able to figure out the
maximum number of elements we're going to allocate, this is a fairly
nice and usable arena-like alternative.
Garbage collection anyway
As mentioned, Ada supports garbage collectors, but doesn't require
them. But maybe we are inspired by the abilities of the storage pool
and figure that we can just call some external garbage collector
within the Allocate
function.
This is in fact very possible: see for instance ytomino's Boehm GC Ada interface , which uses storage pools together with the Boehm-Demers-Weiser garbage collector to achieve nice and easy GC with your run-of-the-mill GNAT compiler. It should be noted that it has limitations: for one you can't just plug in the library and all of your previous non-GC access types will be GC'd; you have to specify the GC storage pool for all the access types you want to use. Secondly, the particular implementation above does not support multithreading. It probably is possible to use an external collector and support multithreading, but I'm not aware of what steps need to be taken to do that.