BIM (Binary Index Map)

So, to get this out of the way early: I'm dead certain that not only am I not the first person to think of doing things this way, but that I previously read an article outlining this exact same data structure. But when I went to look it up, I couldn't find it, only stuff about the Two-Level Segregated Fit Memory Allocator, which is tangentially similar in using multiple levels of bit sets to speed up searching for free memory.

The basic idea of what I'm calling a Binary Index Map is that we have 64 buckets of 64 slots, with each slot having one bit for whether they're occupied or not. Then there's another 64-bit integer doing the same thing on a higher level, with a bit for each bucket indicating whether it's full. This lets us find the first unoccupied slot in O(1) time: counting the trailing ones of the bucket-full set gives us the bucket index, and counting the trailing ones of that bucket's slot set gives us the index within it. Multiply the former by 64 and add them together, and you have your index.

An intrusive free list can serve the same role of freeing and reusing indices, and is so simple that I usually just do that, but this has the advantage of always clustering around the start of the array, instead of gradually fragmenting 'til each allocation gives a pretty much random index. And it's not so complex as all that. With C23's <stdbit.h>:

#include <stdint.h>
#include <stdbit.h>

#define BIM_MAX (64 * 64)

typedef struct {
	uint64_t blk[64];
	uint64_t full;
} BIM;

int bim_new(BIM *bim) {
	if (~bim->full) {
		uint64_t blk = stdc_trailing_ones(bim->full);
		uint64_t idx = stdc_trailing_ones(bim->blk[blk]);
		bim->blk[blk] |= (uint64_t)1 << idx;
		if (~bim->blk[blk] == 0)
			bim->full |= (uint64_t)1 << blk;
		return 64 * blk + idx;
	} else {
		return -1;
	}
}

void bim_del(BIM *bim, int i) {
	uint64_t blk = i / 64;
	uint64_t idx = i & 63;
	bim->blk[blk] &= ~((uint64_t)1 << idx);
	bim->full     &= ~((uint64_t)1 << blk);
}

This particular implementation is limited to 4096 slots, which is plenty enough for my purposes, but it should be pretty easy to extend. One advantage of this approach over a free list is that it already contains the machinery to iterate over the set (e.g. you'd probably want some way to track which slots are occupied or not anyway):

static inline int bim_ext(BIM *bim, int i) {
	return (bim->blk[i / 64] >> (i & 63)) & 1;
}

void print_bims(BIM *bim) {
	for (int i = 0; i < BIM_MAX; i++) {
		if (bim_ext(i)) {
			printf("%d\n", i);
		}
	}
}

Does this thing already have a name? If anyone knows the original article (and has access to tilde mail I suppose), just send a link to wrmr@tilde.town, or @wrmr@tilde.zone on Mastodon, and I'll add it here.

Update: The original is Jakub Tomšů's “bit pool,” from this article (a much better name, frankly).