posted 1 year ago

down with 64 bit pointers, up with 32 bit indices

lately I’ve been experimenting with a trick to save space on 64 bit architectures: rather than using pointers (which take 8 bytes of memory! 8! bytes! omg!), I’m using 32 bit indices into a single, gigantic 32 gig block I allocate at startup. as in:

u64 *bigblock=malloc(32*1024*1024*1024); // at startup. or choose a number smaller than 32 :)

struct foo { int x, y, whatever, whatever; };

// later…

u32 myfoo=MyAllocatorForBigBlock(sizeof(foo)); // using a custom allocator of your choice…

((foo*)(bigblock+myfoo))->x=100;

((foo*)(bigblock+myfoo))->y=200;

with some macro goodness, or in C++, a template class that looks a bit like a smart pointer but is in fact not smart, those casts I made explicit above go away it can be made to look really neat:

P<foo> myfoo=P<foo>::Alloc(); // P<foo> is really a 32 bit index just like before.

myfoo->x=100;myfoo->y=200; // the overloaded -> operator does the bigalloc+index thing for us

The indices in my case assume an 8 byte stride, ie all allocations happen on 8 byte boundaries (that’s why bigblock is a u64*), so you can address 4*8=32 gigs with 32 bit indices. For an in-memory database (my use case, as it happens), that’s plenty. I don’t want it swapping anyway, and 32 gig machines are reasonable these days. The compiler seems to do the right thing and generate quite efficient code (eg it keeps bigblock in a register (ecx), puts indices in say edx and does things like mov [ecx+edx*8+4],200)

another cute side benefit is that all the pointers in your system, now that they’re indices from a base address, are sort of ‘position independent’… you can mmap() or fwrite() the entire bigblock to and from disk, and next time you boot, it doesn’t matter if the base address moves, it all just works. no need for pointer fixups or anything like that.  win!

kinda makes 64 bit linux/windows feel a bit more like a nice embedded system with a predictable memory map and nice small 4 byte pointers. lovely!

posted 2 years ago
posted 2 years ago and tagged as lbp coding

c trick

I love indexing constant strings, I’m sad like that.

for example,

char c=”0123456789abcdef”[b&15];

converts the low nybble of byte b into its hex character.

oldie, but goldie.

posted 2 years ago and tagged as tips c
posted 2 years ago
posted 2 years ago

goto considered harmful? rubbish - it’s fine

— Anton Kirczenow

posted 2 years ago
posted 2 years ago

warning signs

when I’m coding, sometimes I get warning signs that I haven’t understood the problem fully. Normally, they take the form of gross looking code - ‘-1’ or ‘+1’ hanging round in loop limits, even worse, seemingly arbitrary constants, large amounts of unavoidable pointer dereferencing,…

nearlly always this is a sign that either I’m doing it wrong, or I got the data structure wrong. It is ALWAYS better to stop and rethink.

as I’ve got more experienced coding, I’ve continually lowered my tolerance to these warning signs, and it always helps. even if I find myself doing something fairly innocuous, like a nested loop, or if I find an unexpected number of edge cases I need to ‘code around’, I now stop and rethink. I don’t always get the right answer… but I’ve never regretted that mental check.

tl;dr version: develop your spidey coder sense for anything gross, and listen to it. it’s like the human form of -Wall, and it’s always worth it.

posted 2 years ago