Paul Bone

How to Allocate Memory — Code Generation

How do you generate code that allocates memory? This isn’t about memory management algorithms, or language design concepts. But how to translate Plasma code that allocates memory (like use of a data constructor) into PZ abstract machine code.

The options

Let’s explore the options.

Allocate memory then fill it in

The first option is to create the memory and initialise it in two separate steps. Let’s imagine we’re executing:

p = Point(x, 3)

Where Point is a constructor containing two Int fields. In PZ machine code this would look something like.

# Allocate memory, leaves the pointer to the memory on the stack.
alloc point_struct_id

# Create a copy of the pointer which is consumed by write, get x (places it
# on the stack) then write it into the structure.
dup
pick x_location
write point_struct_id 1

# Create a copy of the pointer, load 3 (onto the stack) then write it into
# the structure.
dup
load 3
write point_struct_id 2

# Optionally apply tag.
load tag_value
make_tag

Currently all PZ machine instructions consume their arguments, and therefore dup instructions are used (actually pick 1) to make a copy of the pointer which will be consumed by write. I may change this in the future if I find that some instructions are frequently used with a dup, such as write, and therefore would be more efficient if they didn’t consume this argument. However for now I’m sticking with a simpler design.

Allocating memory and filling it in separately is pretty simple. It means that the alloc instruction only needs to do a single thing. Additionally this is pretty flexible and makes filling in a structure bit-by-bit, for example for last call modulo construction optimisation (LCMC), easy.

Prepare the fields, allocate memory, then fill it in

Like the above, except that the preparation of the fields should be done first.

# Place the values of the fields onto the stack
pick x_location
load 3

alloc point_struct_id
dup
write point_struct_id 1
dup
write point_struct_id 2

# Optionally apply tag.
load tag_value
make_tag

I’m anticipating a design for a garbage collector that is optimised for heaps with few cycles. My original inspiration was Ralph Backet’s History-based Garbage Collector however the design I’m considering has many important differences. What this means is that the time between the allocation and setting the last field to its value is GC unsafe and the kinds of things the program can do should be restricted, particularly it must not allocate memory, therefore preparing the field values before allocating the memory is a good idea, even though it requires more stack space.

Create and initialise cell in one step

# Place the values of the fields onto the stack
pick x_location
load 3

new point_struct_id

# Optionally apply tag.
load tag_value
make_tag

In this case the new instruction (rather than alloc above) allocates the memory and fills it in. This probably makes things a little clearer, which could make errors less likely but I’m not sure if it has any other advantages. It makes last call modulo construction optimisation more difficult.

GC considerations

The garbage collector (GC) must be able to locate all the pointers from heap objects to other heap objects, but also from the stack and registers (roots) into the heap. Below we will not discuss registers, since this is a stack machine that is okay. During execution some stack slots are registers, however that’s an implementation detail not relevant at the current abstraction level. This means that it must know which stack slots contain pointers, and which do not (because they contain ints or floats). Data structures containing this information are known as stack maps and can potentially be a lot of data!

Most if not all accurate garbage collectors reduce the amount of data by only storing this information for specific points in the program, known as GC safe points. It is pretty clear that a GC safe point should not be placed during initialisation in the first two examples. If a GC safe point is needed it should be before the allocation occurs, a collection may be triggered then anyway.

Regardless in addition to initialising a structure this code should ensure that whether or not each stack slot is a pointer is reflected correctly in the current stack map at any GC safe points. This appears to add more work for the later two options. However it is not much more work, since the variables used for the fields of a construction are already on the stack and need to be marked correctly in the stack map.

When memory is allocated and has been initialised then the GC must also know which if its fields is a pointer. Like other things there are multiple ways to handle this, however I will probably provide a bitmap to the garbage collector at allocation/initialisation time. I haven’t shown either this, or any code related to the stack maps in the examples above. Because the initialisation phase in the first two examples is not GC safe, then it probably doesn’t matter whether the code provides this bitfield to the collector at allocation time, or once the object is completely initialised.

Finally this bitfield needs to be created or updated by the generated code based on the same information used by the stack maps. Why not statically? Surely bitfields could be shared between structures of the same type? Because Plasma will support polymorphism and therefore a polymorphic field’s actual type may or may not be a pointer. I have not thought deeply about how to do this. I don’t think that my choice of how to allocate and initialise memory will affect how to handle stack maps, bit fields in the heap or polymorphism. Nevertheless it is important to give some thought to these problems now.

Conclusion

I will most likely choose the second option. It offers more versatility than the third, it may make LCMC and other optimisations easier. It also restricts garbage collection less than the first option.

Of course at this stage in development I’m willing to change my mind if problems arise.