How do you generate code that allocates memory? This isn’t about memory management algorithms, or language design concepts. But how to translate Plasma code that allocates memory (like use of a data constructor) into PZ abstract machine code.
The options
Let’s explore the options.
Allocate memory then fill it in
The first option is to create the memory and initialise it in two separate steps. Let’s imagine we’re executing:
p = Point(x, 3)
Where Point
is a constructor containing two Int
fields.
In PZ machine code this would look something like.
# Allocate memory, leaves the pointer to the memory on the stack. alloc point_struct_id # Create a copy of the pointer which is consumed by write, get x (places it # on the stack) then write it into the structure. dup pick x_location write point_struct_id 1 # Create a copy of the pointer, load 3 (onto the stack) then write it into # the structure. dup load 3 write point_struct_id 2 # Optionally apply tag. load tag_value make_tag
Currently all PZ machine instructions consume their arguments,
and therefore dup
instructions are used (actually pick 1
) to make a copy
of the pointer which will be consumed by write
.
I may change this in the future if I find that some instructions are
frequently used with a dup
, such as write
, and therefore would be more
efficient if they didn’t consume this argument.
However for now I’m sticking with a simpler design.
Allocating memory and filling it in separately is pretty simple.
It means that the alloc
instruction only needs to do a single thing.
Additionally this is pretty flexible and makes filling in a structure
bit-by-bit, for example for
last call
modulo construction optimisation (LCMC), easy.
Prepare the fields, allocate memory, then fill it in
Like the above, except that the preparation of the fields should be done first.
# Place the values of the fields onto the stack pick x_location load 3 alloc point_struct_id dup write point_struct_id 1 dup write point_struct_id 2 # Optionally apply tag. load tag_value make_tag
I’m anticipating a design for a garbage collector that is optimised for heaps with few cycles. My original inspiration was Ralph Backet’s History-based Garbage Collector however the design I’m considering has many important differences. What this means is that the time between the allocation and setting the last field to its value is GC unsafe and the kinds of things the program can do should be restricted, particularly it must not allocate memory, therefore preparing the field values before allocating the memory is a good idea, even though it requires more stack space.
Create and initialise cell in one step
# Place the values of the fields onto the stack pick x_location load 3 new point_struct_id # Optionally apply tag. load tag_value make_tag
In this case the new
instruction (rather than alloc
above) allocates the
memory and fills it in.
This probably makes things a little clearer, which could make errors less
likely but I’m not sure if it has any other advantages.
It makes last call modulo construction optimisation more difficult.
GC considerations
The garbage collector (GC) must be able to locate all the pointers from heap
objects to other heap objects, but also from the stack and registers
(roots) into the heap.
Below we will not discuss registers, since this is a stack machine that is
okay.
During execution some stack slots are registers, however that’s an
implementation detail not relevant at the current abstraction level.
This means that it must know which stack slots contain pointers, and which
do not (because they contain ints
or floats
).
Data structures containing this information are known as
stack maps and
can potentially be a lot of data!
Most if not all accurate garbage collectors reduce the amount of data by only storing this information for specific points in the program, known as GC safe points. It is pretty clear that a GC safe point should not be placed during initialisation in the first two examples. If a GC safe point is needed it should be before the allocation occurs, a collection may be triggered then anyway.
Regardless in addition to initialising a structure this code should ensure that whether or not each stack slot is a pointer is reflected correctly in the current stack map at any GC safe points. This appears to add more work for the later two options. However it is not much more work, since the variables used for the fields of a construction are already on the stack and need to be marked correctly in the stack map.
When memory is allocated and has been initialised then the GC must also know which if its fields is a pointer. Like other things there are multiple ways to handle this, however I will probably provide a bitmap to the garbage collector at allocation/initialisation time. I haven’t shown either this, or any code related to the stack maps in the examples above. Because the initialisation phase in the first two examples is not GC safe, then it probably doesn’t matter whether the code provides this bitfield to the collector at allocation time, or once the object is completely initialised.
Finally this bitfield needs to be created or updated by the generated code based on the same information used by the stack maps. Why not statically? Surely bitfields could be shared between structures of the same type? Because Plasma will support polymorphism and therefore a polymorphic field’s actual type may or may not be a pointer. I have not thought deeply about how to do this. I don’t think that my choice of how to allocate and initialise memory will affect how to handle stack maps, bit fields in the heap or polymorphism. Nevertheless it is important to give some thought to these problems now.
Conclusion
I will most likely choose the second option. It offers more versatility than the third, it may make LCMC and other optimisations easier. It also restricts garbage collection less than the first option.
Of course at this stage in development I’m willing to change my mind if problems arise.