GAP Internals - GASMAN
The following post will only be of interest to people who:
- Know a bit about GAP’s internals
- Know a little bit of C
- Want to know more about how GAP handles memory.
I’m also going to not explain much about GAP kernel programming in general – if you would like me to, then tell me and I might write something up! This post is divided into two sections. A brief overview of gasman, GAP’s memory manager, and then some practical examples of errors I have found.
Overview
The most important feature of GAP’s memory manager is that it moves memory on garbage collection. This causes problems, because in C it is natural to have pointers, and these pointers can become invalidated.
So, how does GAP represent memory, and do this garbage collection? How, when we move an object, can we ensure that every pointer to that object is updated to it’s new location?
The answer is (as with many things in computing), another layer of indirection. GAP’s main function for allocating memory is NewBag
. Unlike C’s malloc
, which returns a pointer, NewBag
returns a Bag
, which is a pointer to a pointer. These Bag
pointers are known as master pointers. We can see that a Bag
is two points from it’s definition:
typedef UInt * * Bag;
To get to the actual allocated memory, we must dereference our Bag
twice. Why do this? To move a normal block of C memory you have to move every pointer which points to it. We can have many references to a Bag
, and to move the memory it points to we only have to change one pointer – the first level of indirection.
Here is some pseudo-code showing the idea (don’t do this in practice, GAP has other bookkeeping to do and this would cause GAP to horribly explode!).
void moveBag(Bag b, void* newLocation, size_t len) {
// First, move the Bag's contents
(newLocation, *b, len);
memcpy// Then, move the master pointer
*b = newLocation;
}
So, how does this effect us? There are two main issues this causes, which unfortunately are in conflict:
- It is slightly slower to access memory in a
Bag bag
than with a simple C pointer, as we have to keep dereferencing pointers twice. Therefore it is tempting to calculate*bag
, and cache the value. - We must recalculate
*bag
, or any explicit C pointer inside a bag, whenever an GC occurs. This can occur whenever a GAP memory allocation function, in particularNewBag
orResizeBag
, are called.
The second point is the most important, as the bugs this creates are very hard to find. To hit a bug caused by holding a pointer to the inside of a bag we need to:
- Take a pointer to the inside of a
Bag
- Call
NewBag
,ResizeBag
, or a function which calls one of these. - Have that allocation cause a garbage collection to occur.
Most allocations don’t cause a garbage collection, so don’t cause an issue. Also, the resulting errors are incredibly hard to fix – if a GC happens, then you try to write into your Bag
, you end up writing over a random location in a random other Bag
– and this change may not be noticable until much later, when you then access that now corrupted bag!
Examples
These are all genuine examples of this kind of bug, from the GAP 4.8 beta release. Many of them had been around for years, but I made a deep effort to dig them out (how? That’s for a later post!) I suspect these bugs were, very occasionally, causing unexpected crashes but they were extremely hard to reproduce. I will group them into categories.
Bad Strings
It is often tempting to take a raw C pointer to a string in GAP – these are required to call C standard library functions for example. However, we must not hold these raw C pointers over a call to NewBag
!
The standard function to get a raw C pointer from a GAP string CSTR_STRING
(this actually only works for certain GAP strings, but we won’t go into that here).
These first three set off immediate alarm bells. We are passing a C string (from CSTR_STRING
) into functions which sound like they might allocate a new bag at some point (and they all do). Therefore we have to rewrite them to avoid that.
Note there is nothing wrong with passing the C string "object, value"
into a function – it is just C strings which we stored inside GAP Bag
s we have to worry about.
// src/opers.c : FuncSETTER_FUNCTION
= NewFunctionCT( T_FUNCTION, SIZE_FUNC, CSTR_STRING(fname), 2,
func "object, value", DoSetterFunction );
// src/opers.c : FuncGETTER_FUNCTION
= NewFunctionCT( T_FUNCTION, SIZE_FUNC, CSTR_STRING(fname), 1,
func "object, value", DoGetterFunction );
// src/streams.c : FuncREAD_GAP_ROOT
return READ_GAP_ROOT(CSTR_STRING(filename)) ? True : False;
Sometimes, we just call CSTR_STRING
too early. In this next function we just calculated init_key
too early, and moved it’s construction later in the function.
// src/intfuncs.c : FuncInitRandomMT
= CHARS_STRING(initstr); init_key
We need to explain some functions for this next example. NameGVar
gives us a raw char*
to a string, and C_NEW_STRING_DYN
makes a new GAP string by copying a C string. Therefore we are, in a round-about way, copying a GAP string. However, this copy must make a new GAP bag, which will invalidate name
! The answer is to use a string copy function which directly copies a bag.
// src/gvars.c : AssGVar
* name;
Char= NameGVar(gvar);
name (onam, name); C_NEW_STRING_DYN
Holding pointers
This class of bugs is just holding a generic C pointer too long. In this code snippet (repeated similarly in many functions in the same file), we calculate cache
before we start looping, then use it each time around the loop. The loop makes new bags, so we must recalculate cache
before each use.
// src/opers.c : DoOperation0Args (and many others)
/* try to find an applicable method in the cache */
= 1+ADDR_OBJ( CacheOper( oper, 2 ) );
cache ...
do {
...
if ( cache[i] != 0 && cache[i+1] == prec) {
= cache[i];
method ...
} while (res == TRY_NEXT_METHOD );
Bad Assignments
This is the most complicated type of error, and most common. It is caused partly by the strangeness of the C programming language.
Let’s pick one, and pull it apart (I’ve unfolded some macros for you). In this example, NAME_FUNC
takes a Bag
which represents a function, and lets us name the function.
// src/opers.c : NewOperationArgs
#define NAME_FUNC(func) (*( *(func) + 8))
(func) = CopyObj( name, 0 ); NAME_FUNC
At first this seems fine. We calculate CopyObj
(which might do a garbage collection), then we do the derefencing within NAME_FUNC
to find where we are assigning to, then do the assignment. Except, that’s not how C works!.
The compiler is allowed (and does) mix up the left and right hand sides of an =
, and if it calculates the left first to get the memory location we are writing to, then calculates the right (causing a garbage collection), then assigning, bad things happen! Fixing this is easy (if boring), just put an explicit extra assignment in.
// src/opers.c : NewOperationArgs
#define NAME_FUNC(func) (*( *(func) + 8))
= CopyObj( name, 0 );
Bag temp (func) = temp; NAME_FUNC
The same problem occurs in these other examples.
// src/opers.c : SetterAndFilter
(setter) = SetterFilter( FLAG1_FILT(getter) );
FLAG1_FILT(setter) = SetterFilter( FLAG2_FILT(getter) ); FLAG2_FILT
// src/opers.c : FuncGETTER_FUNCTION
(func) = INTOBJ_INT( RNamObj(name) ); ENVI_FUNC
Conclusions
In general, the important thing is to be careful whenever taking raw pointers, and resisting the urge to hold them for a long time – don’t worry about efficiency unless you have profiled your code! Also, never call a function on both the left, and right, hand side of an assignment.