Intel® Memory Protection Extensions (Intel® MPX) support in the GCC compiler


Invalid memory access problem is commonly found in many C/C++ programs and leads to time consuming debugging, program instability and vulnerability. Many attacks exploit software bugs related to inappropriate memory accesses caused by buffer overflow (or buffer overruns). Existing set of techniques/tools to find such memory bugs in the programs and defend them from the attacks are software only solutions which result in poor performance of the protected code.

We present the new compiler pass which adds support for recently announced technology Intel® Memory Protection Extensions ( and is a part of the support of this technology in the OS kernel, binutils and system libraries (see #Links section below).

Intel® MPX introduces new registers called bound registers to hold bounds for pointer. Intel® MPX also introduces new instructions to create bounds, move bounds, load and store bounds in bounds tables, and check pointers against bounds (see details in ). The new compiler pass Pointers Checker is designed in the most general way so support could be implemented for any other target using similar hardware feature or software emulation. Current support could be considered as enabling of the technology, there will be more changes for performance tuning.

How to get Intel® MPX enabled GCC

Currently all necessary changes could be found in separate branch named 'mpx' ( or svn:// and considered for submission into the GCC trunk.

Note that you need Intel® MPX enabled binutils for that (see #Links section). You should either install new binutils on your system or use --with-as and --with-ld configure options. To produce instrumented binaries you should add '-fcheck-pointer-bounds -mmpx' flags to your compilation string. Note that these flags are supported for C/C++ and x86 target only. For more info on new flags see #Compiler options section.

Definition of terms

Here are some terms we will use in this article.

Code instrumentation

Memory protection is achieved via code instrumentation. Each memory reference is instrumented with checks of base pointer used for memory access against bounds associated with that pointer. The main problem for the compiler is to determine correct bounds for each dereferenced pointer.

Bounds computation

Bounds are computed by compiler analysis of data flow for used pointer. Note that pointer arithmetic and type casts do not affect bounds values and we just keep looking until reach one of basic pointer sources. There are five such sources:

There are some cases when we cannot find valid sources of pointer. It happens when pointer is a result of cast of non pointer type into pointer. By default compiler does not check such pointers (uses INIT bounds for them). We are considering having a compiler switch to change this behavior.

Bounds table management

Compiler is responsible for managing Bounds Table. It has to find all pointer stores and generate corresponding Bounds Table modification. It is important that in Bounds Table we do not associate pointer location or pointer itself with bounds, but associate pair of pointer and its location with bounds. It allows us:


When object's field address is taken we narrow object's bounds to the field. No narrowing is applied when address of array element is taken. Following rules are applied when narrowing takes place (including nested field accesses):

There are some compilation flags which affect narrowing. See #Compiler options.

Programming Model

In the most cases enabling Pointer Checker is as simple as adding a single compiler switch. But in some cases user may be required to change code to avoid unexpected bound violations. Here are some examples when user may have to change code:

  1. Field address. In some cases address of the structure's field is used to access whole structure. In instrumented code it may cause failures due to narrowing. A different way to obtain field's address should be used in such case.
    • Example:

    •     struct S
            int a;
            int b[10];
            int c;
          #define OFFSETOF(s,f) (((char *)(&s)) + __builtin_offsetof (typeof (s), f))
          void print (int *p, int i)
            printf ("%d\n", p[i]);
          int foo (S &s)
            print ((int *)s.b, 10); //Bounds violation
            print (&s.c, -1); //Bounds violation
            print ((int *)OFFSETOF (s, b), 10); //OK
            print ((int *)OFFSETOF (s, c), -1); //OK
  2. Variable arrays in objects. It is a common practice to have variable-length object using flexible array member. To avoid problems with narrowing compiler should know about all such fields. It automatically assumes that array field with zero length has variable length, but there are cases when flexible arrays are declared with non-zero length. In such cases you should mark them with a special attribute 'bnd_variable_size' (See #Compiler intrinsics and attributes)

In some cases users also may want to manually control bounds for some pointers (e.g. when custom memory allocation is used), modify Bounds Table (e.g. to reflect pointer update in memory performed in legacy code) and introduce manual checks (e.g. before legacy code call). Compiler provides a set of intrinsics for manual code instrumentation. Note that all intrinsics behavior depends on -fcheck-pointer-bounds switch.

Compiler intrinsics and attributes

Here is a full list of intrinsics added to GCC:

  1. void * __bnd_set_ptr_bounds (const void * q, size_t size)

    • Return a new pointer with the value of q, and associate it with the bounds [q, q+size-1]
    • Example:

    • p = __bnd_set_ptr_bounds (q, 8); //Associate p with bounds [q, q + 7] and value q
                                       //Equal to p = q when instrumentation is disabled
  2. void * __bnd_narrow_ptr_bounds (const void *p, const void *q, size_t size)

    • Return a new pointer with the value of p and associate it with the narrowed bounds formed by the intersection of bounds associated with q and the [p, p + size - 1].
    • Example:

    • r = __bnd_narrow_ptr_bounds (p, q, 8); //Associate pointer r with bounds formed by the intersection (bnd(q), [p, p + 7]) and the value p
                                             //Equal to q = r when instrumentation is disabled
  3. void * __bnd_copy_ptr_bounds (const void *q, const void *r)

    • Return a new pointer with the value of q, and associate it with the bounds already associated with pointer r (essentially BNDMOV with pointer association).
    • Example:

    • p = __bnd_copy_ptr_bounds (q, r); //Associate pointer p with bounds of r and the value q
                                        //Equal to p = q when instrumentation is disabled
  4. void * __bnd_init_ptr_bounds (const void *q)

    • Return a new pointer with the value of q, and associate it with INIT bounds.
    • Example:

    • p = __bnd_init_ptr_bounds (q); //Associate pointer p with INIT bounds and the value q
                                     //Equal to p = q when instrumentation is disabled
  5. void * __bnd_null_ptr_bounds (const void *q)

    • Return a new pointer with the value of q, and associate it with NULL bounds.
    • Example:

    • p = __bnd_null_ptr_bounds (q); //Associate pointer p with NULL bounds and the value q
                                     //Equal to p = q when instrumentation is disabled
  6. void __bnd_store_ptr_bounds (const void **ptr_addr, const void *ptr_val)

    • Store the bounds associated with pointer ptr_val and location ptr_addr into Bounds table. This can be useful to propagate bounds from legacy code without touching the associated pointer's memory when pointers were copied as integers.
    • Example:

    • __bnd_store_ptr_bounds (p, q); //Store the bounds associated with pointer q and location p to Bounds Table.
                                     //Ignored when instrumentation is disabled
  7. void   __bnd_chk_ptr_lbounds (const void *q)

    • Check if the pointer is within the lower bounds of its associated bounds.
    • Example:

    • __bnd_chk_ptr_lbounds (q); //Gets the bounds associated with q, and do lower bound check on it with q
                                 //Ignored when instrumentation is disabled
  8. void   __bnd_chk_ptr_ubounds (const void *q)

    • Check if the pointer is within the upper bounds of its associated bounds.
    • Example:

    • __bnd_chk_ptr_ubounds (q); //Gets the bounds associated with q, and do upper bound check on it with q
                                 //Ignored when instrumentation is disabled
  9. void   __bnd_chk_ptr_bounds (const void *q, size_t size)

    • Check that [q, q + size - 1] is within the lower and upper bounds of its associated bounds.
    • Example:

    • __bnd_chk_ptr_bounds (q, 8); //Gets the bounds associated with q, and do bounds check on it with [q, q + 7]
                                   //Ignored when instrumentation is disabled
  10. const void * __bnd_get_ptr_lbound (const void * q)

    • Return the lower bound (which is a pointer) associated with the pointer q. This is at least useful for debugging using printf.
    • Example:

    • lb = __bnd_get_ptr_lbound (q);
      printf ("lb(q)=%p", lb);
  11. const void * __bnd_get_ptr_ubound (const void * q)

    • Return the upper bound (which is a pointer) associated with the pointer q. This is at least useful for debugging using printf.
    • Example:

    • ub = __bnd_get_ptr_ubound (q);
      printf ("ub(q)=%p", ub);

Here is a full list of attributes added to GCC:

  1. bnd_legacy

    • Used to prevent generation of BND prefix and parameter passing code for a legacy function call. Also used to prevent instrumentation for selected functions.
    • Example:

    •       __attribute__((bnd_legacy)) int* legacy_function (int*);
            <some code>
            int *p = legacy_function (q); //No BND prefix for CALL and no bounds arguments passed
    • Example:

    •       __attribute__((bnd_legacy)) int* legacy_function (int* p) //No incoming bounds
                *p = 0; //No bounds checks
                <some code>
                return p; //No BND prefix for RET and no bounds returned
  2. bnd_variable_size

    • This attribute is used to mark variable sized fields in objects.
    • Example:

    • struct dyn_data
        int additional_data_length;
        char contents[4] __attribute__((bnd_variable_size)); //No narrowing for this field
  3. bnd_instrument

    • This attribute is used to mark functions to instrument in case -fchkp-instrument-marked-only is used.

Compiler macros

If user wants to have a special code version for pointer checker, a special macro may be used. Here are macros defined by compiler:

Compiler options

Here is a set of compiler switches that may be used to control instrumentation:

Mixing instrumented and legacy code

Pointer Checker is designed to provide an ability to mix instrumented and legacy code. Legacy code does not experience any change in its functionality. Instrumented applications can link with, call into, or be called from legacy software. It allows granular control on providing protection to higher priority modules first. Possibility to mix instrumented and legacy codes is achieved by the following few rules:

First two conditions should be achieved by designing proper ABI which is backward compatible with the legacy one. The last one is achieved by Bounds Table which associate pointer and its location with bounds. If legacy code changes a pointer then, when we request bounds for new pointer value, Bounds Table detects pointer change and returns INIT bounds. Note that in rare cases Bounds Table mechanism may miss bounds changes. We may model a case when legacy code rewrites a pointer in a memory with pointer of the same value but with different bounds. In such case false bound violation may occur. User is responsible for avoiding such cases. To get higher level of protection try to use instrumentation for modules generating external data.

Other checkers in GCC

Currently GCC has address sanitizer ( and mudflap ( which were added to solve the problem with invalid memory accesses. The main difference of the described approach from these existing solutions is that each memory access is checked against pointer bounds, not against tables of valid addresses. It implies following advantages and drawbacks:

Implementation details

To operate with bounds in GIMPLE we introduce new basic type BOUND_TYPE. We also add additional operand to return statement to hold returned bounds. We do not introduce any new expression to work with new type but use builtin function calls instead.

Builtins used for instrumentation

We define a set of generic builtins operating with bounds but target may have own implementation and provide them via builtin_chkp_function hook. Currently generic builtins are not implemented and target has to provide own version for all of them to support instrumentation pass. Here is a list of introduced builtins:

  1. bnd __chkp_bndmk (const void *lb, size_t s) - BUILT_IN_CHKP_BNDMK

    • Create and return bounds with low bound LB and size S.
  2. void __chkp_bndstx (const void **loc, const void *ptr, bnd b) - BUILT_IN_CHKP_BNDSTX

    • Associate bounds B with pointer PTR and its location LOC in Bounds Table.
  3. bnd __chkp_bndldx (const void **loc, const void *ptr) - BUILT_IN_CHKP_BNDLDX

    • Return bounds associated with pointer PTR and location LOC in Bounds Table.
  4. void __chkp_bndcl (bnd b, const void *ptr) - BUILT_IN_CHKP_BNDCL

    • Check pointer PTR against lower bound in B.
  5. void __chkp_bndcu (bnd b, const void *ptr) - BUILT_IN_CHKP_BNDCU

    • Check pointer PTR against lower bound in B.
  6. bnd __chkp_bndret (void *p) - BUILT_IN_CHKP_BNDRET

    • Return bounds associated with the return value of the last called function.
  7. bnd __chkp_intersect (bnd b1, bnd b2) - BUILT_IN_CHKP_INTERSECT

    • Return intersection of bounds B1 and B2.
  8. bnd __chkp_narrow (const void *ptr, bnd b, size_t s) - BUILT_IN_CHKP_NARROW

    • Return intersection of bounds B and [ptr, ptr + s - 1].
  9. size_t __chkp_sizeof (tree var) - BUILT_IN_CHKP_SIZEOF

    • Return a size of an object VAR. Used when object has incomplete type and its size cannot be determined statically.
  10. const void *__chkp_extract_lower (bnd b) - BUILT_IN_CHKP_EXTRACT_LOWER

    • Extract lower bound from B and return it.
  11. const void *__chkp_extract_upper (bnd b) - BUILT_IN_CHKP_EXTRACT_UPPER

    • Extract upper bound from B and return it.

Target hooks

A set of hooks is introduced to make instrumentation adjustable to different targets. Here is a list of introduced target hooks.

  1. tree builtin_chkp_function (unsigned fcode)

    • Return a target specific hook fndecl to be used instead of generic hook with code FCODE.
  2. tree chkp_bound_type (void)

    • Return a type node to be used for bounds.
  3. enum machine_mode chkp_bound_mode (void)

    • Return a mode to be used for bounds.
  4. enum machine_mode chkp_bound_mode (void)

    • Return a mode to be used for bounds.
  5. tree chkp_make_bounds_constant (HOST_WIDE_INT lb, HOST_WIDE_INT ub)

    • Return constant used to statically initialize constant bounds with specified lower bound LB and upper bounds UB.
  6. int chkp_initialize_bounds (tree var, tree lb, tree ub, tree *stmts)

    • Generate a list of statements STMTS to initialize pointer bounds variable VAR with bounds LB and UB. Return the number of generated statements.
  7. tree fn_abi_va_list_bounds_size (tree fndecl)

    • Return size for va_list object for specified function FNDECL or integer_zero_node if it is a scalar pointer.
  8. rtx load_bounds_for_arg (rtx location, rtx pointer, rtx bnd_slot

    • Expand pass uses this hook to emit insn to get input bounds from specified BND_SLOT for POINTER passed in LOCATION.
  9. rtx store_bounds_for_arg (rtx pointer, rtx location, rtx bounds, rtx bnd_slot)

    • Store BOUNDS passed for POINTER located at LOCATION into BND_SLOT.
  10. rtx load_returned_bounds (rtx slot

    • This hook is used by expand pass to emit insn to load bounds returned by function call in SLOT. Hook returns RTX holding loaded bounds.
  11. rtx store_returned_bounds (rtx slot, rtx bounds

    • This hook is used by expand pass to emit insn to store BOUNDS returned by function call into SLOT.
  12. void setup_incoming_vararg_bounds (cumulative_args_t args_so_far, enum machine_mode mode, tree type, int *pretend_args_size, int second_time

    • Use it to store bounds for anonymous register arguments stored into the stack.
  13. rtx chkp_function_value_bounds (const_tree ret_type, const_tree fn_decl_or_type, bool outgoing)

    • Define this to return an RTX representing the place where a function returns bounds for returned pointers.

Instrumentation clones

In instrumented code each pointer is provided with bounds. For input pointer parameters it means we also have bounds passed. For calls it means we have additional bounds arguments for pointer arguments.

To have all IPA optimizations working correctly we have to express dataflow between passed and received bounds explicitly via additional entries in function declaration arguments list and in function type. Since we may have both instrumented and not instrumented code at the same time, we cannot replace all original functions with their instrumented variants. Therefore we create clones (versions) instead.

Instrumentation clones creation is a separate IPA pass which is a part of early local passes. Clones are created after SSA is built (because instrumentation pass works on SSA) and before any transformations which may change pointer flow and therefore lead to incorrect code instrumentation (possibly causing false bounds check failures).

Instrumentation clones have pointer bounds arguments added rigth after pointer arguments. Clones have assembler name of the original function with suffix added. New assembler name is in transparent alias chain with the original name. Thus we expect all calls to the original and instrumented functions look similar in assembler.

During instrumentation versioning pass we create instrumented versions of all function with body and also for all their aliases and thunks. Clones for functions with no body are created on demand (usually during call instrumentation).

Original and instrumented function nodes are connected with IPA reference IPA_REF_CHKP. It is mostly done to have reachability analysis working correctly. We may have no references to the instrumented function in the code but it still should be counted as reachable if the original function is reachable.

When original function bodies are not needed anymore we release them and transform functions into a special kind of thunks. Each thunk has a call edge to the instrumented version. These thunks help to keep externally visible instrumented functions visible when linker resolution files are used. Linker has no info about connection between original and instrumented function and therefore we may wrongly decide (due to difference in assember names) that instrumented function version is local and can be removed.

Instrumentation pass

The instrumentation pass works after all instrumentation clones are created and code is transformed into SSA form. Here is a list of pass responsibilities (all example dumps use i386 target versions of builtins):

Static constructors

We introduce static constructors to initialize static bounds and Bounds Table for statically initialized pointers. All statically initialized objects are registered from front-end parsers in a special map. After module compilation we generate constructor (or few constructors to avoid very big functions) to initialize all emitted static pointers. The technique to build such constructors is very simple. We just create functions holding initialization code and put special mark (attribute) to the function. During instrumentation pass all required Bound Table modifications are added automatically and after that we (detecting constructor mark) just remove original initialization code.

Support in expand

We introduce some changes in expand pass to handle incoming bounds, bounds passed to calls and returned bounds. We define two new macros (DECL_BOUNDS_RTL and SET_DECL_BOUNDS_RTL) to associate PARM_DECL and RESULT_DECL with RTL slot where input or returned bounds are located.
To handle bounds passed to calls we change structure representing an argument by adding fields to hold original bounds, expanded bounds and bounds slot to be used to pass bounds. If passed argument should have a slot for its bounds then target hook function_arg returns it in PARALLEL expr (similarly to the way few regular registers are returned). Special function chkp_split_slot is used to split returned slot into one used to pass value and other one used to pass bounds. If bounds slot is a register then expanded bounds are just simply moved to it. If it is an integer constant then it is id of special slot to Bounds Table to be used for bounds passing. Target hook store_bounds_for_arg is used to store it then. The same hook is used when bounds slot is NULL (i.e. when pointer is passed in a memory, bounds are stored in Bounds Table in a regular way). Bounds loading happens in a similar way but using load_bounds_for_arg target hook.

Target support

Currently support in implemented for i386 target only. Target support consists of:

Known issues

* Support of the Intel® MPX in binutils -

* Since real hardware does not exist so far, anyone can use Intel® SDE to try new technology, see Binaries of MPX binutils and GCC are provided there.

None: Intel MPX support in the GCC compiler (last edited 2014-10-14 06:41:25 by IlyaEnkovich)