[PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

Ilya Enkovich enkovich.gnu@gmail.com
Mon Nov 11 14:09:00 GMT 2013


2013/11/11 Richard Biener <richard.guenther@gmail.com>:
> On Thu, Nov 7, 2013 at 10:07 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>> 2013/11/7 Jeff Law <law@redhat.com>:
>>> On 11/04/13 05:40, Richard Biener wrote:
>>>>>
>>>>>
>>>>> Effectively the bounds are passed "on the side".
>>>>
>>>>
>>>> Well, not exactly.  I can see __bound_tmp.0_4 passed to access_and_store.
>>>
>>> I'm referring to how they're dealt with in FUNCTION_ARG and friends, ie, the
>>> low level aspects.  Maybe that's why we're crossing wires here.
>>>
>>>
>>>
>>>>
>>>> Are __builtin_ia32_bndcu and access_and_store supposed to be
>>>> called directly after each other?  What if for example profile
>>>> instrumentation
>>>> inserts a call inbetween them?
>>>
>>> That's a question for Ilya.
>>
>> I think there is some misunderstanding. __builtin_ia32_bndcu and
>> access_and_store calls are completely independent in this example.
>> This is just a code from example on how user-level builtins may be
>> used with legacy code.
>>
>>>
>>>
>>>>>
>>>>> You can't actually interleave them -- that results in MPX and normal code
>>>>> not being able to interact.   Passing the bound at the end doesn't really
>>>>> work either -- varargs and the desire to pass some of the bounds around
>>>>> in
>>>>> bound registers.
>>>>
>>>>
>>>> I'm only looking at how bound arguments are passed at the GIMPLE
>>>> level - which I think is arbitrary given they are passed on-the-side
>>>> during code-generation.  So I'm looking for a more "sensible" option
>>>> for the GIMPLE representation which looks somewhat fragile to me
>>>> (it looks the argument is only there to have a data dependence
>>>> between the bndcu intrinsic and the actual call?)
>>>
>>> OK, we're clearly looking at two different aspects here.  While I think we
>>> can express them arbitrarily in GIMPLE, we have to pass them on the side
>>> once we get into the low level code for compatibility purposes.
>>>
>>> I think argument passing for MPX is going to ultimately be ugly/fragile
>>> regardless of what we do given the constraints.  Given we're passing on the
>>> side at the low level, I'd prefer them passed on the side in GIMPLE, but I'm
>>> willing to consider other alternatives.
>>>
>>>
>>>>
>>>>>>
>>>>>> /* Return the number of arguments used by call statement GS
>>>>>>      ignoring bound ones.  */
>>>>>>
>>>>>> static inline unsigned
>>>>>> gimple_call_num_nobnd_args (const_gimple gs)
>>>>>> {
>>>>>>     unsigned num_args = gimple_call_num_args (gs);
>>>>>>     unsigned res = num_args;
>>>>>>     for (unsigned n = 0; n < num_args; n++)
>>>>>>       if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
>>>>>>         res--;
>>>>>>     return res;
>>>>>> }
>>>>>>
>>>>>> the choice means that gimple_call_num_nobnd_args is not O(1).
>>>>>
>>>>>
>>>>> Yes, but I don't see that's terribly problematical.
>>>>
>>>>
>>>> Well, consider compile.exp limits-fnargs.c written with int * parameters
>>>> and bound support ;)  [there is a limit on the number of bound registers
>>>> available, but the GIMPLE parts doesn't put any limit on instrumented
>>>> args?]
>>>
>>> Right.  Go back to Ilya's earlier messages :-)
>>>
>>> There's these two magic hooks TARGET_{LOAD,STORE}_BOUNDS_FOR_ARG.  At some
>>> point when you run out of regs, you start shoving these things into memory.
>>> I think we're ultimately going to need further clarification here.
>>>
>>>
>>>>>>
>>>>>> GIMPLE layout depending on flag_check_pointer_bounds sounds
>>>>>> like a recipie for desaster if you consider TUs compiled with and
>>>>>> TUs compiled without and LTO.  Or if you consider using
>>>>>> optimized attribute with that flag.
>>>>>
>>>>>
>>>>> Sorry, I don't follow.  Can you elaborate please.
>>>>
>>>>
>>>> Compile TU1 with -fno-chk, TU2 with -fchk, LTO both object files
>>>> which gives you - well, either -fno-chk or -fchk.  Then you read in
>>>> the GIMPLE and when calling gimple_call_nobnd_arg you get
>>>> a mismatch between where you generated the gimple and where
>>>> you use the gimple.
>>>>
>>>> A flag on whether a call is instrumented is better (consider inlining
>>>> from TU1 into TU2 where a flag on struct function for example wouldn't
>>>> be enough either).
>>>
>>> OK.  I see what you're saying.  Whether or not we have these extra args is a
>>> property of the call site, not the CFUN, target DECL or the TU. That makes
>>> sense.
>>>
>>>
>>> Ilya, I think there's enough confusion & issues around the call ABI/API that
>>> we need to sort it out before moving forward in any significant way.
>>>
>>> Let's go back to argument passing and go through some concrete examples of
>>> the various cases.  Make sure to tie them into the calls to
>>> TARGET_{LOAD,STORE}_BOUNDS_FOR_ARG and the new gimple_call_* routines.
>>
>> OK. Lets see at how expand pass handles instrumented calls.  As usual,
>> for each passed argument expand calls target hook to determine where
>> argument is passed.  Now target may return not only slot for arg, but
>> also a slot for bounds in case arg may have them (PARALLEL expr is
>> used to hold all slots).  If bounds slot is register, then expand
>> simply puts bounds to it.  If it is not a register, target hook is
>> called to store bounds (TARGET_STORE_BOUNDS_FOR_ARG).  Incoming bounds
>> are handled in a similar way but with usage of
>> TARGET_LOAD_BOUNDS_FOR_ARG hook.
>>
>> The only remained problem for expand here is to identify which bounds
>> should be passed for arg.  To tie bounds to argument I put all bounds
>> passed for arg (in case arg is a structure, we may have multiple
>> bounds) as additional arguments in the call.  Each regular argument is
>> immediately followed by all bounds passed for it. For example:
>>
>> test (&"Hello world!"[0], 0);
>>
>> is translated into
>>
>> test (&"Hello world!"[0], __bound_tmp.0_1, 0);
>>
>> where __bounds_tmp.0_1 - bounds of passed string.
>>
>> Of course, such call modification has some implications.  Some
>> optimizations may rely on number of arguments in the call (e,g, strlen
>> pass) and indexes of arguments.  Also now index of arg in fndecl does
>> not match the index of actual arg in the call.  For such cases new
>> functions are introduced to get args by it's original index, like
>> there is no instrumentation.  BTW there are not many usages of these
>> new functions and almost all of them are in strlen pass.
>>
>> Also some changes, like the patch I sent for calls modifications and
>> copy, are required.  I suppose such changes are quite simple. and not
>> very wide.
>>
>> Regarding proposal to put all bounds to the end of call's arg list - I
>> do not think it makes life much easier.  It still requires
>> modifications in verifier (because it checks arg count), calls copy
>> etc.  It becomes more complex to identify bounds of arg.  Also GIMPLE
>> dump for calls becomes less informative.  Surely it allows to get rid
>> of gimple_call_nobnd_arg and gimple_call_get_nobnd_arg_index but will
>> require additional interface functions for bound args.
>
> What about passing bounds together with the value in the same slot?
> Like via
>
>   arg = __builtin_tie_arg_and_bound (arg, bound);
>   foo (arg);

There are other ways to get bounds. E.g. load bounds from table. If I
use the same builtin for both binding and bounds load, then It is not
clear what to do in case same bounds are used for multiple aegs. For
Example:

int *p
int foo()
{
  bar(p, p+1, p+2);
}

Here all 3 args have same bounds, which are result of __builtin_bndldx
call. If I use the same builtin for binding then I have 3 loads. So,
probably, a separate builtin is better.

>
> you say you can have bounds for aggregates, do you mean
>
>   struct X { int *x; int *y; };
>   foo (struct X);
>
> the ABI says for a call to foo () you pass two bound arguments?
> Or do you merely mean the degenerate case struct X { int *x; }?

In case of structure we pass bounds for all pointers in it.

Ilya

>
> Richard.
>
>> Ilya
>>
>>>
>>> Jeff



More information about the Gcc-patches mailing list