This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Using function clones for Pointer Bounds Checker

2014-05-15 15:27 GMT+04:00 Richard Biener <>:
> On Thu, May 15, 2014 at 1:07 PM, Ilya Enkovich <> wrote:
>> 2014-05-14 19:09 GMT+04:00 H.J. Lu <>:
>>> On Wed, May 14, 2014 at 1:18 AM, Ilya Enkovich <> wrote:
>>>> 2014-05-13 23:21 GMT+04:00 Jeff Law <>:
>>>>> On 05/13/14 02:38, Ilya Enkovich wrote:
>>>>>>>> propagate constant bounds value and remove checks in called function).
>>>>>>> So from a linking standpoint, presumably you have to mangle the
>>>>>>> instrumented
>>>>>>> caller/callee in some manner.  Right?  Or are you dynamically dispatching
>>>>>>> somehow?
>>>>>> Originally the idea was o have instrumented clone to have the same
>>>>>> assembler name as the original function. Since instrumented code is
>>>>>> fully compatible with not instrumented code, we always emit only one
>>>>>> version. Usage of the same assembler name allows instrumented and not
>>>>>> instrumented calls to look similar in assembler. It worked fine until
>>>>>> I tried it with LTO where assembler name is used as a unique
>>>>>> identifier. With linker resolutions files it became even more harder
>>>>>> to use such approach. To resolve these issues I started to use new
>>>>>> assembler name with postfix, but linked with the original name using
>>>>>> IDENTIFIER_TRANSPARENT_ALIAS. It gives different assembler names for
>>>>>> clones and originals during compilation, but both clone and original
>>>>>> functions have similar name in output assembler.
>>>>> OK.  So if I read that correctly, it implies that the existence of bounds
>>>>> information does not change the signature of the callee.   This is obviously
>>>>> important for C++.
>>>>> Sounds like I need to sit down with the branch and see how this works in the
>>>>> new scheme.
>>>> Both mpx branch and Wiki
>>>> (
>>>> page are up-to-date now and may be tried out either in NOP mode or
>>>> with simulator. Let me know if you have any troubles with using it.
>>> I built it.  But "-fcheck-pointer-bounds -mmpx" doesn't generate
>>> MPX enabled executable which runs on both MPX-enabled and
>>> non MPX-enabled hardwares. I didn't see any MPX run-time library.
>> Just checked out the branch and checked generated code.
>> #cat test.c
>> int
>> test (int *p, int i)
>> {
>>   return p[i];
>> }
>> #gcc -fcheck-pointer-bounds -mmpx test.c -S -O2
>> #cat test.s
>>         .file   "test.c"
>>         .section        .text.unlikely,"ax",@progbits
>> .LCOLDB0:
>>         .text
>> .LHOTB0:
>>         .p2align 4,,15
>>         .globl  test
>>         .type   test, @function
>> test:
>> .LFB1:
>>         .cfi_startproc
>>         movslq  %esi, %rsi
>>         leaq    (%rdi,%rsi,4), %rax
>>         bndcl   (%rax), %bnd0
>>         bndcu   3(%rax), %bnd0
>>         movl    (%rax), %eax
>>         bnd ret
>>         .cfi_endproc
>> ...
>> Checks are here. What do you see in your test?
> Wow, that's quite an overhead compared to the non-instrumented variant
>         movslq  %esi, %rsi
>         movl    (%rdi,%rsi,4), %eax
>         ret

Overhead is actually two instructions - checks for lower and upper
bounds. lea instruction is probably a miss-optimization. Checks are
cheap instructions and do not introduce new dependencies for the load
which is the heaviest here. BTW checks are not the main reason for
overhead in an instrumented code, it is a bounds tables management
(store/load bounds for stored/loaded pointers) which is. Anyway it is
too early to speak about overhead until we have hardware to measure

> I thought bounds-checking was done with some clever prefixes thus
> that
>         movslq  %esi, %rsi
>         bndmovl    (%rdi,%rsi,4), %eax, %bnd0
>         bnd ret
> would be possible (well, replace with valid ISA).

Doubt it would be possible to encode it keeping backward compatible
with existing hardware. Also putting all logic into one instruction
does not mean it is executed faster than a sequence of instructions,
especially on out-of-order CPUs.


> Richard.
>> Ilya
>>> --
>>> H.J.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]