This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Add support to trace comparison instructions and switch statements


Hi

Implementing __sanitizer_cov_trace_cmp[1248]_const is OK . 
And I will try to find some determinate way to judge this comparison is for loop or not.
Because all the loops(for() or while()) seem to be transformed to "if" and "goto" before running sancov pass.
Does it necessary to include APIs for float and double comparison ? like __sanitizer_cov_trace_cmpf(float, float)
Why you do not include them ?

Now, I am following some suggests about my codes from Gribow. Final patch is still on processing.
I am also waiting for copyright assignment of FSF which is requested by Jeff.

With Regards
Wish Wu

------------------------------------------------------------------
From:Dmitry Vyukov <dvyukov@google.com>
Time:2017 Jul 15 (Sat) 13:41
To:Kostya Serebryany <kcc@google.com>
Cc:Wish Wu <wishwu007@gmail.com>; gcc <gcc@gcc.gnu.org>; gcc-patches <gcc-patches@gcc.gnu.org>; Wish Wu <weixi.wwx@antfin.com>; Alexander Potapenko <glider@google.com>; andreyknvl <andreyknvl@google.com>; Victor Chibotaru <tchibo@google.com>; Yuri Gribov <tetra2005@gmail.com>
Subject:Re: Add support to trace comparison instructions and switch statements


On Fri, Jul 14, 2017 at 11:17 PM, Kostya Serebryany <kcc@google.com> wrote:
>>>> > Hi
>>>> >
>>>> > I wrote a test for "-fsanitize-coverage=trace-cmp" .
>>>> >
>>>> > Is there anybody tells me if these codes could be merged into gcc ?
>>>>
>>>>
>>>> Nice!
>>>>
>>>> We are currently working on Linux kernel fuzzing that use the
>>>> comparison tracing. We use clang at the moment, but having this
>>>> support in gcc would be great for kernel land.
>>>>
>>>> One concern I have: do we want to do some final refinements to the API
>>>> before we implement this in both compilers?
>>>>
>>>> 2 things we considered from our perspective:
>>>>  - communicating to the runtime which operands are constants
>>>>  - communicating to the runtime which comparisons are counting loop checks
>>>>
>>>> First is useful if you do "find one operand in input and replace with
>>>> the other one" thing. Second is useful because counting loop checks
>>>> are usually not useful (at least all but one).
>>>> In the original Go implementation I also conveyed signedness of
>>>> operands, exact comparison operation (<, >, etc):
>>>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-defs/defs.go#L13
>>>> But I did not find any use for that.
>>>> I also gave all comparisons unique IDs:
>>>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-dep/sonar.go#L24
>>>> That turned out to be useful. And there are chances we will want this
>>>> for C/C++ as well.
>>>>
>>>> Kostya, did anything like this pop up in your work on libfuzzer?
>>>> Can we still change the clang API? At least add an additional argument
>>>> to the callbacks?
>>>
>>>
>>> I'd prefer not to change the API, but extend it (new compiler flag, new
>>> callbacks), if absolutely needed.
>>> Probably make it trace-cmp-guard (similar to trace-pc-guard, with an extra
>>> parameter that has the ID).
>>> I don't like the approach with compiler-generated constant IDs.
>>
>> Yes, if we do it for C/C++, we need to create globals and pass pointer
>> to a global to the callbacks. IDs do not work for C/C++.
>>
>>> Yes, it's a bit more efficient, but much less flexible in presence of
>>> multiple modules, DSOs, dlopen, etc.
>>>
>>> I was also looking at completely inlining this instrumentation because it's
>>> pretty expensive even in it's current form
>>> (adding more parameters will make things worse).
>>> This is going to be much less flexible, of course, so I'll attack it only
>>> once I settle on the algorithm to handle CMPs in libFuzzer.
>>
>> This will require a new, completely different API for
>> compiler<->runtime anyway, so we can put this aside for now.
>>
>>
>>>> At the very least I would suggest that we add an additional arg that
>>>> contains some flags (1/2 arg is a const, this is counting loop check,
>>>> etc). If we do that we can also have just 1 callback that accepts
>>>> uint64's for args because we can pass operand size in the flags:
>>>
>>>
>>> How many flag combinations do we need and do we *really* need them?
>>>
>>> If the number of flag combinations is small, I'd prefer to have separate
>>> callbacks (__sanitizer_cov_trace_cmp_loop_bound ?)
>>>
>>> Do we really need to know that one arg is a const?
>>> It could well be a constant in fact, but compiler won't see it.
>>>
>>> int foo(int n) {   ... if (i < n) ... }
>>> ...
>>> foo(42);  // not inlined.
>>>
>>> We need to handle both cases the same way.
>>
>>
>> Well, following this line of reasoning we would need only
>> __asan_load/storeN callbacks for asan and remove
>> __asan_load/store1/2/4/8, because compiler might not know the size at
>> compile time. Constant-ness is only an optimization. If compiler does
>> not know that something is a const, fine. Based on my experience with
>> go-fuzz and our early experience with kernel, we badly need const
>> hint.
>> Otherwise fuzzer generates gazillions of candidates based on
>> comparison arguments. Note that kernel is several order of magnitude
>> larger than what people usually fuzz in user-space, inputs are more
>> complex and at the same time execution speed is several order of
>> magnitude lower. We can't rely on raw speed.
>>
>> Thinking of this more, I don't thing that globals will be useful in
>> the kernel context (the main problem is that we have multiple
>> transient isolated kernels). If we track per-comparison site
>> information, we will probably use PCs. So I am ready to give up on
>> this.
>>
>> Both of you expressed concerns about performance. Kostya says we
>> should not break existing clang API.
>> If we limit this to only constant-ness, then I think we can make this
>> both forward and backward compatible, which means we don't need to
>> handle it now. E.g. we can:
>>  - if both operands are const (if it's possible at all), don't emit any callback
>>  - if only one is const, emit __sanitizer_cov_trace_cmp_const1 and
>> pass the const in a known position (i.e. always first/second arg)
>>  - if none are const, emit callback __sanitizer_cov_trace_cmp_dyn1
>> Then compiler emits weak aliases form
>> __sanitizer_cov_trace_cmp_const/dyn1 to the old
>> __sanitizer_cov_trace_cmp1, which makes it backwards compatible.
>> New runtimes that implement __sanitizer_cov_trace_cmp_const/dyn1, also
>> need to provide the old __sanitizer_cov_trace_cmp1 for old compilers.
>
> SGTM++
>
> Although, maybe we can actually break the API this way to keep things
> simpler (no weak aliases)
> * leave __sanitizer_cov_trace_cmp[1248] for general case
> * add __sanitizer_cov_trace_cmp[1248]_const for constant in the 2-nd parameter
> * add __sanitizer_cov_trace_cmp[1248]_loop for loop bound in the 2nd parameter

This would work for us.

> * do we need __sanitizer_cov_trace_cmp[1248]_const_loop ?

We don't know yet. Potentially yes, because const is reading/writing
to a static buffer, while non-const is reading/writing to a buffer
with user-provided size.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]