This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PING 2][PATCH] enhance -Wrestrict to handle string built-ins (PR 78918)


On 11/25/2017 05:53 PM, Martin Sebor wrote:
> On 11/22/2017 04:50 PM, Jeff Law wrote:
>> On 11/16/2017 02:29 PM, Martin Sebor wrote:
>>>> On 10/23/2017 08:42 PM, Martin Sebor wrote:
>>>>> Attached is a reworked solution to enhance -Wrestrict while
>>>>> avoiding changing tree-vrp.c or any other VRP machinery.  Richard,
>>>>> in considering you suggestions I realized that the ao_ref struct
>>>>> isn't general enough to detect the kinds of problems I needed to
>>>>> etect (storing bit-offsets in HOST_WIDE_INT means out-of-bounds
>>>>> offsets cannot be represented or detected, leading to either false
>>>>> positives or false negatives).
>> So this seems to be a recurring theme, which makes me wonder if we
>> should have an ao_ref-like structure that deals in bytes rather than
>> bits and make it a first class citizen.   There's certainly clients that
>> work on bits and certainly clients that would prefer to work on bytes.
> 
> The class I introduced serves a different purpose than ao_ref and
> stores a lot more data.
> 
> In https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82042#c3 Richard
> says that "this [offset being HOST_WIDE_INT and storing bits] is
> a know deficiency in ao_ref 'offset' (and also size and maxsize).
> Blowing up to offset_int isn't really a good idea."
Understood.  My point was that we seem to be stumbling into the same
class of problems in more than once place.  So we may want to consider
having a class that looks like like ao_ref, but which operates in byte
sized chunks.

I don't think it should be a requirement to go forward through.  It's
more of a long term question/concern.


>> So I realize you don't have any code to answer this question, but your
>> thoughts on how much we might loose effectiveness if we didn't do the
>> warnings within gimple_fold_builtin_<whatever>, but instead broke out a
>> distinct pass to handle warnings?  My biggest design concern is the
>> warning from within the folder aspects.
> 
> With optimization enabled the folder folds things like this into
> MEM_REF which would prevent the warning unless the pass pass ran
> with early optimizations.
> 
>   struct S { char a[7], b[7]; };
> 
>   void sink (void*);
> 
>   void f (void)
>   {
>     struct S s;
>     sink (&s);
> 
>     unsigned n = sizeof s.a;
>     memcpy (s.a + 4, s.b, n);
> 
>     sink (&s);
>   }
> 
> Without optimization it isn't folded and so the memcpy call is
> emitted (but there is no warning).  The overlapping MEM_REF copy
> is safe but the overlapping memcpy call is not, so warning on it
> is helpful.
Understood.  I'm just trying to get a sense for how an implementation as
a separate pass would affect the quality of the diags we generate.  I
expect we'd lose some, the question is do we lose so many that the
ultimate result really isn't useful in practice and to balance that
against the various pros/cons of the different approaches.


> 
> I prototyped a pass over Thanksgiving for the -Wrestict code from
> builtins.c to and ran it just after the sprintf pass.  There are
> still lots of failures in the new test because it's non-trivial
> to compute the same data as builtins.c does (the data is also
> computed for -Wstringop-overflow).  I could move all that code
> to the new pass as well to clear up the failures.  I could also
> arrange for the pass to run multiple times to catch cases like
> the one above.  I suspect this would trigger some false negatives
> and/or positives due to the differences in range information, so
> it would mean some cleanup in the tests.  What I can't do without
> seriously compromising the feature is avoid calling into the new
> pass from tree-ssa-strlen.c, but that would presumably be fine.
I note that you posted that as a follow-up.  I'll take a look at it
momentarily.


> 
> With that said, and although I'm not necessarily opposed to it,
> moving all this code into its own pass would mean a non-trivial
> amount of work for what seems like a questionable benefit.  All
> it would achieve, as far as I can see, is duplicating some of
> the work that's already been done: iterating over the GIMPLE,
> testing for built-ins to handle, and extracting their arguments.
> What exactly do you hope to accomplish by moving it into its own
> pass?  (If it's a matter of keeping the warning code separate
> that can easily be done by moving it to its own file.)
By moving the warning into its own pass we get several benefits.

1. There is a general desire not to mix diagnostics and code
transformations.  Splitting it out is in line with that goal.

2. We get the flexibility to put the diagnostic pass wherever in the
pipeline makes the most sense.  We can even have early/late versions to
try and increase precision

3. Sometimes, but not always the analysis can be shared.  In that model
the underlying analysis engine is where the bulk of the work happens.
The optimizations and diagnostics are just clients of the analysis
module.  It also encourages re-use of the analysis module.  Oh how I
wish someone had done this with tree-ssa-uninit.c :(

4. It sometimes makes the code cleaner.   You don't have to worry about
the optimizer coming along and invaliding some chunk of data, or
introducing new names that you don't have information for, etc etc.

Of course there's a series of pros for a single pass that handles both.
 I won't enumerate them.

THe point is to make a reasoned evaluation of the pros and cons of an
integrated vs separate approach.


> 
> The only similarity between builtin_memref and ao_ref is between
> the ctor and ao_ref_init_from_ptr_and_size.  The ctor is a bit more
> involved because it deals with non-constant offsets.  But I'm not
> aware of any corner cases in ao_ref_init_from_ptr_and_size (the
> ctor started as a copy of the function and evolved to what it looks
> like now).
Understood.  Let me take a look at the follow-ups.

Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]