This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026

From: Qing Zhao <qing dot zhao at oracle dot com>
To: Jakub Jelinek <jakub at redhat dot com>
Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, jeff Law <law at redhat dot com>, richard Biener <rguenther at suse dot de>
Date: Fri, 15 Dec 2017 11:17:37 -0600
Subject: Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
Authentication-results: sourceware.org; auth=none
References: <C5493A43-682F-4977-BDC8-5A21BB2CE7CA@oracle.com> <20171214204542.GB2353@tucnak> <D98A8D57-3A65-4DA2-B661-1E0531C380BF@oracle.com> <20171215164245.GL2353@tucnak>

> On Dec 15, 2017, at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Fri, Dec 15, 2017 at 10:08:03AM -0600, Qing Zhao wrote:
>> a little confused here:
>> 
>> in the current code:
>> 	. the first case is:          result = strcmp() != 0
>> 	. the second case is:    if (strcmp() != 0)
>> 
>> so, the missing case you mentioned above is:
>> 
>>        result = if (strcmp() != 0) 
>> 
>> or something else?
> 
> result = (strcmp () != 0 ? 15 : 37);
> or similar.  Though, usually COND_EXPRs are added by the tree-if-conversion
> pass, so you might need -ftree-loop-if-convert option and it probably needs
> to be within some loop which will have just a single bb after the
> if-conversion.

I see. thanks.

>> 
>>> if you just use else, you don't need to initialize const_string_leni
>>> either.
>> 
>> I think that const_string_leni still need to be initialized in this case, because when idx2 is non-zero,  
>> const_string_leni is initialized to compute_string_length (idx2). 
> 
> Sure.  But
>  type uninitialized_var;
>  if (cond1)
>    uninitialized_var = foo;
>  else if (cond2)
>    uninitialized_var = bar;
>  use (uninitialized_var);
> is a coding style which asks for -Wmaybe-uninitialized warnings, in order
> not to warn, the compiler has to prove that cond1 || cond2 is always true,
> which might not be always easy for the compiler.

in my case, I already initialize the “uninitialized_var” when declared it:

  HOST_WIDE_INT const_string_leni = -1;

  if (idx1)
    {
      const_string_leni = compute_string_length (idx1);
      var_string = arg2;
    }
  else if (idx2)
    {
      const_string_leni = compute_string_length (idx2);
      var_string = arg1;
    }

so, the -Wmaybe-uninitialized should NOT issue warning, right?

but anyway, I can change the above as following:

 HOST_WIDE_INT const_string_leni = -1;

  if (idx1)
    {
      const_string_leni = compute_string_length (idx1);
      var_string = arg2;
    }
  else
    {
      gcc_assert (idx2);
      const_string_leni = compute_string_length (idx2);
      var_string = arg1;
    }

is this better?

> 
>>> This is something that looks problematic to me.  get_range_strlen returns
>>> some conservative upper bound on the string length, which is fine if
>>> var_string points to say a TREE_STATIC variable where you know the allocated
>>> size, or automatic variable.  But if somebody passes you a pointer to a
>>> structure and the source doesn't contain aggregate copying for it, not sure
>>> if you can take for granted that all the bytes are readable after the '\0'
>>> in the string.  Hopefully at least for flexible array members and arrays in
>>> such positions get_range_strlen will not provide the upper bound, but even
>>> in other cases it doesn't feel safe to me.
>> 
>> this is the part that took me most of the time during the implementation. 
>> 
>> I have considered the following 3 approaches to decide the size of the variable array:
>> 
>> 	A. use “compute_builtin_object_size” in tree-object-size.h to decide the size of the
>> object.   However, even with the simplest case, it cannot provide the information. 
> 
> compute_builtin_object_size with modes 0 or 1 computes upper bound, what you
> are really looking for is lower bound,

you mean: 0, 1 is for maximum object size, and 2 is for minimum object size?

yes, I am looking for minimum object size for this optimization. 

> so that would be mode 2, though that
> mode isn't actually used in real-world code and thus might be not fully
> tested.

so, using this routine with mode 2 should be the right approach to go? and we need fully testing on this too?

> 
>> 	B. use “get_range_strlen” in gimple-fold.h to decide the size of the object.  however, 
>> it cannot provide valid info for simple array, either. 
> 
> get_range_strlen returns you a range, the minval is not what you're looking
> for, that is the minimum string length, so might be too short for your
> purposes.  And maxval is an upper bound, but you are looking for lower
> bound, you need guarantees this amount of memory can be accessed, even if
> there is 0 in the first byte.

my understanding is that: get_range_strlen returns the minimum and maximum length of the string pointed by the 
pointer, and the maximum length of the string is determined by the size of the allocated memory pointed by the
pointer, so, it should serve my purpose,   did I misunderstand it?

> 
>>> Furthermore, in the comments you say that you do it only for small strings,
>>> but in the patch I can't see any upper bound, so you could transform strlen
>>> that would happen to return say just 1 or 2 with a function call that
>>> possibly reads megabytes of data (memcmp may read all bytes, not just stop
>>> at the first difference).
>> 
>> do you mean for very short constant string, we should NOT change it to a. call to memcmp?  instead we should just 
>> inline it with byte comparison sequence?
> 
> I mean we should never ever replace strcmp or strncmp call with library call to
> memcmp, you don't know how it is implemented and it could access any bytes
> in the new range, while previously strcmp/strncmp wouldn't.  

> If you have
> sufficient guarantees that the memory must be all mapped and thus read-only
> accessible (see above, I don't think get_range_strlen provides that), it
> might be fine to inline expand the memcmp some way, because there you have
> full control over what the compiler will emit.

agreed. and the size check in the implementation is try to guarantee the transformation
is safe.

so, the major issue here is whether “get_range_strlen” can provide the good info on this. whether we should use
compute_builtin_object_size mode 2 for this purpose.

I can change to use “compute_builtin_object_size” and mode 2, and try to see whether any improvement is needed 
to compute_builtin_object_size for simple cases.

what’s your opinion?


>  But if we find out during
> expansion we don't want to expand it inline, we should fall back to calling
> strcmp or strncmp.

under what situation we will NOT expand the memcpy_eq call inline?

Qing

>  Which means, instead of changing the call to the
> memcmp_eq builtin, change it to some STRCMP or STRNCMP internal function,
> ensure it is expanded inline the way you prefer and if it can't, it will
> call strcmp or strncmp.
> 
> 	Jakub

Follow-Ups:
- Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  - From: Jakub Jelinek

References:
- [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  - From: Qing Zhao
- Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  - From: Jakub Jelinek
- Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  - From: Qing Zhao
- Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  - From: Jakub Jelinek

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]