This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Convert character arrays to string csts


On Tue, Aug 8, 2017 at 1:47 PM, Martin Liška <mliska@suse.cz> wrote:
> On 11/09/2016 11:22 AM, Richard Biener wrote:
>> On Fri, Nov 4, 2016 at 2:33 PM, Martin Liška <mliska@suse.cz> wrote:
>>> On 11/03/2016 02:00 PM, Jan Hubicka wrote:
>>>>> On 11/03/2016 01:12 PM, Martin Liška wrote:
>>>>>> +  tree init = DECL_INITIAL (decl);
>>>>>> +  if (init
>>>>>> +      && TREE_READONLY (decl)
>>>>>> +      && can_convert_ctor_to_string_cst (init))
>>>>>> +    DECL_INITIAL (decl) = build_string_cst_from_ctor (init);
>>>>>
>>>>> I'd merge these two new functions since they're only ever called
>>>>> together. We'd then have something like
>>>>>
>>>>> if (init && TREE_READONLY (decl))
>>>>>   init = convert_ctor_to_string_cst (init);
>>>>> if (init)
>>>>>   DECL_INITIAL (decl) = init;
>>>
>>> Done.
>>>
>>>>>
>>>>> I'll defer to Jan on whether finalize_decl seems like a good place
>>>>> to do this.
>>>>
>>>> I think finalize_decl may be bit too early because frontends may expects the
>>>> ctors to be in a way they produced them.  We only want to convert those arrays
>>>> seen by middle-end so I would move the logic to varpool_node::analyze
>>>>
>>>> Otherwise the patch seems fine to me.
>>>>
>>>> Honza
>>>>>
>>>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>>>>>> index 283bd1c..b2d1fd5 100644
>>>>>> --- a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>>>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>>>>>> @@ -4,12 +4,15 @@
>>>>>> char *buffer1;
>>>>>> char *buffer2;
>>>>>>
>>>>>> +const char global[] = {'a', 'b', 'c', 'd', '\0'};
>>>>>> +
>>>>>> #define SIZE 1000
>>>>>>
>>>>>> int
>>>>>> main (void)
>>>>>> {
>>>>>>   const char* const foo1 = "hello world";
>>>>>> +  const char local[] = "abcd";
>>>>>>
>>>>>>   buffer1 = __builtin_malloc (SIZE);
>>>>>>   __builtin_strcpy (buffer1, foo1);
>>>>>> @@ -45,6 +48,10 @@ main (void)
>>>>>>     __builtin_abort ();
>>>>>>   if (__builtin_memchr (foo1, null, 12) != foo1 + 11)
>>>>>>     __builtin_abort ();
>>>>>> +  if (__builtin_memchr (global, null, 5) == 0)
>>>>>> +    __builtin_abort ();
>>>>>> +  if (__builtin_memchr (local, null, 5) == 0)
>>>>>> +    __builtin_abort ();
>>>>>
>>>>> How is that a meaningful test? This seems to work even with an
>>>>> unpatched gcc. I'd have expected something that shows a benefit for
>>>>> doing this conversion, and maybe also a test that shows it isn't
>>>>> done in cases where it's not allowed.
>>>
>>> It's meaningful as it scans that there's no __builtin_memchr in optimized dump.
>>> I'm adding new tests that does the opposite test.
>>>
>>>>>
>>>>>> tree
>>>>>> -build_string_literal (int len, const char *str)
>>>>>> +build_string_literal (int len, const char *str, bool build_addr_expr)
>>>>>
>>>>> New arguments should be documented in the function comment.
>>>
>>> Yep, improved.
>>>
>>>>>
>>>>>> +/* Return TRUE when CTOR can be converted to a string constant.  */
>>>>>
>>>>> "if", not "when".
>>>
>>> Done.
>>>
>>>>>
>>>>>> +  unsigned HOST_WIDE_INT elements = CONSTRUCTOR_NELTS (ctor);
>>>>>> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), idx, key, value)
>>>>>> +    {
>>>>>> +      if (key == NULL_TREE
>>>>>> +     || TREE_CODE (key) != INTEGER_CST
>>>>>> +     || !tree_fits_uhwi_p (value)
>>>>>> +     || !useless_type_conversion_p (TREE_TYPE (value), char_type_node))
>>>>>> +   return false;
>>>>>
>>>>> Shouldn't all elements have the same type, or do you really have to
>>>>> call useless_type_conversion in a loop?
>>>>>
>>>>>> +      /* Allow zero character just at the end of a string.  */
>>>>>> +      if (integer_zerop (value))
>>>>>> +   return idx == elements - 1;
>>>>>
>>>>> Don't you also have to explicitly check it's there?
>>>>>
>>>>>
>>>>> Bernd
>>>
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> I'm curious about the
>>
>> @@ -1495,7 +1495,8 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
>>         {
>>           if (!TREE_STATIC (decl))
>>             {
>> -             DECL_INITIAL (decl) = NULL_TREE;
>> +             if (!TREE_READONLY (decl) || TREE_CODE (init) != STRING_CST)
>> +               DECL_INITIAL (decl) = NULL_TREE;
>>               init = build2 (INIT_EXPR, void_type_node, decl, init);
>>               gimplify_and_add (init, seq_p);
>>               ggc_free (init);
>>
>> change.  Why keep DECL_INITIAL if you build a INIT_EXPR anyway?
>>
>> @@ -4438,6 +4439,19 @@ gimplify_init_constructor (tree *expr_p,
>> gimple_seq *pre_p, gimple_seq *post_p,
>>             break;
>>           }
>>
>> +       /* Replace a ctor with a string constant with possible.  */
>> +       if (TREE_READONLY (object)
>> +           && VAR_P (object))
>> +         {
>> +           tree string_ctor = convert_ctor_to_string_cst (ctor);
>> +           if (string_ctor)
>> +             {
>> +               TREE_OPERAND (*expr_p, 1) = string_ctor;
>> +               DECL_INITIAL (object) = string_ctor;
>> +               break;
>> +             }
>> +         }
>> +
>>         /* Fetch information about the constructor to direct later processing.
>>            We might want to make static versions of it in various cases, and
>>            can only do so if it known to be a valid constant initializer.  */
>>
>> hmm, so both these hunks will end up keeping a DECL_INITIAL
>> for non-static local consts?  Usually we end up with
>>
>> main ()
>> {
>>   const char local[5];
>>
>>   <bb 2>:
>>   local = "abcd";
>>
>> here.  So you keep DECL_INITIAL for folding?
>>
>> I believe we should create CONST_DECLs for the above (and make
>> CONST_DECLs first-class
>> symtab citizens), thus avoid runtime stack initialization for the
>> above and instead emit
>> "abcd" to the constant pool?
>
> Hi Richi.
>
> I've just returned to this in order to resolve long pending unfinished stuff.
> I have couple of questions:
>
> a) what should create a CONST_DECL, a FE or gimplifier when it identifies that
> it can be converted to CONST_DECL?

The gimplifier (prototype patches of mine did it that way when trying
to get rid of
&STRING_CST).

> b) Why do we need to put such local variables to symtab?

We don't need to, but we eventually should?

> c) Do we have a target that uses CONST_DECL for such (or similar) purpose?

Not sure.  gimplify_init_constructor ends up with .LC0* on for example aarch64
for initializers of arrays for example.

Attached old &STRING_CST -> &CONST_DECL gimplification patch (no
longer applies).

Richard.

> Thanks,
> Martin
>
>>
>> +  /* Skip constructors with a hole.  */
>> +  if (CONSTRUCTOR_NELTS (ctor) != ctor_length)
>> +    return false;
>>
>> not sure if that's maybe too clever ;)
>>
>> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), idx, key, value)
>> +    {
>> +      if (key == NULL_TREE
>> +         || !tree_fits_uhwi_p (key)
>> +         || !tree_fits_uhwi_p (value))
>> +       return false;
>>
>> I think key == NULL is very well valid (you are not using it...).  I'd
>> instead do
>>
>>      if (key && compare_tree_int (key, idx) != 0)
>>        return false;
>>
>> for the hole check.  value should always fit uhwi given the earlier
>> element type check.
>>
>> +tree
>> +convert_ctor_to_string_cst (tree ctor)
>> +{
>> +  if (!can_convert_ctor_to_string_cst (ctor))
>> +    return NULL_TREE;
>> +
>> +  unsigned HOST_WIDE_INT idx;
>> +  tree value;
>> +  vec<constructor_elt, va_gc> *elts = CONSTRUCTOR_ELTS (ctor);
>> +  char *str = XNEWVEC (char, elts->length ());
>> +
>> +  FOR_EACH_CONSTRUCTOR_VALUE (elts, idx, value)
>> +    str[idx] = (char)tree_to_uhwi (value);
>> +
>> +  tree init = build_string_literal (elts->length (),
>> +                                   ggc_alloc_string (str, elts->length ()),
>> +                                   false);
>> +  free (str);
>>
>> why alloc str on the heap, copy it to gc and the copy it again?
>> That is, you can pass 'str' to build_string_literal directly I think.
>>
>> In fact as you are refactoring build_string_literal please
>> refactor build_string into a build_string_raw taking just 'len'
>> (thus leaving out the actual string initialization apart from
>> 'appending' '\0') and then avoid the heap allocation.
>>
>> Refactor build_string_literal to take a tree STRING_CST
>> and build_string_literal_addr to build it's address.
>>
>> Thanks,
>> Richard.
>>
>>
>>
>>
>>> Martin
>>>
>

Attachment: fix-pr50199
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]