[PATCH, ivopt] Try aligned offset when get_address_cost

Richard Biener richard.guenther@gmail.com
Wed Aug 6 12:25:00 GMT 2014


On Wed, Aug 6, 2014 at 8:34 AM, Zhenqiang Chen
<zhenqiang.chen@linaro.org> wrote:
> On 5 August 2014 21:59, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Mon, Aug 4, 2014 at 11:09 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
>>>> Sent: Monday, August 04, 2014 4:41 PM
>>>> To: Zhenqiang Chen
>>>> Cc: gcc-patches List
>>>> Subject: Re: [PATCH, ivopt] Try aligned offset when get_address_cost
>>>>
>>>> On Mon, Aug 4, 2014 at 2:28 PM, Zhenqiang Chen
>>>> <zhenqiang.chen@arm.com> wrote:
>>>> > Hi,
>>>> >
>>>> > For some TARGET, like ARM THUMB1, the offset in load/store should be
>>>> > nature aligned. But in function get_address_cost, when computing
>>>> > max_offset, it only tries byte-aligned offsets:
>>>> >
>>>> >   ((unsigned HOST_WIDE_INT) 1 << i) - 1
>>>> >
>>>> > which can not meet thumb_legitimate_offset_p check called from
>>>> > thumb1_legitimate_address_p for HImode and SImode.
>>>> >
>>>> > The patch adds additional try for aligned offset:
>>>> >
>>>> >   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).
>>>> >
>>>> > Bootstrap and no make check regression on X86-64.
>>>> > No make check regression on qemu for Cortex-m0 and Cortex-m3.
>>>> > For Cortex-m0, no performance changes with coremark and dhrystone.
>>>> > Coremark code size is ~0.44 smaller. And eembcv2 code size is ~0.22
>>>> > smaller. CSiBE code size is ~0.05% smaller.
>>>> >
>>>> > OK for trunk?
>>>> >
>>>> > Thanks!
>>>> > -Zhenqiang
>>>> >
>>>> > ChangeLog
>>>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>>>> >
>>>> >         * tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset.
>>>> >
>>>> > testsuite/ChangeLog:
>>>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>>>> >
>>>> >         * gcc.target/arm/get_address_cost_aligned_max_offset.c: New
>>> test.
>>>> >
>>>> > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>>>> > index 3b4a6cd..562122a 100644
>>>> > --- a/gcc/tree-ssa-loop-ivopts.c
>>>> > +++ b/gcc/tree-ssa-loop-ivopts.c
>>>> > @@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool
>>>> > var_present,
>>>> >           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>>> >           if (memory_address_addr_space_p (mem_mode, addr, as))
>>>> >             break;
>>>> > +         /* For some TARGET, like ARM THUMB1, the offset should be
>>> nature
>>>> > +            aligned.  Try an aligned offset if address_mode is not
>>> QImode.
>>>> > */
>>>> > +         off = (address_mode == QImode)
>>>> > +               ? 0
>>>> > +               : ((unsigned HOST_WIDE_INT) 1 << i)
>>>> > +                   - GET_MODE_SIZE (address_mode);
>>>> > +         if (off > 0)
>>>> > +           {
>>>> > +             XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>>> > +             if (memory_address_addr_space_p (mem_mode, addr, as))
>>>> > +               break;
>>>> > +           }
>>>> Hi, Why not just check "address_mode != QImode"? Set off to 0 then check
>>> it
>>>> seems unnecessary.
>>>
>>> Thanks for the comments.
>>>
>>> ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode) might be a
>>> negative value except QImode. A negative value can not be max_offset. So we
>>> do not need to check it.
>>>
>>> For QImode, "((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE
>>> (address_mode)" == "((unsigned HOST_WIDE_INT) 1 << i) - 1". It is already
>>> checked. So no need to check it again.
>>>
>>> I think the compiler can optimize the patch like
>>>
>>> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>>> index 3b4a6cd..213598a 100644
>>> --- a/gcc/tree-ssa-loop-ivopts.c
>>> +++ b/gcc/tree-ssa-loop-ivopts.c
>>> @@ -3308,6 +3308,19 @@ get_address_cost (bool symbol_present, bool
>>> var_present,
>>>           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>>           if (memory_address_addr_space_p (mem_mode, addr, as))
>>>             break;
>>> +         /* For some TARGET, like ARM THUMB1, the offset should be nature
>>> +            aligned.  Try an aligned offset if address_mode is not QImode.
>>> */
>>> +         if (address_mode != QImode)
>>> +           {
>>> +             off = ((unsigned HOST_WIDE_INT) 1 << i)
>>> +                     - GET_MODE_SIZE (address_mode);
>>> +             if (off > 0)
>>> +               {
>>> +                 XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>> +                 if (memory_address_addr_space_p (mem_mode, addr, as))
>>> +                   break;
>>> +               }
>>> +           }
>>>         }
>>>        if (i == -1)
>>>          off = 0;
>>
>> But is off now guaranteed to be the max value? (1 << (i-1) ) - 1 for
>> small i is larger than (1 << i) - GET_MODE_SIZE (address_mode).
>>
>> That is, I think you want to guard this with 1 << (i - 1) >
>> GET_MODE_SIZE (address_mode)?
>
> Yes. Without off > 0, it can not guarantee the off is the max value.
> With off > 0, it can guarantee that
>
> (1 << i) - GET_MODE_SIZE (address_mode) is greater than (1 << (i-1) ) - 1.
>
>> You don't adjust the negative offset side - why?
>
> -((unsigned HOST_WIDE_INT) 1 << i) is already the min aligned offset.

Ok.

The patch is ok then.

Thanks,
Richard.

> Thanks!
> -Zhenqiang
>
>> Richard.
>>
>>>
>>>> Thanks,
>>>> bin
>>>> >         }
>>>> >        if (i == -1)
>>>> >          off = 0;
>>>> > diff --git
>>>> > a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>>>> > b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>>>> > new file mode 100644
>>>> > index 0000000..cc3e2f7
>>>> > --- /dev/null
>>>> > +++
>>>> b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset
>>>> > +++ .c
>>>> > @@ -0,0 +1,28 @@
>>>> > +/* { dg-do compile } */
>>>> > +/* { dg-options "-mthumb -O2" }  */
>>>> > +/* { dg-require-effective-target arm_thumb1_ok } */
>>>> > +
>>>> > +unsigned int
>>>> > +test (const short p16[6 * 64])
>>>> > +{
>>>> > +  unsigned int i = 6;
>>>> > +  unsigned int ret = 0;
>>>> > +
>>>> > +  do
>>>> > +    {
>>>> > +      unsigned long long *p64 = (unsigned long long*) p16;
>>>> > +      unsigned int *p32 = (unsigned int*) p16;
>>>> > +      ret += ret;
>>>> > +      if (p16[1] || p32[1])
>>>> > +       ret++;
>>>> > +      else if (p64[1] | p64[2] | p64[3])
>>>> > +       ret++;
>>>> > +      p16 += 64;
>>>> > +      i--;
>>>> > +    } while (i != 0);
>>>> > +
>>>> > +  return ret;
>>>> > +}
>>>> > +
>>>> > +/* { dg-final { scan-assembler-not "#22" } } */
>>>> > +/* { dg-final { scan-assembler-not "#14" } } */
>>>> >
>>>> >
>>>> >
>>>
>>>
>>>
>>>



More information about the Gcc-patches mailing list