[PATCH] Reduce stack usage in sha512 (PR target/77308)

Kyrill Tkachov kyrylo.tkachov@foss.arm.com
Wed Oct 19 16:47:00 GMT 2016


On 19/10/16 17:06, Bernd Edlinger wrote:
> On 10/19/16 12:44, Christophe Lyon wrote:
>> On 19 October 2016 at 10:34, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>>> On 19/10/16 07:55, Christophe Lyon wrote:
>>>> On 18 October 2016 at 17:35, Christophe Lyon <christophe.lyon@linaro.org>
>>>> wrote:
>>>>> On 18 October 2016 at 16:45, Bernd Edlinger <bernd.edlinger@hotmail.de>
>>>>> wrote:
>>>>>> On 10/18/16 10:36, Christophe Lyon wrote:
>>>>>>> I am seeing a lot of regressions since this patch was committed:
>>>>>>>
>>>>>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/241273/report-build-info.html
>>>>>>>
>>>>>>> (you can click on "REGRESSED" to see the list of regressions, "sum"
>>>>>>> and "log" to download
>>>>>>> the corresponding .sum/.log)
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Christophe
>>>>>>>
>>>>>> Oh, sorry, I have completely missed that.
>>>>>> Unfortunately I have tested one of the good combinations.
>>>>>>
>>>>>> I have not considered the case that in and out could be
>>>>>> the same register.  But that happens here.
>>>>>>
>>>>>>
>>>>>> This should solve it.
>>>>>>
>>>>>> Can you give it a try?
>>>>>>
>>>>> Sure, I have started the validations, it will take a few hours.
>>>>>
>>>> It looks OK, thanks.
>>>
>>> Ok. Thanks for testing Christophe.
>>> Sorry for not catching it in review.
>>> Kyrill
>>>
>> Since it was painful to check that this 2nd patch fixed all the regressions
>> observed in all the configurations, I ran another validation with
>> the 2 patches combined, on top of r241272 to check the effect of the two
>> over the previous reference.
>>
>> It turns out there is still a failure:
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/241272-r241273-combined/report-build-info.html
>> gcc.c-torture/execute/pr34971.c   -O0  execution test
>> now fails on arm-none-linux-gnueabihf when using thumb mode, expect
>> when targeting cortex-a5.
>>
> Thanks Christophe, I looked in the test case,
> and I think that is a pre-existing issue.
>
> I can make the test case fail before my patch:
>
> struct foo
> {
>     unsigned long long b:40;
> } x;
>
> extern void abort (void);
>
> void test1(unsigned long long res)
> {
>     /* Build a rotate expression on a 40 bit argument.  */
>     if ((x.b<<9) + (x.b>>31) != res)
>       abort ();
> }
>
> int main()
> {
>     x.b = 0x0100000001;
>     test1(0x0000000202);
>     x.b = 0x0100000000;
>     test1(0x0000000002);
>     return 0;
> }
>
>
> fails with -mcpu=cortex-a9 -mthumb -mfpu=neon-fp16 -O0
> even before my patch.
>
> before reload we have this insn:
> (insn 19 18 20 2 (parallel [
>               (set (reg:DI 113 [ _4 ])
>                   (lshiftrt:DI (reg:DI 112 [ _3 ])
>                       (const_int 31 [0x1f])))
>               (clobber (scratch:SI))
>               (clobber (scratch:SI))
>               (clobber (scratch:DI))
>               (clobber (reg:CC 100 cc))
>           ]) "pr34971.c":11 1232 {lshrdi3_neon}
>
>
> which is replaced to this insn:
> (insn 19 18 20 2 (parallel [
>               (set (reg:DI 1 r1 [orig:113 _4 ] [113])
>                   (lshiftrt:DI (reg:DI 0 r0 [orig:112 _3 ] [112])
>                       (const_int 31 [0x1f])))
>               (clobber (scratch:SI))
>               (clobber (scratch:SI))
>               (clobber (scratch:DI))
>               (clobber (reg:CC 100 cc))
>           ]) "pr34971.c":11 1232 {lshrdi3_neon}
>        (nil))
>
> And now that is not supposed to happen, when this code
> is executed for shifts by a constant less than 32:
>
>             emit_insn (SET (out_down, LSHIFT (code, in_down, amount)));
>             emit_insn (SET (out_down,
>                             ORR (REV_LSHIFT (code, in_up, reverse_amount),
>                                  out_down)));
>             emit_insn (SET (out_up, SHIFT (code, in_up, amount)));
>
>
> out_down may not be the same hard register than in_up for this to work.
>
> I opened a new BZ for this:
>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
>
> I am not sure if this is a LRA issue or if it can be handled in the
> shift expansion.
>
> Is the second patch good for trunk as it is?

Thanks for the investigation and the bug report.
The patch is ok.
Kyrill

>
> Thanks
> Bernd.



More information about the Gcc-patches mailing list