This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ARM] PR 67591 ARM v8 Thumb IT blocks are deprecated


On 13 September 2017 at 18:33, Kyrill  Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi Christophe,
>
>
> On 13/09/17 16:23, Christophe Lyon wrote:
>>
>> Hi,
>>
>> On 12 October 2016 at 11:22, Christophe Lyon <christophe.lyon@linaro.org>
>> wrote:
>>>
>>> On 12 October 2016 at 11:14, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
>>> wrote:
>>>>
>>>> On 12/10/16 09:59, Christophe Lyon wrote:
>>>>>
>>>>> Hi Kyrill,
>>>>>
>>>>> On 7 October 2016 at 17:00, Kyrill Tkachov
>>>>> <kyrylo.tkachov@foss.arm.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Christophe,
>>>>>>
>>>>>>
>>>>>> On 07/09/16 21:05, Christophe Lyon wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> The attached patch is a first part to solve PR 67591: it removes
>>>>>>> several occurrences of "IT blocks containing 32-bit Thumb
>>>>>>> instructions are deprecated in ARMv8" messages in the
>>>>>>> gcc/g++/libstdc++/fortran testsuites.
>>>>>>>
>>>>>>> It does not remove them all yet. This patch only modifies the
>>>>>>> *cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp,
>>>>>>> *and_scc_scc and *and_scc_scc_cmp patterns.
>>>>>>> Additional work is required in sub_shiftsi etc, at least.
>>>>>>> I've started looking at these, but I decided I could already
>>>>>>> post this self-contained patch to check if this implementation
>>>>>>> is OK.
>>>>>>>
>>>>>>> Regarding *cmp_and and *cmp_ior patterns, the addition of the
>>>>>>> enabled_for_depr_it attribute is aggressive in the sense that it
>>>>>>> keeps
>>>>>>> only the alternatives with 'l' and 'Py' constraints, while in some
>>>>>>> cases the constraints could be relaxed. Indeed, these 2 patterns can
>>>>>>> swap their input comparisons, meaning that any of them can be emitted
>>>>>>> in the IT-block, and is thus subject to the ARMv8 deprecation.
>>>>>>> The generated code is possibly suboptimal in the cases where the
>>>>>>> operands are not swapped, since 'r' could be used.
>>>>>>>
>>>>>>> Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a
>>>>>>> and --with-cpu=cortex-a57 --with-mode=thumb, showing only
>>>>>>> improvements:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html
>>>>>>>
>>>>>>> Bootstrapped OK on armv8l HW.
>>>>>>>
>>>>>>> Is this OK?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Christophe
>>>>>>
>>>>>>
>>>>>>    (define_insn_and_split "*ior_scc_scc"
>>>>>> -  [(set (match_operand:SI 0 "s_register_operand" "=Ts")
>>>>>> +  [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")
>>>>>>           (ior:SI (match_operator:SI 3 "arm_comparison_operator"
>>>>>> -                [(match_operand:SI 1 "s_register_operand" "r")
>>>>>> -                 (match_operand:SI 2 "arm_add_operand" "rIL")])
>>>>>> +                [(match_operand:SI 1 "s_register_operand" "r,l")
>>>>>> +                 (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
>>>>>>                   (match_operator:SI 6 "arm_comparison_operator"
>>>>>> -                [(match_operand:SI 4 "s_register_operand" "r")
>>>>>> -                 (match_operand:SI 5 "arm_add_operand" "rIL")])))
>>>>>> +                [(match_operand:SI 4 "s_register_operand" "r,l")
>>>>>> +                 (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))
>>>>>>
>>>>>> Can you please put the more restrictive alternatives (lPy) first?
>>>>>> Same with the other patterns your patch touches.
>>>>>> Ok with that change if a rebased testing run is ok.
>>>>>> Sorry for the delay in reviewing.
>>>>>>
>>>>> OK, I will update my patch accordingly.
>>>>>
>>>>> However, when I discussed this with Ramana during the Cauldron,
>>>>> he requested benchmark results. So far, I was able to run spec2006
>>>>> on an APM machine, and I'm seeing performance changes in the
>>>>> range 11% improvement (465.tonto) to 7% degradation (433.milc).
>>>>>
>>>>> Would that be acceptable?
>>>>
>>>>
>>>> Those sound like quite large swings.
>>>
>>> Indeed, but most are in the -1%-+1% range.
>>>
>>>> Are you sure the machine was not running anything else at the time
>>>> or playing tricks with frequency scaling?
>>>
>>> No, I had no such guarantee. I used this machine temporarily,
>>> first to check that bootstrap worked. I planed to use another
>>> board with an A57 "standard" microarch for proper
>>> benchmarking, but I'm not sure when I'll have access to it
>>> (wrt to e/o gcc stage1), that's why I reported these early
>>> figures.
>>>
>>>> Did all iterations of SPEC show a consistent difference?
>>>>
>>>> If the changes are consistent, could you have a look at the codegen
>>>> to see if there are any clues to the differences?
>>>
>>> I will update my patch according to your comment, re-run the bench
>>> and have a deeper look at the codegen differences.
>>>
>> I have finally been able to run benchmarks with my patch updated
>> according to your comment, on new machines where we have
>> better control of the environment (frequency, etc...).
>>
>> These machines use cortex-a57 CPUs and spec2006 shows little
>> difference with and without this patch.
>>
>> Comparing several runs with and without the patch:
>> - gcc is about 1% slower
>> - povray about 1% faster
>> - omnetpp about 1.5% faster
>
>
> Yeah, that looks line with what I'd expect for this change.
>
>> The others are in the noise level.
>>
>> OK for trunk?
>
>
> This is ok if a bootstrap and test run on top of a recent compiler
> configured for --with-thumb
> and for armv8-a passes okay (to exercise the new code).
> Thanks for persevering with this!
>

Thanks for the prompt approval.
Committed after bootstrap+regtest on top of r252757.

I'll take a look at the remaining similar warnings.

Christophe

> Kyrill
>
>
>>
>> Thanks,
>>
>> Christophe
>>
>>
>>>> I'd like to get an explanation for these differences before committing
>>>> this patch. If they are just an effect of the more restrictive
>>>> constraints
>>>> then there may be not much we can do, but I'd like to make sure there's
>>>> not
>>>> anything else unexpected going on.
>>>>
>>> Thanks,
>>>
>>> Christophe
>>>
>>>>> The number of warnings (IT blocks containing 32-bit Thumb instructions
>>>>> are deprecated in ARMv8)
>>>>> was 712 without my patch and 122 with it. (using the hosts's binutils
>>>>> 2.24/ubuntu).
>>>>> I expected some warning, since as I said earlier other patterns need
>>>>> to be updated.
>>>>
>>>>
>>>> Understood. That's fine.
>>>>
>>>> Thanks,
>>>> Kyrill
>>>>
>>>>
>>>>> Christophe
>>>>>
>>>>>
>>>>>> Thanks for improving this area!
>>>>>> Kyrill
>>>>>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]