On 12 October 2016 at 11:14, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
On 12/10/16 09:59, Christophe Lyon wrote:
Hi Kyrill,
On 7 October 2016 at 17:00, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
wrote:
Hi Christophe,
On 07/09/16 21:05, Christophe Lyon wrote:
Hi,
The attached patch is a first part to solve PR 67591: it removes
several occurrences of "IT blocks containing 32-bit Thumb
instructions are deprecated in ARMv8" messages in the
gcc/g++/libstdc++/fortran testsuites.
It does not remove them all yet. This patch only modifies the
*cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp,
*and_scc_scc and *and_scc_scc_cmp patterns.
Additional work is required in sub_shiftsi etc, at least.
I've started looking at these, but I decided I could already
post this self-contained patch to check if this implementation
is OK.
Regarding *cmp_and and *cmp_ior patterns, the addition of the
enabled_for_depr_it attribute is aggressive in the sense that it keeps
only the alternatives with 'l' and 'Py' constraints, while in some
cases the constraints could be relaxed. Indeed, these 2 patterns can
swap their input comparisons, meaning that any of them can be emitted
in the IT-block, and is thus subject to the ARMv8 deprecation.
The generated code is possibly suboptimal in the cases where the
operands are not swapped, since 'r' could be used.
Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a
and --with-cpu=cortex-a57 --with-mode=thumb, showing only improvements:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html
Bootstrapped OK on armv8l HW.
Is this OK?
Thanks,
Christophe
(define_insn_and_split "*ior_scc_scc"
- [(set (match_operand:SI 0 "s_register_operand" "=Ts")
+ [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")
(ior:SI (match_operator:SI 3 "arm_comparison_operator"
- [(match_operand:SI 1 "s_register_operand" "r")
- (match_operand:SI 2 "arm_add_operand" "rIL")])
+ [(match_operand:SI 1 "s_register_operand" "r,l")
+ (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
(match_operator:SI 6 "arm_comparison_operator"
- [(match_operand:SI 4 "s_register_operand" "r")
- (match_operand:SI 5 "arm_add_operand" "rIL")])))
+ [(match_operand:SI 4 "s_register_operand" "r,l")
+ (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))
Can you please put the more restrictive alternatives (lPy) first?
Same with the other patterns your patch touches.
Ok with that change if a rebased testing run is ok.
Sorry for the delay in reviewing.
OK, I will update my patch accordingly.
However, when I discussed this with Ramana during the Cauldron,
he requested benchmark results. So far, I was able to run spec2006
on an APM machine, and I'm seeing performance changes in the
range 11% improvement (465.tonto) to 7% degradation (433.milc).
Would that be acceptable?
Those sound like quite large swings.
Indeed, but most are in the -1%-+1% range.
Are you sure the machine was not running anything else at the time
or playing tricks with frequency scaling?
No, I had no such guarantee. I used this machine temporarily,
first to check that bootstrap worked. I planed to use another
board with an A57 "standard" microarch for proper
benchmarking, but I'm not sure when I'll have access to it
(wrt to e/o gcc stage1), that's why I reported these early
figures.
Did all iterations of SPEC show a consistent difference?
If the changes are consistent, could you have a look at the codegen
to see if there are any clues to the differences?
I will update my patch according to your comment, re-run the bench
and have a deeper look at the codegen differences.