97075 – [11 regression] powerpc64 vector tests fails after r11-3230

Bug 97075 - [11 regression] powerpc64 vector tests fails after r11-3230

Summary: [11 regression] powerpc64 vector tests fails after r11-3230

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	11.0

Importance:	P3 normal
Target Milestone:	11.0
Assignee:	Kewen Lin

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-09-16 18:15 UTC by seurer
Modified:	2020-09-24 05:49 UTC (History)
CC List:	5 users (show)

See Also:
Host:	powerpc64*-linux-gnu
Target:	powerpc64*-linux-gnu
Build:	powerpc64*-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:	2020-09-17 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description seurer 2020-09-16 18:15:15 UTC

g:052204fac580b21c967e57e6285d99a9828b8fac, r11-3230

FAIL: gcc.target/powerpc/p9-vec-length-epil-7.c scan-assembler-times \\mstxvl\\M 10
FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-not \\mlxvx\\M
FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-not \\mstxvx\\M
FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\mlxvl\\M 16
FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\mstxvl\\M 16

For the second test case here is a diff of the assembler:

seurer@makalu-lp1:~/gcc/git/build/gcc-test$ diff p9-vec-length-epil-7.s.r11-3229 p9-vec-length-epil-7.s.r11-3230
322,323c322,323
< 	li 6,28
< 	addis 7,2,.LC8@toc@ha
---
> 	li 7,28
> 	addis 8,2,.LC8@toc@ha
325d324
< 	addis 10,2,.LANCHOR0@toc@ha
327,330c326,328
< 	li 8,0
< 	mtctr 6
< 	addi 7,7,.LC8@toc@l
< 	addi 10,10,.LANCHOR0@toc@l
---
> 	li 10,0
> 	mtctr 7
> 	addi 8,8,.LC8@toc@l
333,335c331,333
< 	lxv 32,0(7)
< 	addis 7,2,.LANCHOR0+904@toc@ha
< 	std 8,.LANCHOR0+904@toc@l(7)
---
> 	lxv 32,0(8)
> 	addis 8,2,.LANCHOR0+904@toc@ha
> 	std 10,.LANCHOR0+904@toc@l(8)
343,349c341,343
< 	addis 7,2,.LC9@toc@ha
< 	li 8,8
< 	addi 9,10,1360
< 	addi 7,7,.LC9@toc@l
< 	sldi 10,8,56
< 	lxv 0,0(7)
< 	stxvl 0,9,10
---
> 	li 9,57
> 	addis 10,2,.LANCHOR0+1360@toc@ha
> 	std 9,.LANCHOR0+1360@toc@l(10)
368,369c362,363
< 	li 6,28
< 	addis 7,2,.LC8@toc@ha
---
> 	li 7,28
> 	addis 8,2,.LC8@toc@ha
371d364
< 	addis 10,2,.LANCHOR0@toc@ha
373,376c366,368
< 	li 8,0
< 	mtctr 6
< 	addi 7,7,.LC8@toc@l
< 	addi 10,10,.LANCHOR0@toc@l
---
> 	li 10,0
> 	mtctr 7
> 	addi 8,8,.LC8@toc@l
379,381c371,373
< 	lxv 32,0(7)
< 	addis 7,2,.LANCHOR0+1416@toc@ha
< 	std 8,.LANCHOR0+1416@toc@l(7)
---
> 	lxv 32,0(8)
> 	addis 8,2,.LANCHOR0+1416@toc@ha
> 	std 10,.LANCHOR0+1416@toc@l(8)
389,395c381,383
< 	addis 7,2,.LC9@toc@ha
< 	li 8,8
< 	addi 9,10,1872
< 	addi 7,7,.LC9@toc@l
< 	sldi 10,8,56
< 	lxv 0,0(7)
< 	stxvl 0,9,10
---
> 	li 9,57
> 	addis 10,2,.LANCHOR0+1872@toc@ha
> 	std 9,.LANCHOR0+1872@toc@l(10)
414,415c402,403
< 	addis 6,2,.LC10@toc@ha
< 	addis 7,2,.LC11@toc@ha
---
> 	addis 6,2,.LC9@toc@ha
> 	addis 7,2,.LC10@toc@ha
421,422c409,410
< 	addi 6,6,.LC10@toc@l
< 	addi 7,7,.LC11@toc@l
---
> 	addi 6,6,.LC9@toc@l
> 	addi 7,7,.LC10@toc@l
441c429
< 	addis 7,2,.LC12@toc@ha
---
> 	addis 7,2,.LC11@toc@ha
444c432
< 	addi 7,7,.LC12@toc@l
---
> 	addi 7,7,.LC11@toc@l
466,469c454,456
< 	addis 7,2,.LC13@toc@ha
< 	li 6,28
< 	addis 8,2,.LC14@toc@ha
< 	addis 10,2,.LANCHOR0@toc@ha
---
> 	addis 8,2,.LC12@toc@ha
> 	li 7,28
> 	addis 10,2,.LC13@toc@ha
472,475c459,461
< 	addi 7,7,.LC13@toc@l
< 	mtctr 6
< 	addi 8,8,.LC14@toc@l
< 	addi 10,10,.LANCHOR0@toc@l
---
> 	addi 8,8,.LC12@toc@l
> 	mtctr 7
> 	addi 10,10,.LC13@toc@l
477,480c463,466
< 	lxv 0,0(7)
< 	lxv 11,0(8)
< 	addis 8,2,.LANCHOR0+2184@toc@ha
< 	stfd 12,.LANCHOR0+2184@toc@l(8)
---
> 	lxv 0,0(8)
> 	lxv 11,0(10)
> 	addis 10,2,.LANCHOR0+2184@toc@ha
> 	stfd 12,.LANCHOR0+2184@toc@l(10)
488,494c474,477
< 	addis 7,2,.LC15@toc@ha
< 	li 8,8
< 	addi 9,10,2640
< 	addi 7,7,.LC15@toc@l
< 	sldi 10,8,56
< 	lxv 0,0(7)
< 	stxvl 0,9,10
---
> 	addis 9,2,.LC14@toc@ha
> 	lfd 0,.LC14@toc@l(9)
> 	addis 9,2,.LANCHOR0+2640@toc@ha
> 	stfd 0,.LANCHOR0+2640@toc@l(9)

Comment 1 Kewen Lin 2020-09-17 02:28:49 UTC

I'll take a look at this.

Comment 2 akrl 2020-09-17 06:42:15 UTC

Thanks Kewen, unfortunately I've no Power setup.  Sorry for the inconvenience.

Comment 3 Kewen Lin 2020-09-17 09:31:02 UTC

(In reply to akrl from comment #2)
> Thanks Kewen, unfortunately I've no Power setup.  Sorry for the
> inconvenience.

My pleasure! If you have interests to run on Power machines, you can apply and use some Power8/Power9 machines in CFarm machine pool https://cfarm.tetaneutral.net/machines/list/.

Comment 4 Kewen Lin 2020-09-17 09:41:17 UTC

> gcc.target/powerpc/p9-vec-length-full-6.c

This is a test case issue, 64bit/32bit pairs will use full vector instead of partial vector as Andrea's improvement.

> gcc.target/powerpc/p9-vec-length-epil-7.c

It exposed one problem: when we call vect_need_peeling_or_partial_vectors_p in function vect_analyze_loop_2, it's in analysis stage, if the loop is one epilogue loop, the loop_vinfo hasn't been fixed up, like LOOP_VINFO_INT_NITERS, the function can probably give the wrong answer.  For some 64bit type functions of this failed case, it will return false for the epilogue loops but actually the remaining iteration can't cover the full vector.

One simple fix is to exclude epilogue loop for this check.

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index ab627fbf029..7273e998a99 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2278,7 +2278,8 @@ start_over:
     {
       /* Don't use partial vectors if we don't need to peel the loop.  */
       if (param_vect_partial_vector_usage == 0
-          || !vect_need_peeling_or_partial_vectors_p (loop_vinfo))
+          || (!LOOP_VINFO_EPILOGUE_P (loop_vinfo)
+              && !vect_need_peeling_or_partial_vectors_p (loop_vinfo)))
         LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
       else if (vect_verify_full_masking (loop_vinfo)
                || vect_verify_loop_lens (loop_vinfo))

Testing is ongoing.

Comment 5 GCC Commits 2020-09-24 05:47:19 UTC

The master branch has been updated by Kewen Lin <linkw@gcc.gnu.org>:

https://gcc.gnu.org/g:5427bd4d57c0376e51fc7b256e76aa46c43aa8cf

commit r11-3422-g5427bd4d57c0376e51fc7b256e76aa46c43aa8cf
Author: Kewen Lin <linkw@linux.ibm.com>
Date:   Thu Sep 24 00:40:47 2020 -0500

    test: Adjust case p9-vec-length-full-6.c [PR97075]
    
    The commit r11-3230 brings a nice improvement to use full
    vectors instead of partial vectors when available.  This
    patch is to fix the test failures on p9-vec-length-full-6.c,
    where 64bit/32bit pairs are able to use full vector instead.
    
    Bootstrapped/regtested on powerpc64le-linux-gnu P9.
    
    gcc/testsuite/ChangeLog:
    
            PR tree-optimization/97075
            * gcc.target/powerpc/p9-vec-length-full-6.c: Adjust.

Comment 6 Kewen Lin 2020-09-24 05:49:47 UTC

Richard's rework r11-3393 has taken care of the failure on gcc.target/powerpc/p9-vec-length-epil-7.c.  All failures should be gone on latest trunk.