This is an old one but I am trying to clean up the failing tests on powerpc 64 that have just been ignored. This isn't really a regression as it looks like it has never worked right on power 7. It works OK on other power hardware including other BE. I am pretty sure it is just something not supported on power 7. make -k check-gcc RUNTESTFLAGS=gomp.exp=gcc.dg/gomp/pr82374.c # of expected passes 1 # of unexpected failures 1 FAIL: gcc.dg/gomp/pr82374.c scan-tree-dump-times vect "vectorized 1 loops" 2 g:5c38262d95bedc091083cc881d9e21cd1f215a9a, r8-3584 Author: Jakub Jelinek <jakub@redhat.com> Date: Wed Oct 4 09:50:38 2017 +0200 re PR tree-optimization/82374 (#pragma GCC optimize is not applied to openmp-generated functions) PR tree-optimization/82374 * omp-low.c (create_omp_child_function): Copy DECL_ATTRIBUTES, DECL_FUNCTION_SPECIFIC_OPTIMIZATION, DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_VERSIONED from current_function_decl to the new decl. * gcc.dg/gomp/pr82374.c: New test. From-SVN: r253395
This issue only exists on gcc8 and gcc9, it's gone with gcc10 and trunk. The main difference is listed below: with gcc8/gcc9, the cost modeling says it's not profitable because of high cost realign vector load/store for vectorization body, that is: gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: Cost model analysis: Vector inside of loop cost: 32 Vector prologue cost: 6 Vector epilogue cost: 0 Scalar iteration cost: 4 Scalar outside cost: 0 Vector outside cost: 6 prologue iterations: 0 epilogue iterations: 0 gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: cost model: the vector iteration cost = 32 divided by the scalar iteration cost = 4 is greater or equal to the vectorization factor = 4. gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: not vectorized: vectorization not profitable. gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: not vectorized: vector version will never be profitable. While with gcc10 and trunk, the information looks like: gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: Cost model analysis: Vector inside of loop cost: 6 Vector prologue cost: 0 Vector epilogue cost: 0 Scalar iteration cost: 6 Scalar outside cost: 0 Vector outside cost: 0 prologue iterations: 0 epilogue iterations: 0 Calculated minimum iters for profitability: 0 gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: Runtime profitability threshold = 4 gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: Static estimate profitability threshold = 4 By tracing back, I noticed the difference comes from: gcc8/gcc9 can't force alignment of ref: a[i_12] gcc10/trunk: force alignment of a[i_12] I guess it's not a good idea to backport some patch to get the alignment forced (probably risky?), instead I think we can append an additional option -mefficient-unaligned-vsx together with -mvsx to ensure we can use unaligned vector load/store, or set the target requirement into powerpc_vsx_ok && vect_hw_misalign, both meet the original testing purpose. Hi @Jakub, what do you think of this?
To be more specific, the reason causing the available alignment forcing is the default setting of -fcommon, we set -fno-common as default from GCC10, it makes decl_binds_to_current_def_p return true then. I can observe this case fail if with explicit -fcommon.
> > I can observe this case fail if with explicit -fcommon. I mean even with gcc10 or trunk.
So add -fcommon to the gcc8/gcc9 version then? What the test wants to test is whether the optimize attribute is propagated properly...
I mean -fno-common, sorry.
(In reply to Jakub Jelinek from comment #5) > I mean -fno-common, sorry. Good idea, that works! I'll send a patch by adding -fno-common into dg-options. Thanks for your suggestion!
The releases/gcc-8 branch has been updated by Kewen Lin <linkw@gcc.gnu.org>: https://gcc.gnu.org/g:6786b369ab2851b25e8fd2aae33d3b1bf20de132 commit r8-10403-g6786b369ab2851b25e8fd2aae33d3b1bf20de132 Author: Kewen Lin <linkw@linux.ibm.com> Date: Wed Aug 12 04:19:16 2020 -0500 testsuite: Add -fno-common to pr82374.c [PR94077] As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on Power7 since gcc8. But it passes from gcc10. By looking into the difference, it's due to that gcc10 sets -fno-common as default, which makes vectorizer force the alignment and be able to use aligned vector load/store on those targets which doesn't support unaligned vector load/store (here it's Power7). As Jakub suggested in the PR, this patch is to append -fno-common into dg-options. Verified with gcc8/gcc9 releases on ppc64-redhat-linux (Power7). gcc/testsuite/ChangeLog: PR testsuite/94077 * gcc.dg/gomp/pr82374.c: Add option -fno-common.
The releases/gcc-9 branch has been updated by Kewen Lin <linkw@gcc.gnu.org>: https://gcc.gnu.org/g:ffb32ba2fb79d90be4be9a59ef0336d3404ff538 commit r9-8806-gffb32ba2fb79d90be4be9a59ef0336d3404ff538 Author: Kewen Lin <linkw@linux.ibm.com> Date: Wed Aug 12 04:19:16 2020 -0500 testsuite: Add -fno-common to pr82374.c [PR94077] As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on Power7 since gcc8. But it passes from gcc10. By looking into the difference, it's due to that gcc10 sets -fno-common as default, which makes vectorizer force the alignment and be able to use aligned vector load/store on those targets which doesn't support unaligned vector load/store (here it's Power7). As Jakub suggested in the PR, this patch is to append -fno-common into dg-options. Verified with gcc8/gcc9 releases on ppc64-redhat-linux (Power7). gcc/testsuite/ChangeLog: PR testsuite/94077 * gcc.dg/gomp/pr82374.c: Add option -fno-common.
Should be fixed now.