Bug 94077 - gcc.dg/gomp/pr82374.c fails on power 7
Summary: gcc.dg/gomp/pr82374.c fails on power 7
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: testsuite (show other bugs)
Version: 8.4.1
: P3 normal
Target Milestone: ---
Assignee: Kewen Lin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-06 16:27 UTC by seurer
Modified: 2020-08-12 09:51 UTC (History)
3 users (show)

See Also:
Host: powerpc64-linux-gnu
Target: powerpc64-linux-gnu
Build: powerpc64-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2020-08-12 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description seurer 2020-03-06 16:27:53 UTC
This is an old one but I am trying to clean up the failing tests on powerpc 64 that have just been ignored.

This isn't really a regression as it looks like it has never worked right on power 7.  It works OK on other power hardware including other BE.  I am pretty sure it is just something not supported on power 7.

make -k check-gcc RUNTESTFLAGS=gomp.exp=gcc.dg/gomp/pr82374.c

# of expected passes		1
# of unexpected failures	1
FAIL: gcc.dg/gomp/pr82374.c scan-tree-dump-times vect "vectorized 1 loops" 2


g:5c38262d95bedc091083cc881d9e21cd1f215a9a, r8-3584

Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Oct 4 09:50:38 2017 +0200

    re PR tree-optimization/82374 (#pragma GCC optimize is not applied to openmp-generated functions)
    
            PR tree-optimization/82374
            * omp-low.c (create_omp_child_function): Copy DECL_ATTRIBUTES,
            DECL_FUNCTION_SPECIFIC_OPTIMIZATION,
            DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_VERSIONED from
            current_function_decl to the new decl.
    
            * gcc.dg/gomp/pr82374.c: New test.
    
    From-SVN: r253395
Comment 1 Kewen Lin 2020-08-12 06:11:47 UTC
This issue only exists on gcc8 and gcc9, it's gone with gcc10 and trunk.

The main difference is listed below:

with gcc8/gcc9, the cost modeling says it's not profitable because of high cost realign vector load/store for vectorization body, that is:

gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: Cost model analysis:
  Vector inside of loop cost: 32
  Vector prologue cost: 6
  Vector epilogue cost: 0
  Scalar iteration cost: 4
  Scalar outside cost: 0
  Vector outside cost: 6
  prologue iterations: 0
  epilogue iterations: 0
gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: cost model: the vector iteration cost = 32 divided by the scalar iteration cost = 4 is greater or equal to the vectorization factor = 4.
gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: not vectorized: vectorization not profitable.
gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note: not vectorized: vector version will never be profitable.


While with gcc10 and trunk, the information looks like:

gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note:  Cost model analysis:
  Vector inside of loop cost: 6
  Vector prologue cost: 0
  Vector epilogue cost: 0
  Scalar iteration cost: 6
  Scalar outside cost: 0
  Vector outside cost: 0
  prologue iterations: 0
  epilogue iterations: 0
  Calculated minimum iters for profitability: 0
gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note:    Runtime profitability threshold = 4
gcc/testsuite/gcc.dg/gomp/pr82374.c:27:3: note:    Static estimate profitability threshold = 4

By tracing back, I noticed the difference comes from:

gcc8/gcc9
  can't force alignment of ref: a[i_12]
  
gcc10/trunk:
  force alignment of a[i_12]
  
I guess it's not a good idea to backport some patch to get the alignment forced (probably risky?), instead I think we can append an additional option -mefficient-unaligned-vsx together with -mvsx to ensure we can use unaligned vector load/store, or set the target requirement into powerpc_vsx_ok && vect_hw_misalign, both meet the original testing purpose.

Hi @Jakub, what do you think of this?
Comment 2 Kewen Lin 2020-08-12 07:09:26 UTC
To be more specific, the reason causing the available alignment forcing is the default setting of -fcommon, we set -fno-common as default from GCC10, it makes decl_binds_to_current_def_p return true then.

I can observe this case fail if with explicit -fcommon.
Comment 3 Kewen Lin 2020-08-12 07:10:12 UTC
> 
> I can observe this case fail if with explicit -fcommon.

I mean even with gcc10 or trunk.
Comment 4 Jakub Jelinek 2020-08-12 07:19:27 UTC
So add -fcommon to the gcc8/gcc9 version then?
What the test wants to test is whether the optimize attribute is propagated properly...
Comment 5 Jakub Jelinek 2020-08-12 07:21:23 UTC
I mean -fno-common, sorry.
Comment 6 Kewen Lin 2020-08-12 07:59:23 UTC
(In reply to Jakub Jelinek from comment #5)
> I mean -fno-common, sorry.

Good idea, that works!  I'll send a patch by adding -fno-common into dg-options.  Thanks for your suggestion!
Comment 7 GCC Commits 2020-08-12 09:42:45 UTC
The releases/gcc-8 branch has been updated by Kewen Lin <linkw@gcc.gnu.org>:

https://gcc.gnu.org/g:6786b369ab2851b25e8fd2aae33d3b1bf20de132

commit r8-10403-g6786b369ab2851b25e8fd2aae33d3b1bf20de132
Author: Kewen Lin <linkw@linux.ibm.com>
Date:   Wed Aug 12 04:19:16 2020 -0500

    testsuite: Add -fno-common to pr82374.c [PR94077]
    
    As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on
    Power7 since gcc8.  But it passes from gcc10.  By looking into
    the difference, it's due to that gcc10 sets -fno-common as default,
    which makes vectorizer force the alignment and be able to use
    aligned vector load/store on those targets which doesn't support
    unaligned vector load/store (here it's Power7).
    
    As Jakub suggested in the PR, this patch is to append -fno-common
    into dg-options.
    
    Verified with gcc8/gcc9 releases on ppc64-redhat-linux (Power7).
    
    gcc/testsuite/ChangeLog:
    
            PR testsuite/94077
            * gcc.dg/gomp/pr82374.c: Add option -fno-common.
Comment 8 GCC Commits 2020-08-12 09:44:13 UTC
The releases/gcc-9 branch has been updated by Kewen Lin <linkw@gcc.gnu.org>:

https://gcc.gnu.org/g:ffb32ba2fb79d90be4be9a59ef0336d3404ff538

commit r9-8806-gffb32ba2fb79d90be4be9a59ef0336d3404ff538
Author: Kewen Lin <linkw@linux.ibm.com>
Date:   Wed Aug 12 04:19:16 2020 -0500

    testsuite: Add -fno-common to pr82374.c [PR94077]
    
    As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on
    Power7 since gcc8.  But it passes from gcc10.  By looking into
    the difference, it's due to that gcc10 sets -fno-common as default,
    which makes vectorizer force the alignment and be able to use
    aligned vector load/store on those targets which doesn't support
    unaligned vector load/store (here it's Power7).
    
    As Jakub suggested in the PR, this patch is to append -fno-common
    into dg-options.
    
    Verified with gcc8/gcc9 releases on ppc64-redhat-linux (Power7).
    
    gcc/testsuite/ChangeLog:
    
            PR testsuite/94077
            * gcc.dg/gomp/pr82374.c: Add option -fno-common.
Comment 9 Kewen Lin 2020-08-12 09:51:26 UTC
Should be fixed now.