Bug 94707 - [8/9 Regression] class with empty base passed incorrectly with -std=c++17 on powerpc64le
Summary: [8/9 Regression] class with empty base passed incorrectly with -std=c++17 on ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.0
: P1 blocker
Target Milestone: 10.0
Assignee: Jakub Jelinek
URL:
Keywords: ABI, wrong-code
: 94720 (view as bug list)
Depends on: 94383
Blocks:
  Show dependency treegraph
 
Reported: 2020-04-22 07:14 UTC by Jonathan Wakely
Modified: 2020-04-29 07:06 UTC (History)
6 users (show)

See Also:
Host:
Target: powerpc64le-*-linux
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-04-22 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Wakely 2020-04-22 07:14:43 UTC
+++ This bug was initially created as a clone of Bug #94383 +++

I see these failures for Jakub's new test for PR 94383:

fail 50.30
fail 116.55
fail 116.56
fail 116.74
fail 116.74
fail 116.74
fail 116.74
fail 116.72
fail 116.74
fail 116.74
fail 120.10
fail 120.11
fail 120.17
fail 120.55
fail 120.56
fail 120.18
fail 124.30
fail 127.30
fail 140.30
fail 187.10
fail 187.11
fail 187.12
fail 187.17
fail 187.50
fail 187.55
fail 187.56
fail 187.18
FAIL: tmpdir-g++.dg-struct-layout-1/t032 cp_compat_x_tst.o-cp_compat_y_tst.o execute


fail 2613.30
fail 2638.30
FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_tst.o-cp_compat_y_tst.o execute


fail 2854.30
FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_tst.o-cp_compat_y_tst.o execute

I've only checked powerpc64le but it's probably wrong on powerpc64 too.
Comment 1 Jakub Jelinek 2020-04-22 08:53:55 UTC
E.g. the 120.* failures can be seen on
struct empty_base {};
struct S : public empty_base { float a[5]; };
S s;
int i;

void foo (S, int *);

int
bar ()
{
  foo (s, &i);
  return 0;
}
compile with -O2 -std=c++14 and then with -O2 -std=c++17 and see it generating different code.
Comment 2 Jakub Jelinek 2020-04-22 08:59:29 UTC
With -std=c++14, the structure is passed in floating point registers 1, 2, 3, 4, 5 (SFmode each elt), with -std=c++17 the structure is passed in gprs 3, 4, 5 (where the first two contain 64-bits each, the last one contains 32 bits zero extended to 64 bits).
Comment 3 Jakub Jelinek 2020-04-22 09:13:07 UTC
--- gcc/config/rs6000/rs6000-call.c.jj	2020-03-30 22:53:40.746640328 +0200
+++ gcc/config/rs6000/rs6000-call.c	2020-04-22 11:07:05.507723606 +0200
@@ -5636,6 +5636,16 @@ rs6000_aggregate_candidate (const_tree t
 	    if (TREE_CODE (field) != FIELD_DECL)
 	      continue;
 
+	    /* Ignore C++17 empty base fields, while their type indicates
+	       they do contain padding, they have zero size and thus don't
+	       contain any padding.  */
+	    if (DECL_ARTIFICIAL (field)
+		&& DECL_NAME (field) == NULL_TREE
+		&& RECORD_OR_UNION_TYPE_P (TREE_TYPE (field))
+		&& DECL_SIZE (field)
+		&& integer_zerop (DECL_SIZE (field)))
+	      continue;
+
 	    sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep);
 	    if (sub_count < 0)
 	      return -1;

fixes the ABI difference, but more work will be needed to actually do the -Wpsabi diagnostics, bet similarly to the aarch64 patch it needs to tell the callers that the empty base field has been seen and ignored; now if the type turns out to be a homogenous aggregate and whatever calls that function would make a different decision based on whether it is homogenous or not, and the c++17 empty base has been seen in it, then it should perform -Wpsabi diagnostics.
Comment 4 Jakub Jelinek 2020-04-22 10:12:12 UTC
Updated incomplete patch on top of https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544276.html

I've handled one rs6000_discover_homogeneous_aggregate caller where it wasn't that hard to figure out
how to report different decisions based on if GCC 7/8/9 with -std=c++17 would make the aggregate
non-homogeneous because of the C++17 empty base artificial FIELD_DECL and we'd return true because of that, while
in -std=c++14 and in patched GCC trunk we'd return false.
But I'm getting lost in all the other spots, figuring out if we made different decisions is harder.

--- gcc/config/rs6000/rs6000-internal.h.jj	2020-03-20 09:11:36.229903622 +0100
+++ gcc/config/rs6000/rs6000-internal.h	2020-04-22 11:31:23.943522525 +0200
@@ -129,7 +129,8 @@ extern int rs6000_darwin64_struct_check_
 extern bool rs6000_discover_homogeneous_aggregate (machine_mode mode,
 						   const_tree type,
 						   machine_mode *elt_mode,
-						   int *n_elts);
+						   int *n_elts,
+						   bool *cxx17_empty_base_seen);
 extern void rs6000_output_mi_thunk (FILE *file,
 				    tree thunk_fndecl ATTRIBUTE_UNUSED,
 				    HOST_WIDE_INT delta,
--- gcc/config/rs6000/rs6000-call.c.jj	2020-03-30 22:53:40.746640328 +0200
+++ gcc/config/rs6000/rs6000-call.c	2020-04-22 12:06:00.066843119 +0200
@@ -5528,7 +5528,8 @@ const struct altivec_builtin_types altiv
    sub-tree.  */
 
 static int
-rs6000_aggregate_candidate (const_tree type, machine_mode *modep)
+rs6000_aggregate_candidate (const_tree type, machine_mode *modep,
+			    bool *cxx17_empty_base_seen)
 {
   machine_mode mode;
   HOST_WIDE_INT size;
@@ -5598,7 +5599,8 @@ rs6000_aggregate_candidate (const_tree t
 	    || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
 	  return -1;
 
-	count = rs6000_aggregate_candidate (TREE_TYPE (type), modep);
+	count = rs6000_aggregate_candidate (TREE_TYPE (type), modep,
+					    cxx17_empty_base_seen);
 	if (count == -1
 	    || !index
 	    || !TYPE_MAX_VALUE (index)
@@ -5636,7 +5638,15 @@ rs6000_aggregate_candidate (const_tree t
 	    if (TREE_CODE (field) != FIELD_DECL)
 	      continue;
 
-	    sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep);
+	    if (cxx17_empty_base_field_p (field))
+	      {
+		if (cxx17_empty_base_seen)
+		  *cxx17_empty_base_seen = true;
+		continue;
+	      }
+
+	    sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep,
+						    cxx17_empty_base_seen);
 	    if (sub_count < 0)
 	      return -1;
 	    count += sub_count;
@@ -5669,7 +5679,8 @@ rs6000_aggregate_candidate (const_tree t
 	    if (TREE_CODE (field) != FIELD_DECL)
 	      continue;
 
-	    sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep);
+	    sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep,
+						    cxx17_empty_base_seen);
 	    if (sub_count < 0)
 	      return -1;
 	    count = count > sub_count ? count : sub_count;
@@ -5700,7 +5711,8 @@ rs6000_aggregate_candidate (const_tree t
 bool
 rs6000_discover_homogeneous_aggregate (machine_mode mode, const_tree type,
 				       machine_mode *elt_mode,
-				       int *n_elts)
+				       int *n_elts,
+				       bool *cxx17_empty_base_seen)
 {
   /* Note that we do not accept complex types at the top level as
      homogeneous aggregates; these types are handled via the
@@ -5710,7 +5722,8 @@ rs6000_discover_homogeneous_aggregate (m
       && AGGREGATE_TYPE_P (type))
     {
       machine_mode field_mode = VOIDmode;
-      int field_count = rs6000_aggregate_candidate (type, &field_mode);
+      int field_count = rs6000_aggregate_candidate (type, &field_mode,
+						    cxx17_empty_base_seen);
 
       if (field_count > 0)
 	{
@@ -5734,6 +5747,8 @@ rs6000_discover_homogeneous_aggregate (m
     *elt_mode = mode;
   if (n_elts)
     *n_elts = 1;
+  if (cxx17_empty_base_seen)
+    *cxx17_empty_base_seen = false;
   return false;
 }
 
@@ -5790,9 +5805,14 @@ rs6000_return_in_memory (const_tree type
     }
 
   /* The ELFv2 ABI returns homogeneous VFP aggregates in registers */
+  bool cxx17_empty_base_seen = false;
   if (rs6000_discover_homogeneous_aggregate (TYPE_MODE (type), type,
-					     NULL, NULL))
-    return false;
+					     NULL, NULL,
+					     &cxx17_empty_base_seen))
+    {
+      if (!cxx17_empty_base_seen || !warn_psabi)
+	return false;
+    }
 
   /* The ELFv2 ABI returns aggregates up to 16B in registers */
   if (DEFAULT_ABI == ABI_ELFv2 && AGGREGATE_TYPE_P (type)
@@ -5802,7 +5822,21 @@ rs6000_return_in_memory (const_tree type
   if (AGGREGATE_TYPE_P (type)
       && (aix_struct_return
 	  || (unsigned HOST_WIDE_INT) int_size_in_bytes (type) > 8))
-    return true;
+    {
+      if (cxx17_empty_base_seen)
+	{
+	  inform (input_location,
+		  "prior to GCC 10, parameters of type %qT were passed "
+		  "incorrectly for C++17", type);
+	  return false;
+	}
+
+      return true;
+    }
+
+  /* The ELFv2 ABI returns homogeneous VFP aggregates in registers.  */
+  if (cxx17_empty_base_seen)
+    return false;
 
   /* Allow -maltivec -mabi=no-altivec without warning.  Altivec vector
      modes only exist for GCC vector types if -maltivec.  */
@@ -6141,7 +6175,8 @@ rs6000_function_arg_boundary (machine_mo
   machine_mode elt_mode;
   int n_elts;
 
-  rs6000_discover_homogeneous_aggregate (mode, type, &elt_mode, &n_elts);
+  rs6000_discover_homogeneous_aggregate (mode, type, &elt_mode, &n_elts,
+					 NULL);
 
   if (DEFAULT_ABI == ABI_V4
       && (GET_MODE_SIZE (mode) == 8
@@ -6409,7 +6444,8 @@ rs6000_function_arg_advance_1 (CUMULATIV
   machine_mode elt_mode;
   int n_elts;
 
-  rs6000_discover_homogeneous_aggregate (mode, type, &elt_mode, &n_elts);
+  rs6000_discover_homogeneous_aggregate (mode, type, &elt_mode, &n_elts,
+					 NULL);
 
   /* Only tick off an argument if we're not recursing.  */
   if (depth == 0)
@@ -6993,7 +7029,9 @@ rs6000_function_arg (cumulative_args_t c
       return GEN_INT (cum->call_cookie & ~CALL_LIBCALL);
     }
 
-  rs6000_discover_homogeneous_aggregate (mode, type, &elt_mode, &n_elts);
+  bool cxx17_empty_base_seen = false;
+  rs6000_discover_homogeneous_aggregate (mode, type, &elt_mode, &n_elts,
+					 &cxx17_empty_base_seen);
 
   if (TARGET_MACHO && rs6000_darwin64_struct_check_p (mode, type))
     {
@@ -7229,7 +7267,7 @@ rs6000_arg_partial_bytes (cumulative_arg
   int n_elts;
 
   rs6000_discover_homogeneous_aggregate (arg.mode, arg.type,
-					 &elt_mode, &n_elts);
+					 &elt_mode, &n_elts, NULL);
 
   if (DEFAULT_ABI == ABI_V4)
     return 0;
--- gcc/config/rs6000/rs6000.c.jj	2020-04-17 08:49:49.040683868 +0200
+++ gcc/config/rs6000/rs6000.c	2020-04-22 12:06:34.988310057 +0200
@@ -22428,7 +22428,8 @@ rs6000_function_value (const_tree valtyp
   mode = TYPE_MODE (valtype);
 
   /* The ELFv2 ABI returns homogeneous VFP aggregates in registers.  */
-  if (rs6000_discover_homogeneous_aggregate (mode, valtype, &elt_mode, &n_elts))
+  if (rs6000_discover_homogeneous_aggregate (mode, valtype, &elt_mode, &n_elts,
+					     NULL))
     {
       int first_reg, n_regs;
Comment 6 Bill Schmidt 2020-04-22 16:08:27 UTC
The ELFv2 ABI has a prominent note specifying:

"Floating-point and vector aggregates that contain padding words and integer fields with a width of 0 should not be treated as homogeneous aggregates."
Comment 7 Bill Schmidt 2020-04-22 16:08:52 UTC
ELF V1 does not have a concept of homogeneous aggregates.
Comment 8 Bill Schmidt 2020-04-22 16:11:51 UTC
Thus the compiler is acting as expected in both cases, so far as I can see.  If C++17 has added new hidden fields, that seems to have introduced an incompatibility between C++17 and C++14 targeted code for the ELFv2 ABI.
Comment 9 Jakub Jelinek 2020-04-22 16:33:41 UTC
CCing Jason on the rationale for adding the empty base field.  The PowerPC ABI certainly describes what is at the source level and how it is mapped to the calling conventions, not what the compiler adds to the TYPE_FIELDS.
And, because the empty base has 0 size, there is no padding as far as the layout of the class that contains it is involved.
Comment 10 Jonathan Wakely 2020-04-22 16:56:06 UTC
(In reply to Bill Schmidt from comment #8)
> If C++17 has added new hidden fields

It hasn't. What changed is how G++ represents such types. The code is the same in C++14 and C++17, i.e. there's an empty base class present in the source code, not something introduced by the language. And there's no such thing as a "hidden field" in any version of the C++ standard. C++17 didn't introduce anything here.
Comment 11 Andrew Pinski 2020-04-22 18:16:21 UTC
*** Bug 94720 has been marked as a duplicate of this bug. ***
Comment 12 CVS Commits 2020-04-23 08:00:58 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a39ed81b8a0b46320a7c6ece3f7ad4c3f8519609

commit r10-7907-ga39ed81b8a0b46320a7c6ece3f7ad4c3f8519609
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Apr 23 09:59:57 2020 +0200

    rs6000: Fix C++14 vs. C++17 ABI bug on powerpc64le [PR94707]
    
    As mentioned in the PR and on IRC, the recently added struct-layout-1.exp
    new tests FAIL on powerpc64le-linux (among other targets).
    FAIL: tmpdir-g++.dg-struct-layout-1/t032 cp_compat_x_tst.o-cp_compat_y_tst.o execute
    FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_tst.o-cp_compat_y_tst.o execute
    FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_tst.o-cp_compat_y_tst.o execute
    in particular.  The problem is that the presence or absence of the C++17
    artificial empty base fields, which have non-zero TYPE_SIZE, but zero
    DECL_SIZE, change the ABI decisions, if it is present (-std=c++17), the type
    might not be considered homogeneous, while if it is absent (-std=c++14), it
    can be.
    
    The following patch fixes that and emits a -Wpsabi inform; perhaps more
    often than it could, because the fact that rs6000_discover_homogeneous_aggregate
    returns true when it didn't in in GCC 7/8/9 with -std=c++17 doesn't still
    mean it will make a different ABI decision, but the warning triggered only
    on the test I've changed (the struct-layout-1.exp tests use -w -Wno-psabi
    already).
    
    2020-04-23  Jakub Jelinek  <jakub@redhat.com>
    
            PR target/94707
            * config/rs6000/rs6000-call.c (rs6000_aggregate_candidate): Add
            cxx17_empty_base_seen argument.  Pass it to recursive calls.
            Ignore cxx17_empty_base_field_p fields after setting
            *cxx17_empty_base_seen to true.
            (rs6000_discover_homogeneous_aggregate): Adjust
            rs6000_aggregate_candidate caller.  With -Wpsabi, diagnose homogeneous
            aggregates with C++17 empty base fields.
    
            * g++.dg/tree-ssa/pr27830.C: Use -Wpsabi -w for -std=c++17 and higher.
Comment 13 Jakub Jelinek 2020-04-23 08:08:23 UTC
Fixed in 10; I assume we don't want to backport.
Comment 14 CVS Commits 2020-04-23 12:44:02 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:239cfd92e9ce5014a7616f692e0c6d4f337227b8

commit r10-7910-g239cfd92e9ce5014a7616f692e0c6d4f337227b8
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Apr 23 14:43:18 2020 +0200

    rs6000: Small improvement to the C++17 ABI fix [PR94707]
    
    Anyway, based on IRC discussion with Richard Sandiford on IRC, we should
    probably test type uids instead of type pointers because type uids aren't
    reused, but type pointers in a very bad luck case could be, and having the
    static var at filescope and GTY((deletable)) is an overkill (and with costs
    during GC time).
    
    2020-04-23  Jakub Jelinek  <jakub@redhat.com>
    
            PR target/94707
            * config/rs6000/rs6000-call.c (rs6000_discover_homogeneous_aggregate):
            Use TYPE_UID (TYPE_MAIN_VARIANT (type)) instead of type to check
            if the same type has been diagnosed most recently already.
Comment 15 CVS Commits 2020-04-29 07:06:49 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:575ac27fd5f18ffc9cfce8a99e987f52c5b898c9

commit r10-8027-g575ac27fd5f18ffc9cfce8a99e987f52c5b898c9
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Apr 29 09:01:49 2020 +0200

    c++, middle-end, rs6000: Fix C++17 ABI incompatibilities during class layout and [[no_unique_address]] handling [PR94707]
    
    As reported by Iain and David, powerpc-darwin and powerpc-aix* have C++14
    vs. C++17 ABI incompatibilities which are not fixed by mere adding of
    cxx17_empty_base_field_p calls.  Unlike the issues that were seen on other
    targets where the artificial empty base field affected function argument
    passing or returning of values, on these two targets the difference is
    during class layout, not afterwards (e.g.
    struct empty_base {};
    struct S : public empty_base { unsigned long long l[2]; };
    will have different __alignof__ (S) between C++14 and C++17 (or possibly
    with double instead of unsigned long long too)).
    
    I've tried:
    struct X { };
    struct Y { int : 0; };
    struct Z { int : 0; Y y; };
    struct U : public X { X q; };
    struct A { float a, b, c, d; };
    struct B : public X { float a, b, c, d; };
    struct C : public Y { float a, b, c, d; };
    struct D : public Z { float a, b, c, d; };
    struct E : public U { float a, b, c, d; };
    struct F { [[no_unique_address]] X x; float a, b, c, d; };
    struct G { [[no_unique_address]] Y y; float a, b, c, d; };
    struct H { [[no_unique_address]] Z z; float a, b, c, d; };
    struct I { [[no_unique_address]] U u; float a, b, c, d; };
    struct J { float a, b; [[no_unique_address]] X x; float c, d; };
    struct K { float a, b; [[no_unique_address]] Y y; float c, d; };
    struct L { float a, b; [[no_unique_address]] Z z; float c, d; };
    struct M { float a, b; [[no_unique_address]] U u; float c, d; };
     #define T(S, s) extern S s; extern void foo##s (S); int bar##s () { foo##s (s); return 0; }
    T (A, a)
    T (B, b)
    T (C, c)
    T (D, d)
    T (E, e)
    T (F, f)
    T (G, g)
    T (H, h)
    T (I, i)
    T (J, j)
    T (K, k)
    T (L, l)
    T (M, m)
    testcase on powerpc64-linux.  Results:
    G++ 9 -std=c++14                A, B, C passed in fprs, the rest in gprs
    G++ 9 -std=c++17                A passed in fprs, the rest in gprs
    current trunk -std=c++14 & 17   A, B, C passed in fprs, the rest in gprs
    patched trunk -std=c++14 & 17   A, B, C, F, G, J, K passed in fprs, the rest in gprs
    clang++ [*] -std=c++14 & 17     A, B, C, F, G, J, K passed in fprs, the rest in gprs
    [*] clang version 11.0.0 (git@github.com:llvm/llvm-project.git 5c352e69e76a26e4eda075e20aa6a9bb7686042c)
    
    Is that what we want?  I think it matches the stated intent of P0840R2 or
    what Jason/Jonathan said, and doing something different like e.g. not
    treating C, G and K as homogenous because of the int : 0 in empty bases
    or in zero sized [[no_unique_address] fields would be quite hard to
    implement (because for C++14 the FIELD_DECL just isn't there).
    
    2020-04-29  Jakub Jelinek  <jakub@redhat.com>
    
            PR target/94707
            * tree-core.h (tree_decl_common): Note decl_flag_0 used for
            DECL_FIELD_ABI_IGNORED.
            * tree.h (DECL_FIELD_ABI_IGNORED): Define.
            * calls.h (cxx17_empty_base_field_p): Change into a temporary
            macro, check DECL_FIELD_ABI_IGNORED flag with no "no_unique_address"
            attribute.
            * calls.c (cxx17_empty_base_field_p): Remove.
            * tree-streamer-out.c (pack_ts_decl_common_value_fields): Handle
            DECL_FIELD_ABI_IGNORED.
            * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
            * lto-streamer-out.c (hash_tree): Likewise.
            * config/rs6000/rs6000-call.c (rs6000_aggregate_candidate): Rename
            cxx17_empty_base_seen to empty_base_seen, change type to int *,
            adjust recursive calls, use DECL_FIELD_ABI_IGNORED instead of
            cxx17_empty_base_field_p, if "no_unique_address" attribute is
            present, propagate that to the caller too.
            (rs6000_discover_homogeneous_aggregate): Adjust
            rs6000_aggregate_candidate caller, emit different diagnostics
            when c++17 empty base fields are present and when empty
            [[no_unique_address]] fields are present.
            * config/rs6000/rs6000.c (rs6000_special_round_type_align,
            darwin_rs6000_special_round_type_align): Skip DECL_FIELD_ABI_IGNORED
            fields.
    
            * class.c (build_base_field): Set DECL_FIELD_ABI_IGNORED on C++17 empty
            base artificial FIELD_DECLs.
            (layout_class_type): Set DECL_FIELD_ABI_IGNORED on empty class
            field_poverlapping_p FIELD_DECLs.
    
            * lto-common.c (compare_tree_sccs_1): Handle DECL_FIELD_ABI_IGNORED.
    
            * g++.target/powerpc/pr94707-1.C: New test.
            * g++.target/powerpc/pr94707-2.C: New test.
            * g++.target/powerpc/pr94707-3.C: New test.
            * g++.target/powerpc/pr94707-4.C: New test.
            * g++.target/powerpc/pr94707-5.C: New test.
            * g++.target/powerpc/pr94707-4.C: New test.