Bug 93945 - [9/10 Regression] memset of non-zero constant followed by bitfield read big-endian miscompilation since r9-547
Summary: [9/10 Regression] memset of non-zero constant followed by bitfield read big-e...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.2.1
: P2 normal
Target Milestone: 9.3
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2020-02-26 11:08 UTC by Jakub Jelinek
Modified: 2020-02-27 10:29 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-02-26 00:00:00


Attachments
gcc10-pr93945.patch (1.77 KB, patch)
2020-02-26 13:01 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2020-02-26 11:08:56 UTC
I think since r9-547-gb72feab889cd7925fab59771269638fcc88bc195 we miscompile following testcase on big-endian (verified on powerpc64-linux) at -O2:
union U { char a[8]; struct S { unsigned int b : 8, c : 13, d : 11; } e; } u;

__attribute__((noipa)) int
foo (void)
{
  __builtin_memset (&u.a, 0xf4, sizeof (u.a));
  return u.e.c;
}

__attribute__((noipa)) int
bar (void)
{
  asm volatile ("" : : "g" (&u) : "memory");
  return u.e.c;
}

int
main ()
{
  int a = foo ();
  __builtin_memset (&u.a, 0xf4, sizeof (u.a));
  int b = bar ();
  if (a != b)
    __builtin_abort ();
  return 0;
}
Comment 1 Jakub Jelinek 2020-02-26 11:13:20 UTC
For 9.x, I think we need to require for that case that ref->size is a multiple of BITS_PER_UNIT (and I must say I'm a little bit worried that for this native_interpret_expr case we don't require that known_eq (ref->size, maxsize)).
For the trunk, I'm actually working on support for arbitrary bitfield reads.
Comment 2 Andrew Pinski 2020-02-26 11:19:35 UTC
My bet is there missing the following somewhere:
+         if (BYTES_BIG_ENDIAN)
+           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;

I have not looked where though.  I do know BIT_INSERT_EXPR is broken with that respect.
The following patch fixes that:
commit 7490f8e4f52d38d1ae265bec50ab518e86a53d19
Author: Andrew Pinski <apinski@marvell.com>
Date:   Tue Dec 17 03:54:46 2019 +0000

    Fix big-endian constant folding of BIT_INSERT_EXPR

    Big-endian has a different idea of what bit order
    should be compared to little-endian.  This fixes
    it by correcting the shift/mask to be in the correct
    order.

    Change-Id: I3e4ba3c27d62a20affd08100083cee5f921edc7d
    Signed-off-by: Andrew Pinski <apinski@marvell.com>

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index aefa91666e2..f515d776c57 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -12506,6 +12506,8 @@ fold_ternary_loc (location_t loc, enum tree_code code, tree type,
        {
          unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
          unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
+         if (BYTES_BIG_ENDIAN)
+           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
          wide_int tem = (wi::to_wide (arg0)
                          & wi::shifted_mask (bitpos, bitsize, true,
                                              TYPE_PRECISION (type)));
Comment 3 Jakub Jelinek 2020-02-26 11:25:25 UTC
For 9.x all we need is add && multiple_p (ref->size, BITS_PER_UNIT) because the code doesn't attempt to handle loads or stores that aren't byte aligned and whole byte sized.
I found this by code inspection when trying to add trunk support for arbitrary load bit offset or ref->size.
Comment 4 Jakub Jelinek 2020-02-26 13:01:11 UTC
Created attachment 47915 [details]
gcc10-pr93945.patch

Untested trunk patch.
Comment 5 GCC Commits 2020-02-27 09:14:24 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:5f9cd512c4278621435cce486dd00248ea2e821c

commit r10-6885-g5f9cd512c4278621435cce486dd00248ea2e821c
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Feb 27 10:12:52 2020 +0100

    sccvn: Handle non-byte aligned offset or size for memset (, 123, ) [PR93945]
    
    The following is the last spot in vn_reference_lookup_3 that didn't allow
    non-byte aligned offsets or sizes.  To be precise, it did allow size that
    wasn't multiple of byte size and that caused a wrong-code issue on
    big-endian, as the pr93945.c testcase shows, so for GCC 9 we should add
    && multiple_p (ref->size, BITS_PER_UNIT) check instead.
    For the memset with SSA_NAME middle-argument, it still requires byte-aligned
    offset, as we'd otherwise need to rotate the value at runtime.
    
    2020-02-27  Jakub Jelinek  <jakub@redhat.com>
    
    	PR tree-optimization/93582
    	PR tree-optimization/93945
    	* tree-ssa-sccvn.c (vn_reference_lookup_3): Handle memset with
    	non-zero INTEGER_CST second argument and ref->offset or ref->size
    	not a multiple of BITS_PER_UNIT.
    
    	* gcc.dg/tree-ssa/pr93582-9.c: New test.
    	* gcc.c-torture/execute/pr93945.c: New test.
Comment 6 GCC Commits 2020-02-27 10:24:18 UTC
The releases/gcc-9 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a460bf38dc3582ce1f559cc84084ca27e429b34c

commit r9-8299-ga460bf38dc3582ce1f559cc84084ca27e429b34c
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Feb 27 11:21:52 2020 +0100

    sccvn: Punt on ref->size not multiple of 8 for memset (, 123, ) in 9.x [PR93945]
    
    And here is the corresponding 9.x change where we the patch just punts if
    ref->size is not whole bytes, like we already punt if offseti is not byte
    aligned.
    
    2020-02-27  Jakub Jelinek  <jakub@redhat.com>
    
    	PR tree-optimization/93945
    	* tree-ssa-sccvn.c (vn_reference_lookup_3): For memset with non-zero
    	second operand, require ref->size to be a multiple of BITS_PER_UNIT.
    
    	* gcc.c-torture/execute/pr93945.c: New test.
Comment 7 Jakub Jelinek 2020-02-27 10:29:09 UTC
Fixed for 9.3+.