I think since r9-547-gb72feab889cd7925fab59771269638fcc88bc195 we miscompile following testcase on big-endian (verified on powerpc64-linux) at -O2: union U { char a[8]; struct S { unsigned int b : 8, c : 13, d : 11; } e; } u; __attribute__((noipa)) int foo (void) { __builtin_memset (&u.a, 0xf4, sizeof (u.a)); return u.e.c; } __attribute__((noipa)) int bar (void) { asm volatile ("" : : "g" (&u) : "memory"); return u.e.c; } int main () { int a = foo (); __builtin_memset (&u.a, 0xf4, sizeof (u.a)); int b = bar (); if (a != b) __builtin_abort (); return 0; }
For 9.x, I think we need to require for that case that ref->size is a multiple of BITS_PER_UNIT (and I must say I'm a little bit worried that for this native_interpret_expr case we don't require that known_eq (ref->size, maxsize)). For the trunk, I'm actually working on support for arbitrary bitfield reads.
My bet is there missing the following somewhere: + if (BYTES_BIG_ENDIAN) + bitpos = TYPE_PRECISION (type) - bitpos - bitsize; I have not looked where though. I do know BIT_INSERT_EXPR is broken with that respect. The following patch fixes that: commit 7490f8e4f52d38d1ae265bec50ab518e86a53d19 Author: Andrew Pinski <apinski@marvell.com> Date: Tue Dec 17 03:54:46 2019 +0000 Fix big-endian constant folding of BIT_INSERT_EXPR Big-endian has a different idea of what bit order should be compared to little-endian. This fixes it by correcting the shift/mask to be in the correct order. Change-Id: I3e4ba3c27d62a20affd08100083cee5f921edc7d Signed-off-by: Andrew Pinski <apinski@marvell.com> diff --git a/gcc/fold-const.c b/gcc/fold-const.c index aefa91666e2..f515d776c57 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -12506,6 +12506,8 @@ fold_ternary_loc (location_t loc, enum tree_code code, tree type, { unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2); unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1)); + if (BYTES_BIG_ENDIAN) + bitpos = TYPE_PRECISION (type) - bitpos - bitsize; wide_int tem = (wi::to_wide (arg0) & wi::shifted_mask (bitpos, bitsize, true, TYPE_PRECISION (type)));
For 9.x all we need is add && multiple_p (ref->size, BITS_PER_UNIT) because the code doesn't attempt to handle loads or stores that aren't byte aligned and whole byte sized. I found this by code inspection when trying to add trunk support for arbitrary load bit offset or ref->size.
Created attachment 47915 [details] gcc10-pr93945.patch Untested trunk patch.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:5f9cd512c4278621435cce486dd00248ea2e821c commit r10-6885-g5f9cd512c4278621435cce486dd00248ea2e821c Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Feb 27 10:12:52 2020 +0100 sccvn: Handle non-byte aligned offset or size for memset (, 123, ) [PR93945] The following is the last spot in vn_reference_lookup_3 that didn't allow non-byte aligned offsets or sizes. To be precise, it did allow size that wasn't multiple of byte size and that caused a wrong-code issue on big-endian, as the pr93945.c testcase shows, so for GCC 9 we should add && multiple_p (ref->size, BITS_PER_UNIT) check instead. For the memset with SSA_NAME middle-argument, it still requires byte-aligned offset, as we'd otherwise need to rotate the value at runtime. 2020-02-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93582 PR tree-optimization/93945 * tree-ssa-sccvn.c (vn_reference_lookup_3): Handle memset with non-zero INTEGER_CST second argument and ref->offset or ref->size not a multiple of BITS_PER_UNIT. * gcc.dg/tree-ssa/pr93582-9.c: New test. * gcc.c-torture/execute/pr93945.c: New test.
The releases/gcc-9 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:a460bf38dc3582ce1f559cc84084ca27e429b34c commit r9-8299-ga460bf38dc3582ce1f559cc84084ca27e429b34c Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Feb 27 11:21:52 2020 +0100 sccvn: Punt on ref->size not multiple of 8 for memset (, 123, ) in 9.x [PR93945] And here is the corresponding 9.x change where we the patch just punts if ref->size is not whole bytes, like we already punt if offseti is not byte aligned. 2020-02-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93945 * tree-ssa-sccvn.c (vn_reference_lookup_3): For memset with non-zero second operand, require ref->size to be a multiple of BITS_PER_UNIT. * gcc.c-torture/execute/pr93945.c: New test.
Fixed for 9.3+.