struct S0 { unsigned f0; signed f2 : 11; signed : 6; } GlobS, *Ptr = &GlobS; const struct S0 Initializer = {7, 3}; int main (void) { for (unsigned i = 0; i <= 2; i++) *Ptr = Initializer; if (GlobS.f2 != 3) __builtin_abort (); return 0; } gcc -march=z13 -O2 t.c (should fail for any arch which supports vector extensions) During ifcvt we have Start lowering bitfields Lowering: Ptr.0_1->f2 = 3; to: _ifc__24 = Ptr.0_1->D.2918; _ifc__25 = BIT_INSERT_EXPR <_ifc__24, 3, 0 (11 bits)>; Ptr.0_1->D.2918 = _ifc__25; Done lowering bitfields ... Match-and-simplified BIT_INSERT_EXPR <_ifc__24, 3, 0 (11 bits)> to 3 RHS BIT_INSERT_EXPR <_ifc__24, 3, 0 (11 bits)> simplified to 3 Setting value number of _ifc__25 to 3 (changed) Replaced BIT_INSERT_EXPR <_ifc__24, 3, 0 (11 bits)> with 3 in all uses of _ifc__25 = BIT_INSERT_EXPR <_ifc__24, 3, 0 (11 bits)>; Value numbering stmt = Ptr.0_1->D.2918 = _ifc__25; which in the end leads to the optimized tree output int main () { struct S0 * Ptr.0_1; unsigned int _2; unsigned int _3; <bb 2> [local count: 268435458]: Ptr.0_1 = Ptr; MEM <vector(2) unsigned int> [(void *)Ptr.0_1] = { 7, 3 }; _2 = BIT_FIELD_REF <GlobS, 32, 32>; _3 = _2 & 4292870144; if (_3 != 6291456) goto <bb 3>; [0.00%] else goto <bb 4>; [100.00%] <bb 3> [count: 0]: __builtin_abort (); <bb 4> [local count: 268435456]: return 0; } Since bitfields are left aligned on s390, constant 3 is wrong and should rather be 0x600000.
Had a look at this and I see similar codegen for aarch64 when compiling for big-endian. If I disable tree-ifcvt (-fdisable-tree-ifvt) I end up with: MEM <unsigned long> [(void *)Ptr.0_1] = 30071062528; Which is the behaviour we want to see. This is achieved by store-merging, we should have a look at how that pass handles this. When ifcvt is enabled, lower_bitfield generates: _ifc__24 = Ptr.0_1->D.4418; _ifc__25 = BIT_INSERT_EXPR <_ifc__24, 3, 0 (11 bits)>; That gets optimized to: Ptr.0_1->D.4418 = 3; Whereas I expected that to get big-endiannised (not a word I know) to = 0x600000. I was also surprised to see that the front-end already transforms: 'if (GlobS.f2 != 3)' into '(BIT_FIELD_REF <GlobS, 32, 32> & 4292870144) != 6291456' Anyway that's as far as I got, not sure what the right solution is, should BIT_INSERT_EXPR <_ifc__24, 3, 0 (11 bits)> not fold to 0x60000 ?
>I was also surprised to see that the front-end already transforms: 'if (GlobS.f2 != 3)' into '(BIT_FIELD_REF <GlobS, 32, 32> & 4292870144) != 6291456' That is from fold. Specifically optimize_bit_field_compare .
https://gcc.gnu.org/pipermail/gcc-patches/2020-January/537612.html I can't remember if the fix was committed or not ...
(In reply to Andrew Pinski from comment #3) > https://gcc.gnu.org/pipermail/gcc-patches/2020-January/537612.html > > I can't remember if the fix was committed or not ... It wasn't - would be good to check with that fixed. BIT_FIELD_REF and BIT_INSERT_EXPR work as if operating on memory which is why they only operate on mode-precision registers. So for BIT_INSERT_EXPR interpret as register spill to memory, do bit operation in memory, read back result to register. That in particular means bit numbering for big-endian differs from little-endian.
I checked it locally and Pinski's recommended change does seem to fix the codegen for this case. I end up with: MEM <vector(2) unsigned int> [(void *)Ptr.0_1] = { 7, 6291456 }; I am regression testing the change now on aarch64_be-none-elf. @Stefan maybe you can test it on your z13 ? The change in question is: --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -13712,6 +13712,8 @@ fold_ternary_loc (location_t loc, enum tree_code code, tree type, { unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2); unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1)); + if (BYTES_BIG_ENDIAN) + bitpos = TYPE_PRECISION (type) - bitpos - bitsize; wide_int tem = (wi::to_wide (arg0) & wi::shifted_mask (bitpos, bitsize, true, TYPE_PRECISION (type)));
I gave it a try on s390 and I also end up with MEM <vector(2) unsigned int> [(void *)Ptr.0_1] = { 7, 6291456 }; Thanks for the very fast fix :)
My aarch64_be-none-elf regression testing also came back with no new failures. @Pinski: given it was your suggestion do you want the commit? ;)
(In reply to avieira from comment #7) > My aarch64_be-none-elf regression testing also came back with no new > failures. > > @Pinski: given it was your suggestion do you want the commit? ;) I am working on other things right now. It won't be for a few weeks until I have sometime.
The master branch has been updated by Andre Simoes Dias Vieira <avieira@gcc.gnu.org>: https://gcc.gnu.org/g:2e30e90a0c2bf8147a6d24854aa653c332c8f84f commit r15-4336-g2e30e90a0c2bf8147a6d24854aa653c332c8f84f Author: Andre Vieira <andre.simoesdiasvieira@arm.com> Date: Mon Oct 14 16:24:07 2024 +0100 fold-const: Fix BIT_INSERT_EXPR folding for BYTES_BIG_ENDIAN [PR116997] Fix constant folding of BIT_INSER_EXPR for BYTES_BIG_ENDIAN targets. gcc/ChangeLog: PR middle-end/116997 * fold-const.cc (fold_ternary_loc): Fix BIT_INSERT_EXPR constant folding for BYTES_BIG_ENDIAN targets. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr116997.c: New test. Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
The releases/gcc-14 branch has been updated by Andre Simoes Dias Vieira <avieira@gcc.gnu.org>: https://gcc.gnu.org/g:b51b45eaf7131ec97b7fa180ffa6e8dedc24e74f commit r14-10938-gb51b45eaf7131ec97b7fa180ffa6e8dedc24e74f Author: Andre Vieira <andre.simoesdiasvieira@arm.com> Date: Mon Oct 14 16:24:07 2024 +0100 fold-const: Fix BIT_INSERT_EXPR folding for BYTES_BIG_ENDIAN [PR116997] Fix constant folding of BIT_INSER_EXPR for BYTES_BIG_ENDIAN targets. gcc/ChangeLog: PR middle-end/116997 * fold-const.cc (fold_ternary_loc): Fix BIT_INSERT_EXPR constant folding for BYTES_BIG_ENDIAN targets. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr116997.c: New test. Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com> (cherry picked from commit 2e30e90a0c2bf8147a6d24854aa653c332c8f84f)
The releases/gcc-13 branch has been updated by Andre Simoes Dias Vieira <avieira@gcc.gnu.org>: https://gcc.gnu.org/g:57df36f0365218987fe3565523d4c272935a6561 commit r13-9200-g57df36f0365218987fe3565523d4c272935a6561 Author: Andre Vieira <andre.simoesdiasvieira@arm.com> Date: Mon Oct 14 16:24:07 2024 +0100 fold-const: Fix BIT_INSERT_EXPR folding for BYTES_BIG_ENDIAN [PR116997] Fix constant folding of BIT_INSER_EXPR for BYTES_BIG_ENDIAN targets. gcc/ChangeLog: PR middle-end/116997 * fold-const.cc (fold_ternary_loc): Fix BIT_INSERT_EXPR constant folding for BYTES_BIG_ENDIAN targets. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr116997.c: New test. Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com> (cherry picked from commit 2e30e90a0c2bf8147a6d24854aa653c332c8f84f)
Fixed on all known affected branches. Closing.