Take the following code: struct s { int i1:1; int i2:1; int i3:1; }; void f(struct s *x, struct s *y) { *x = *y; } In 3.4.0 (and currently with the C++ compiler too), we were able to get optimal code: movl 8(%esp), %eax movl (%eax), %edx movl 4(%esp), %eax movl %edx, (%eax) ret But in (and after) 4.0.0 and the C compiler we get: f: pushl %esi pushl %ebx movl 16(%esp), %eax movl 12(%esp), %esi movzbl (%eax), %eax movb %al, %dl movb %al, %bl salb $5, %al sarb $7, %al movzbl %al, %ecx movl (%esi), %eax salb $6, %dl andl $1, %ecx sarb $7, %dl movzbl %dl, %edx salb $7, %bl andl $-7, %eax sall $2, %ecx andl $1, %edx sarb $7, %bl addl %edx, %edx orl %ecx, %eax orl %edx, %eax movzbl %bl, %edx andl $1, %edx andl $-2, %eax orl %edx, %eax movl %eax, (%esi) popl %ebx popl %esi ret Which is much worse
Note this also effects PPC, though not as badly as there is only one load: in 3.3.3: _f: lwz r0,0(r4) stw r0,0(r3) blr in 4.0.0 and above: _f: lwz r11,0(r4) lwz r0,0(r3) slwi r2,r11,2 slwi r9,r11,1 rlwimi r0,r2,30,2,2 rlwimi r0,r9,31,1,1 rlwimi r0,r11,0,0,0 stw r0,0(r3) blr
The main difference between the C++ front-end and the C front-end in this respect is how they repesent bit-fields. for the C++ front-end we get the following output from SRA: Cannot scalarize variable D.1693 because its type cannot be decomposed but for the C front-end we get: Scan results: D.1237: n_uses=0 n_copies=2 Initial instantiation for D.1237 Using element-copy for D.1237 D.1237.i1 -> SR.12 D.1237.i2 -> SR.13 D.1237.i3 -> SR.14 So to me this looks like a SRA bug in that it should be using BIT_FIELD_REF instead if it can.
I have a fix, it does what I recommended in 22157.
I am no longer working on this, this is a much harder problem than I sugested. This is basically the same as PR 18268.
If we add more bit-fields, that is now fixed in 4.0.2 and 4.1.0 by: 2005-08-04 Richard Henderson <rth@redhat.com> PR 21529 * params.def (PARAM_SRA_MAX_STRUCTURE_COUNT): New. * params.h (SRA_MAX_STRUCTURE_COUNT): New. * tree-sra.c (decide_block_copy): Use it. Disable element copy if we'd have to instantiate too many members. But the orginal code is not fixed as we don't have too many members.
*** Bug 22157 has been marked as a duplicate of this bug. ***
Nothing can happen until 4.2.0 for this issue really as we need a real copy prop for structs.
This bug ought to have a much higher priority than P5 IMHO.
Steven, if we increased the priority of the bug, what solution would you envision? This is not in any way meant to be a rhetorical or sarcastic question. My reading of the audit trail is that this bug is essentially a feature request for a new optimization; given that, I don't think we can make it release-critical. Am I misunderstanding? Thanks, -- Mark
Will not be fixed in 4.1.1; adjust target milestone to 4.1.2.
Subject: Bug number PR 22156 A patch for this bug has been added to the patch tracker. The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2007-03/msg01625.html
Subject: Bug 22156 Author: aoliva Date: Thu Apr 5 19:50:34 2007 New Revision: 123524 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=123524 Log: PR middle-end/22156 * tree-sra.c (struct sra_elt): Add in_bitfld_block. Remove all_no_warning. (struct sra_walk_fns): Remove use_all parameter from use. (sra_hash_tree): Handle BIT_FIELD_REFs. (sra_elt_hash): Don't hash bitfld blocks. (sra_elt_eq): Skip them in parent compares as well. Handle BIT_FIELD_REFs. (sra_walk_expr): Don't maintain or pass down use_all_p. (scan_use): Remove use_all parameter. (scalarize_use): Likewise. Re-expand assignment to BIT_FIELD_REF of gimple_reg. De-scalarize before input or output, and re-scalarize after output. Don't mark anything for no warning. (scalarize_ldst): Adjust. (scalarize_walk_gimple_modify_statement): Likewise. (build_element_name_1): Handle BIT_FIELD_REFs. (instantiate_element): Don't warn for any element whose parent is used as a whole. (instantiate_missing_elements_1): Return the sra_elt. (canon_type_for_field): New. (try_instantiate_multiple_fields): New. (instantiate_missing_elemnts): Use them. (mark_no_warning): Removed. (generate_one_element_ref): Handle BIT_FIELD_REFs. (REPLDUP, sra_build_elt_assignment): New. (generate_copy_inout): Use them. (generate_element_copy): Likewise. Handle bitfld differences. (generate_element_zero): Don't recurse for blocks. Use sra_build_elt_assignment. (generate_one_element_int): Take elt instead of var. Use sra_build_elt_assignment. (generate_element_init_1): Adjust. (scalarize_use, scalarize_copy): Use REPLDUP. (scalarize_ldst): Move assert before dereference. (dump_sra_elt_name): Handle BIT_FIELD_REFs. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-sra.c
Fixed in trunk for 4.3, not sure about backporting...
Subject: Bug 22156 Author: aoliva Date: Mon Apr 30 18:41:11 2007 New Revision: 124302 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124302 Log: PR middle-end/22156 Temporarily revert: 2007-04-06 Andreas Tobler <a.tobler@schweiz.org> 2007-04-05 Alexandre Oliva <aoliva@redhat.com> * tree-sra.c (try_instantiate_multiple_fields): Needlessly initialize align to silence bogus warning. 2007-04-05 Alexandre Oliva <aoliva@redhat.com> * tree-sra.c (struct sra_elt): Add in_bitfld_block. Remove all_no_warning. (struct sra_walk_fns): Remove use_all parameter from use. (sra_hash_tree): Handle BIT_FIELD_REFs. (sra_elt_hash): Don't hash bitfld blocks. (sra_elt_eq): Skip them in parent compares as well. Handle BIT_FIELD_REFs. (sra_walk_expr): Don't maintain or pass down use_all_p. (scan_use): Remove use_all parameter. (scalarize_use): Likewise. Re-expand assignment to BIT_FIELD_REF of gimple_reg. De-scalarize before input or output, and re-scalarize after output. Don't mark anything for no warning. (scalarize_ldst): Adjust. (scalarize_walk_gimple_modify_statement): Likewise. (build_element_name_1): Handle BIT_FIELD_REFs. (instantiate_element): Don't warn for any element whose parent is used as a whole. (instantiate_missing_elements_1): Return the sra_elt. (canon_type_for_field): New. (try_instantiate_multiple_fields): New. (instantiate_missing_elemnts): Use them. (mark_no_warning): Removed. (generate_one_element_ref): Handle BIT_FIELD_REFs. (REPLDUP, sra_build_elt_assignment): New. (generate_copy_inout): Use them. (generate_element_copy): Likewise. Handle bitfld differences. (generate_element_zero): Don't recurse for blocks. Use sra_build_elt_assignment. (generate_one_element_int): Take elt instead of var. Use sra_build_elt_assignment. (generate_element_init_1): Adjust. (scalarize_use, scalarize_copy): Use REPLDUP. (scalarize_ldst): Move assert before dereference. (dump_sra_elt_name): Handle BIT_FIELD_REFs. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-sra.c
Here is a new testcase which shows the problem on the trunk on powerpc-darwin: struct s { int i1:1; int i2:1; }; void f(struct s *x, struct s *y) { *x = *y; }
Subject: Bug 22156 Author: aoliva Date: Mon Oct 1 16:35:55 2007 New Revision: 128908 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=128908 Log: PR middle-end/22156 * tree-sra.c (struct sra_elt): Add in_bitfld_block. (sra_hash_tree): Handle BIT_FIELD_REFs. (sra_elt_hash): Don't hash bitfld blocks. (sra_elt_eq): Skip them in parent compares as well. Handle BIT_FIELD_REFs. (build_element_name_1): Handle BIT_FIELD_REFs. (instantiate_element): Propagate nowarn from parents. Create BIT_FIELD_REF for variables that are widened by scalarization. Gimple-zero-initialize all bit-field variables that are not part of parameters that are going to be scalarized on entry. (instantiate_missing_elements_1): Return the sra_elt. (canon_type_for_field): New. (try_instantiate_multiple_fields): New. Infer widest possible access mode from decl or member type, but clip it at word size, and only widen it if a field crosses an alignment boundary. (instantiate_missing_elements): Use them. (generate_one_element_ref): Handle BIT_FIELD_REFs. (scalar_bitfield_p): New. (sra_build_assignment): Optimize assignments from scalarizable BIT_FIELD_REFs. Use BITS_BIG_ENDIAN to determine shift counts. (REPLDUP): New. (sra_build_bf_assignment): New. Optimize assignments to scalarizable BIT_FIELD_REFs. (sra_build_elt_assignment): New. Optimize BIT_FIELD_REF assignments to full variables. (generate_copy_inout): Use the new macros and functions. (generate_element_copy): Likewise. Handle bitfld differences. (generate_element_zero): Don't recurse for blocks. Use sra_build_elt_assignment. (generate_one_element_init): Take elt instead of var. Use sra_build_elt_assignment. (generate_element_init_1): Adjust. (bitfield_overlap_info): New struct. (bitfield_overlaps_p): New. (sra_explode_bitfield_assignment): New. Adjust widened variables to account for endianness. (sra_sync_for_bitfield_assignment): New. (scalarize_use): Re-expand assignment to/from scalarized BIT_FIELD_REFs. Explode or sync needed members for BIT_FIELD_REFs accesses or assignments. Use REPLDUP. (scalarize_copy): Use REPLDUP. (scalarize_ldst): Move assert before dereference. Adjust EH handling. (dump_sra_elt_name): Handle BIT_FIELD_REFs. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-sra.c
Patch checked in.
Subject: Bug 22156 Author: aoliva Date: Tue Oct 9 04:41:39 2007 New Revision: 129149 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129149 Log: Add missing hunk in r129143 check in. Add references to PR 22156. Modified: trunk/gcc/ChangeLog
Subject: Bug 22156 Author: aoliva Date: Tue Oct 9 04:45:22 2007 New Revision: 129150 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129150 Log: PR middle-end/22156 * tree-sra.c (instantiate_element): Use BYTES_BIG_ENDIAN for bit-field layout. (sra_build_assignment): Likewise. Set up mask depending on precision, not type. (sra_build_bf_assignment): Use BYTES_BIG_ENDIAN. Don't overflow computing bit masks. (sra_build_elt_assignment): Don't view-convert from signed to unsigned. (sra_explode_bitfield_assignment): Use bit-field type if possible. Use BYTES_BIG_ENDIAN. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-sra.c
Subject: Bug 22156 Author: aoliva Date: Tue Oct 9 04:55:17 2007 New Revision: 129152 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129152 Log: PR tree-optimization/33655 PR middle-end/22156 * tree-sra.c (bitfield_overlaps_p): When fld->element is INTEGER_CST, convert it to bitsizetype before size_binop call. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-sra.c