Bug 22156

Summary: [4.0/4.1/4.2/4.3 Regression] bit-field copying regressed
Product: gcc Reporter: Andrew Pinski <pinskia>
Component: middle-endAssignee: Alexandre Oliva <aoliva>
Status: RESOLVED FIXED    
Severity: minor CC: gcc-bugs, kazu, mark, pawel_sikora, rth
Priority: P5 Keywords: missed-optimization
Version: 4.1.0   
Target Milestone: 4.3.0   
See Also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15596
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2007-03-20 04:42:47
Bug Depends on: 14295, 18268, 22157    
Bug Blocks: 19466    

Description Andrew Pinski 2005-06-23 05:02:43 UTC
Take the following code:
struct s
{
  int i1:1;
  int i2:1;
  int i3:1;
};
void f(struct s *x, struct s *y) { *x = *y; }

In 3.4.0 (and currently with the C++ compiler too), we were able to get optimal code:
        movl    8(%esp), %eax
        movl    (%eax), %edx
        movl    4(%esp), %eax
        movl    %edx, (%eax)
        ret

But in (and after) 4.0.0 and the C compiler we get:
f:
        pushl   %esi
        pushl   %ebx
        movl    16(%esp), %eax
        movl    12(%esp), %esi
        movzbl  (%eax), %eax
        movb    %al, %dl
        movb    %al, %bl
        salb    $5, %al
        sarb    $7, %al
        movzbl  %al, %ecx
        movl    (%esi), %eax
        salb    $6, %dl
        andl    $1, %ecx
        sarb    $7, %dl
        movzbl  %dl, %edx
        salb    $7, %bl
        andl    $-7, %eax
        sall    $2, %ecx
        andl    $1, %edx
        sarb    $7, %bl
        addl    %edx, %edx
        orl     %ecx, %eax
        orl     %edx, %eax
        movzbl  %bl, %edx
        andl    $1, %edx
        andl    $-2, %eax
        orl     %edx, %eax
        movl    %eax, (%esi)
        popl    %ebx
        popl    %esi
        ret

Which is much worse
Comment 1 Andrew Pinski 2005-06-23 05:04:32 UTC
Note this also effects PPC, though not as badly as there is only one load:
in 3.3.3:
_f:
        lwz r0,0(r4)
        stw r0,0(r3)
        blr

in 4.0.0 and above:
_f:
        lwz r11,0(r4)
        lwz r0,0(r3)
        slwi r2,r11,2
        slwi r9,r11,1
        rlwimi r0,r2,30,2,2
        rlwimi r0,r9,31,1,1
        rlwimi r0,r11,0,0,0
        stw r0,0(r3)
        blr
Comment 2 Andrew Pinski 2005-06-23 05:11:40 UTC
The main difference between the C++ front-end and the C front-end in this respect is how they 
repesent bit-fields.
for the C++ front-end we get the following output from SRA:
Cannot scalarize variable D.1693 because its type cannot be decomposed

but for the C front-end we get:
Scan results:
D.1237: n_uses=0 n_copies=2

Initial instantiation for D.1237
Using element-copy for D.1237
  D.1237.i1 -> SR.12
  D.1237.i2 -> SR.13
  D.1237.i3 -> SR.14

So to me this looks like a SRA bug in that it should be using BIT_FIELD_REF instead if it can.
Comment 3 Andrew Pinski 2005-06-23 19:49:18 UTC
I have a fix, it does what I recommended in 22157.
Comment 4 Andrew Pinski 2005-07-04 10:26:30 UTC
I am no longer working on this, this is a much harder problem than I sugested.  This is basically the 
same as PR 18268.
Comment 5 Andrew Pinski 2005-08-05 20:45:18 UTC
If we add more bit-fields, that is now fixed in 4.0.2 and 4.1.0 by:
2005-08-04  Richard Henderson  <rth@redhat.com>

        PR 21529
        * params.def (PARAM_SRA_MAX_STRUCTURE_COUNT): New.
        * params.h (SRA_MAX_STRUCTURE_COUNT): New.
        * tree-sra.c (decide_block_copy): Use it.  Disable element copy
        if we'd have to instantiate too many members.

But the orginal code is not fixed as we don't have too many members.
Comment 6 Andrew Pinski 2005-09-15 14:27:37 UTC
*** Bug 22157 has been marked as a duplicate of this bug. ***
Comment 7 Andrew Pinski 2005-10-27 20:31:57 UTC
Nothing can happen until 4.2.0 for this issue really as we need a real copy prop for structs.
Comment 8 Steven Bosscher 2006-02-05 21:21:18 UTC
This bug ought to have a much higher priority than P5 IMHO.
Comment 9 Mark Mitchell 2006-02-07 09:26:27 UTC
Steven, if we increased the priority of the bug, what solution would you envision?  This is not in any way meant to be a rhetorical or sarcastic question.  My reading of the audit trail is that this bug is essentially a feature request for a new optimization; given that, I don't think we can make it release-critical.  Am I misunderstanding?

Thanks,

-- Mark
Comment 10 Mark Mitchell 2006-05-25 02:36:12 UTC
Will not be fixed in 4.1.1; adjust target milestone to 4.1.2.
Comment 11 patchapp@dberlin.org 2007-03-26 04:40:32 UTC
Subject: Bug number PR 22156

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2007-03/msg01625.html
Comment 12 Alexandre Oliva 2007-04-05 19:50:47 UTC
Subject: Bug 22156

Author: aoliva
Date: Thu Apr  5 19:50:34 2007
New Revision: 123524

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=123524
Log:
PR middle-end/22156
* tree-sra.c (struct sra_elt): Add in_bitfld_block.  Remove
all_no_warning.
(struct sra_walk_fns): Remove use_all parameter from use.
(sra_hash_tree): Handle BIT_FIELD_REFs.
(sra_elt_hash): Don't hash bitfld blocks.
(sra_elt_eq): Skip them in parent compares as well.  Handle
BIT_FIELD_REFs.
(sra_walk_expr): Don't maintain or pass down use_all_p.
(scan_use): Remove use_all parameter.
(scalarize_use): Likewise.  Re-expand assignment to
BIT_FIELD_REF of gimple_reg.  De-scalarize before input or
output, and re-scalarize after output.  Don't mark anything
for no warning.
(scalarize_ldst): Adjust.
(scalarize_walk_gimple_modify_statement): Likewise.
(build_element_name_1): Handle BIT_FIELD_REFs.
(instantiate_element): Don't warn for any element whose parent
is used as a whole.
(instantiate_missing_elements_1): Return the sra_elt.
(canon_type_for_field): New.
(try_instantiate_multiple_fields): New.
(instantiate_missing_elemnts): Use them.
(mark_no_warning): Removed.
(generate_one_element_ref): Handle BIT_FIELD_REFs.
(REPLDUP, sra_build_elt_assignment): New.
(generate_copy_inout): Use them.
(generate_element_copy): Likewise.  Handle bitfld differences.
(generate_element_zero): Don't recurse for blocks.  Use
sra_build_elt_assignment.
(generate_one_element_int): Take elt instead of var.  Use
sra_build_elt_assignment.
(generate_element_init_1): Adjust.
(scalarize_use, scalarize_copy): Use REPLDUP.
(scalarize_ldst): Move assert before dereference.
(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-sra.c

Comment 13 Alexandre Oliva 2007-04-05 23:06:39 UTC
Fixed in trunk for 4.3, not sure about backporting...
Comment 14 Alexandre Oliva 2007-04-30 18:41:22 UTC
Subject: Bug 22156

Author: aoliva
Date: Mon Apr 30 18:41:11 2007
New Revision: 124302

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124302
Log:
PR middle-end/22156
Temporarily revert:
2007-04-06  Andreas Tobler  <a.tobler@schweiz.org>
2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
* tree-sra.c (try_instantiate_multiple_fields): Needlessly
initialize align to silence bogus warning.
2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
* tree-sra.c (struct sra_elt): Add in_bitfld_block.  Remove
all_no_warning.
(struct sra_walk_fns): Remove use_all parameter from use.
(sra_hash_tree): Handle BIT_FIELD_REFs.
(sra_elt_hash): Don't hash bitfld blocks.
(sra_elt_eq): Skip them in parent compares as well.  Handle
BIT_FIELD_REFs.
(sra_walk_expr): Don't maintain or pass down use_all_p.
(scan_use): Remove use_all parameter.
(scalarize_use): Likewise.  Re-expand assignment to
BIT_FIELD_REF of gimple_reg.  De-scalarize before input or
output, and re-scalarize after output.  Don't mark anything
for no warning.
(scalarize_ldst): Adjust.
(scalarize_walk_gimple_modify_statement): Likewise.
(build_element_name_1): Handle BIT_FIELD_REFs.
(instantiate_element): Don't warn for any element whose parent
is used as a whole.
(instantiate_missing_elements_1): Return the sra_elt.
(canon_type_for_field): New.
(try_instantiate_multiple_fields): New.
(instantiate_missing_elemnts): Use them.
(mark_no_warning): Removed.
(generate_one_element_ref): Handle BIT_FIELD_REFs.
(REPLDUP, sra_build_elt_assignment): New.
(generate_copy_inout): Use them.
(generate_element_copy): Likewise.  Handle bitfld differences.
(generate_element_zero): Don't recurse for blocks.  Use
sra_build_elt_assignment.
(generate_one_element_int): Take elt instead of var.  Use
sra_build_elt_assignment.
(generate_element_init_1): Adjust.
(scalarize_use, scalarize_copy): Use REPLDUP.
(scalarize_ldst): Move assert before dereference.
(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-sra.c

Comment 15 Andrew Pinski 2007-07-09 09:15:10 UTC
Here is a new testcase which shows the problem on the trunk on powerpc-darwin:
struct s
{
  int i1:1;
  int i2:1;
};
void f(struct s *x, struct s *y) { *x = *y; }
Comment 16 Alexandre Oliva 2007-10-01 16:36:05 UTC
Subject: Bug 22156

Author: aoliva
Date: Mon Oct  1 16:35:55 2007
New Revision: 128908

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=128908
Log:
PR middle-end/22156
* tree-sra.c (struct sra_elt): Add in_bitfld_block.
(sra_hash_tree): Handle BIT_FIELD_REFs.
(sra_elt_hash): Don't hash bitfld blocks.
(sra_elt_eq): Skip them in parent compares as well.  Handle
BIT_FIELD_REFs.
(build_element_name_1): Handle BIT_FIELD_REFs.
(instantiate_element): Propagate nowarn from parents.  Create
BIT_FIELD_REF for variables that are widened by scalarization.
Gimple-zero-initialize all bit-field variables that are not
part of parameters that are going to be scalarized on entry.
(instantiate_missing_elements_1): Return the sra_elt.
(canon_type_for_field): New.
(try_instantiate_multiple_fields): New.  Infer widest possible
access mode from decl or member type, but clip it at word
size, and only widen it if a field crosses an alignment
boundary.
(instantiate_missing_elements): Use them.
(generate_one_element_ref): Handle BIT_FIELD_REFs.
(scalar_bitfield_p): New.
(sra_build_assignment): Optimize assignments from scalarizable
BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
counts.
(REPLDUP): New.
(sra_build_bf_assignment): New.  Optimize assignments to
scalarizable BIT_FIELD_REFs.
(sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
assignments to full variables.
(generate_copy_inout): Use the new macros and functions.
(generate_element_copy): Likewise.  Handle bitfld differences.
(generate_element_zero): Don't recurse for blocks.  Use
sra_build_elt_assignment.
(generate_one_element_init): Take elt instead of var.  Use
sra_build_elt_assignment.
(generate_element_init_1): Adjust.
(bitfield_overlap_info): New struct.
(bitfield_overlaps_p): New.
(sra_explode_bitfield_assignment): New.  Adjust widened
variables to account for endianness.
(sra_sync_for_bitfield_assignment): New.
(scalarize_use): Re-expand assignment to/from scalarized
BIT_FIELD_REFs.  Explode or sync needed members for
BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
(scalarize_copy): Use REPLDUP.
(scalarize_ldst): Move assert before dereference.  Adjust EH
handling.
(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-sra.c

Comment 17 Alexandre Oliva 2007-10-01 16:46:30 UTC
Patch checked in.
Comment 18 Alexandre Oliva 2007-10-09 04:41:52 UTC
Subject: Bug 22156

Author: aoliva
Date: Tue Oct  9 04:41:39 2007
New Revision: 129149

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129149
Log:
Add missing hunk in r129143 check in.  Add references to PR 22156.

Modified:
    trunk/gcc/ChangeLog

Comment 19 Alexandre Oliva 2007-10-09 04:45:34 UTC
Subject: Bug 22156

Author: aoliva
Date: Tue Oct  9 04:45:22 2007
New Revision: 129150

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129150
Log:
PR middle-end/22156
* tree-sra.c (instantiate_element): Use BYTES_BIG_ENDIAN for
bit-field layout.
(sra_build_assignment): Likewise.  Set up mask depending on
precision, not type.
(sra_build_bf_assignment): Use BYTES_BIG_ENDIAN.  Don't overflow
computing bit masks.
(sra_build_elt_assignment): Don't view-convert from signed to
unsigned.
(sra_explode_bitfield_assignment): Use bit-field type if
possible.  Use BYTES_BIG_ENDIAN.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-sra.c

Comment 20 Alexandre Oliva 2007-10-09 04:55:38 UTC
Subject: Bug 22156

Author: aoliva
Date: Tue Oct  9 04:55:17 2007
New Revision: 129152

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129152
Log:
PR tree-optimization/33655
PR middle-end/22156
* tree-sra.c (bitfield_overlaps_p): When fld->element is INTEGER_CST,
convert it to bitsizetype before size_binop call.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-sra.c