This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch, combine] Bit-field insertion optimization #2


Hi, asides from Andrew's patch, we also have this alternate patch using combine. With respect to the bitfield case here, the two patches are mostly similar in effect, but we thought we'd post both for discussion anyways.

Currently, combine detects and rejects cases where the combine target
insn is a "modification" (LHS is a SUBREG, STRICT_LOW_PART, or
ZERO_EXTRACT), plus one of the prior merged insns has a SET destination
overlap, citing wrong semantics when such two insns are merged.

For the same testcase:
struct bits {
  unsigned a : 5;
  unsigned b : 5;
  unsigned c : 5;
  unsigned d : 5;
};

struct bits
f (unsigned int a)
{
  struct bits bits;
  bits.a = 1;
  bits.b = 2;
  bits.c = 3;
  bits.d = a;
  return bits;
}

The RTL encountered by combine on ARM and fails to optimize is like this:
  2: r138:SI = r0:SI
  6: r139:SI = 1
 23: r136:SI = 0				@
  7: zero_extract [4:0] r136:SI = r139:SI	@ See how r136 is
  8: r140:SI = 2				@ modified throughout
  9: zero_extract [9:5] r136:SI = r140:SI	@ the insns
 10: r141:SI = 3
 11: zero_extract [14:10] r136:SI = r141:SI
 12: zero_extract [19:15] r136:SI = r138:SI
 17: r0:SI = r136:SI
 20: <use r0:SI>

For cases like ARM, this patch enhances combine's handled cases by using
its own expand_field_assignment() function (currently applied for prior
merged insn patterns) to transform the combine target insn which happens
to be a modification into a larger IOR/AND expression; this can turn the
LHS into a normal register write.

For example,
change (zero_extract reg1 #5 #0) = <foo>
into   (reg1) = (ior (and reg1 0xffffffe0) (and <foo> 0x0000001f))

This plus the various propagation and simplification done later in
combine increases the chances it successfully finds a new combined
bitfield operation.

The combine patch is quite small, basically a call to
expand_field_assignment() + another try at combinable_i3pat() before
giving up and returning 0, enabled when flag_expensive_optimizations is set.

With this patch, the testcase generated code for ARM improves from:
        mov     r2, #1
        mov     r3, #0
        bfi     r3, r2, #0, #5
        mov     r2, #2
        bfi     r3, r2, #5, #5
        mov     r2, #3
        bfi     r3, r2, #10, #5
        bfi     r3, r0, #15, #5
        mov     r0, r3
        bx      lr
into
        movw    r3, #3137
        bfi     r3, r0, #15, #5
        mov     r0, r3
        bx      lr

As mentioned, for this bitfield case, Andrew's CSE patch and mine is mostly similar in effect. From the ordering of passes however, the CSE patch will shadow this combine one if both were applied :P

Thanks,
Chung-Lin

2010-12-09 Chung-Lin Tang <cltang@codesourcery.com>

	* combine.c (combinable_i3pat): Try field expanding the combine
	target insn when it modifies its LHS output, changing it into
	overwriting form, before failing the combine attempt.

Attachment: cb.diff
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]