This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[Committed] PR22563: Further bitfield improvements
- From: Roger Sayle <roger at eyesopen dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Sun, 14 May 2006 09:08:50 -0600 (MDT)
- Subject: [Committed] PR22563: Further bitfield improvements
The following patch address the performance regression
PR rtl-optimization/22563 and is a more aggressive variant of my
earlier http://gcc.gnu.org/ml/gcc-patches/2006-04/msg01049.html
Consider this reduced C++ test case from the PR (from bench++):
class my_rec {
public:
signed int a1 : 3;
bool a2 : 1;
unsigned int a3 : 4;
};
static my_rec b_rec;
void h000007_test(void){
b_rec.a1 = -2;
b_rec.a2 = false;
b_rec.a3 = 5;
}
Currently with mainline, even after my earlier patches, we generate the
following on i686 with -O2 -fomit-frame-pointer:
movzbl b_rec, %eax
andl $-8, %eax
orl $6, %eax
movb %al, b_rec
andb $-9, b_rec
movzbl b_rec, %eax
andl $15, %eax
orl $80, %eax
movb %al, b_rec
ret
with the change below, we now get:
movb $86, b_rec
ret
The problem/remaining issue was that I was too conservative previsouly,
and didn't force intermediates into registers if we only performed a
single AND or a single IOR. Unfortunately, this prevents the a2 = false
above from using a pseudo that CSE/GCSE could work with, that then
prevents the three bitfield stores from being combined. We now always
force these intermediates into pseudos. I've also confirmed that combine
does its job and optimizes the stores of zeros and stores of allones
into a single AND with a memory operand or a IOR with a memory operand
respectively.
I also retested the code size implications on CSiBE:
object file before after delta
catdvi,fontinfo 3692 3660 -32
sed,compile 7669 7670 +1
sed,fmt 1696 1719 +23
sed,regex 587 591 +4
And unlike before, being more aggressive is actually a net size win,
(on a statistically insignificant sample) so I've removed the previous
optimize_size tests.
The following patch has been tested on i686-pc-linux-gnu with a full
"make bootstrap", all default langauges including Ada, and regression
tested with a top-level "make -k check" with no new failures.
Committed to mainline as revision 113762. After a while on mainline,
I'll investigate a similar backport for the 4.0 and 4.1 branches.
2006-05-14 Roger Sayle <roger@eyesopen.com>
PR rtl-optimization/22563
* expmed.c (store_fixed_bit_field): When using AND and IOR to store
a fixed width bitfield, always force the intermediates into psuedos.
Index: expmed.c
===================================================================
*** expmed.c (revision 113720)
--- expmed.c (working copy)
*************** store_fixed_bit_field (rtx op0, unsigned
*** 793,799 ****
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
! rtx subtarget, temp;
int all_zero = 0;
int all_one = 0;
--- 793,799 ----
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
! rtx temp;
int all_zero = 0;
int all_one = 0;
*************** store_fixed_bit_field (rtx op0, unsigned
*** 919,947 ****
/* Now clear the chosen bits in OP0,
except that if VALUE is -1 we need not bother. */
! subtarget = op0;
if (! all_one)
{
! /* Don't try and keep the intermediate in memory, if we need to
! perform both a bit-wise AND and a bit-wise IOR (except when
! we're optimizing for size). */
! if (MEM_P (subtarget) && !all_zero && !optimize_size)
! subtarget = force_reg (mode, subtarget);
! temp = expand_binop (mode, and_optab, subtarget,
mask_rtx (mode, bitpos, bitsize, 1),
! subtarget, 1, OPTAB_LIB_WIDEN);
! subtarget = temp;
}
- else
- temp = op0;
/* Now logical-or VALUE into OP0, unless it is zero. */
if (! all_zero)
! temp = expand_binop (mode, ior_optab, temp, value,
! subtarget, 1, OPTAB_LIB_WIDEN);
if (op0 != temp)
emit_move_insn (op0, temp);
}
--- 919,946 ----
/* Now clear the chosen bits in OP0,
except that if VALUE is -1 we need not bother. */
+ /* We keep the intermediates in registers to allow CSE to combine
+ consecutive bitfield assignments. */
! temp = force_reg (mode, op0);
if (! all_one)
{
! temp = expand_binop (mode, and_optab, temp,
mask_rtx (mode, bitpos, bitsize, 1),
! NULL_RTX, 1, OPTAB_LIB_WIDEN);
! temp = force_reg (mode, temp);
}
/* Now logical-or VALUE into OP0, unless it is zero. */
if (! all_zero)
! {
! temp = expand_binop (mode, ior_optab, temp, value,
! NULL_RTX, 1, OPTAB_LIB_WIDEN);
! temp = force_reg (mode, temp);
! }
!
if (op0 != temp)
emit_move_insn (op0, temp);
}
Roger
--