This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [patch] partial register update for a bit mask operation on x86

From: Paolo Bonzini <paolo dot bonzini at lu dot unisi dot ch>
To: Eric Christopher <echristo at apple dot com>
Cc: Hui-May Chang <hm dot chang at apple dot com>, gcc-patches at gcc dot gnu dot org
Date: Wed, 09 May 2007 09:56:22 +0200
Subject: Re: [patch] partial register update for a bit mask operation on x86
References: <D71D2A7D-EC41-4789-88A3-CB91253B4B55@apple.com> <682C6320-EE06-410D-8AD8-C332FA18B9EF@apple.com>
Reply-to: bonzini at gnu dot org

So, what's the size difference between this and movstrctqi_xor?

2 bytes for xor or mov, versus 5/6 for and.

I think the patch is not doing the right thing anyway.

First, movstrictqi_xor should have a condition like this:

 "reload_completed
  && ((!TARGET_PARTIAL_REG_STALL && !TARGET_USE_MOV0) || optimize_size)"

and this would probably remove the need for a new pattern altogether. movstrictqi's should not be generated at all for partial register stall targets, there should be no need to work around and generate the standard bit twiddling. (BTW, could any guru enlighten me on the need for "reload_completed" in the insn's condition?)

Second, movstrictqi_1 should have a "i" alternative for the source operand, to allow other constants. I am pretty sure the current status may cause problems on the K6 (i.e. the only TARGET_USE_MOV0 target), and adding the alternative also allows more optimization for -Os (or processors without register stalls). I am pretty sure that combine can synthesize a movstrictqi_1 from "x |= 255;", and even for "x = (x & ~255) | 100".

If this is done, movstrictqi_xor should be moved in front of movstrictqi_1, or eliminated completely since it does not have any size benefit.

Third, the same should be done for movstricthi, except that in this case movstricthi_xor should be kept because it *does* have size benefits (2 bytes for xor, 3 for mov, 5/6 for and).

/* { dg-do compiler { { target i?86-*-* x86_64-*-* } && ilp32 } } */

"dg-do compile", of course. :-)

I would also add a test that tries "-mtune=pentium" and check that the "xor" version is generated.

Hope this helps!

Paolo

References:
- [patch] partial register update for a bit mask operation on x86
  - From: Hui-May Chang
- Re: [patch] partial register update for a bit mask operation on x86
  - From: Eric Christopher

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]