This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [rfc] multi-word subreg lowering pass


Martin Koegler wrote on Montag, 27. Juni 2005 09:33 :
> On Sat, 7 May 2005 18:35:51 -0700, Richard Henderson wrote:
> > For AND, IOR, XOR, you should be able to delete the multi-word patterns
> > entirely, and leave those operations to be generated by the middle-end.
>
> This will not work for the current CVS version, which requires operations
> for mode_for_size(sizeof(int),MODE_INT,0) mode is present (at least for
> xor).
>
> e.g.:
> long long foo (double x, double y)
> {
>   return !__builtin_isunordered (x, y);
> }
>
>
> If it compiled (without any optimizations), the gimple form is:
>
> foo (x, y)
> {
>   long long int D.1390;
>   _Bool D.1391;
>   int D.1392;
>
>   D.1391 = x unord y;
>   D.1392 = !D.1391;
>   D.1390 = (long long int) D.1392;
>   return D.1390;
> }
>
> sizeof(_Bool) is 1 for most architectures. expand_binop for D.1392 =
> !D.1391 will be called with a QI Register and (CONST_INT 1) as operands and
> a result register with mode_for_size(sizeof(int),MODE_INT,0) mode.
>
> On i386, the result will be a SI register. As xor for SI mode is available,
> the following case will be used:
>
>   if (methods != OPTAB_MUST_WIDEN
>       && binoptab->handlers[(int) mode].insn_code != CODE_FOR_nothing)
> In this case, the QI operands are converted to the right mode.
>
> On AVR, word_mode is 1, while sizeof(int) is 2 (unless an option is
> specified). Therefore the result will be a HI register.  If xor would
> only be available for QI mode, the following case will be used:
>
>   /* These can be done a word at a time.  */
>   if ((binoptab == and_optab || binoptab == ior_optab || binoptab ==
> xor_optab) && class == MODE_INT
>       && GET_MODE_SIZE (mode) > UNITS_PER_WORD
>       && binoptab->handlers[(int) word_mode].insn_code != CODE_FOR_nothing)
>
> Here operand_subword_force will be called for a QI operand (with
> mode=HI as parameter), which will cause an internal compiler error.
>
> mfg Martin Kögler
Thank's for reviewing this. IIUC, I also had observed a couple of regressions 
when removing the patterns completely. In my local experimental working 
version of the AVR back-end that makes use of Richard's patch, I have used 
explicit expanders for lowering xor:HI and xor:SI to a sequence of xor:QI 
operations.

BTW: My present judgement concerning subreg-lowering before reload is:

1.) It is very helpful to expose the complexity to the register allocator and 
for many cases the resulting code is much more efficient: Expressions using 
sign/zero-extension shifts larger than one architecture-word benefit most. 
Also when operating frequently with variables held in memory or initialized 
with immediates, the early subreg-lowering could considerably help by 
reducing register pressure. Some of the subregs could die earlier than 
others.
2.) Difficulty is that for a couple of situations, one would like to maintain 
the information what the zoos of individual lowered subregs correspond to. 
The key issue, IMO, is condition code re-use. It would be, e.g., extremely 
cumbersome to teach the mid-end passes to understand which condition code
is calculated by the expanded sequences. E.g. a cp:SI (reg:SI xx) (const_int 
a)  on avr would be expanded into one compare-QI-with-immediate, 3 
load-QI-with-immediates and three compare_register_with_register_with_carry. 
The generated sequences would be highly target specific, so that a generic 
approach seems not to be easy.

For the real-world test cases I have studied so far the  2.) results in an 
over-all reduced efficiency when lowering before reload. IMO, in order to 
overcome this difficulty, one could not help placing some additional 
knowledge in the rtl sequences: I.e. while the actual rtl that generates code 
would always refer to the lowered subregs, I think that it would be necessary 
to add instructions for the sole purpose of telling which kind of value is 
located in which registers so that the different CSE passes have a chance to 
find out at which place an existing condition-code value is re-calculated.

One possible option IMO are set-myself-to myself instructions carrying 
register equal notes  such as the optabs-expanders generate. The other option 
that I think would work are dedicated "marker" instructions that never 
generate text and only serve as hooks for CSE and combine. When aiming to use 
the reg-equal notes method, one, however, would IMO need to change the 
mid-end since presently, if I see correctly, they don't even survive the 
first jump optimization pass. For the second (marker instruction) approach 
one would need an additional pass for removing them prior to register 
allocation.

Yours,

Björn


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]