[rfc] multi-word subreg lowering pass

Sat May 7 14:04:00 GMT 2005

> Richard Henderson writes:
> I hadn't posted this yet because I've yet to show real measurable
> improvements in long long arithmetic on x86.  I suspect that I
> won't be able to show this until we have a new register allocator.
> 
> I'd also planned to restructure this so that it uses recog_data to
> replace values in real operands, rather than replacing subregs
> wherever they may be found.
> 
> However, it may be that it's good enough to help the problems that
> AVR is having.  If so, I'll go ahead with the rewrite to recog_data.
> 
> The Object is to replace (reg:LARGE X) with a set of (reg:WORD Yi).
> We do this when X is only used in decomposible contexts.  Such
> contexts are (1) subregs smaller than or equal to word sized and
> (2) in a decomposible moves.

- For a byte word size target like AVR for which UNITS_PER_WORD == 1,
  would all operations be decomposed to QI mode operations?

- What effect if any would it's presently necessary lie about it's
  word size (i.e. lies within libgcc2 and claims UNITS_PER_WORD == 4
  in order to trick it into selecting reasonably correct operand modes)
  have on the generated code for functions defined within libgcc2?

> If the target leaves logical arithmetic to the middle end, this
> means that the posted example,
>
>        long long foo(int x, long long y)
>        {
>          return x & y;
>        }
>
> will decompose to
>
>        (set (reg:SI 100) (reg:SI x))
>        (set (reg:SI 101) (const_int 0))
>        (set (reg:SI 102) (and:SI (reg:SI 100) (reg:SI ylow)))
>        (set (reg:SI 103) (and:SI (reg:SI 101) (reg:SI yhigh)))
>        (set (reg:SI eax) (reg:SI 102))
>        (set (reg:SI edx) (reg:SI 103))
>
> which after cse and combine will be just perfect.
>
> Where it fails is when the target has patterns that take multi-word
> inputs.  In this case we have to stop and leave the data in the
> multi-word pseudo.  This approach does have the advantage of being
> able to work incrementally, though it's also true that incomplete
> conversion can hurt

- Out of curiosity, why not leave all decomposed operations in their
  subreg form, thereby maintaining the logical integrity of their operand
  modes? i.e. the above decomposes to something like:

     (set (subreg:SI 100 0) (and:SI (reg:SI x) (subreg:SI y 0)))
     (set (subreg:SI 100 1) (and:SI (const_int 0) (subreg:SI y 1)))

  which seems both simpler, and does not require introduction of new
  semantics which complicate multi-word/sub-reg input operand expressions.