This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [rfc] multi-word subreg lowering pass


Michael Matz <matz@suse.de> writes:

> Initially
> 
>   p1:DI <-- some expression
>   p2:DI <-- p1:DI   ;  p1 dies, hence p1 and p2 do not conflict
>   use   <-- p2:DI
> 
> is transformed into (p1->(p10,p11) , p2->(p20,p21))
> 
>   p10:SI <-- some exp1
>   p11:SI <-- some exp2
>   p20:SI <-- p10:SI ;  p10 dies
>   p21:SI <-- p11:SI ;  p11 dies
>   use1   <-- p20:SI
>   use2   <-- p21:SI
> 
> p10 and p11 must conflict anyway, as must p20 and p21, and they do with 
> the normal conflict graph builder.  That one will also create a conflict 
> between p11 and p20 (not between p10 and p21, though).  Are you worried 
> about that one?  Generelly you can not do without that conflict, because 
> if you don't have it, they might get the same register, and then p11 would 
> be clobbered by writing to p20.  To be able to do without this conflict 
> you need to ensure certain conditions on how p10/p11 and p20/p21 are layed 
> out in the end, but those would again just implicitely capture the effect 
> of the conflict edge, namely that the ex-high part conflicts with the 
> ex-low part of a multi-word reg.
> 
> Note that this p11-p20 conflict does not prevent you from coalescing both 
> copies away, I guess that's what you are interested in.  p11 doesn't 
> conflict with p21, so they can be merged, as can p10 and p20.  So 
> everything just works as expected without any special handling of the 
> conflicts.  But perhaps I'm misunderstanding what you tried to achieve.

Yes, that case, the case in which the registers can be fully
decomposed, works OK.  The case that doesn't work OK is when one
register can be decomposed but the other can not be.  Then you get

  p20:SI <-- subreg:SI p10:DI 0
  p21:SI <-- subreg:SI p10:DI 4

Let's assume that p10 dies in the second instruction.  Then we want to
allocate p20/p21 to overlap with p10.  But unfortunately the first
instruction causes a conflict between p20 and p10, since p10 is still
live after the first instruction, so the register allocator can't do
that.

In fact I have now implemented this case.  I've introduced a
computation of REG_SUBREG_DEAD, though only within a basic block.
Then the first instruction above will get a REG_SUBREG_DEAD note for
subreg:SI p10:DI 0 on the first insn above.  I've modified global.c to
see that and to not introduce a conflict between p20 and p10, and
p20/p21 and p10 now get allocated to the same register pair in my test
case.

I now need to do more timing tests to see if I get improvements on
real code.

Ian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]