RFA: patch to prohibit IRA undoing sched1 [was IRA undoing sched1]

H.J. Lu hjl.tools@gmail.com
Sat Dec 18 23:24:00 GMT 2010

On Thu, Dec 2, 2010 at 2:17 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> On 12/01/2010 02:14 PM, Paul Koning wrote:
>> On Nov 29, 2010, at 9:51 PM, Vladimir Makarov wrote:
>>> On 11/29/2010 08:52 PM, Paul Koning wrote:
>>>> I'm doing some experiments to get to know GCC better, and something is
>>>> puzzling me.
>>>> I have defined an md file with DFA and costs describing the fact that
>>>> loads take a while (as do stores). Also, there is no memory to memory move,
>>>> only memory to/from register.
>>>> Test program is basically a=b; c=d; e=f; g=h;
>>>> Sched1, as expected, turns this into four loads followed by four stores,
>>>> exploiting the pipeline.
>>>> Then IRA kicks in.  It shuffles the insns back into load/store,
>>>> load/store pairs, essentially the source code order.  It looks like it's
>>>> doing that to reduce the number of registers used.  Fair enough, but this
>>>> makes the code less efficient.  I don't see a way to tell IRA not to do
>>>> this.
>>> Most probably that happens because of ira.c::update_equiv_regs.   This
>>> function was inherited from the old register allocator.  The major goal of
>>> the function is to find equivalent memory/constants/invariants for pseudos
>>> which can be used by reload pass.  Pseudo equivalence also affects live
>>> range splitting decision in IRA.
>>> Update_equiv_regs can also move insns initiating pseudo equivalences
>>> close to the pseudo usage.  You could try to prevent this and to see what
>>> happens.  IMO preventing such insn moving will do more harm on performance
>>> on SPEC benchmarks for x86/x86-64 processors.
>>>> As it happens, there's a secondary reload involved: the loads are into
>>>> one set of registers but the stores from another, so a register to register
>>>> move is added in by reload.  Does that explain the behavior?  I tried
>>>> changing the cover_classes, but that doesn't make a difference.
>>> It is hard to say without the dump file.  If everything is correctly
>>> defined, it should not happen.
>> I extended the test code a little, and fed it to a mips64el-elf targeted
>> gcc.  It showed the same pattern in one of the two functions but not the
>> other.  The test code is test8.c (attached).
>> What I see in the assembly output (test8.s, also attached) is that foo()
>> has a load then store then load then store pattern, which contradicts what
>> sched1 constructed and doesn't take advantage of the pipeline.  However,
>> bar() does use the pipeline.  I don't know what's different between these
>> two.
>> Do you want some dump file (which ones)?  Or you could just reproduce this
>> with the current gcc, it's a standard target build.  The compile was -O2
>> -mtune=mips64r2 -mabi=n32.
>  As I guessed the problem is in update_reg_equiv transformation
> trying to move initialization insn close to its single use to decrease
> the register pressure.  A lot of people already complaint about
> undoing scheduling by this function.
>  The following patch solves the problem when you use
> -fsched-pressure.  I would not like to do that for regular (not
> register pressure-sensitive) insn scheduling for obvious reasons.
> I think most RISC targets (including MIPS ones) should make
> -fsched-pressure by default.
> 2010-12-02  Vladimir Makarov <vmakarov@redhat.com>
>    * ira.c (update_equiv_regs): Prohibit move insns if
>    pressure-sensitive scheduling was done.
> Jeff, sorry for bothering you.  Is it ok to commit the patch to the
> trunk?

This caused:



More information about the Gcc mailing list