This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH], PR 50341, Fix TOC load scheduling for powerpc on GCC 4.6 and 4.7


On Thu, Sep 15, 2011 at 11:07:43AM -0700, Andrew Pinski wrote:
> On Thu, Sep 15, 2011 at 10:52 AM, Michael Meissner
> <meissner@linux.vnet.ibm.com> wrote:
> > The recent changes for the new code models tickled a latent bug in the GCC 4.6
> > and 4.7 compilers with the AIX calling sequence (PR 50341). ?When you call a
> > function indirectly, the compiler has to save the current function's TOC
> > pointer that is in r2 on the stack frame, and then load the new function's TOC
> > pointer before doing the call. ?If the indirect function that you are calling
> > has the same TOC value as the current function, the code will run, but if the
> > indirection function has a different TOC (for example it is in the main
> > program, and the indirect call is in a shared library), the wrong address will
> > be used.
> >
> > The existing code did this by doing a split after reload has finished.
> > Scheduling after reload then moves the load of the new TOC. ?These patches for
> > GCC 4.6 and GCC 4.7 disable the splits so that the load of the new TOC is
> > always next to the bctrl instruction, preventing the compiler from moving the
> > load. ?Ideally, we can eventually tackle the underlying problem that we did not
> > have the right dependencies to prevent the scheduler from moving the new TOC
> > load before the use.
> 
> ENOPATCH.

Yes, I realized that after the fact and put it out later.
 
> It was originally splitting before reload and I had changed it to be
> splitting after reload assuming the dependencies would be correct.
> What is the instruction which is being described as not dependent on
> the load of r2?

>From the example in the bug, it is the 3rd cpu_fprintf, where it decides to
move loading up an address into register 27, and move the load of the address
before the call, since register 27 is preserved across calls.

void cpu_dump_state (struct CPUPPCState *env, FILE *f, fprintf_function cpu_fprintf,
                     int flags)
{



    int i;

    cpu_fprintf(f, "NIP " "%016" "l" "x" "   LR " "%016" "l" "x" " CTR "
                "%016" "l" "x" " XER " "%016" "l" "x" "\n",
                env->nip, env->lr, env->ctr, env->xer);
    cpu_fprintf(f, "MSR " "%016" "l" "x" " HID0 " "%016" "l" "x" "  HF "
                "%016" "l" "x" " idx %d\n", env->msr, env->spr[(0x3F0)],
                env->hflags, env->mmu_idx);

    cpu_fprintf(f, "TB %08" "u" " %08" "l" "u"

                " DECR %08" "u"

                "\n",
                cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)

                , cpu_ppc_load_decr(env)

                );

These are the insns that are moved before the call:

(insn 1105 153 203 2 (set (reg/f:DI 27 27 [603])
        (const:DI (plus:DI (reg:DI 2 2)
                (high:DI (const:DI (unspec:DI [
                                (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 0xfff931e6360 *.LC5>)
                            ] UNSPEC_TOCREL)))))) 513 {largetoc_high}
     (expr_list:REG_EQUIV (const:DI (plus:DI (reg:DI 2 2)
                (high:DI (const:DI (unspec:DI [
                                (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 0xfff931e6360 *.LC5>)
                            ] UNSPEC_TOCREL)))))
        (nil)))

(insn 1107 161 204 2 (set (reg/f:DI 27 27 [601])
        (lo_sum:DI (reg/f:DI 27 27 [603])
            (const:DI (unspec:DI [
                        (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 0xfff931e6360 *.LC5>)
                    ] UNSPEC_TOCREL)))) 514 {largetoc_low}
     (expr_list:REG_EQUIV (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 0xfff931e6360 *.LC5>)
        (nil)))

The RTL logic is not looking into the const.

This is triggered by Alan's patch in June:

2011-06-20  Alan Modra  <amodra@gmail.com>

	* config/rs6000/rs6000.c (create_TOC_reference): Wrap high part
	of toc-relative address in CONST.
	(rs6000_delegitimize_address): Recognize changed address.
	(rs6000_legitimize_reload_address): Likewise.
	(rs6000_emit_move): Don't force these constants to memory.
	* config/rs6000/rs6000.md (tls_gd, tls_gd_high): Wrap high part of
	toc-relative address in CONST.
	(tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise.
	(tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise.

Now, I anticipate that the patch in 4.7 will be temporary until we are
confident that we have the right solution, but it is better to put a bandaid on
the patch to limit the damage.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meissner@linux.vnet.ibm.com	fax +1 (978) 399-6899


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]