This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Hongtao Liu <crazylht at gmail dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, Jakub Jelinek <jakub at redhat dot com>, Alan Modra <amodra at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 2 Sep 2019 10:41:24 +0200
- Subject: Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.
- References: <CAFULd4Z88+aey62UENVeSQCzCx+ev7-AYbCgW-ox63qa7R6TtA@mail.gmail.com> <CAFULd4Yaoa3h4vtd=x0yposto8hsLouLAwSdF5P2thG9CuVC=A@mail.gmail.com> <CAFiYyc1HOHw0RTXKP32OpsgNqAHXyZoHOp7PWFV8Zc_LYpLb4Q@mail.gmail.com> <CAFULd4aoX-JqbkFECYSMHgCEx2zL=WkDFGR0ZrE4a5sywYW3Zw@mail.gmail.com> <20190831005151.GD9227@bubble.grove.modra.org> <B11C5072-1F6C-4BCA-B3DF-FB2490740858@gmail.com> <CAFULd4Z0PbsE4eMC035c4Tv1YFSg1JsWp8--ZWYS=gVBH1oR-g@mail.gmail.com> <CAMZc-bzzSMHcJGweXx0SyDBhjWjUHt1khcY+kYcc=89bEwH9eA@mail.gmail.com>
On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> > which is not the case with core_cost (and similar with skylake_cost):
> >
> > 2, 2, 4, /* cost of moving XMM,YMM,ZMM register */
> > {6, 6, 6, 6, 12}, /* cost of loading SSE registers
> > in 32,64,128,256 and 512-bit */
> > {6, 6, 6, 6, 12}, /* cost of storing SSE registers
> > in 32,64,128,256 and 512-bit */
> > 2, 2, /* SSE->integer and integer->SSE moves */
> >
> > We have the same cost of moving between integer registers (by default
> > set to 2), between SSE registers and between integer and SSE register
> > sets. I think that at least the cost of moves between regsets should
> > be substantially higher, rs6000 uses 3x cost of intra-regset moves;
> > that would translate to the value of 6. The value should be low enough
> > to keep the cost below the value that forces move through the memory.
> > Changing core register allocation cost of SSE <-> integer to:
> >
> > --cut here--
> > Index: config/i386/x86-tune-costs.h
> > ===================================================================
> > --- config/i386/x86-tune-costs.h (revision 275281)
> > +++ config/i386/x86-tune-costs.h (working copy)
> > @@ -2555,7 +2555,7 @@ struct processor_costs core_cost = {
> > in 32,64,128,256 and 512-bit */
> > {6, 6, 6, 6, 12}, /* cost of storing SSE registers
> > in 32,64,128,256 and 512-bit */
> > - 2, 2, /* SSE->integer and
> > integer->SSE moves */
> > + 6, 6, /* SSE->integer and
> > integer->SSE moves */
> > /* End of register allocator costs. */
> > },
> >
> > --cut here--
> >
> > still produces direct move in gcc.target/i386/minmax-6.c
> >
> > I think that in addition to attached patch, values between 2 and 6
> > should be considered in benchmarking. Unfortunately, without access to
> > regressed SPEC tests, I can't analyse these changes by myself.
> >
> > Uros.
>
> Apply similar change to skylake_cost, on skylake workstation we got
> performance like:
> ---------------------------
> version |
> 548_exchange_r score
> gcc10_20180822: | 10
> apply remove_max8 | 8.9
> also apply increase integer_tofrom_sse cost | 9.69
> -----------------------------
> Still 3% regression which is related to _gfortran_mminloc0_4_i4 in
> libgfortran.so.5.0.0.
>
> I found suspicious code as bellow, does it affect?
Hard to say without access to the test, but I'm glad that changing the
knob has noticeable effect. I think that (as said by Alan) a fine-tune
of register pressure calculation will be needed to push this forward.
Uros.
> ------------------
> modified gcc/config/i386/i386-features.c
> @@ -590,7 +590,7 @@ general_scalar_chain::compute_convert_gain ()
> if (dump_file)
> fprintf (dump_file, " Instruction conversion gain: %d\n", gain);
>
> - /* ??? What about integer to SSE? */
> + /* ??? What about integer to SSE? */???
> EXECUTE_IF_SET_IN_BITMAP (defs_conv, 0, insn_uid, bi)
> cost += DF_REG_DEF_COUNT (insn_uid) * ix86_cost->sse_to_integer;
> ------------------
> --
> BR,
> Hongtao