This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR91522
On Wed, 28 Aug 2019, Iain Sandoe wrote:
>
>
>
> > On 26 Aug 2019, at 11:06, Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Mon, Aug 26, 2019 at 10:40 AM Richard Biener <rguenther@suse.de> wrote:
> >>
> >> On Fri, 23 Aug 2019, Richard Biener wrote:
> >>
> >>> On Fri, 23 Aug 2019, Richard Biener wrote:
> >>>
> >>>> On Fri, 23 Aug 2019, Uros Bizjak wrote:
> >>>>
> >>>>> On Thu, Aug 22, 2019 at 3:35 PM Richard Biener <rguenther@suse.de> wrote:
> >>>>>>
> >>>>>>
> >>>>>> This fixes quadraticness in STV and makes
> >>>>>>
> >>>>>> machine dep reorg : 89.07 ( 95%) 0.02 ( 18%) 89.10 (
> >>>>>> 95%) 54 kB ( 0%)
> >>>>>>
> >>>>>> drop to zero. Anybody remembers why it is the way it is now?
> >>>>>>
> >>>>>> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
> >>>>>>
> >>>>>> OK?
> >>>>>
> >>>>> Looking at the PR, comment #3 [1], I assume this patch is obsoltete
> >>>>> and will be updated?
> >>>>>
> >>>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91522#c3
> >>>>
> >>>> Yes. I'm still learning how STV operates (and learing DF and RTL...).
> >>>> The following is a rewrite of the non-TImode chain conversion
> >>>> according to I think how it should operate als allowing the hunk
> >>>> that fixes the compile-time and fixing PR91527 on the way
> >>>> (which I ran into during more extensive testing of the patch myself).
> >>>>
> >>>> So compared to the state before which I still do not 100% understand
> >>>> we now do the following. Chain detection works as before including
> >>>> recording of all defs (both defined by the insns in the chain and
> >>>> insns outside) that need copy-in or copy-out operations.
> >>>>
> >>>> But then the patch changes things as to guarantee that
> >>>> after the conversion all uses/defs of a pseudo are
> >>>> of the (subreg:Vmode ..) form or of the original scalar form.
> >>>> In particular it avoids the need to change any insns that
> >>>> are not part of the chain (besides emitting the extra copy
> >>>> instructions). For this all defs that were marked as needing
> >>>> copies (thus they have uses/defs both in the chain and outside)
> >>>> the chain will use a new pseudo that we copy to from scalar sources
> >>>> and that we copy from for scalar uses. There's the new defs_map
> >>>> which records the mapping of old to new reg. pseudos that are
> >>>> only used in the chain already are not remapped.
> >>>>
> >>>> The conversion itself then happens in two stages, first,
> >>>> in make_vector_copies, we emit the copy-in insns and
> >>>> allocate all pseudos we need. Then the rest of the
> >>>> conversion happens fully inside of convert_insn where
> >>>> we generate the copy-outs of the insns def, replace
> >>>> defs and uses according to the mapping and replace uses
> >>>> and defs with the (subreg:Vmode ..) style.
> >>>>
> >>>> For PR91527 IRA doesn't like the REG_EQUIV note in
> >>>>
> >>>> (insn 4 24 5 2 (set (subreg:V4SI (reg/v:SI 90 [ c ]) 0)
> >>>> (subreg:V4SI (reg:SI 100) 0))
> >>>> "/space/rguenther/src/svn/trunk2/gcc/testsuite/g++.dg/tree-ssa/pr21463.C":11:4
> >>>> 1248 {movv4si_internal}
> >>>> (expr_list:REG_DEAD (reg:SI 100)
> >>>> (expr_list:REG_EQUIV (mem/c:SI (plus:DI (reg/f:DI 16 argp)
> >>>> (const_int 16 [0x10])) [1 c+0 S4 A64])
> >>>> (nil))))
> >>>>
> >>>> because the SET_DEST is not a REG_P. I'm not sure if this
> >>>> is invalid RTL, docs say SET_DEST might be a strict_low_part
> >>>> or a zero_extract but doesn't mention a subreg. So I opted
> >>>> to simply remove equal/equiv notes on insns we convert
> >>>> and since the above has a REG_DEAD note I took the liberty
> >>>> to update that according to the mapping (so that would have
> >>>> been not needed before this patch) rather than dropping it.
> >>>>
> >>>> Bootstrapped with and without --with-march=westmere (to get
> >>>> some STV coverage, this included all languages) on
> >>>> x88_64-unknown-linux-gnu, testing in progress.
> >>>>
> >>>> OK if testing succeeds?
> >>>
> >>> Testing revealed I made an error in general_scalar_chain::convert_insn
> >>> failing to move down SET_SRC extraction below replacing with
> >>> the defs map. This showed in 4 execute FAILs in 32bit fortran
> >>> testing (only). Fixed by moving down the whole def_set/src/dst
> >>> extraction.
> >>>
> >>> Re-testing on x86_64-unknown-linux-gnu.
> >>
> >> Bootstrapped / tested on x86_64-unknown-linux-gnu. The "no-costmodel"
> >> run runs into the latent PR91528 building 32bit libada in stage3
> >> for a few sources, I've manually built those with -mno-stv and
> >> bootstrap finishes successfully. I hope HJ can help with this
> >> dynamic stack-alignment issue.
> >>
> >> So - OK for trunk?
> >>
> >> As followup we can now remove general_remove_non_convertible_regs
> >> because we can handle defs that cannot be converted just fine
> >> with the patch since we're splitting live-ranges of all defs at
> >> the chain boundary.
> >>
> >> Thanks,
> >> Richard.
> >>
> >>> Updated patch below. I'm feeling adventurous and will run
> >>> the "westmere" bootstrap with costing disabled (aka always
> >>> convert detected chains...).
> >>>
> >>> Richard.
> >>>
> >>> 2019-08-23 Richard Biener <rguenther@suse.de>
> >>>
> >>> PR target/91522
> >>> PR target/91527
> >>> * config/i386/i386-features.h (general_scalar_chain::defs_map):
> >>> New member.
> >>> (general_scalar_chain::replace_with_subreg): Remove.
> >>> (general_scalar_chain::replace_with_subreg_in_insn): Likewise.
> >>> (general_scalar_chain::convert_reg): Adjust signature.
> >>> * config/i386/i386-features.c (scalar_chain::add_insn): Do not
> >>> iterate over all defs of a reg.
> >>> (general_scalar_chain::replace_with_subreg): Remove.
> >>> (general_scalar_chain::replace_with_subreg_in_insn): Likewise.
> >>> (general_scalar_chain::make_vector_copies): Populate defs_map,
> >>> place copy only after defs that are used as vectors in the chain.
> >>> (general_scalar_chain::convert_reg): Emit a copy for a specific
> >>> def in a specific instruction.
> >>> (general_scalar_chain::convert_op): All reg uses are converted here.
> >>> (general_scalar_chain::convert_insn): Emit copies for scalar
> >>> uses of defs here. Replace uses with the copies we created.
> >>> Replace and convert the def. Adjust REG_DEAD notes, remove
> >>> REG_EQUIV/EQUAL notes.
> >>> (general_scalar_chain::convert_registers): Only handle copies
> >>> into the chain here.
> >
> > Rubberstamped with LGTM. It looks you are the master of this domain now ;)
>
> This breaks bootstrap for i686-darwin (and most likely is the cause of bootstrap fail on
> i686-linux i686-linux-gnu at https://gcc.gnu.org/ml/gcc-regression/2019-08/msg00398.html
> et al)
> It gives a bunch of compare errors spread around the tree - so no specific pointer there.
>
> There’s a secondary fail overlaying it between 274933-or so and 274980 which confused
> my initial search.
Please file a bugreport. Don't have time today to dig into but will do
tomorrow (but now at least try to reproduce).
Richard.