This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR91522


On Wed, 28 Aug 2019, Iain Sandoe wrote:

> 
> 
> 
> > On 26 Aug 2019, at 11:06, Uros Bizjak <ubizjak@gmail.com> wrote:
> > 
> > On Mon, Aug 26, 2019 at 10:40 AM Richard Biener <rguenther@suse.de> wrote:
> >> 
> >> On Fri, 23 Aug 2019, Richard Biener wrote:
> >> 
> >>> On Fri, 23 Aug 2019, Richard Biener wrote:
> >>> 
> >>>> On Fri, 23 Aug 2019, Uros Bizjak wrote:
> >>>> 
> >>>>> On Thu, Aug 22, 2019 at 3:35 PM Richard Biener <rguenther@suse.de> wrote:
> >>>>>> 
> >>>>>> 
> >>>>>> This fixes quadraticness in STV and makes
> >>>>>> 
> >>>>>> machine dep reorg                  :  89.07 ( 95%)   0.02 ( 18%)  89.10 (
> >>>>>> 95%)      54 kB (  0%)
> >>>>>> 
> >>>>>> drop to zero.  Anybody remembers why it is the way it is now?
> >>>>>> 
> >>>>>> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
> >>>>>> 
> >>>>>> OK?
> >>>>> 
> >>>>> Looking at the PR, comment #3 [1], I assume this patch is obsoltete
> >>>>> and will be updated?
> >>>>> 
> >>>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91522#c3
> >>>> 
> >>>> Yes.  I'm still learning how STV operates (and learing DF and RTL...).
> >>>> The following is a rewrite of the non-TImode chain conversion
> >>>> according to I think how it should operate als allowing the hunk
> >>>> that fixes the compile-time and fixing PR91527 on the way
> >>>> (which I ran into during more extensive testing of the patch myself).
> >>>> 
> >>>> So compared to the state before which I still do not 100% understand
> >>>> we now do the following.  Chain detection works as before including
> >>>> recording of all defs (both defined by the insns in the chain and
> >>>> insns outside) that need copy-in or copy-out operations.
> >>>> 
> >>>> But then the patch changes things as to guarantee that
> >>>> after the conversion all uses/defs of a pseudo are
> >>>> of the (subreg:Vmode ..) form or of the original scalar form.
> >>>> In particular it avoids the need to change any insns that
> >>>> are not part of the chain (besides emitting the extra copy
> >>>> instructions).  For this all defs that were marked as needing
> >>>> copies (thus they have uses/defs both in the chain and outside)
> >>>> the chain will use a new pseudo that we copy to from scalar sources
> >>>> and that we copy from for scalar uses.  There's the new defs_map
> >>>> which records the mapping of old to new reg.  pseudos that are
> >>>> only used in the chain already are not remapped.
> >>>> 
> >>>> The conversion itself then happens in two stages, first,
> >>>> in make_vector_copies, we emit the copy-in insns and
> >>>> allocate all pseudos we need.  Then the rest of the
> >>>> conversion happens fully inside of convert_insn where
> >>>> we generate the copy-outs of the insns def, replace
> >>>> defs and uses according to the mapping and replace uses
> >>>> and defs with the (subreg:Vmode ..) style.
> >>>> 
> >>>> For PR91527 IRA doesn't like the REG_EQUIV note in
> >>>> 
> >>>> (insn 4 24 5 2 (set (subreg:V4SI (reg/v:SI 90 [ c ]) 0)
> >>>>        (subreg:V4SI (reg:SI 100) 0))
> >>>> "/space/rguenther/src/svn/trunk2/gcc/testsuite/g++.dg/tree-ssa/pr21463.C":11:4
> >>>> 1248 {movv4si_internal}
> >>>>     (expr_list:REG_DEAD (reg:SI 100)
> >>>>        (expr_list:REG_EQUIV (mem/c:SI (plus:DI (reg/f:DI 16 argp)
> >>>>                    (const_int 16 [0x10])) [1 c+0 S4 A64])
> >>>>            (nil))))
> >>>> 
> >>>> because the SET_DEST is not a REG_P.  I'm not sure if this
> >>>> is invalid RTL, docs say SET_DEST might be a strict_low_part
> >>>> or a zero_extract but doesn't mention a subreg.  So I opted
> >>>> to simply remove equal/equiv notes on insns we convert
> >>>> and since the above has a REG_DEAD note I took the liberty
> >>>> to update that according to the mapping (so that would have
> >>>> been not needed before this patch) rather than dropping it.
> >>>> 
> >>>> Bootstrapped with and without --with-march=westmere (to get
> >>>> some STV coverage, this included all languages) on
> >>>> x88_64-unknown-linux-gnu, testing in progress.
> >>>> 
> >>>> OK if testing succeeds?
> >>> 
> >>> Testing revealed I made an error in general_scalar_chain::convert_insn
> >>> failing to move down SET_SRC extraction below replacing with
> >>> the defs map.  This showed in 4 execute FAILs in 32bit fortran
> >>> testing (only).  Fixed by moving down the whole def_set/src/dst
> >>> extraction.
> >>> 
> >>> Re-testing on x86_64-unknown-linux-gnu.
> >> 
> >> Bootstrapped / tested on x86_64-unknown-linux-gnu.  The "no-costmodel"
> >> run runs into the latent PR91528 building 32bit libada in stage3
> >> for a few sources, I've manually built those with -mno-stv and
> >> bootstrap finishes successfully.  I hope HJ can help with this
> >> dynamic stack-alignment issue.
> >> 
> >> So - OK for trunk?
> >> 
> >> As followup we can now remove general_remove_non_convertible_regs
> >> because we can handle defs that cannot be converted just fine
> >> with the patch since we're splitting live-ranges of all defs at
> >> the chain boundary.
> >> 
> >> Thanks,
> >> Richard.
> >> 
> >>> Updated patch below.  I'm feeling adventurous and will run
> >>> the "westmere" bootstrap with costing disabled (aka always
> >>> convert detected chains...).
> >>> 
> >>> Richard.
> >>> 
> >>> 2019-08-23  Richard Biener  <rguenther@suse.de>
> >>> 
> >>>      PR target/91522
> >>>      PR target/91527
> >>>      * config/i386/i386-features.h (general_scalar_chain::defs_map):
> >>>      New member.
> >>>      (general_scalar_chain::replace_with_subreg): Remove.
> >>>      (general_scalar_chain::replace_with_subreg_in_insn): Likewise.
> >>>      (general_scalar_chain::convert_reg): Adjust signature.
> >>>      * config/i386/i386-features.c (scalar_chain::add_insn): Do not
> >>>      iterate over all defs of a reg.
> >>>      (general_scalar_chain::replace_with_subreg): Remove.
> >>>      (general_scalar_chain::replace_with_subreg_in_insn): Likewise.
> >>>      (general_scalar_chain::make_vector_copies): Populate defs_map,
> >>>      place copy only after defs that are used as vectors in the chain.
> >>>      (general_scalar_chain::convert_reg): Emit a copy for a specific
> >>>      def in a specific instruction.
> >>>      (general_scalar_chain::convert_op): All reg uses are converted here.
> >>>      (general_scalar_chain::convert_insn): Emit copies for scalar
> >>>      uses of defs here.  Replace uses with the copies we created.
> >>>      Replace and convert the def.  Adjust REG_DEAD notes, remove
> >>>      REG_EQUIV/EQUAL notes.
> >>>      (general_scalar_chain::convert_registers): Only handle copies
> >>>      into the chain here.
> > 
> > Rubberstamped with LGTM. It looks you are the master of this domain now ;)
> 
> This  breaks bootstrap for i686-darwin (and most likely is the cause of bootstrap fail on 
> i686-linux i686-linux-gnu at https://gcc.gnu.org/ml/gcc-regression/2019-08/msg00398.html 
> et al) 
> It gives a bunch of compare errors spread around the tree - so no specific pointer there.
> 
> There’s a secondary fail overlaying it between 274933-or so and 274980 which confused
> my initial search.

Please file a bugreport.  Don't have time today to dig into but will do
tomorrow (but now at least try to reproduce).

Richard.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]