This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/85090] [8 Regression] wrong code with -O2 -fno-tree-dominator-opts -mavx512f -fira-algorithm=priority
- From: "vmakarov at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 06 Apr 2018 21:32:45 +0000
- Subject: [Bug middle-end/85090] [8 Regression] wrong code with -O2 -fno-tree-dominator-opts -mavx512f -fira-algorithm=priority
- Auto-submitted: auto-generated
- References: <bug-85090-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85090
--- Comment #13 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #11)
> (In reply to Jakub Jelinek from comment #5)
> > I guess it depends on what exactly a normal subreg on lhs means.
> > The documentation says:
> > When used as an lvalue, 'subreg' is a word-based accessor.
> > Storing to a 'subreg' modifies all the words of REG that
> > overlap the 'subreg', but it leaves the other words of REG
> > alone.
>
>
> But this wording applies only to multi-word registers. We can't use the
> above wording for 512bit single-word register, since we don't know how the
> move will affect the bits outside the subreg. We can say that the move
> "modifies all the words of REG that overlap the 'subreg', since we have only
> one 512-bit word of a 512-bit register.
>
OK.
> So, I think that the transformation in the Comment 10 is invalid for
> registers that can't be decomposed to independent word-sized registers (to
> use "word-based accessor"), e.g. V32HImode xmm20. Perhaps the mentioned
> alter_subreg should choose correct approach based on TARGET_HARD_REGNO_NREGS?
Actually I do the same things as the old reload does. It has practically the
same alter_subreg code. May be the reload and LRA code is not up to date to
treat correctly this situation. I don't know.
What I can do is to generate (strict_low_part (subreg:DI (reg:V32HI <sse
pseudo>))) to reflect the new semantics. Something like
Index: lra.c
===================================================================
--- lra.c (revision 258691)
+++ lra.c (working copy)
@@ -487,14 +487,26 @@ int lra_curr_reload_num;
void
lra_emit_move (rtx x, rtx y)
{
- int old;
-
+ int old, regno;
+ machine_mode mode;
+ rtx reg;
+
if (GET_CODE (y) != PLUS)
{
if (rtx_equal_p (x, y))
return;
old = max_reg_num ();
- emit_move_insn (x, y);
+ if (GET_CODE (x) == SUBREG
+ && REG_P (reg = SUBREG_REG (x))
+ && GET_MODE_SIZE (mode = GET_MODE (reg)).to_constant () >
UNITS_PER_WORD
+ && (regno = REGNO (reg)) >= FIRST_PSEUDO_REGISTER
+ && ira_reg_class_max_nregs[lra_get_allocno_class (regno)][mode] == 1)
+ {
+ x = gen_rtx_STRICT_LOW_PART (VOIDmode, x);
+ emit_insn (gen_rtx_SET (x, y));
+ }
+ else
+ emit_move_insn (x, y);
if (REG_P (x))
lra_reg_info[ORIGINAL_REGNO (x)].last_reload = ++lra_curr_reload_num;
/* Function emit_move can create pseudos -- so expand the pseudo
But we need insn patterns for such cases which are absent in i386 md files.
Without adding them, compiler will crash in LRA.