[PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

Mon May 25 11:36:43 GMT 2020

On Mon, 25 May 2020, Uros Bizjak wrote:

> On Mon, May 25, 2020 at 8:27 AM Richard Biener <rguenther@suse.de> wrote:
> >
> > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak <ubizjak@gmail.com> wrote:
> > >On Mon, May 25, 2020 at 7:53 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > >> > We have to introduce a new expander, that will have conforming mode
> > >of
> > >> > output operand (V2SF) and will produce RTX that will match
> > >> > *float<floatunssuffix>v2div2sf2. A paradoxical output subreg from
> > >> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
> > >> > the case with paradoxical input subreg.
> > >>
> > >> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> > >> will return NULL since
> > >> ----
> > >> 948  /* Subregs involving floating point modes are not allowed to
> > >> 949     change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > >> 950     (subreg:SI (reg:DF) 0) isn't.  */
> > >
> > >But, we are not changing size, we are still operating with SFmode. It
> > >looks to me that this limitation is too strict, the intention is to
> > >not expand scalar SFmode to DFmode.
> >
> > I guess so. The test probably wants to tes the component mode.
> 
> There is already some fishy x86 specific workaround in place, which I
> took the liberty to extend to vector modes. The attached patch that
> exercises this code works as expected, and produces expected code.
> 
> Liu, can you please test the attached patch?
> 
> Richi, is the middle end part OK?

+     i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
+     surely isn't the cleanest way to represent this.  It's questionable
+     if this ought to be represented at all -- why can't this all be 
hidden
+     in post-reload splitters that make arbitrarily mode changes to the
+     registers themselves.  */
+  else if (VECTOR_MODE_P (omode)
+          && (GET_MODE_INNER (omode) == imode
+              || (VECTOR_MODE_P (imode)
+                  && (GET_MODE_INNER (omode) == GET_MODE_INNER 
(imode)))))
     ;

I'd formulate this simpler, as

  else if (VECTOR_MODE_P (omode)
           && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))

since GET_MODE_INNER (DFmode) == DFmode.

We also allow (subreg:V4SF (reg:V2DF) 0) where we effectively
subref DF with SF but the size check covers the whole vector, so
the comment on the code with float modes is a bit misleading IMHO.

It also leads to the strange situation that both
(subreg:V4SF (reg:V2DF) 0) and (subreg:V8SF (reg:V4SF) 0) is valid
but (subreg:V8SF (reg:V2DF) 0) is not.

Anyway, I think generalizing the existing exception is obvious enough
thus OK with my suggestion for the simplification if nobody else
has differing comments.

Thanks,
Richard.