This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 10/11][RS6000] Migrate reduction optabs to reduc_..._scal


On Thu, 2014-11-06 at 16:44 +0000, Alan Lawrence wrote:
> Hmmm. I am a little surprised by your mention of "saturation points" as I would 
> not expect any variety of reduc_plus to be a saturating operation???

I wouldn't either, but the underlying vsum4ubs and vsum4sbs instructions
used in these patterns do both a reduction and an add to another value.
If that other value is large enough this can trigger a saturation event.
However, the patterns use vzero for this other value, so it's not
possible to approach the saturation cutoff for either instruction since
the reductions are being done on byte values.  (Each word in the vector
result is the sum of the corresponding four byte values in the vector
source, added to the other value, which here is zero.)

Thanks,
Bill

> 
> A.
> 
> Bill Schmidt wrote:
> > On Fri, 2014-10-24 at 19:49 -0400, David Edelsohn wrote:
> >> On Fri, Oct 24, 2014 at 8:06 AM, Alan Lawrence <alan.lawrence@arm.com> wrote:
> >>> This migrates the reduction patterns in altivec.md and vector.md to the new
> >>> names. I've not touched paired.md as I wasn't really sure how to fix that
> >>> (how do I vec_extractv2sf ?), moreover the testing I did didn't seem to
> >>> exercise any of those patterns (iow: I'm not sure what would be an
> >>> appropriate target machine?).
> >>>
> >>> I note the reduc_uplus_v16qi (which I've removed, as unsigned and signed
> >>> addition should be equivalent) differed from reduc_splus_v16qi in using
> >>> gen_altivec_vsum4ubs rather than gen_altivec_vsum4sbs.  Testcases
> >>> gcc.dg/vect/{slp-24-big-array.c,slp-24.c,vect-reduc-1char-big-array.c,vert-reduc-1char.c}
> >>> thus produce assembly which differs from previously (only) in that
> >>> "vsum4ubs" becomes "vsum4sbs". These tests are still passing so I assume
> >>> this is OK.
> > 
> > Given that the only 32-bit quantity being added here is zero, the
> > difference in saturation points for vsum4ubs and vsum4sbs won't come
> > into play, so I agree this should be fine.
> > 
> > I would like to ask Mike Meissner to look over the changes to the
> > reduction patterns in vector.md.  He wrote those and is more familiar
> > with that piece than I am.  On the surface I don't see any problems, but
> > I could miss something subtle.
> > 
> > Otherwise I'm ok with the patch.
> > 
> > Thanks,
> > Bill
> > 
> > (p.s. Sorry for the delay on reviewing this.  As David noted, I was
> > traveling, and I ended up having no access to my mail for most of the
> > week due to an IT snafu.)
> > 
> >>> The combining of signed and unsigned addition also improves
> >>> gcc.dg/vect/{vect-outer-4i.c,vect-reduc-1short.c,vect-reduc-dot-u8b.c,vect-reduc-pattern-1c-big-array.c,vect-reduc-pattern-1c.c}
> >>> : these are now reduced using direct vector reduction, rather than with
> >>> shifts as previously (because there was only a reduc_splus rather than the
> >>> reduc_uplus these tests looked for).
> >>>
> >>> ((Side note: the RTL changes to vector.md are to match the combine patterns
> >>> in vsx.md; now that we now longer depend upon combine to generate those
> >>> patterns (as the optab outputs them directly), one might wish to remove the
> >>> smaller pattern from vsx.md, and/or simplify the RTL. I theorize that a
> >>> reduction of a two-element vector is just adding the first element to the
> >>> second, so maybe to something like
> >>>
> >>>   [(parallel [(set (match_operand:DF 0 "vfloat_operand" "")
> >>>                    (VEC_reduc:V2DF
> >>>                     (vec_select:DF
> >>>                      (match_operand:V2DF 1 "vfloat_operand" "")
> >>>                      (parallel [(const_int 1)]))
> >>>                     (vec_select:DF
> >>>                      (match_dup 1)
> >>>                      (parallel [(const_int 0)]))))
> >>>               (clobber (match_scratch:V2DF 2 ""))])]
> >>>
> >>> but I think it's best for me to leave that to the port maintainers.))
> >>>
> >>> Bootstrapped and check-gcc on powerpc64-none-linux-gnu
> >>> (gcc110.fsffrance.org, with thanks to the GCC Compile Farm).
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>         * config/rs6000/altivec.md (reduc_splus_<mode>): Rename to...
> >>>         (reduc_plus_scal_<mode>): ...this, and rs6000_expand_vector_extract.
> >>>         (reduc_uplus_v16qi): Remove.
> >>>
> >>>         * config/rs6000/vector.md (VEC_reduc_name): change "splus" to "plus"
> >>>         (reduc_<VEC_reduc_name>_v2df): Rename to...
> >>>         (reduc_<VEC_reduc_name>_scal_v2df): ...this, wrap VEC_reduc in a
> >>>         vec_select of element 1.
> >>>         (reduc_<VEC_reduc_name>_v4sf): Rename to...
> >>>         (reduc_<VEC_reduc_name>_scal_v4sf): ...this, wrap VEC_reduc in a
> >>>         vec_select of element 3, add scratch register.
> >>>
> >> This needs some input from Bill, but he will be busy with a conference
> >> this week.
> >>
> >> - David
> >>
> > 
> > 
> > 
> 
> 



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]