This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: SSE5 patches round 3
- From: "Meissner, Michael" <michael dot meissner at amd dot com>
- To: "Uros Bizjak" <ubizjak at gmail dot com>, "GCC Patches" <gcc-patches at gcc dot gnu dot org>, "Harle, Christophe" <christophe dot harle at amd dot com>, "rajagopal, dwarak" <dwarak dot rajagopal at amd dot com>, "H. J. Lu" <hjl at lucon dot org>
- Date: Mon, 10 Sep 2007 13:09:52 -0400
- Subject: RE: SSE5 patches round 3
I had forgotten about the smmintrin.h/mmintrin-common.h changes in the
ChangeLog. Good catch. Yes, I should have added them. The changes to
smmintrin.h were to move the common stuff between bmmintrin.h and
smmintrin.h into a common file (mmintrin-common.h), so that
sse-1{2,3,4}.c can be compiled. This is needed because both SSE4.1 and
SSE5 define the same instructions (ROUND and PTEST). H. J., what are
your feelings about my changes to smmintrin.h?
Ok, I will look at merging SSE5_INTRINSIC, since now that I'm using
vec_merge, the two cases can be handled. I just wanted to make sure the
main changes got in.
In terms of sse-1{2,3,4}.c and bmmintrin.h/ammintrin.h. I originally
did it that way, and I can change it back. I was just being cautious in
case Intel/Via decides to implement SSE5 but not SSE4A (right now,
-msse5 does turn on -msse4a, but they are logically disjoint
extensions).
--
Michael Meissner
AMD, MS 83-29
90 Central Street
Boxborough, MA 01719
> -----Original Message-----
> From: Uros Bizjak [mailto:ubizjak@gmail.com]
> Sent: Monday, September 10, 2007 1:00 PM
> To: Meissner, Michael; Uros Bizjak; GCC Patches; Harle, Christophe;
> rajagopal, dwarak; H. J. Lu
> Subject: Re: SSE5 patches round 3
>
> Michael Meissner wrote:
>
> > This patch is the latest combined patch. If I don't hear any strong
> > objections, I will check this in a few hours, and we can fix any
> problems in
> > subsequent patches. I did a full boostrap build based on changes as
> they were
> > in the tree on Saturday. I just rev'ed up to today's changes, fixed
the
> merge
> > conflicts, and did a quick build, and I will do a full boostrap
> build/test
> > after sending this mail.
> >
>
> I would like some remarks (outlined below) to be addressed before
> committing this large patch (they should be mostly mechanical
> changes/minor problems), but I'd like just to look over them to avoid
> making large changes after your patch is committed. If necessary, we
can
> wait with other changes to i386 directory for a day or two to ease
your
> merging. Also, as your changes are strictly to files in i386
> directories, you can get a stage3 deadline extension to finish your
work.
>
> > Thanks for doing the review.
> >
>
> Thanks for your changes!
>
> Please note that your patch changes smmintrin.h, but there is no
mention
> of this change in ChangeLog, as well as no mention of
mmintrin-common.h.
> I think that since smmintrin.h header was defined by Intel, H.J.
should
> also approve this change. I have added his mail into CC:
>
> I have one objection to this patch. As discussed on 6th of September,
> you should merge
>
> + (UNSPEC_SSE5_INTRINSIC_P 150)
> + (UNSPEC_SSE5_INTRINSIC_S 151)
>
> into UNSPEC_SSE5_INTRINSIC to describe both, vector and scalar version
> of insn with single unspec constant:
>
> --quote--
> The insn with unspec is OK for vector modes (to prevent automatic
> combining of vec mul and add into vector fmadd insn as outlined in the
> comment above insn), but for scalar instruction, a vec_merge form
should
> be used. instead. Basically, you wrap body of parallel instruction
with:
>
> [(set (match_operand:V4SF 0 "register_operand" "=x")
> (vec_merge:V4SF
> (...parallel pattern body...)
> (match_dup 1)
> (const_int 1)))]
> --unquote--
>
> where
>
> --quote--
> It is a de-facto convention to name this pattern "sseX_vm....", where
> 'vm' stands for vec_merge. This way, you will need only
> UNSPEC_SSE5_INTRINSIC in both, parallel and scalar (actually vm)
> instruction.
>
> You can look into i.e. *subv4sf3 and sse_vmsubv4sf3 insn patterns as
an
> example of this approach.
> --unquote--
>
> > * config/i386/driver-i386.c (host_detect_local_cpu): Add basic
>
> The description was truncated.
>
> > *** gcc/testsuite/gcc.target/i386/sse-13.c.~1~ 2007-09-10
> 11:32:14.774764000 -0400
> > --- gcc/testsuite/gcc.target/i386/sse-13.c 2007-09-09
> 22:53:47.157284000 -0400
> > ***************
> > *** 1,8 ****
> > ! /* { dg-do compile } */
> > ! /* { dg-options "-O2 -msse4.1 -msse4a" } */
> >
> > /* Test that the intrinsics compile with optimization. All of
them
> are
> > ! defined as inline functions in {,x,e,p,t,s,a}mmintrin.h that
> reference
> > the proper builtin functions. Defining away "static" and
> "__inline"
> > results in all of them being compiled as proper functions. */
> >
> > --- 1,8 ----
> > ! /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
> > ! /* { dg-options "-O2 -msse4.1 -msse4a -msse5 " } */
> You should substitute -msse4a with -msse5, as -msse5 should implicitly
> define -msse4a.
>
> > #include <ammintrin.h>
> > + #include <bmmintrin.h>
> > #include <smmintrin.h>
>
> Also, you should remove ammintrin.h include, as bmmintrin.h should
> include ammintrin.h.
>
> Please also include changes to sse-12.c, sse-13.c and sse-14.c in
> testsuite/ChangeLog.
>
> Thanks for your cooperation,
> Uros.
>