[PATCH][libgcc-math] Vectorized intrinsics for x86_64
Tue Apr 4 15:40:00 GMT 2006
On Fri, 31 Mar 2006, Richard Henderson wrote:
> On Tue, Mar 28, 2006 at 10:23:47AM +0200, Richard Guenther wrote:
> > The intrinsics implementation was contributed by AMD to be licensed
> > as GPL + libgcc execption.
> It's a shame they were written in hand-coded assembly; otherwise
> we could use them for 32-bit as well. I don't have any trouble
> adding these routines, but I think the data sections need some work.
> First, none of the data put in .data is writable. At bare minimum
> this should be going into .rodata. But I also see that there is
> quite a bit of overlap between the various routines, so it would be
> Much Better if we could make use of the constant pooling featues of
> the linker. So the tables should go into .rodata, but the individual
> double-precision values should go into
> .section .rodata.cst8, "M", @progbits, 8
> .align 8
Here's an updated patch with your suggestions applied (but with .align 16,
to have it cacheline aligned), like so
.section .rodata.cst16, "M", @progbits, 16
Bootstrapped on x86_64-unknown-linux-gnu.
Ok for mainline? (I'll wait until someone looks at / approves Zdeneks
patch to utilize these functions)
(patch attached due to size)
2006-04-04 Richard Guenther <firstname.lastname@example.org>
* configure.ac: Handle x86_64 subdir.
* configure: Regenerate.
* Makefile.in: Regenerate.
* x86_64/Makefile.am: New file.
* x86_64/Makefile.in: Regenerate.
* x86_64/libm_util_amd.h: New file.
* x86_64/remainder_piby2d2f.c: Likewise.
* x86_64/remainder_piby2.c: Likewise.
* x86_64/vrd2log.s: Likewise.
* x86_64/vrd2log10.s: Likewise.
* x86_64/vrd4log.s: Likewise.
* x86_64/vrs4powxf.s: Likewise.
* x86_64/vrd4log10.s: Likewise.
* x86_64/vrd2cos.s: Likewise.
* x86_64/vrs4sincosf.s: Likewise.
* x86_64/vrd4cos.s: Likewise.
* x86_64/vrd2sin.s: Likewise.
* x86_64/vrd4sin.s: Likewise.
* x86_64/vrs4logf.s: Likewise.
* x86_64/vrs8logf.s: Likewise.
* x86_64/vrs4sinf.s: Likewise.
* x86_64/vrs4expf.s: Likewise.
* x86_64/vrs8expf.s: Likewise.
* x86_64/vrs4log2f.s: Likewise.
* x86_64/vrd2exp.s: Likewise.
* x86_64/vrs4powf.s: Likewise.
* x86_64/vrs8log2f.s: Likewise.
* x86_64/vrd2sincos.s: Likewise.
* x86_64/vrd4exp.s: Likewise.
* x86_64/vrd2log2.s: Likewise.
* x86_64/vrd4log2.s: Likewise.
* x86_64/vrs4log10f.s: Likewise.
* x86_64/vrs4cosf.s: Likewise.
* x86_64/vrs8log10f.s: Likewise.
* x86_64/mv.map: New version map.
More information about the Gcc-patches