[PATCH][libgcc-math] Vectorized intrinsics for x86_64

Richard Guenther rguenther@suse.de
Tue Apr 4 15:40:00 GMT 2006


On Fri, 31 Mar 2006, Richard Henderson wrote:

> On Tue, Mar 28, 2006 at 10:23:47AM +0200, Richard Guenther wrote:
> > The intrinsics implementation was contributed by AMD to be licensed
> > as GPL + libgcc execption.
> 
> It's a shame they were written in hand-coded assembly; otherwise
> we could use them for 32-bit as well.  I don't have any trouble
> adding these routines, but I think the data sections need some work.
> 
> First, none of the data put in .data is writable.  At bare minimum
> this should be going into .rodata.  But I also see that there is 
> quite a bit of overlap between the various routines, so it would be
> Much Better if we could make use of the constant pooling featues of
> the linker.  So the tables should go into .rodata, but the individual
> double-precision values should go into
> 
> 	.section .rodata.cst8, "M", @progbits, 8
> 	.align 8

Here's an updated patch with your suggestions applied (but with .align 16,
to have it cacheline aligned), like so

.section .rodata.cst16, "M", @progbits, 16
.align 16

Bootstrapped on x86_64-unknown-linux-gnu.

Ok for mainline?  (I'll wait until someone looks at / approves Zdeneks
patch to utilize these functions)

Thanks,
Richard.

(patch attached due to size)

2006-04-04  Richard Guenther  <rguenther@suse.de>

        * configure.ac: Handle x86_64 subdir.
        * configure: Regenerate.
        * Makefile.in: Regenerate.
        * x86_64/Makefile.am: New file.
        * x86_64/Makefile.in: Regenerate.
        * x86_64/libm_util_amd.h: New file.
        * x86_64/remainder_piby2d2f.c: Likewise.
        * x86_64/remainder_piby2.c: Likewise.
        * x86_64/vrd2log.s: Likewise.
        * x86_64/vrd2log10.s: Likewise.
        * x86_64/vrd4log.s: Likewise.
        * x86_64/vrs4powxf.s: Likewise.
        * x86_64/vrd4log10.s: Likewise.
        * x86_64/vrd2cos.s: Likewise.
        * x86_64/vrs4sincosf.s: Likewise.
        * x86_64/vrd4cos.s: Likewise.
        * x86_64/vrd2sin.s: Likewise.
        * x86_64/vrd4sin.s: Likewise.
        * x86_64/vrs4logf.s: Likewise.
        * x86_64/vrs8logf.s: Likewise.
        * x86_64/vrs4sinf.s: Likewise.
        * x86_64/vrs4expf.s: Likewise.
        * x86_64/vrs8expf.s: Likewise.
        * x86_64/vrs4log2f.s: Likewise.
        * x86_64/vrd2exp.s: Likewise.
        * x86_64/vrs4powf.s: Likewise.
        * x86_64/vrs8log2f.s: Likewise.
        * x86_64/vrd2sincos.s: Likewise.
        * x86_64/vrd4exp.s: Likewise.
        * x86_64/vrd2log2.s: Likewise.
        * x86_64/vrd4log2.s: Likewise.
        * x86_64/vrs4log10f.s: Likewise.
        * x86_64/vrs4cosf.s: Likewise.
        * x86_64/vrs8log10f.s: Likewise.
        * x86_64/mv.map: New version map.



More information about the Gcc-patches mailing list