This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Advice on creating an AMD-specific matmul

From: Thomas Koenig <tkoenig at netcologne dot de>
To: "fortran at gcc dot gnu dot org" <fortran at gcc dot gnu dot org>, gcc mailing list <gcc at gcc dot gnu dot org>
Date: Sat, 20 May 2017 15:45:57 +0200
Subject: Advice on creating an AMD-specific matmul
Authentication-results: sourceware.org; auth=none

Hello world,

I am wondering how best to implement an AMD-specific version
of MATMUL for libfortran.

What we currently have in there is restricted to Intel chips,
with a run-time selection of versions depending on availability
of AVX, AVX2+FMA and AVX512F.

The specific function is then declared using (as an example)

matmul_r4_avx (gfc_array_r4 * const restrict retarray,

gfc_array_r4 * const restrict a, gfc_array_r4 * const restrictb, int try_blas,

        int blas_limit, blas_call gemm) __attribute__((__target__("avx")));

This doesn't work well with AMD chips because, according to everything
I have read, their performance with 256-bit AVX is worse than not
using AVX at all. FMA would certainly come in handy for matrix
multiplicaton. Compiling a separate file with -mprefer-avx128 might
work, but would certainly create headaches to do with mixing
m4, CPP and contitional compilation using autoconf.

Alternatively, I don't think something like -mprefer-avx128
can be specified as a target attribute (but I'd like to be
proven wrong here).

Any advice on how best to proceed?

Regards

	Thomas

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]