This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Using SSE2 with the old i386 ABI
- From: Florian Weimer <fweimer at redhat dot com>
- To: "gcc-help at gcc dot gnu dot org" <gcc-help at gcc dot gnu dot org>
- Date: Mon, 31 Jul 2017 16:04:54 +0200
- Subject: Using SSE2 with the old i386 ABI
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=fweimer at redhat dot com
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D6421C23357B
Fedora is considering getting rid of the i686 kernel. If that happens,
all i686 installations will be in fact x86-64, so they have SSE2.
There has been an uncoordinated, breaking ABI change for i386 when the
stack alignment requirements were changed. A lot of software uses
4-byte alignment, perhaps based on this recommendation from the GCC manual:
'-mincoming-stack-boundary=NUM'
Assume the incoming stack is aligned to a 2 raised to NUM byte
boundary. If '-mincoming-stack-boundary' is not specified, the one
specified by '-mpreferred-stack-boundary' is used.
On Pentium and Pentium Pro, 'double' and 'long double' values
should be aligned to an 8-byte boundary (see '-malign-double') or
suffer significant run time performance penalties. On Pentium III,
the Streaming SIMD Extension (SSE) data type '__m128' may not work
properly if it is not 16-byte aligned.
…
This extra alignment does consume extra stack space, and generally
increases code size. Code that is sensitive to stack space usage,
such as embedded systems and operating system kernels, may want to
reduce the preferred alignment to '-mpreferred-stack-boundary=2'.
If we start compiling system libraries with SSE2 support enabled, we
must make sure that they do not assume the stack is aligned by than 4
bytes. Would -mincoming-stack-boundary=2 do that?
Will GCC still maintain stack alignment if such code is called with a
properly aligned stack? (This is important so that callbacks can still
use SSE2 with the default ABI.)
Thanks,
Florian