[PATCH] x86_64: Avoid rorx rotation instructions with -Os

Roger Sayle roger@nextmovesoftware.com
Mon Nov 15 13:54:47 GMT 2021


This patch teaches the i386 backend to avoid using BMI2's rorx
instructions when optimizing for size.  The benefits are shown
with the following example:

unsigned int ror1(unsigned int x) { return (x >> 1) | (x << 31); }
unsigned int ror2(unsigned int x) { return (x >> 2) | (x << 30); }
unsigned int rol2(unsigned int x) { return (x >> 30) | (x << 2); }
unsigned int rol1(unsigned int x) { return (x >> 31) | (x << 1); }

which currently with -Os -march=cascadelake generates:

ror1:   rorx    $1, %edi, %eax          // 6 bytes
        ret
ror2:   rorx    $2, %edi, %eax          // 6 bytes
        ret
rol2:   rorx    $30, %edi, %eax         // 6 bytes
        ret
rol1:   rorx    $31, %edi, %eax         // 6 bytes
        ret

but with this patch now generates:

ror1:   movl    %edi, %eax              // 2 bytes
        rorl    %eax                    // 2 bytes
        ret
ror2:   movl    %edi, %eax              // 2 bytes
        rorl    $2, %eax                // 3 bytes
        ret
rol2:   movl    %edi, %eax              // 2 bytes
        roll    $2, %eax                // 3 bytes
        ret
rol1:   movl    %edi, %eax              // 2 bytes
        roll    %eax                    // 2 bytes
        ret

I've confirmed that this patch is a win on the CSiBE benchmark,
even though rotations are rare, where for example libmspack/test/md5.o
shrinks from 5824 bytes to 5632 bytes.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2021-11-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.md (*bmi2_rorx<mode3>_1): Make conditional
	on !optimize_function_for_size_p.
	(*<any_rotate><mode>3_1): Add preferred_for_size attribute.
	(define_splits): Conditionalize on !optimize_function_for_size_p.
	(*bmi2_rorxsi3_1_zext): Likewise.
	(*<any_rotate>si2_1_zext): Add preferred_for_size attribute.
	(define_splits): Conditionalize on !optimize_function_for_size_p.

Thanks in advance,
Roger
--

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patchz.txt
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20211115/e692d641/attachment.txt>


More information about the Gcc-patches mailing list