This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, i386] automatic MMX/x87 FPU mode switching (the real one)


Hello!

This patch implements much requested feature of automatic mode switching between
MMX and x87 register sets. This patch is based on LCM algorithm to insert
(f)emms instruction where appropriate. Thanks also rth for his valuable
help and Roger for his encouragement!

This patch now handles ASM patterns, as discussed with rth. The only limitation
is, that mixing x87 and MMX registers is not allowed in input and output
constraints of ASM pattern. Function calls are handled in the same way as
discussed before.

So, the testcase:

--cut here--
#include <mmintrin.h>

__v8qi
aaa (__v8qi x, __v8qi y)
{
  __v8qi mm1;

  mm1 = _mm_add_pi8 (x, y);

  return mm1;
}

int main() {
  __v8qi mm0 = { 1,2,3,4,5,6,7,8 };
  __v8qi mm1 = { 11,12,13,14,15,16,17,18 };

  double a = 0.0;

  union ttt {
    __v8qi mm;
    char x[8];
  } temp;

  temp.mm = mm0;
  temp.x[1] = cos(a);

  temp.mm = aaa (temp.mm, mm1);
  printf ("%i %f\n", temp.x[0], sqrt(temp.x[1]));

  return 0;
}
--cut here--

produces (gcc -O2 -mmmx -ffast-math -fomit-frame-pointer):

aaa:
        paddb %mm1, %mm0
        ret

main:
        pushl %ebp
        movl %esp, %ebp
        subl $24, %esp
        andl $-16, %esp
        subl $16, %esp
        movl $67305985, %edx
        movl $134678021, %ecx
        movb $1, %dh
        movq .LC1, %mm1
        movl %edx, -8(%ebp)
        movl %ecx, -4(%ebp)
        movq -8(%ebp), %mm2
        movq %mm2, %mm0
        call aaa
        movq %mm0, -8(%ebp)
        movl -8(%ebp), %edx
        movsbl %dh, %eax
        cbtw
        emms                       <<< inserted by LCM here
        pushw %ax
        movsbl %dl,%eax
        filds (%esp)
        addl $2, %esp
        movl %eax, 4(%esp)
        movl $.LC2, (%esp)
        fsqrt
        fstpl 8(%esp)
        call printf
        xorl %eax, %eax
        leave
        ret

And binary works as expected:

./a.out
12 3.605551
 

A IMHO nice feature of this patch is, that manually inserted emms (via
_mm_empty() intrinsic) is also handled with LCM approach. If there is no need
for emms in this place, it is not emitted. And this patch also handles (stupid)
code like:

#include <mmintrin.h>

__v8qi
aaa (__v8qi x, __v8qi y)
{
  __v8qi mm1;

  mm1 = _mm_add_pi8 (x, y);
  _mm_empty ();
  return mm1;
}

to produce correct asm code:

aaa:
        subl $12, %esp
        paddb %mm1, %mm0
        movq %mm0, (%esp)
        emms
        movq (%esp), %mm0
        addl $12, %esp
        ret

The patch was bootstrapped on i686-pc-linux-gnu, regtested for c, c++. It
introduces one new failure into the testsuite (__builtin_apply problems,
gcc.dg/20020218-1.c), otherwise produced correct code for all testcases I have
thrown in. I think this patch is ready for wider exposure in current mainline.

For __builtin_apply ()problems, I suggest that called function (for i386) should
NOT use MMX registers, and that it is always called in FPU_MODE_387. Otherwise,
there is no way to determine MODE_AFTER of such function.

2005-06-21  Uros Bizjak  <uros@kss-loka.si>

	* mode-switching.c (optimize_mode_switching): Change MODE_AFTER
	to include entity.

	* reg-stack.c (subst_stack_regs): Handle MMX/x87 FPU mode
	switching instructions.

	* config/sh/sh.h: MODE_AFTER: Change define to include entity.

	* config/i386/i386-modes.def: ALLREGS: New RANDOM_MODE.

	* config/i386/i386-protos.h (emit_i387_cw_initialization):
	Remove prototype.
	(ix86_mode_after): New prototype.
	(ix86_mode_entry): New prototype.
	(ix86_mode_exit): New prototype.
	(ix86_emit_mode_set): New prototype.

	* config/i386/i386.h (enum ix86_fpu_mode): New enum.
	(FPU_MODE_DEFAULT): New define.
	(enum ix86_entity): Add new I387_FPU_MODE entity.
	(NUM_MODES_FOR_MODE_SWITCHING): Add FPU_MODE_ANY to
	enable switching for I387_FPU_MODE entity.
	(MODE_AFTER): New define.
	(MODE_ENTRY): New define.
	(MODE_EXIT): New define.
	(EMIT_MODE_SET): Change definition to use ix86_emit_mode_set.
	(HARD_REGNO_NREGS): Return 8 for ALLREGS mode.

	* config/i386/i386.c (ix86_mode_needed): Handle
	entity I387_FPU_MODE.
	(ix86_mode_after): New function.
	(ix86_mode_entry): New function.
	(ix86_mode_exit): New function.
	(ix86_emit_mode_set): Renamed from emit_i387_cw_initialization.
	Handle entity I387_FPU_MODE.
	(ix86_init_machine_status): Set optimize_mode_switching flag
	for I387_FPU_MODE entity if TARGET_MMX.
	(ix86_expand_builtin) [IX86_BUILTIN_FEMMS]: Use "mmx_emms"
	instruction pattern.

	* config/i386/i386.md (UNSPECV_FEMMS): Remove constant.
	(UNSPECV_EFPU, UNSPECV_NOP, FIRSTFP_REG, FIRSTMMX_REG): New
	constants

	* config/i386/mmx.md ("mmx_emms"): Change instruction definition
	to use UNSPECV_NOP. Set "unit" attribute to i387.
	("efpu", "emms"): New instruction patterns.

Uros.

Attachment: emms.diff
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]