This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH, i386] automatic MMX/x87 FPU mode switching (the real one)
- From: Richard Henderson <rth at redhat dot com>
- To: Uros Bizjak <uros dot bizjak at kss-loka dot si>
- Cc: gcc-patches at gcc dot gnu dot org, rth at gcc dot gnu dot org, roger at eyesopen dot com
- Date: Sun, 26 Jun 2005 22:20:44 -0700
- Subject: Re: [PATCH, i386] automatic MMX/x87 FPU mode switching (the real one)
- References: <firstname.lastname@example.org>
On Tue, Jun 21, 2005 at 12:43:31PM +0200, Uros Bizjak wrote:
> +(define_insn "efpu"
> + [(set (reg:ALLREGS FIRSTFP_REG)
> + (unspec_volatile:ALLREGS [(const_int 0)] UNSPECV_EFPU))]
> + ""
> + ""
> + [(set_attr "length" "0")])
> +(define_insn "emms"
> + [(set (reg:ALLREGS FIRSTMMX_REG)
> + (unspec_volatile:ALLREGS [(const_int 0)] UNSPECV_EMMS))]
> + "TARGET_MMX"
> + return TARGET_3DNOW ? "femms" : "emms";
Both of these patterns need to use the registers of the opposite unit.
Otherwise the registers won't be live; you'll instead get REG_UNUSED
markers on the insn. So:
[(set (reg:ALLREGS FIRSTFP_REG)
(unspec_volatile:ALLREGS [(reg:ALLREGS FIRSTMMX_REG)]
You're also missing a change to EPILOGUE_USES to force the registers
of the unit *not* active at the end of the function to be live.
Both of these are required in order to prevent the register allocator
from allocating registers from the unit that is not active.
> +;; This instruction pattern sets attribute "unit" to "i387".
> +;; Based on "unit" attribute, mode switching pass will insert an
> +;;"(f)emms" instruction in appropriate place.
> (define_insn "mmx_emms"
> + [(unspec_volatile [(const_int 0)] UNSPECV_NOP)]
> + ""
> + [(set_attr "unit" "i387")])
I guess this is a reasonable approximation. Though I wonder if we
could actually get away with emitting nothing at all for the builtins.
If we're actually doing our job in the LCM, then we can *always* place
these better than the user. Indeed, emitting nothing would optimize
static void foo()
/* use mmx */
_m_empty (); /* required in source for pedantic correctness */
for (int i = 0; i < 100; ++i)
where foo gets inlined into bar.