This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
performance regression: inlining with constant arguments
- From: Michel LESPINASSE <walken at zoy dot org>
- To: gcc at gcc dot gnu dot org, gcc-bugs at gcc dot gnu dot org
- Date: Mon, 17 Jun 2002 02:45:58 -0700
- Subject: performance regression: inlining with constant arguments
Hi,
This is a small piece of code from an mpeg-2 decoding library. It uses
an inline function to compile the same code twice, first using the
pavgb instruction, then using the pavgusb instruction. Both
instructions have identical semantics, but are supported by different
processors.... (SSE vs. 3dnow).
This all worked fine with gcc 2.95 or 3.0, but with 3.1 gcc apparently
can not figure out that the processor type is known at compile time,
so it loads the processor type value in a register, and a few
instructions later generates a conditional jump based on that
value. Previous versions of gcc would 'know' the processor type, and
directly generate the right instruction.
I have attached a small code sample, which I think is simple enough
(compiles to less than 128 bytes with either gcc versions)
/* small macros to generate MMX instructions */
#define mmx_m2r(op,mem,reg) \
__asm__ __volatile__ (#op " %0, %%" #reg \
: /* nothing */ \
: "m" (mem))
#define mmx_r2m(op,reg,mem) \
__asm__ __volatile__ (#op " %%" #reg ", %0" \
: "=m" (mem) \
: /* nothing */ )
#define movq_m2r(var,reg) mmx_m2r (movq, var, reg)
#define movq_r2m(reg,var) mmx_r2m (movq, reg, var)
#define pavgusb_m2r(var,reg) mmx_m2r (pavgusb, var, reg)
#define pavgb_m2r(var,reg) mmx_m2r (pavgb, var, reg)
/* CPU_MMXEXT/CPU_3DNOW adaptation layer */
#define CPU_MMXEXT 0
#define CPU_3DNOW 1
#define pavg_m2r(src,dest) \
do { \
if (cpu == CPU_MMXEXT) \
pavgb_m2r (src, dest); \
else \
pavgusb_m2r (src, dest); \
} while (0)
/* motion compensation code */
static inline void MC_avg1_8 (int height, char * dest, char * ref, int stride,
int cpu)
{
do {
movq_m2r (*ref, mm0);
pavg_m2r (*dest, mm0);
ref += stride;
movq_r2m (mm0, *dest);
dest += stride;
} while (--height);
}
/* SSE/3dnow specializations */
void MC_avg_o_8_mmxext (char * dest, char * ref, int stride, int height)
{
MC_avg1_8 (height, dest, ref, stride, CPU_MMXEXT);
}
void MC_avg_o_8_3dnow (char * dest, char * ref, int stride, int height)
{
MC_avg1_8 (height, dest, ref, stride, CPU_3DNOW);
}
Cheers,
--
Michel "Walken" LESPINASSE
Is this the best that god can do ? Then I'm not impressed.