This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

performance regression: inlining with constant arguments


Hi,

This is a small piece of code from an mpeg-2 decoding library. It uses
an inline function to compile the same code twice, first using the
pavgb instruction, then using the pavgusb instruction. Both
instructions have identical semantics, but are supported by different
processors.... (SSE vs. 3dnow).

This all worked fine with gcc 2.95 or 3.0, but with 3.1 gcc apparently
can not figure out that the processor type is known at compile time,
so it loads the processor type value in a register, and a few
instructions later generates a conditional jump based on that
value. Previous versions of gcc would 'know' the processor type, and
directly generate the right instruction.

I have attached a small code sample, which I think is simple enough
(compiles to less than 128 bytes with either gcc versions)

/* small macros to generate MMX instructions */

#define	mmx_m2r(op,mem,reg) \
	__asm__ __volatile__ (#op " %0, %%" #reg \
			      : /* nothing */ \
			      : "m" (mem))

#define	mmx_r2m(op,reg,mem) \
	__asm__ __volatile__ (#op " %%" #reg ", %0" \
			      : "=m" (mem) \
			      : /* nothing */ )

#define	movq_m2r(var,reg)	mmx_m2r (movq, var, reg)
#define	movq_r2m(reg,var)	mmx_r2m (movq, reg, var)
#define pavgusb_m2r(var,reg)	mmx_m2r (pavgusb, var, reg)
#define	pavgb_m2r(var,reg)	mmx_m2r (pavgb, var, reg)


/* CPU_MMXEXT/CPU_3DNOW adaptation layer */

#define CPU_MMXEXT 0
#define CPU_3DNOW 1

#define pavg_m2r(src,dest)		\
do {					\
    if (cpu == CPU_MMXEXT)		\
	pavgb_m2r (src, dest);		\
    else				\
	pavgusb_m2r (src, dest);	\
} while (0)


/* motion compensation code */

static inline void MC_avg1_8 (int height, char * dest, char * ref, int stride,
			      int cpu)
{
    do {
	movq_m2r (*ref, mm0);
	pavg_m2r (*dest, mm0);
	ref += stride;
	movq_r2m (mm0, *dest);
	dest += stride;
    } while (--height);
}

/* SSE/3dnow specializations */

void MC_avg_o_8_mmxext (char * dest, char * ref, int stride, int height)
{
    MC_avg1_8 (height, dest, ref, stride, CPU_MMXEXT);
}

void MC_avg_o_8_3dnow (char * dest, char * ref, int stride, int height)
{
    MC_avg1_8 (height, dest, ref, stride, CPU_3DNOW);
}



Cheers,

-- 
Michel "Walken" LESPINASSE
Is this the best that god can do ? Then I'm not impressed.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]