This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

egcs-1.1.1 on x86 and 64-bit operations


Hi,

This is not a bug report but a suggestion on improvement of code
generation.  My machine is a Intel Pentium II running Linux.

I'm developing an application where some 64-bit operations are needed
and most of its parts are really performance critical.  When looking
at the generated code I was somewaht dissapointed by the 64-bit ops:

I've compiled everything with -O2 -fomit-frame-pointer:

The C code:

	void g(int);
	unsigned long long d, e;
	int shifttab[256];

Then

	void f0(void) { if (d&1) g(0); }

produces

	testb $1,d
	je .L2
	pushl $0
	call g
	addl $4,%esp

That's ok.  But

	void f1(void) { if (!(d&1)) g(1); }

is compiled into

        movl d,%edx
        movl d+4,%ecx
        andl $1,%edx
        andl $0,%ecx
        movl %edx,%eax
        orl %ecx,%eax
        jne .L4
        pushl $1
        call g
        addl $4,%esp

which is clearly not optimal.  Also, when compiling

	void f2(void) { if (d&e) g(2); }

I'd like the unneccessary movl to be removed:

        movl d,%edx
        movl d+4,%ecx
        andl e,%edx
        andl e+4,%ecx
        movl %edx,%eax		# remove this
        orl %ecx,%eax		# replace by orl %edx,%ecx
        je .L6
        pushl $2
        call g
        addl $4,%esp

Also, bitwise AND for larger constants without a ~ does a AND $0 and
the unneccessary movl:

	void f3(void) { if (d & 0xFE) g(3); }

produces

        movl d,%edx
        movl d+4,%ecx
        andl $254,%edx
        andl $0,%ecx
        movl %edx,%eax
        orl %ecx,%eax
        je .L8
        pushl $3
        call g
        addl $4,%esp

while the following looks OK:

	void f4(void) { g(shifttab[d & 0xFE]); }

produces

        movl d,%eax
        andl $254,%eax
        movl shifttab(,%eax,4),%eax
        pushl %eax
        call g
        addl $4,%esp


The case in my code was a loop, which has to examine all 1-bits in a
64-bit int and do some operation for these.  I tried two alternatives,
both of which produce suboptimal code:

void f5(void)
{
    int i;
    unsigned long long d;

    for (i = 0; d != 0; d & 0xFE ? (d >>=1, i++) : (d >>= 8, i += 8)) {
	if (!(d&1))
	    continue;
	g(i);
    }
}

void f6(void)
{
    int i, s;
    unsigned long long d;

    for (i = 0; d != 0; s = shifttab[d & 0xFE], d >>= s, i += s) {
	if (!(d&1))
	    continue;
	g(i);
    }
}


Is someone working on this and can better code generation for these
cases be expected in future versions of egcs?


urs


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]