Strange R3000 MIPS code generation

Dave Brown dave@snsys.com
Sun Jan 31 23:58:00 GMT 1999


Hi,

I have been trying egcs C & C++ builds for producing mips R3000 code for
the Sony Playstation, and on the whole the results have been rather good,
especially for C++.

Just one problem I've noticed is where you perform a load, as in this C
example:

egcs generates 4 single byte loads (lbu) instead of one 4-byte load (lw),
as GCC does.

¦
¦unsigned long ulVertex = 0x04030201;
¦
¦extern  int TestCompiler(void);
¦
¦int TestCompiler(void)
¦{
¦  /* We need to keep the extracted vertex indices   */
¦  /* around, as we use them for normal lookup later */
¦  unsigned char ucVertex0;
¦  unsigned char ucVertex1;
¦  unsigned char ucVertex2;
¦  unsigned char ucVertex3;
¦
¦  /* In this example I'm just splitting a 32-bit global  */
¦  /* but in our rendering code this read/split operation */
¦  /* takes place for _every_ polygon.                    */
¦  ucVertex0   = ((ulVertex)>>0)&0xff;
¦  ucVertex1   = ((ulVertex)>>8)&0xff;
¦  ucVertex2   = ((ulVertex)>>16)&0xff;
¦  ucVertex3   = ((ulVertex)>>24)&0xff;
¦
¦  /* I've just put this here to use the results..    */
¦  /* otherwise the optimiser removes all the code in */
¦  /* this function.                                  */
¦  return (ucVertex0 + ucVertex1 + ucVertex2 + ucVertex3);
¦}
¦
¦main()
¦{
¦        return (0);
¦}
¦

egcs 1.1 & 1.1.1 produces :

TestCompiler:
        .frame  $sp,0,$31
        .mask   0x00000000,0
        .fmask  0x00000000,0
        lbu     $2,ulVertex
        lbu     $3,ulVertex+1
        lbu     $4,ulVertex+2
        lbu     $5,ulVertex+3
        addu    $2,$2,$3
        addu    $2,$2,$4
        .set    noreorder
        .set    nomacro
        j       $31
        addu    $2,$2,$5
        .set    macro
        .set    reorder

whereas gcc 2.8.1 gives the better code:

TestCompiler:
        .frame  $sp,0,$31
        .mask   0x00000000,0
        .fmask  0x00000000,0
        lw      $5,ulVertex
        lbu     $2,ulVertex
        srl     $3,$5,8
        srl     $4,$5,16
        andi    $3,$3,0x00ff
        addu    $2,$2,$3
        andi    $4,$4,0x00ff
        addu    $2,$2,$4
        srl     $5,$5,24
        .set    noreorder
        .set    nomacro
        j       $31
        addu    $2,$2,$5
        .set    macro
        .set    reorder

Does anyone have any idea why egcs is using 4 single byte loads ?  Or even
why gcc puts the extra (as far as I can tell) redundant lbu after the lw ?
Both sets of asm were generated using -O2.

I've tried meddling with the settings in mips.h but to no avail.  Can
anyone help ?

Many thanks,

Dave Brown



More information about the Gcc mailing list