This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Strange R3000 MIPS code generation


Dave,

 after looking at the assembler it seems that egcs has
recognize that:

a) the shift ops are not necessary in this case and
b) that the masking with "ff" is also redundant.

 The question is, which version is more optimal on speed.
If gcc is faster, it could be that egcs is a bit overagressive :-)

 The lbu to $2 is the assigment to ucVertex0, which is then
used to accumulate the final sum.

Martin



===
------------------------------------------------------
Martin Knoblauch
email: knobi@knobisoft.de or knobi@rocketmail.com
www:   http://www.knobisoft.de







---Dave Brown <dave@snsys.com> wrote:
>
> Hi,
> 
> I have been trying egcs C & C++ builds for producing mips R3000 code
for
> the Sony Playstation, and on the whole the results have been rather
good,
> especially for C++.
> 
> Just one problem I've noticed is where you perform a load, as in
this C
> example:
> 
> egcs generates 4 single byte loads (lbu) instead of one 4-byte load
(lw),
> as GCC does.
> 
> ¦
> ¦unsigned long ulVertex = 0x04030201;
> ¦
> ¦extern  int TestCompiler(void);
> ¦
> ¦int TestCompiler(void)
> ¦{
> ¦  /* We need to keep the extracted vertex indices   */
> ¦  /* around, as we use them for normal lookup later */
> ¦  unsigned char ucVertex0;
> ¦  unsigned char ucVertex1;
> ¦  unsigned char ucVertex2;
> ¦  unsigned char ucVertex3;
> ¦
> ¦  /* In this example I'm just splitting a 32-bit global  */
> ¦  /* but in our rendering code this read/split operation */
> ¦  /* takes place for _every_ polygon.                    */
> ¦  ucVertex0   = ((ulVertex)>>0)&0xff;
> ¦  ucVertex1   = ((ulVertex)>>8)&0xff;
> ¦  ucVertex2   = ((ulVertex)>>16)&0xff;
> ¦  ucVertex3   = ((ulVertex)>>24)&0xff;
> ¦
> ¦  /* I've just put this here to use the results..    */
> ¦  /* otherwise the optimiser removes all the code in */
> ¦  /* this function.                                  */
> ¦  return (ucVertex0 + ucVertex1 + ucVertex2 + ucVertex3);
> ¦}
> ¦
> ¦main()
> ¦{
> ¦        return (0);
> ¦}
> ¦
> 
> egcs 1.1 & 1.1.1 produces :
> 
> TestCompiler:
>         .frame  $sp,0,$31
>         .mask   0x00000000,0
>         .fmask  0x00000000,0
>         lbu     $2,ulVertex
>         lbu     $3,ulVertex+1
>         lbu     $4,ulVertex+2
>         lbu     $5,ulVertex+3
>         addu    $2,$2,$3
>         addu    $2,$2,$4
>         .set    noreorder
>         .set    nomacro
>         j       $31
>         addu    $2,$2,$5
>         .set    macro
>         .set    reorder
> 
> whereas gcc 2.8.1 gives the better code:
> 
> TestCompiler:
>         .frame  $sp,0,$31
>         .mask   0x00000000,0
>         .fmask  0x00000000,0
>         lw      $5,ulVertex
>         lbu     $2,ulVertex
>         srl     $3,$5,8
>         srl     $4,$5,16
>         andi    $3,$3,0x00ff
>         addu    $2,$2,$3
>         andi    $4,$4,0x00ff
>         addu    $2,$2,$4
>         srl     $5,$5,24
>         .set    noreorder
>         .set    nomacro
>         j       $31
>         addu    $2,$2,$5
>         .set    macro
>         .set    reorder
> 
> Does anyone have any idea why egcs is using 4 single byte loads ? 
Or even
> why gcc puts the extra (as far as I can tell) redundant lbu after
the lw ?
> Both sets of asm were generated using -O2.
> 
> I've tried meddling with the settings in mips.h but to no avail.  Can
> anyone help ?
> 
> Many thanks,
> 
> Dave Brown
> 

_________________________________________________________
DO YOU YAHOO!?
Get your free @yahoo.com address at http://mail.yahoo.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]