This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to implement efficiently builtins for dual-result instructions ?


Hi, Paolo and Ian,

Thanks a lot for your suggestions. I have tried the one suggested by Paolo
It has improved my code considerably, disposing of load and stores.
However, the code produced is still not as efficient as I would like it to be

For the following function
int bar()
{
  int r1, r2;
  long long tmp;
  tmp = __super_ld32_di(&addr);
  *(&r1) = (int)tmp;
  *(&r2) = (int)(tmp>>32);
  return  r1+r2;
}

I have got two unnecessary register copy "ident" instructions:
_bar:
        .function
        /***************************/
        uimm(_addr) -> r64
        super_ld32_di r64 -> r64 r65
        ident r64 -> r66
        ident r65 -> r5
        iadd r5 r66 -> r5
        /***************************/
        ijmpf r0 r2             [FREQ=10000]
        .endfunction

I saw the following happening: the lower and higher subreg of the DI pseudo,
which is the destination of the super_ld, were explicitly copied to
separate SI pseudos.
I would expect these RTL instructions to be removed by copyprop or some other
optimization stage, but this did not happen and they  became the
superfluos "ident" instructions.  Do you know what could be the reason
for that ?

 In the response of Paolo I also don't understand how the DI pseudo could be
 mapped on two consecutive SI regs. I think gcc always  will map a multiword
pseudo on consecutive word-size regs. Am I wrong here ?

Dmitry




On 2/4/08, Paolo Bonzini <bonzini@gnu.org> wrote:
>
> > To invoke this instruction from the source level, a compiler builtin
> > is provided.
> > Since C syntax doesn't provide functions with two results, this builtin refers
> > to them via pointers:__super_ld32( int* x, int *y, int *a)
>
> I did something similar in a private port by folding the builtin to
>
>   long long tmp = __super_ld32_dimode (a);
>   *x = (int) tmp;
>   *y = (int) (tmp >> 32);
>
> Then you create the builtin as returning a DImode, but the lower-subreg
> pass will be able to split the DImode pseudo into non-consecutive hard
> registers.
>
> Paolo
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]