User account creation filtered due to spam.

Bug 33790 - postreload can handle the case where the memory locations use different modes
Summary: postreload can handle the case where the memory locations use different modes
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.3.0
: P3 enhancement
Target Milestone: ---
Assignee: Andrew Pinski
URL:
Keywords: missed-optimization
Depends on:
Blocks: 4.4pending
  Show dependency treegraph
 
Reported: 2007-10-16 01:36 UTC by Andrew Pinski
Modified: 2008-03-29 08:55 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-10-16 01:37:37


Attachments
Patch (501 bytes, patch)
2007-10-16 01:39 UTC, Andrew Pinski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2007-10-16 01:36:04 UTC
Take the following testcase (either on spu-elf or powerpc-linux-gnu with -maltivec):
#define vector __attribute__((__vector_size__(16) ))

typedef vector float vec_float4;
typedef struct {
	vec_float4 data;
} VecFloat4;

typedef struct {
	vec_float4 a;
	vec_float4 b;
} VecFloat4x2;


VecFloat4 test1(VecFloat4 a, VecFloat4 b)
{
	a.data = a.data+b.data;
	return a;
}


VecFloat4x2 test2(VecFloat4x2 data)
{
	data.a = data.a+data.a;
	data.b = data.b+data.b;
	return data;
}

----- cut -----
Right now we do (for spu-elf, it is a similar issue for PPC):
_Z5test211VecFloat4x2:
        hbr     .L5,$lr
        stqd    $sp,-128($sp)
        ai      $sp,$sp,-128
        stqd    $3,64($sp)
        stqd    $4,80($sp)
        lqd     $5,80($sp)
        lqd     $4,64($sp)
        fa      $2,$5,$5
        fa      $3,$4,$4
        stqd    $2,48($sp)
        stqd    $3,32($sp)
        lqd     $4,48($sp)
        lqd     $3,32($sp)
        ai      $sp,$sp,128
.L5:
        bi      $lr

---- cut ----
With the patch which I will attach, we get:
_Z5test211VecFloat4x2:
        fa      $2,$3,$3
        hbr     .L5,$lr
        stqd    $sp,-128($sp)
        ai      $sp,$sp,-128
        nop     127
        stqd    $3,64($sp)
        fa      $3,$4,$4
        stqd    $4,80($sp)
        nop     127
        stqd    $2,32($sp)
        ori     $4,$3,0
        stqd    $3,48($sp)
        ori     $3,$2,0
        ai      $sp,$sp,128
.L5:
        bi      $lr


----------- cut ------
Notice how the loads are gone.
Note dse could do the same.
Comment 1 Andrew Pinski 2007-10-16 01:37:37 UTC
Mine.
Comment 2 Andrew Pinski 2007-10-16 01:39:36 UTC
Created attachment 14357 [details]
Patch

This patch has been tested on powerpc64-linux-gnu with no regressions and also test for spu-elf with no regressions.  I have not looked into the code size differences though but it should just decrease them instead of increase them as we change a load to a move which then maybe register rename can rename registers around to get one less register usage.
Comment 3 Andrew Pinski 2008-03-29 08:55:54 UTC
Hmm, I wonder how important this is now after the DSE patch for PR 33927 which basically does the same thing and it also runs after reload.  I really don't want to make cselib any slower than it is already and post reload cse is really to me a hack for reload (or really RA) not doing its job so I don't want to slow down post reload cse.

I am going to close this as won't fix as the reasons mentioned about.
Comment 4 pinskia@gmail.com 2008-03-29 09:02:55 UTC
Subject: Re:  postreload can handle the case where the memory locations use different modes

I forgot to mention that the dse patch fixes the problem earlier on so  
we now do the optimization pre-reload. We still have an extra store  
but that is recorded as another bug I filed.

Sent from my iPhone

On Mar 29, 2008, at 1:55, "pinskia at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org 
 > wrote:

>
>
> ------- Comment #3 from pinskia at gcc dot gnu dot org  2008-03-29  
> 08:55 -------
> Hmm, I wonder how important this is now after the DSE patch for PR  
> 33927 which
> basically does the same thing and it also runs after reload.  I  
> really don't
> want to make cselib any slower than it is already and post reload  
> cse is really
> to me a hack for reload (or really RA) not doing its job so I don't  
> want to
> slow down post reload cse.
>
> I am going to close this as won't fix as the reasons mentioned about.
>
>
> -- 
>
> pinskia at gcc dot gnu dot org changed:
>
>           What    |Removed                     |Added
> --- 
> --- 
> ----------------------------------------------------------------------
>             Status|ASSIGNED                    |RESOLVED
>         Resolution|                            |WONTFIX
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33790
>