Bug 30271 - -mstrict-align can an store extra for struct agrument passing
Summary: -mstrict-align can an store extra for struct agrument passing
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: ---
Assignee: Kenneth.Zadeck@NaturalBridge.com
URL:
Keywords: missed-optimization
Depends on: 38532
Blocks: 16996 argument, return 29215
  Show dependency treegraph
 
Reported: 2006-12-21 02:49 UTC by Andrew Pinski
Modified: 2022-03-08 16:20 UTC (History)
7 users (show)

See Also:
Host:
Target: powerpc64-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-10-15 21:30:29


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2006-12-21 02:49:05 UTC
Testcase:
struct a { short t, t1;};
int f(struct a b) { return b.t; }
Without -mstrict-align, we get:
.L.f:
        srawi 3,3,16
        extsw 3,3
        blr

But with we get:
.L.f:
        stw 3,48(1)
        nop
        nop
        nop
        lha 3,48(1)
        extsw 3,3
        blr
Comment 1 Andrew Pinski 2006-12-21 22:23:13 UTC
The problem is that we force the struct to be BLKmode from DImode with:
  /* If structure's known alignment is less than what the scalar
     mode would need, and it matters, then stick with BLKmode.  */
  if (TYPE_MODE (type) != BLKmode
      && STRICT_ALIGNMENT
      && ! (TYPE_ALIGN (type) >= BIGGEST_ALIGNMENT
	    || TYPE_ALIGN (type) >= GET_MODE_ALIGNMENT (TYPE_MODE (type))))
    {
      /* If this is the only reason this type is BLKmode, then
	 don't force containing types to be BLKmode.  */
      TYPE_NO_FORCE_BLK (type) = 1;
      TYPE_MODE (type) = BLKmode;
    }

Which causes us to store this to the stack all the time.
Comment 2 Andrew Pinski 2007-07-24 02:14:10 UTC
This is improved by
http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00573.html

But for:
struct a
{
  int t, t1;
};

int f(struct a b)
{
  return b.t + b.t1;
}

We still get a store/load.
Comment 3 Andrew Pinski 2007-07-24 02:14:51 UTC
(In reply to comment #2)
> This is improved by
> http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00573.html
Also the store is still there:
.L.f:
        srawi 0,3,16
        stw 3,48(1)
        extsw 3,0
        blr
Comment 4 Andrew Pinski 2007-10-15 21:30:29 UTC
The store is still there but there is no longer an extra load.
The load was fixed for the testcase in comment #2 by:
2007-10-12  Richard Sandiford  <rsandifo@nildram.co.uk>

        * dse.c (find_shift_sequence): Reinstate "<= UNITS_PER_WORD" condition.
Comment 5 Andrew Pinski 2007-10-15 21:36:21 UTC
Note comment #1 compiles to:
        srdi 0,3,32
        std 3,48(1)
        add 3,3,0
        extsw 3,3
        blr
Comment 6 Kenneth Zadeck 2007-10-29 14:09:22 UTC
These stores to the stack are not really anything that dse can handle.  It is good at removing stores addressed off the frame pointer that go dead when the function returns, but it must be more conservative with stack pointer based addresses.

This is a result of the fact that gcc does not allow the program to play as many games with values accessed off of the frame as it does with values accessed off of the stack.  In fact, i think it would be wrong for dse to remove these stores because (i believe) that the caller is allowed to see values accessed off the stack pointer when the function returns.

On the other hand, if i am wrong about this, it would not be hard to teach dse to do some of the same tricks to stack based addresses as it currently does for frame based addresses but i think that i pressed iant pretty hard on this stuff when i was writing this pass and this was all he would allow me to do.

I think that if this bug is going to be "cleared", the right way to go is do better analysis about generating those stores in the first place.  
Comment 7 Kenneth Zadeck 2007-10-30 00:38:44 UTC
Subject: Re:  -mstrict-align can an extra for struct agrument
 passing

zadeck at naturalbridge dot com wrote:
> ------- Comment #6 from zadeck at naturalbridge dot com  2007-10-29 14:09 -------
> These stores to the stack are not really anything that dse can handle.  It is
> good at removing stores addressed off the frame pointer that go dead when the
> function returns, but it must be more conservative with stack pointer based
> addresses.
>
> This is a result of the fact that gcc does not allow the program to play as
> many games with values accessed off of the frame as it does with values
> accessed off of the stack.  In fact, i think it would be wrong for dse to
> remove these stores because (i believe) that the caller is allowed to see
> values accessed off the stack pointer when the function returns.
>
> On the other hand, if i am wrong about this, it would not be hard to teach dse
> to do some of the same tricks to stack based addresses as it currently does for
> frame based addresses but i think that i pressed iant pretty hard on this stuff
> when i was writing this pass and this was all he would allow me to do.
>
> I think that if this bug is going to be "cleared", the right way to go is do
> better analysis about generating those stores in the first place.  
>
>
>   
ian t and i discussed this earlier tonight.

ian does not believe that the solution that we will pursue for this bug
merits inclusion into stage 3 so i am writing here so we can come back
for it when stage 1 opens.

one of the features that is encoded into the current implementation of
dse is the ability to apply very aggressive treatment to addresses that
have been given their own magical alias sets.  We use these magical
alias sets in the current dse to mark the stack locations that reload
uses during spilling as being known never to be aliased.    This allows
us to bypass the currently conservative aliasing code and treat these
locations as if they have completely transparent semantics, which in
fact they do.

our plan is to mark the locations referenced off of virtual registers as
having these magical alias sets iff there is no place in the code that
takes the address of one of these locations.  A quick scan at the start
of pass_instantiate_virtual_regs should be able to identify the places
where the address are taken. 

This should allow us to clear any dead when the function returns or
globally redundant stores of the variables addressed off these virtual
registers. 

There may be other places where these magical alias sets are applicable;
if anyone wants to suggest some, we will look into it.

Comments?

kenny
Comment 8 Andrew Pinski 2008-12-11 04:42:41 UTC
So the problem with the stores here is that the base is arg_pointer_rtx which is still a frame pointer related offset.  I think the same can be said is true of stack_pointer_rtx too.  We only set frame_related for frame_pointer_rtx and hard_frame_pointer_rtx but arg_pointer_rtx will become a frame pointer later one too.  The only issue is that there might not be correct dependencies with respect of arg_pointer_rtx.

Kenny,
  Do you agree?
Comment 9 Kenneth Zadeck 2008-12-15 15:32:48 UTC
Andrew, 

What is your point here?

1) Is it your claim that anything that is arg_pointer_rtx related would automatically qualify as being safe enough to remove dead stores to?

or

2) Is it your claim that if we could generalize the game proposed in comment #7 to cover the arg_pointer_rtx's also?

Kenny
Comment 10 Andrew Pinski 2008-12-16 02:08:02 UTC
(In reply to comment #9)
> Andrew, 
> 
> What is your point here?

My point here is that currently we do:
      gi->frame_related =
        (base == frame_pointer_rtx) || (base == hard_frame_pointer_rtx);

But if we change it to be:
      gi->frame_related =
        (base == frame_pointer_rtx) || (base == hard_frame_pointer_rtx)
        || (base == arg_pointer_rtx && fixed_regs[ARG_POINTER_REGNUM]);

It would delete the store (at least in a 4.3 based compiler).  arg_pointer_rtx is the incoming argument space so if it is a fixed register it will be also frame related and we can safely delete the stores to this space.
Comment 11 Andrew Pinski 2008-12-29 23:51:51 UTC
(In reply to comment #10)
Hmm, this patch bootstraps just fine on the 4.3 branch but causes miscompares on the trunk (the 4.3 branch had checking on also when I bootstrapped it on i386-darwin8.11).
Comment 12 Xionghu Luo (luoxhu@gcc.gnu.org) 2020-05-21 06:15:14 UTC
Fixed at least from GCC 4.9.4?
$ /opt/at8.0/bin/gcc -O3 -c -S  pr30271.c -mstrict-align
$ cat pr30271.s
        .file   "pr30271.c"
        .abiversion 2
        .section        ".toc","aw"
        .section        ".text"
        .align 2
        .p2align 4,,15
        .globl f
        .type   f, @function
f:
        extsh 9,3
        srawi 3,3,16
        add 3,9,3
        extsw 3,3
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0
        .size   f,.-f
        .ident  "GCC: (GNU) 4.9.4 20150824 (Advance-Toolchain-at8.0) [ibm/gcc-4_9-branch, revision: 227153 merged from gcc-4_9-branch, revision 227151]"
        .section        .note.GNU-stack,"",@progbits