[RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack

Jeff Law jeffreyalaw@gmail.com
Tue Jul 6 04:01:56 GMT 2021



On 7/5/2021 5:17 AM, Richard Biener via Gcc-patches wrote:
> On Fri, Jul 2, 2021 at 6:13 PM Jeff Law <jlaw@tachyum.com> wrote:
>>
>> This is a minor missed optimization we found with our internal port.
>>
>> Given this code:
>>
>> typedef struct {short a; short b;} T;
>>
>> extern void g1();
>>
>> void f(T s)
>> {
>>           if (s.a < 0)
>>                   g1();
>> }
>>
>>
>> "s" is passed in a register, but it's still a BLKmode object because the
>> alignment of T is smaller than the alignment that an integer of the same
>> size would have (16 bits vs 32 bits).
>>
>>
>> Because "s" is BLKmode function.c is going to store it into a stack slot
>> and we'll load it from the that slot for each reference.  So on the v850
>> (just to pick a port that likely has the same behavior we're seeing) we
>> have this RTL from CSE2:
>>
>>
>> (insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 34 .fp)
>>                   (const_int -4 [0xfffffffffffffffc])) [2 S4 A32])
>>           (reg:SI 6 r6)) "j.c":6:1 7 {*movsi_internal}
>>        (expr_list:REG_DEAD (reg:SI 6 r6)
>>           (nil)))
>> (note 3 2 8 2 NOTE_INSN_FUNCTION_BEG)
>> (insn 8 3 9 2 (set (reg:HI 44 [ s.a ])
>>           (mem/c:HI (plus:SI (reg/f:SI 34 .fp)
>>                   (const_int -4 [0xfffffffffffffffc])) [1 s.a+0 S2 A32]))
>> "j.c":7:5 3 {*movhi_internal}
>>        (nil))
>> (insn 9 8 10 2 (parallel [
>>               (set (reg:SI 45)
>>                   (ashift:SI (subreg:SI (reg:HI 44 [ s.a ]) 0)
>>                       (const_int 16 [0x10])))
>>               (clobber (reg:CC 32 psw))
>>           ]) "j.c":7:5 94 {ashlsi3_clobber_flags}
>>        (expr_list:REG_DEAD (reg:HI 44 [ s.a ])
>>           (expr_list:REG_UNUSED (reg:CC 32 psw)
>>               (nil))))
>> (insn 10 9 11 2 (parallel [
>>               (set (reg:SI 43)
>>                   (ashiftrt:SI (reg:SI 45)
>>                       (const_int 16 [0x10])))
>>               (clobber (reg:CC 32 psw))
>>           ]) "j.c":7:5 104 {ashrsi3_clobber_flags}
>>        (expr_list:REG_DEAD (reg:SI 45)
>>           (expr_list:REG_UNUSED (reg:CC 32 psw)
>>               (nil))))
>>
>>
>> Insn 2 is the store into the stack. insn 8 is the load for s.a in the
>> conditional.  DSE1 replaces the MEM in insn 8 with (reg 6) since (reg 6)
>> has the value we want.  After that the store at insn 2 is dead.  Sadly
>> DSE never removes the store.
>>
>> The problem is RTL DSE considers a store with no MEM_EXPR as escaping,
>> which keeps the MEM live.  The lack of a MEM_EXPR is due to call to
>> change_address to twiddle the mode on the MEM for the store at insn 2.
>> It should be safe to copy the MEM_EXPR (which should always be a
>> PARM_DECL) from the original memory to the memory returned by
>> change_address.  Doing so results in DSE1 removing the store at insn 2.
>>
>> It would be nice to remove the stack setup/teardown.   I'm not offhand
>> aware of mechanisms to remove the setup/teardown after we've already
>> allocated a slot, even if the slot is no longer used.
>>
>> Bootstrapped and regression tested on x86, though I don't think that's a
>> particularly useful test.  So I also ran it through my tester across
>> those pesky embedded targets without regressions as well.
>>
>> I didn't include a test simply because I didn't want to have an insane
>> target selector.  I guess if we really wanted a test we could look after
>> DSE1 is done and verify there aren't any MEMs left at all.  Willing to
>> try that if the consensus is we want this tested.
>>
>> OK for the trunk?
> I wonder why the code doesn't use adjust_address instead?  That
> handles most cases already and the code doesn't change the
> address but just the mode (and access size)?
No idea.  It should be easy enough to try that approach though.

jeff



More information about the Gcc-patches mailing list