[PATCH 0/3] Power10 PCREL_OPT support
Bill Schmidt
wschmidt@linux.ibm.com
Tue Aug 25 04:24:54 GMT 2020
On 8/24/20 11:01 PM, Michael Meissner wrote:
> On Sat, Aug 22, 2020 at 07:05:51PM -0500, Bill Schmidt wrote:
>> What is necessary in order to allow this optimization to occur
>> earlier is to make this hidden dependency explicit. When the
>> relocation is inserted, we have to change the "pld" instruction to
>> have a specific clobber of (in this case) r5, which represents what
>> will happen if the linker makes the substitution.
>>
>> I agree that it's too fragile to force this to be the last pass, so
>> I think if Mike can look into introducing a clobber of the hard
>> register when performing the optimization, that would at least allow
>> us to move this anywhere after reload.
>>
>> I don't immediately see a solution that works prior to register
>> allocation because we basically are representing two potential
>> starting points of a live range, only one of which will survive in
>> the final code. That is too ugly a problem to hand to the register
>> allocator.
> As I said in a private message, I have the appropriate clobbers and such
> already.
Great, thanks! I had failed to understand this was there now -- excellent.
I started cobbling up some code to use the du-chains to simplify things
a little (but only a little). Dealing with intervening loads and stores
still requires a little mess. I'll send you something in the next day
or two, I hope.
Bill
>
> Here is the program I used in my previous reply to Segher:
>
> extern int a, b, c;
>
> int sum (void)
> {
> return a + b + c;
> }
>
>
> Here is the RTL before the PCREL_OPT pass from sched2:
>
> ;; Load the address of a into r8
> (insn:TI 5 13 6 2 (set (reg/f:DI 8 8 [123])
> (symbol_ref:DI ("a") [flags 0xc0] <var_decl 0x7ff100832480 a>)) "foo02.c":5:12 722 {*pcrel_extern_addr}
> (expr_list:REG_EQUIV (symbol_ref:DI ("a") [flags 0xc0] <var_decl 0x7ff100832480 a>)
> (nil)))
>
> ;; Load the address of b into r10
> (insn 6 5 10 2 (set (reg/f:DI 10 10 [124])
> (symbol_ref:DI ("b") [flags 0xc0] <var_decl 0x7ff100832510 b>)) "foo02.c":5:12 722 {*pcrel_extern_addr}
> (expr_list:REG_EQUIV (symbol_ref:DI ("b") [flags 0xc0] <var_decl 0x7ff100832510 b>)
> (nil)))
>
> ;; Load the address of c into r9
> (insn 10 6 7 2 (set (reg/f:DI 9 9 [128])
> (symbol_ref:DI ("c") [flags 0xc0] <var_decl 0x7ff1008325a0 c>)) "foo02.c":5:16 722 {*pcrel_extern_addr}
> (expr_list:REG_EQUIV (symbol_ref:DI ("c") [flags 0xc0] <var_decl 0x7ff1008325a0 c>)
> (nil)))
>
> ;; Load a's value into r3, using r8 as the base register
> (insn:TI 7 10 8 2 (set (reg:DI 3 3)
> (zero_extend:DI (mem/c:SI (reg/f:DI 8 8 [123]) [1 a+0 S4 A32]))) "foo02.c":5:12 16 {zero_extendsidi2}
> (expr_list:REG_DEAD (reg/f:DI 8 8 [123])
> (nil)))
>
> ;; Load b's value into r10, using r10 as the base register
> (insn 8 7 11 2 (set (reg:DI 10 10)
> (zero_extend:DI (mem/c:SI (reg/f:DI 10 10 [124]) [1 b+0 S4 A32]))) "foo02.c":5:12 16 {zero_extendsidi2}
> (nil))
>
> ;; Load c's value into r9, using r9 as the base register
> (insn 11 8 9 2 (set (reg:DI 9 9)
> (zero_extend:DI (mem/c:SI (reg/f:DI 9 9 [128]) [1 c+0 S4 A32]))) "foo02.c":5:16 16 {zero_extendsidi2}
> (nil))
>
> ;; Add a+b
> (insn:TI 9 11 12 2 (set (reg:SI 3 3 [125])
> (plus:SI (reg:SI 3 3 [orig:126 a ] [126])
> (reg:SI 10 10 [orig:127 b ] [127]))) "foo02.c":5:12 65 {*addsi3}
> (expr_list:REG_DEAD (reg:SI 10 10 [orig:127 b ] [127])
> (nil)))
>
> ;; Add (a+b)+c
> (insn:TI 12 9 18 2 (set (reg:SI 3 3 [122])
> (plus:SI (reg:SI 3 3 [125])
> (reg:SI 9 9 [orig:129 c ] [129]))) "foo02.c":5:16 65 {*addsi3}
> (expr_list:REG_DEAD (reg:SI 9 9 [orig:129 c ] [129])
> (nil)))
>
> ;; Sign extend
> (insn:TI 18 12 19 2 (set (reg/i:DI 3 3)
> (sign_extend:DI (reg:SI 3 3 [122]))) "foo02.c":6:1 31 {extendsidi2}
> (nil))
>
> ;; Return
> (insn 19 18 29 2 (use (reg/i:DI 3 3)) "foo02.c":6:1 -1
> (nil))
> (note 29 19 25 2 NOTE_INSN_EPILOGUE_BEG)
> (jump_insn 25 29 26 2 (simple_return) "foo02.c":6:1 866 {simple_return}
> (nil)
> -> simple_return)
>
>
> And here is the RTL after the PCREL_OPT:
>
> ;; Load of address a into r8, a will be loaded into r3
> (insn:TI 5 13 6 2 (parallel [
> (set (reg/f:DI 8 8 [123])
> (unspec:DI [
> (symbol_ref:DI ("a") [flags 0xc0] <var_decl 0x7ff100832480 a>)
> (const_int 1 [0x1])
> ] UNSPEC_PCREL_OPT_LD_ADDR))
> (set (reg:DI 3 3)
> (unspec:DI [
> (const_int 0 [0])
> ] UNSPEC_PCREL_OPT_LD_ADDR))
> ]) "foo02.c":5:12 2198 {pcrel_opt_ld_addr}
> (expr_list:REG_EQUIV (symbol_ref:DI ("a") [flags 0xc0] <var_decl 0x7ff100832480 a>)
> (nil)))
>
> ;; Load of address b into r10, which will be the same register b's value is loaded into
> (insn 6 5 10 2 (set (reg/f:DI 10 10 [124])
> (unspec:DI [
> (symbol_ref:DI ("b") [flags 0xc0] <var_decl 0x7ff100832510 b>)
> (const_int 2 [0x2])
> ] UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG)) "foo02.c":5:12 2199 {pcrel_opt_ld_addr_same_reg}
> (expr_list:REG_EQUIV (symbol_ref:DI ("b") [flags 0xc0] <var_decl 0x7ff100832510 b>)
> (nil)))
>
> ;; Load of address c into r9, which will be the same register c's value is loaded into
> (insn 10 6 7 2 (set (reg/f:DI 9 9 [128])
> (unspec:DI [
> (symbol_ref:DI ("c") [flags 0xc0] <var_decl 0x7ff1008325a0 c>)
> (const_int 3 [0x3])
> ] UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG)) "foo02.c":5:16 2199 {pcrel_opt_ld_addr_same_reg}
> (expr_list:REG_EQUIV (symbol_ref:DI ("c") [flags 0xc0] <var_decl 0x7ff1008325a0 c>)
> (nil)))
>
> ;; Load & zero extend the variable a into r3, using base register r8
> (insn:TI 7 10 8 2 (parallel [
> (set (reg:DI 3 3)
> (zero_extend:DI (unspec:SI [
> (mem/c:SI (reg/f:DI 8 8 [123]) [1 a+0 S4 A32])
> (reg:DI 3 3)
> (const_int 1 [0x1])
> ] UNSPEC_PCREL_OPT_LD_RELOC)))
> (clobber (reg/f:DI 8 8 [123]))
> ]) "foo02.c":5:12 2207 {*pcrel_opt_ldsi_udi_gpr}
> (expr_list:REG_DEAD (reg/f:DI 8 8 [123])
> (nil)))
>
> ;; Load & zero extend the variable b into r10, using r10 as the base register
> (insn 8 7 11 2 (parallel [
> (set (reg:DI 10 10)
> (zero_extend:DI (unspec:SI [
> (mem/c:SI (reg/f:DI 10 10 [124]) [1 b+0 S4 A32])
> (reg:DI 10 10)
> (const_int 2 [0x2])
> ] UNSPEC_PCREL_OPT_LD_RELOC)))
> (clobber (scratch:DI))
> ]) "foo02.c":5:12 2207 {*pcrel_opt_ldsi_udi_gpr}
> (nil))
>
> ;; Load and zero extend the variable c into r9, using r9 as the base register
> (insn 11 8 9 2 (parallel [
> (set (reg:DI 9 9)
> (zero_extend:DI (unspec:SI [
> (mem/c:SI (reg/f:DI 9 9 [128]) [1 c+0 S4 A32])
> (reg:DI 9 9)
> (const_int 3 [0x3])
> ] UNSPEC_PCREL_OPT_LD_RELOC)))
> (clobber (scratch:DI))
> ]) "foo02.c":5:16 2207 {*pcrel_opt_ldsi_udi_gpr}
> (nil))
>
> ;; Add a+b
> (insn:TI 9 11 12 2 (set (reg:SI 3 3 [125])
> (plus:SI (reg:SI 3 3 [orig:126 a ] [126])
> (reg:SI 10 10 [orig:127 b ] [127]))) "foo02.c":5:12 65 {*addsi3}
> (expr_list:REG_DEAD (reg:SI 10 10 [orig:127 b ] [127])
> (nil)))
>
> ;; Add (a+b)+c
> (insn:TI 12 9 18 2 (set (reg:SI 3 3 [122])
> (plus:SI (reg:SI 3 3 [125])
> (reg:SI 9 9 [orig:129 c ] [129]))) "foo02.c":5:16 65 {*addsi3}
> (expr_list:REG_DEAD (reg:SI 9 9 [orig:129 c ] [129])
> (nil)))
>
> ;; Sign extend the result
> (insn:TI 18 12 19 2 (set (reg/i:DI 3 3)
> (sign_extend:DI (reg:SI 3 3 [122]))) "foo02.c":6:1 31 {extendsidi2}
> (nil))
>
> ;; Return
> (insn 19 18 29 2 (use (reg/i:DI 3 3)) "foo02.c":6:1 -1
> (nil))
> (note 29 19 25 2 NOTE_INSN_EPILOGUE_BEG)
> (jump_insn 25 29 26 2 (simple_return) "foo02.c":6:1 866 {simple_return}
> (nil)
> -> simple_return)
>
More information about the Gcc-patches
mailing list