[PATCH 0/3] Power10 PCREL_OPT support

Bill Schmidt wschmidt@linux.ibm.com
Tue Aug 25 04:24:54 GMT 2020


On 8/24/20 11:01 PM, Michael Meissner wrote:
> On Sat, Aug 22, 2020 at 07:05:51PM -0500, Bill Schmidt wrote:
>> What is necessary in order to allow this optimization to occur
>> earlier is to make this hidden dependency explicit.  When the
>> relocation is inserted, we have to change the "pld" instruction to
>> have a specific clobber of (in this case) r5, which represents what
>> will happen if the linker makes the substitution.
>>
>> I agree that it's too fragile to force this to be the last pass, so
>> I think if Mike can look into introducing a clobber of the hard
>> register when performing the optimization, that would at least allow
>> us to move this anywhere after reload.
>>
>> I don't immediately see a solution that works prior to register
>> allocation because we basically are representing two potential
>> starting points of a live range, only one of which will survive in
>> the final code.  That is too ugly a problem to hand to the register
>> allocator.
> As I said in a private message, I have the appropriate clobbers and such
> already.

Great, thanks!  I had failed to understand this was there now -- excellent.

I started cobbling up some code to use the du-chains to simplify things 
a little (but only a little).  Dealing with intervening loads and stores 
still requires a little mess.  I'll send you something in the next day 
or two, I hope.

Bill

>
> Here is the program I used in my previous reply to Segher:
>
> 	extern int a, b, c;
>
> 	int sum (void)
> 	{
> 	  return a + b + c;
> 	}
>
>
> Here is the RTL before the PCREL_OPT pass from sched2:
>
> 	;; Load the address of a into r8
> 	(insn:TI 5 13 6 2 (set (reg/f:DI 8 8 [123])
> 		(symbol_ref:DI ("a") [flags 0xc0]  <var_decl 0x7ff100832480 a>)) "foo02.c":5:12 722 {*pcrel_extern_addr}
> 	     (expr_list:REG_EQUIV (symbol_ref:DI ("a") [flags 0xc0]  <var_decl 0x7ff100832480 a>)
> 		(nil)))
>
> 	;; Load the address of b into r10
> 	(insn 6 5 10 2 (set (reg/f:DI 10 10 [124])
> 		(symbol_ref:DI ("b") [flags 0xc0]  <var_decl 0x7ff100832510 b>)) "foo02.c":5:12 722 {*pcrel_extern_addr}
> 	     (expr_list:REG_EQUIV (symbol_ref:DI ("b") [flags 0xc0]  <var_decl 0x7ff100832510 b>)
> 		(nil)))
>
> 	;; Load the address of c into r9
> 	(insn 10 6 7 2 (set (reg/f:DI 9 9 [128])
> 		(symbol_ref:DI ("c") [flags 0xc0]  <var_decl 0x7ff1008325a0 c>)) "foo02.c":5:16 722 {*pcrel_extern_addr}
> 	     (expr_list:REG_EQUIV (symbol_ref:DI ("c") [flags 0xc0]  <var_decl 0x7ff1008325a0 c>)
> 		(nil)))
>
> 	;; Load a's value into r3, using r8 as the base register
> 	(insn:TI 7 10 8 2 (set (reg:DI 3 3)
> 		(zero_extend:DI (mem/c:SI (reg/f:DI 8 8 [123]) [1 a+0 S4 A32]))) "foo02.c":5:12 16 {zero_extendsidi2}
> 	     (expr_list:REG_DEAD (reg/f:DI 8 8 [123])
> 		(nil)))
>
> 	;; Load b's value into r10, using r10 as the base register
> 	(insn 8 7 11 2 (set (reg:DI 10 10)
> 		(zero_extend:DI (mem/c:SI (reg/f:DI 10 10 [124]) [1 b+0 S4 A32]))) "foo02.c":5:12 16 {zero_extendsidi2}
> 	     (nil))
>
> 	;; Load c's value into r9, using r9 as the base register
> 	(insn 11 8 9 2 (set (reg:DI 9 9)
> 		(zero_extend:DI (mem/c:SI (reg/f:DI 9 9 [128]) [1 c+0 S4 A32]))) "foo02.c":5:16 16 {zero_extendsidi2}
> 	     (nil))
>
> 	;; Add a+b
> 	(insn:TI 9 11 12 2 (set (reg:SI 3 3 [125])
> 		(plus:SI (reg:SI 3 3 [orig:126 a ] [126])
> 		    (reg:SI 10 10 [orig:127 b ] [127]))) "foo02.c":5:12 65 {*addsi3}
> 	     (expr_list:REG_DEAD (reg:SI 10 10 [orig:127 b ] [127])
> 		(nil)))
>
> 	;; Add (a+b)+c
> 	(insn:TI 12 9 18 2 (set (reg:SI 3 3 [122])
> 		(plus:SI (reg:SI 3 3 [125])
> 		    (reg:SI 9 9 [orig:129 c ] [129]))) "foo02.c":5:16 65 {*addsi3}
> 	     (expr_list:REG_DEAD (reg:SI 9 9 [orig:129 c ] [129])
> 		(nil)))
>
> 	;; Sign extend
> 	(insn:TI 18 12 19 2 (set (reg/i:DI 3 3)
> 		(sign_extend:DI (reg:SI 3 3 [122]))) "foo02.c":6:1 31 {extendsidi2}
> 	     (nil))
>
> 	;; Return
> 	(insn 19 18 29 2 (use (reg/i:DI 3 3)) "foo02.c":6:1 -1
> 	     (nil))
> 	(note 29 19 25 2 NOTE_INSN_EPILOGUE_BEG)
> 	(jump_insn 25 29 26 2 (simple_return) "foo02.c":6:1 866 {simple_return}
> 	     (nil)
> 	 -> simple_return)
>
>
> And here is the RTL after the PCREL_OPT:
>
> 	;; Load of address a into r8, a will be loaded into r3
> 	(insn:TI 5 13 6 2 (parallel [
> 		    (set (reg/f:DI 8 8 [123])
> 			(unspec:DI [
> 				(symbol_ref:DI ("a") [flags 0xc0]  <var_decl 0x7ff100832480 a>)
> 				(const_int 1 [0x1])
> 			    ] UNSPEC_PCREL_OPT_LD_ADDR))
> 		    (set (reg:DI 3 3)
> 			(unspec:DI [
> 				(const_int 0 [0])
> 			    ] UNSPEC_PCREL_OPT_LD_ADDR))
> 		]) "foo02.c":5:12 2198 {pcrel_opt_ld_addr}
> 	     (expr_list:REG_EQUIV (symbol_ref:DI ("a") [flags 0xc0]  <var_decl 0x7ff100832480 a>)
> 		(nil)))
>
> 	;; Load of address b into r10, which will be the same register b's value is loaded into
> 	(insn 6 5 10 2 (set (reg/f:DI 10 10 [124])
> 		(unspec:DI [
> 			(symbol_ref:DI ("b") [flags 0xc0]  <var_decl 0x7ff100832510 b>)
> 			(const_int 2 [0x2])
> 		    ] UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG)) "foo02.c":5:12 2199 {pcrel_opt_ld_addr_same_reg}
> 	     (expr_list:REG_EQUIV (symbol_ref:DI ("b") [flags 0xc0]  <var_decl 0x7ff100832510 b>)
> 		(nil)))
>
> 	;; Load of address c into r9, which will be the same register c's value is loaded into
> 	(insn 10 6 7 2 (set (reg/f:DI 9 9 [128])
> 		(unspec:DI [
> 			(symbol_ref:DI ("c") [flags 0xc0]  <var_decl 0x7ff1008325a0 c>)
> 			(const_int 3 [0x3])
> 		    ] UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG)) "foo02.c":5:16 2199 {pcrel_opt_ld_addr_same_reg}
> 	     (expr_list:REG_EQUIV (symbol_ref:DI ("c") [flags 0xc0]  <var_decl 0x7ff1008325a0 c>)
> 		(nil)))
>
> 	;; Load & zero extend the variable a into r3, using base register r8
> 	(insn:TI 7 10 8 2 (parallel [
> 		    (set (reg:DI 3 3)
> 			(zero_extend:DI (unspec:SI [
> 				    (mem/c:SI (reg/f:DI 8 8 [123]) [1 a+0 S4 A32])
> 				    (reg:DI 3 3)
> 				    (const_int 1 [0x1])
> 				] UNSPEC_PCREL_OPT_LD_RELOC)))
> 		    (clobber (reg/f:DI 8 8 [123]))
> 		]) "foo02.c":5:12 2207 {*pcrel_opt_ldsi_udi_gpr}
> 	     (expr_list:REG_DEAD (reg/f:DI 8 8 [123])
> 		(nil)))
>
> 	;; Load & zero extend the variable b into r10, using r10 as the base register
> 	(insn 8 7 11 2 (parallel [
> 		    (set (reg:DI 10 10)
> 			(zero_extend:DI (unspec:SI [
> 				    (mem/c:SI (reg/f:DI 10 10 [124]) [1 b+0 S4 A32])
> 				    (reg:DI 10 10)
> 				    (const_int 2 [0x2])
> 				] UNSPEC_PCREL_OPT_LD_RELOC)))
> 		    (clobber (scratch:DI))
> 		]) "foo02.c":5:12 2207 {*pcrel_opt_ldsi_udi_gpr}
> 	     (nil))
>
> 	;; Load and zero extend the variable c into r9, using r9 as the base register
> 	(insn 11 8 9 2 (parallel [
> 		    (set (reg:DI 9 9)
> 			(zero_extend:DI (unspec:SI [
> 				    (mem/c:SI (reg/f:DI 9 9 [128]) [1 c+0 S4 A32])
> 				    (reg:DI 9 9)
> 				    (const_int 3 [0x3])
> 				] UNSPEC_PCREL_OPT_LD_RELOC)))
> 		    (clobber (scratch:DI))
> 		]) "foo02.c":5:16 2207 {*pcrel_opt_ldsi_udi_gpr}
> 	     (nil))
>
> 	;; Add a+b
> 	(insn:TI 9 11 12 2 (set (reg:SI 3 3 [125])
> 		(plus:SI (reg:SI 3 3 [orig:126 a ] [126])
> 		    (reg:SI 10 10 [orig:127 b ] [127]))) "foo02.c":5:12 65 {*addsi3}
> 	     (expr_list:REG_DEAD (reg:SI 10 10 [orig:127 b ] [127])
> 		(nil)))
>
> 	;; Add (a+b)+c
> 	(insn:TI 12 9 18 2 (set (reg:SI 3 3 [122])
> 		(plus:SI (reg:SI 3 3 [125])
> 		    (reg:SI 9 9 [orig:129 c ] [129]))) "foo02.c":5:16 65 {*addsi3}
> 	     (expr_list:REG_DEAD (reg:SI 9 9 [orig:129 c ] [129])
> 		(nil)))
>
> 	;; Sign extend the result
> 	(insn:TI 18 12 19 2 (set (reg/i:DI 3 3)
> 		(sign_extend:DI (reg:SI 3 3 [122]))) "foo02.c":6:1 31 {extendsidi2}
> 	     (nil))
>
> 	;; Return
> 	(insn 19 18 29 2 (use (reg/i:DI 3 3)) "foo02.c":6:1 -1
> 	     (nil))
> 	(note 29 19 25 2 NOTE_INSN_EPILOGUE_BEG)
> 	(jump_insn 25 29 26 2 (simple_return) "foo02.c":6:1 866 {simple_return}
> 	     (nil)
> 	 -> simple_return)
>


More information about the Gcc-patches mailing list