This is the mail archive of the
mailing list for the GCC project.
Re: SH2A: "RTV/N Rn" implementation
On Tue, Jul 24, 2007 at 03:32:03PM +0530, Naveen H.S. wrote:
> Thanks for your valuable suggestion.
> We modified the epilogue as per your suggestions. RTV/N Rn instruction
> was generated with the operand as R0 in most of the case. The redundant
> transfer of register Rn to R0 before the epilogue is still generated.
> So RTV/N does not lead to any optimization in the code size.
With this patch
--- sh.md (revision 126809)
+++ sh.md (working copy)
@@ -9375,15 +9375,26 @@
"sh_expand_prologue (); DONE;")
+ [(set (reg:SI R0_REG)
+ (match_operand:SI 0 "register_operand" "r"))
+ "rtv %0"
+ [(set_attr "type" "return")
+ (set_attr "needs_delay_slot" "yes")])
- emit_jump_insn (gen_return ());
+ if (HAVE_return_rtv)
+ emit_jump_insn (gen_return_rtv (gen_rtx_REG (SImode, R0_REG)));
+ emit_jump_insn (gen_return ());
[(use (match_operand 0 "register_operand" ""))]
and this testcase
int32_t rtvtest3 (int64_t a, int64_t b)
return ((a * b) >> 32);
I get (-m2a -O2)
mov r5,r0 ! 58 movsi_ie/2 [length = 2]
mov r6,r3 ! 57 movsi_ie/2 [length = 2]
mulr r0,r3 ! 8 mul_r [length = 2]
mov r7,r0 ! 59 movsi_ie/2 [length = 2]
mulr r0,r4 ! 10 mul_r [length = 2]
dmulu.l r5,r7 ! 46 umulsidi3_i [length = 2]
mov.l r14,@-r15 ! 60 movsi_ie/10 [length = 4]
add r4,r3 ! 11 *addsi3_compact [length = 2]
sts mach,r1 ! 48 movsi_ie/7 [length = 2]
add r1,r3 ! 13 *addsi3_compact [length = 2]
mov r3,r0 ! 28 movsi_ie/2 [length = 2]
mov r15,r14 ! 61 movsi_ie/2 [length = 2]
mov r14,r15 ! 68 movsi_ie/2 [length = 2]
mov.l @r15+,r14 ! 69 movsi_ie/6 [length = 4]
rtv r3 ! 70 return_rtv [length = 4]
Insn 28 should have been deleted. You'll have to ask the dataflow people
why it isn't happening.
> We masked this transfer when return type of the function is INTEGER_TYPE
> in the function expand_value_return (rtx val) in gcc/stmt.c. This
> resulted in some regression FAIL. RTV/N Rn is generated only when
> return type of the function is INTEGER_TYPE. How to avoid redundant move
> without any regression failures?
What "regression failures"?
> We tried to get the register Rn from the function expand_value_return
> (rtx val) in gcc/stmt.c. The register Rn can be used as the operand in
> "return_rtv". The Rn register obtained from the above function is a
> PSEUDO register. Kindly suggest a way to get HARD register instead of a
> PSEUDO register?
true_regnum(), but I doubt that is the way to go.
Rask Ingemann Lambertsen