This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug rtl-optimization/71768] Missed trivial rematerialiation oppurtunity

From: "hubicka at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Tue, 05 Jul 2016 18:18:08 +0000
Subject: [Bug rtl-optimization/71768] Missed trivial rematerialiation oppurtunity
Auto-submitted: auto-generated
References: <bug-71768-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71768

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2016-07-05
                 CC|                            |vmakarov at redhat dot com
     Ever confirmed|0                           |1

--- Comment #1 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
I looked into this a bit as it seemed to be easy to fix, but i don't see how
exactly rematerialization triggers here.
The following testcase:
const float cst=1;
int t()
{
  float val = cst;
  asm("#%0"::"x"(val));
  e();
  asm("#%0"::"x"(val));
}
looks almost identical pre-ira and is optimized as expected. The difference in
reloading seems to start here:
@@ -78,17 +78,14 @@
     Hard reg set forest:
       0:( 0-6 8-15 21-52)@0
         1:( 0-6 37-44)@12000
-        2:( 21-28 45-52)@16000
+      Spill a1(r88,l0)

The reason here is that allocno's class_cost is not greater than memory_cost in
the vector case.  I guess either class_cost or memory_cost is not computed
correctly which seems to be done by record_reg_classes that does quite some
guesswork.

Later LRA misses the equivalence in the second case:

@@ -10,12 +10,55 @@
 Can eliminate 16 to 6 (offset=8, prev_offset=0)
 Can eliminate 20 to 7 (offset=0, prev_offset=0)
 Can eliminate 20 to 6 (offset=-8, prev_offset=0)
-            alt=0: Bad operand -- refuse
-            alt=1: Bad operand -- refuse
-          alt=2,overall=0,losers=0,rld_nregs=0
-        Choosing alt 2 in insn 5:  (0) v  (1) vm {movv4si_internal}
-          alt=0,overall=0,losers=0,rld_nregs=0
+      Removing equiv init insn 5 (freq=1000)
+    5: r88:SF=[`*.LC0']
+      REG_EQUIV 1.0e+0
+deleting insn with uid = 5.
+Changing pseudo 88 in operand 0 of insn 6 on equiv 1.0e+0
+            0 Non-pseudo reload: reject+=2
+            0 Non input pseudo reload: reject++
+          alt=0,overall=15,losers=2,rld_nregs=1
         Choosing alt 0 in insn 6:  (0) x
+      Creating newreg=90, assigning class SSE_REGS to r90
+    6: {asm_operands;clobber fpsr:CCFP;clobber flags:CC;}
+      REG_UNUSED fpsr:CCFP
+      REG_UNUSED flags:CC
+    Inserting insn reload before:
+   18: r90:SF=[`*.LC0']

I wonder why LRA behaves this way. Even if the register is not spilled, it can
be rematerialized.

References:
- [Bug rtl-optimization/71768] New: Missed trivial rematerialiation oppurtunity
  - From: hubicka at gcc dot gnu.org

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]