[Bug rtl-optimization/47764] The constant load instruction should be hoisted out of loop
ubizjak at gmail dot com
gcc-bugzilla@gcc.gnu.org
Thu Jan 24 07:25:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47764
Uros Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target|arm-linux-androideabi |
CC| |ubizjak at gmail dot com
Component|target |rtl-optimization
Known to fail| |4.7.0, 4.8.0
--- Comment #5 from Uros Bizjak <ubizjak at gmail dot com> 2013-01-24 07:25:04 UTC ---
This is a problem with rtl-optimization, gcse2 pass.
Following testcase also fails on x86_64, with 4.8 [1] that removes (!o,F)
alternative.
Following test, when compiled with -O3 hoists memory load out of the loop:
--cut here--
volatile double y;
void
test ()
{
int z;
for (z = 0; z < 1000; z++)
y = 0.1;
}
--cut here--
_.210r.postreload:
15: L15:
8: NOTE_INSN_BASIC_BLOCK 3
23: xmm0:DF=[`*.LC0']
10: [`y']=xmm0:DF
REG_DEAD xmm0:DF
11: NOTE_INSN_DELETED
12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}
13: pc={(flags:CCZ!=0)?L15:pc}
REG_BR_PROB 0x26ab
_.211r.gcse2:
26: xmm0:DF=[`*.LC0']
15: L15:
8: NOTE_INSN_BASIC_BLOCK 3
10: [`y']=xmm0:DF
REG_DEAD xmm0:DF
11: NOTE_INSN_DELETED
12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}
13: pc={(flags:CCZ!=0)?L15:pc}
REG_BR_PROB 0x26ab
However, when constant is changed to 0.0 (so, we can load it directly to %xmm
register using xorpd insn):
--cut here--
volatile double y;
void
test ()
{
int z;
for (z = 0; z < 1000; z++)
y = 0.0;
}
--cut here--
gcc -O3:
_.211r.gcse2:
15: L15:
8: NOTE_INSN_BASIC_BLOCK 3
10: xmm0:DF=0.0
23: [`y']=xmm0:DF
REG_DEAD xmm0:DF
11: NOTE_INSN_DELETED
12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}
13: pc={(flags:CCZ!=0)?L15:pc}
REG_BR_PROB 0x26ab
Constant load remains inside the loop. It looks that gcse2 pass cares only for
loads from memory, but I see no reason why constant load should not be
considered. It looks like an oversight to me.
The same happens with:
--cut here--
volatile long long y;
void
test ()
{
int z;
for (z = 0; z < 1000; z++)
y = 0x123456789;
}
--cut here--
_.211r.gcse2:
15: L15:
8: NOTE_INSN_BASIC_BLOCK 3
23: dx:DI=0x123456789
24: [`y']=dx:DI
REG_DEAD dx:DI
11: NOTE_INSN_DELETED
12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}
13: pc={(flags:CCZ!=0)?L15:pc}
REG_BR_PROB 0x26ab
resulting in:
.L3:
movabsq $4886718345, %rdx
subl $1, %eax
movq %rdx, y(%rip)
jne .L3
Reconfirmed as rtl-optimization (gcse2 pass) problem.
[1] 4.8.0 20130124 (experimental) [trunk revision 195417]
More information about the Gcc-bugs
mailing list