[Bug rtl-optimization/96031] suboptimal codegen for store low 16-bits value
zhongyunde at tom dot com
gcc-bugzilla@gcc.gnu.org
Mon Jul 20 12:25:52 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96031
--- Comment #3 from zhongyunde at tom dot com <zhongyunde at tom dot com> ---
I find there is some different between the two cases during in ivopts.
For the 2nd case, a UINT32 type iv sum is choosed
<bb 3> [local count: 955630224]:
# sum_15 = PHI <0(5), sum_9(6)>
# ivtmp.10_17 = PHI <ivtmp.10_3(5), ivtmp.10_4(6)>
_2 = (short unsigned int) sum_15;
_1 = _2;
_11 = (void *) ivtmp.10_17;
MEM[base: _11, offset: 0B] = _1;
sum_9 = step_8(D) + sum_15;
ivtmp.10_4 = ivtmp.10_17 + 2;
if (ivtmp.10_4 != _22)
goto <bb 6>; [89.00%]
For the 1st case, a 'short unsigned int type' ivtmp.8 is choosed as your dump
showed, and there is no UINT32 type candidate with Step step.
typedef unsigned int UINT32;
typedef unsigned short UINT16;
UINT16 array[12];
void foo (UINT32 len, UINT32 step)
{
UINT32 index = 0;
UINT32 sum = 0;
for (index = 0; index < len; index++ )
{
sum = index * step;
array[index] = sum;
}
}
I tried to add a UINT32 type temporary sum as above case (the 3rd case), then
modify the gcc to add an UINT32 type candidate variable and adjust the cost to
choose the Candidate variable (do the similar things as the 2nd case in ivopt),
then we can also optimize the 'and w2, w2, 65535' insn.
But above method is not conformed to the implementation method of ivopt, may be
we need extend an UINT32 candidate variable base 'on short unsigned int' IV
struct ?
===== the change of gcc to add UINT32 type candidate variable
==================
@@ -3389,7 +3389,7 @@ add_iv_candidate_for_bivs (struct ivopts_data *data)
EXECUTE_IF_SET_IN_BITMAP (data->relevant, 0, i, bi)
{
iv = ver_info (data, i)->iv;
- if (iv && iv->biv_p && !integer_zerop (iv->step))
+ if (iv && !integer_zerop (iv->step))
add_iv_candidate_for_biv (data, iv);
}
}
More information about the Gcc-bugs
mailing list