[Bug rtl-optimization/96031] suboptimal codegen for store low 16-bits value
amker at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Sun Jul 19 14:32:35 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96031
--- Comment #2 from bin cheng <amker at gcc dot gnu.org> ---
Interesting case, I see two issues in generated asm. One is the unnecessary
bitwise and, the other is allocating different registers for induction variable
and the base address. However, looks like neither issue is caused by ivopts.
Check the dump:
431 <bb 4> [local count: 105119324]:
432 _12 = (short unsigned int) step_8(D);
433 ivtmp.10_11 = (unsigned long) &array;
434 _18 = len_7(D) + 4294967294;
435 _19 = (unsigned long) _18;
436 _20 = _19 * 2;
437 _21 = (unsigned long) &array;
438 _22 = _21 + 2;
439 _23 = _20 + _22;
440
441 <bb 5> [local count: 955630224]:
442 # ivtmp.8_15 = PHI <_12(4), ivtmp.8_5(6)>
443 # ivtmp.10_16 = PHI <ivtmp.10_11(4), ivtmp.10_4(6)>
444 _3 = ivtmp.8_15;
445 _2 = (void *) ivtmp.10_16;
446 MEM[base: _2, offset: 2B] = _3;
447 ivtmp.8_5 = ivtmp.8_15 + _12;
448 ivtmp.10_4 = ivtmp.10_16 + 2;
449 if (ivtmp.10_4 != _23)
450 goto <bb 6>; [89.00%]
451 else
452 goto <bb 8>; [11.00%]
453
454 <bb 8> [local count: 105119324]:
455 goto <bb 3>; [100.00%]
456
457 <bb 6> [local count: 850510900]:
458 goto <bb 5>; [100.00%]
As far as I can tell, it's optimal.
The register allocation issue is introduced by rtl PRE, apparently we should
not save the add 2 instruction in the last iteration with a false dependence
which is more harmful.
As for ivopt, I can see a minor improvement by replacing != exit condition with
<=, thus saving add 2 instruction computing _22, which happens to "disable" the
wrong PRE transformation.
Ah, I see it's already classified as rtl-optimization.
Thanks
More information about the Gcc-bugs
mailing list