[Bug ipa/103585] fatigue2 requires inlining of peridida to work well
hubicka at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Sun Jan 29 02:23:21 GMT 2023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103585
--- Comment #15 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
We get 47s runtime with -O2 -flto and 53s with -O2
-fno-inline-functions-called-once.
The call sequence is:
<bb 81> [local count: 109362591]:
_1656 = (unsigned long) _45;
_1655 = _1656 + ivtmp.1182_2540;
_229 = (double *) _1655;
_1646 = (unsigned long) _35;
_1645 = _1646 + ivtmp.1182_2540;
_230 = (double *) _1645;
_1636 = (unsigned long) _55;
_1635 = _1636 + ivtmp.1182_2540;
_231 = (double *) _1635;
_1628 = (unsigned long) _17;
_1627 = _1628 + ivtmp.1182_2540;
_232 = (double *) _1627;
_1618 = (unsigned long) _64;
_1617 = _1618 + ivtmp.1182_2540;
_233 = (double *) _1617;
_234 = (double *) ivtmp.1181_2551;
_235 = (double *) ivtmp.1180_2575;
_236 = (double *) ivtmp.1178_2586;
_2607 = yield_stress;
perdida.constprop.isra (&dt, &lambda, &mu, _2607, &r_infinity, &b,
&x_infinity, &gamma, &eta, &plastic_strain_threshold, _229, _230, _231, _232,
_236, _233, _235, _234, &failure_threshold, &crack_closure_parameter);
It is not clear to me why lambda is not replaced. Howevever for dt it seems to
be:
! Disqualifying parameter number 0 - Dereferences in callers would happen much
more frequently.
I think this is too early since if we SRA all the way down to the original
caller we will avoid all dereferences completely.
Other place for improvement is non-LTO. Here IPA-sra disables itself since it
does not have cost model for cloning (that could be also improved).
Situation could be improved by ipa-modref that may optimize away unused parts
of the array descriptors. ipa-modref however gives up on the fact that perdida
has Fortran i/o and it then gives up on tracking on the descriptors even if the
descritors are never escaping to the i/o.
For this I need to finish the non-escaping analysis. I.e. make difference
between arguments that does not escape in a sense that once function return
they are not saved in global memory and in a sense that they are never passed
down to callee function.
We also may annotate fortran i/o and understand what it does.
So there is still a lot to do.
More information about the Gcc-bugs
mailing list