The following testcase: struct a { int a, b; }; static inline int t (struct a a) { if (!__builtin_constant_p (a.a)) __builtin_abort (); } static int __attribute__ ((noinline)) q (struct a a) { t (a); } int main (int argc) { struct a a; int i; for (i = 0; i < 10000; i++) { a.a = 1; a.b = 2; q (a); } return 0; } Fails when compiled with ./xgcc -B ./ -O3 t.c -S -fdump-ipa-cp -fno-ipa-sra -fno-early-inlining ipa-cp correctly identifies the inlining oppurtunity: Estimating effects for main/2, base_time: 1588. Estimating body: main/2 Known to be false: size:10 time:1588 - context independent values, size: 10, time_benefit: 1 good_cloning_opportunity_p (time: 1, size: 10, freq_sum: 0, single_call) -> evaluation: 0, threshold: 500 Not cloning for all contexts because !good_cloning_opportunity_p. Estimating effects for q/1, base_time: 13. Estimating body: q/1 Known to be false: size:5 time:13 - context independent values, size: 5, time_benefit: 0 Decided to specialize for all known contexts, code not going to grow. Estimating effects for t/0, base_time: 5. Estimating body: t/0 Known to be false: op0[offset: 0] changed size:4 time:2 - context independent values, size: 4, time_benefit: 3 Decided to specialize for all known contexts, code not going to grow. however because the ipa-cp adjustments are applied to the body of q before inlining we end up with: __attribute__((noinline)) q.constprop (struct a a) { struct a a; int _2; int _3; <bb 2>: a = a; _2 = a.a; _3 = __builtin_constant_p (_2); if (_3 == 0) goto <bb 3>; else goto <bb 4>; <bb 3>: __builtin_abort (); <bb 4>: return; } i.e. we lose the track that a.a==1 Modification phase of node q.constprop/5 Aggregate replacements: 0[0]=1, 0[32]=2 __attribute__((noinline)) q.constprop (struct a a) { <bb 3>: <bb 2>: t (a); return; } here t is not inlined yet and the clone of t is not modified because it is inline clone but it should be because it is inline clone of ipa-cp clone. I wonder if we should not apply the replacements only after inlining. Even if ipa-cp decided to not clone the callee, inliner might have inlined it and we ought to use the known constants when optimizing the function body.
Actually applying it just after inlining is probably going to miss the transform, too, because there will be extra copy of the whole aggregate. What about making the info about known constant values of aggregates available to GVN? It should be able to do the desired cleanup.
I suppose the dump is misleading in that the inline clone has Modification phase of node q.constprop/5 Aggregate replacements: 0[0]=1, 0[32]=2 applied to? __attribute__((noinline)) q.constprop (struct a a) { <bb 3>: <bb 2>: t (a); return; } instead would look like wrong-code. Ah, IPA-CP doesn't actually remove "const" parameters. So this report is reporting that IPA-CP thinks it does sth but in the end it's a no-op clone. I wonder why we don't simply add a.a = 1; a.b = 2; at the start of q.constprop. Or if we only know a partially then initialize those parts from the know constants?
On Wed, 16 Dec 2015, hubicka at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68930 > > Jan Hubicka <hubicka at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |rguenther at suse dot de > > --- Comment #1 from Jan Hubicka <hubicka at gcc dot gnu.org> --- > Actually applying it just after inlining is probably going to miss the > transform, too, because there will be extra copy of the whole aggregate. What > about making the info about known constant values of aggregates available to > GVN? It should be able to do the desired cleanup. But how would you represent this info? I don't see how it's going to do better than the inliner inserting an extra initialization (of course that might stay around if there are no cleanup opportunities and actually make the code bigger & slower)
I agree with the analysis in the summary, the problem is that the IPA-CP transformation phase should be run on clone of t _before_ we inline it to q. From the ipa-cp dump it is apparent that does not happen. As far as the GVN idea is concerned, can GVN even work with aggregates? And even if yes, I assume this would mean adding some kind of asserts into the gimple stream, which would also require that we process the body of the clone before we inline it. Trying to track IPA-CP aggregate transformation info across inlining seems difficult and unnecessary to me. The modification of q is a no op because we only replace loads of known values with the known values in the transformation phase. And IPA-CP indeed does not remove _aggregate_ parameters, even when it knows the value of it all. I suppose it should.
> But how would you represent this info? I don't see how it's going to > do better than the inliner inserting an extra initialization (of course > that might stay around if there are no cleanup opportunities and actually > make the code bigger & slower) Just keep the existing transformation data used by ipcp_modif_dom_walker and feed it into GVN at entry block. Consider code: foo (struct a) { struct b=a; return b.a; } this is a common case of post inlining code. Now the way ipa-prop works if we figure out that b.a=1 it won't update the copy b=a because it is not access to the field it knows value of so no transformation happens. GVN will figure out that the return is equivalent to a.a and if it knew that a.a==1 it would optimize the code. It is quite important bug that we work hard to figure out all those constants and then we drop them to the floor :) Honza
For the Martin's answer. I poked about this a bit. One can add call to node->get_body to the end of cgraph_materialize_clone which after removing the associated assert will call to the ipa-prop transform method. It will also call inliner's transform which will in turn remove the cgraph and break. I suppose I can fix this by making IPA transforms function speicfic (they should be) and making inliner to only drop its transform method on functions non-inline functions and make ipa-cp add transforms only to the clones. It should improve compile time, too. Still we will get missed optimizations, so keeping this info for GVN seems like resonable thing to do.
On December 16, 2015 7:28:35 PM GMT+01:00, hubicka at ucw dot cz <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68930 > >--- Comment #6 from Jan Hubicka <hubicka at ucw dot cz> --- >For the Martin's answer. I poked about this a bit. >One can add call to node->get_body to the end of >cgraph_materialize_clone which >after removing the associated assert will call to the ipa-prop >transform >method. It will also call inliner's transform which will in turn >remove the >cgraph and break. > >I suppose I can fix this by making IPA transforms function speicfic >(they >should >be) and making inliner to only drop its transform method on functions >non-inline >functions and make ipa-cp add transforms only to the clones. It should >improve >compile time, too. > >Still we will get missed optimizations, so keeping this info for GVN >seems like >resonable thing to do. If the information is there in some way then I can try using it from GVN. just tell me where it is.
The information is consumed by ipcp_transform_function, so you can take a look. It is stored in ipcp_get_transformation_summary I think it is just matter of not freeing it (assuming it is freed after cloning is done) and using it from GVN
I have proposed a fix on the mailing list: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619969.html
The master branch has been updated by Martin Jambor <jamborm@gcc.gnu.org>: https://gcc.gnu.org/g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a commit r14-3226-gd073e2d75d9ed492de9a8dc6970e5b69fae20e5a Author: Martin Jambor <mjambor@suse.cz> Date: Tue Aug 15 17:26:13 2023 +0200 Feed results of IPA-CP into tree value numbering PRs 68930 and 92497 show that when IPA-CP figures out constants in aggregate parameters or when passed by reference but the loads happen in an inlined function the information is lost. This happens even when the inlined function itself was known to have - or even cloned to have - such constants in incoming parameters because the transform phase of IPA passes is not run on them. See discussion in the bugs for reasons why. Honza suggested that we can plug the results of IPA-CP analysis into value numbering, so that FRE can figure out that some loads fetch known constants. This is what this patch attempts to do. The patch does not attempt to populate partial_defs with information from IPA-CP, this can be hopefully added as a follow-up. gcc/ChangeLog: 2023-08-11 Martin Jambor <mjambor@suse.cz> PR ipa/68930 PR ipa/92497 * ipa-prop.h (ipcp_get_aggregate_const): Declare. * ipa-prop.cc (ipcp_get_aggregate_const): New function. (ipcp_transform_function): Do not deallocate transformation info. * tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and ipa-prop.h. (vn_reference_lookup_2): When hitting default-def vuse, query IPA-CP transformation info for any known constants. gcc/testsuite/ChangeLog: 2023-06-07 Martin Jambor <mjambor@suse.cz> PR ipa/68930 PR ipa/92497 * gcc.dg/ipa/pr92497-1.c: New test. * gcc.dg/ipa/pr92497-2.c: Likewise.
Finally fixed.