This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/67606] Missing optimization: load possible return value before early-out test
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 17 Sep 2015 08:08:46 +0000
- Subject: [Bug middle-end/67606] Missing optimization: load possible return value before early-out test
- Auto-submitted: auto-generated
- References: <bug-67606-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67606
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-09-17
CC| |matz at gcc dot gnu.org
Component|c |middle-end
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
So for the main part of this PR we actually expand from
<bb 7>:
# count_16 = PHI <count_1(6), 0(2)>
return count_16;
so it is a matter of coalescing and where we put that copy from zero.
We coalesce the following way:
Coalesce list: (1)count_1 & (15)count_15 [map: 0, 7] : Success -> 0
Coalesce list: (3)ivtmp.6_3 & (13)ivtmp.6_13 [map: 2, 6] : Success -> 2
Coalesce list: (1)count_1 & (11)count_11 [map: 0, 5] : Success -> 0
Coalesce list: (1)count_1 & (16)count_16 [map: 0, 8] : Success -> 0
Coalesce list: (2)ivtmp.6_2 & (13)ivtmp.6_3 [map: 1, 2] : Success -> 2
so 'count' is fully coalesced but of course the constant is still there
and we insert a copy on the 2->7 edge.
Inserting a value copy on edge BB3->BB4 : PART.0 = 0
Inserting a value copy on edge BB2->BB7 : PART.0 = 0
which also looks good (we use the correct partition for this). Note that
the zero init isn't partially redundant so GCSE isn't able to optimize
here and RTL code hoisting isn't very advanced.
I'm also sure the RA guys will say it's not the RAs job of doing the
hoisting.
So with my usual GIMPLE hat on I'd say it would have been nice to help
things by placing the value copy more intelligently. We have a pass
that is supposed to help here - uncprop. We're faced with
<bb 2>:
if (length_4(D) > 0)
goto <bb 3>;
else
goto <bb 7>;
<bb 3>:
...
<bb 4>:
# count_15 = PHI <0(3), count_1(6)>
...
<bb 6>:
# count_1 = PHI <count_15(4), count_11(5)>
ivtmp.6_3 = ivtmp.6_13 + 4;
if (ivtmp.6_3 != _25)
goto <bb 4>;
else
goto <bb 7>;
<bb 7>:
# count_16 = PHI <count_1(6), 0(2)>
return count_16;
which might be a good enough pattern to detect (slight complication with
the forwarder BB3). Note that we'd increase register pressure throughout
BB3 and that for the whole thing to work we'd still need to be able to
make sure we can coalesce all of count and the register we init with zero.
Given the interaction with coalescing I wonder whether it makes sense to
do "uncprop" together with coalescing or to make emitting and placing
value-copies work on GIMPLE, exposing the partitions explicitely somehow.
Well, trying to improve uncprop for this particular testcase would work
and shouldn't be too hard (you'd extend it from detecting edge equivalencies
to existing vargs to do PHI value hoisting). There is also other code
trying to improve coalescing - rewrite_out_of_ssa has insert_backedge_copies
so we could also detect the pattern here and insert copies from the
common constant in the common dominator.