This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug debug/78685] -Og generates too many "<optimized out>"s
- From: "vries at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 02 Jul 2018 10:35:47 +0000
- Subject: [Bug debug/78685] -Og generates too many "<optimized out>"s
- Auto-submitted: auto-generated
- References: <bug-78685-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78685
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #44333|0 |1
is obsolete| |
--- Comment #12 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 44343
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44343&action=edit
[debug] Add -fkeep-vars-live
(In reply to Richard Biener from comment #10)
> (In reply to Tom de Vries from comment #8)
> > Created attachment 44333 [details]
> > proof of concept patch
> >
> > I ran into the same problem with guality test-case pr54200.c, which fails
> > for Og.
> >
> > The relevant part of the test-case is:
> > ...
> > int __attribute__((noinline,noclone))
> > foo (int z, int x, int b)
> > {
> > if (x == 1)
> > {
> > bar ();
> > return z;
> > }
> > else
> > {
> > int a = (x + z) + b;
> > return a; /* { dg-final { gdb-test 20 "z" "3" } } */
> > }
> > }
> > ...
> >
> > The problem is that the '(x + z) + b' calculation has a temporary register
> > which gets allocated the register that holds 'z', so when we get to the
> > gdb-test line, z is no longer available.
> >
> > Using this patch I managed to print the correct value of z at the gdb-test
> > line.
> >
> > The patch uses clobbers in gimple to mark the out-of-scope point, purely
> > because that's similar to what was already done for local array variables,
> > and I thought that was the fastest path to getting a proof of concept
> > working. It's more accurate to model this as some sort of use in gimple, and
> > doing so may prevent gimple optimizations which wreck debug info, but
> > perhaps that's not necessary, I suppose that depends on which optimizations
> > are enabled in Og.
> >
> > Anyway, at expand we emit a use for the clobber which seems to do the trick.
>
> An interesting idea but I think this will cause excessive spilling and
> come at a compile-time cost. The idea of -Og was to have performance
> closer to -O1 here and with this we approach -O0 in assigning stack slots
> to all user variables?
>
Agreed, there will be a compile time cost. I've moved the functionality to an
orthogonal option -fkeep-vars-live (as suggested in comment 3), to keep Og as
is. Also, I agree there will be more spilling, but I'd hope that memory usage
still would be less than for O0.
> This is a general conflict of interest of course.
>
> Your patch shouldn't prevent any optimization at the GIMPLE level
> (like completely eliding local variables)
I've fixed that now, in the sense that it generates uses rather than clobbers
at gimple level.
> but definitely they'll keep
> things live at RTL level. Given for the testcase at hand the issue
> is "suboptimal" choice of register allocation I wonder if we can
> adjust the RA / DF to consider liveness to extend always up to the
> next sequence point.
>
> That is, it is reasonable to lose track of z in
>
> int a = (x + z) + b;
>
> but only after x + z is computed.
Right, that could maybe be a way to improve on Og.
But the trade-off I'm going for currently with "Og -fkeep-vars-live" is:
O0-like debug experience, slower compilation than Og, faster execution than
-O0.
> So a different approach would be
> to see this as an issue with the way debug consumers "step" now that
> we emit column debug information?
>
> That said, starting with GCC 8 we now have # DEBUG BEGIN_STMT
> markers:
>
> <bb 4> [local count: 856416479]:
> # DEBUG BEGIN_STMT
> _1 = x_4(D) + z_7(D);
> a_9 = _1 + b_8(D);
> # DEBUG a => a_9
> # DEBUG BEGIN_STMT
>
> they are also present on RTL as
>
> (debug_insn 21 20 0 (debug_marker) "t.c":11 -1
> (nil))
>
> which means we _could_ somehow use those for code-generation (in DF analysis
> to extend lifetimes somehow). Of course if we do we have to emit those
> always to not get code-gen differences -g vs. -g0.
>
> Maybe the RA could also be generally changed to use a not-LRU style
> preferencing of available registers to choose from.
>
> In the end nothing saves us from "spilling" things if we really want to
> be able to access all final values of registers up to the point they
> go out of scope...
>
> So another idea would be instead of like -O0 assigning the main location
> to the stack we spill no longer used vars during LRA? We'd still have to
> somehow know until when that spill slot needs to live (and afterwards can
> be re-used).
To stay in terms of the current patch, that could mean we mark artificial uses
as such, and then make the compiler handle them smarter than regular uses.
[ Also, this version of the patch adds insertion of nops to prevent empty debug
ranges, which allows me to get the value of 'a' at the 'return a' mentioned
above. ]