This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug debug/78685] -Og generates too many "<optimized out>"s

From: "vries at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Mon, 02 Jul 2018 10:35:47 +0000
Subject: [Bug debug/78685] -Og generates too many "<optimized out>"s
Auto-submitted: auto-generated
References: <bug-78685-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78685

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #44333|0                           |1
        is obsolete|                            |

--- Comment #12 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 44343
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44343&action=edit
[debug] Add -fkeep-vars-live

(In reply to Richard Biener from comment #10)
> (In reply to Tom de Vries from comment #8)
> > Created attachment 44333 [details]
> > proof of concept patch
> > 
> > I ran into the same problem with guality test-case pr54200.c, which fails
> > for Og.
> > 
> > The relevant part of the test-case is:
> > ...
> > int __attribute__((noinline,noclone))
> > foo (int z, int x, int b)
> > {
> >   if (x == 1)
> >     {
> >       bar ();
> >       return z;
> >     }
> >   else
> >     {
> >       int a = (x + z) + b;
> >       return a; /* { dg-final { gdb-test 20 "z" "3" } } */
> >     }
> > }
> > ...
> > 
> > The problem is that the '(x + z) + b' calculation has a temporary register
> > which  gets allocated the register that holds 'z', so when we get to the
> > gdb-test line, z is no longer available.
> > 
> > Using this patch I managed to print the correct value of z at the gdb-test
> > line.
> > 
> > The patch uses clobbers in gimple to mark the out-of-scope point, purely
> > because that's similar to what was already done for local array variables,
> > and I thought that was the fastest path to getting a proof of concept
> > working. It's more accurate to model this as some sort of use in gimple, and
> > doing so may prevent gimple optimizations which wreck debug info, but
> > perhaps that's not necessary, I suppose that depends on which optimizations
> > are enabled in Og.
> > 
> > Anyway, at expand we emit a use for the clobber which seems to do the trick.
> 
> An interesting idea but I think this will cause excessive spilling and
> come at a compile-time cost.  The idea of -Og was to have performance
> closer to -O1 here and with this we approach -O0 in assigning stack slots
> to all user variables?
> 

Agreed, there will be a compile time cost. I've moved the functionality to an
orthogonal option -fkeep-vars-live (as suggested in comment 3), to keep Og as
is.  Also, I agree there will be more spilling, but I'd hope that memory usage
still would be less than for O0.

> This is a general conflict of interest of course.
> 
> Your patch shouldn't prevent any optimization at the GIMPLE level
> (like completely eliding local variables) 

I've fixed that now, in the sense that it generates uses rather than clobbers
at gimple level.

> but definitely they'll keep
> things live at RTL level.  Given for the testcase at hand the issue
> is "suboptimal" choice of register allocation I wonder if we can
> adjust the RA / DF to consider liveness to extend always up to the
> next sequence point.
> 
> That is, it is reasonable to lose track of z in
> 
>   int a = (x + z) + b;
> 
> but only after x + z is computed.

Right, that could maybe be a way to improve on Og.

But the trade-off I'm going for currently with "Og -fkeep-vars-live" is:
O0-like debug experience, slower compilation than Og, faster execution than
-O0.

> So a different approach would be
> to see this as an issue with the way debug consumers "step" now that
> we emit column debug information?
> 
> That said, starting with GCC 8 we now have # DEBUG BEGIN_STMT
> markers:
> 
>   <bb 4> [local count: 856416479]:
>   # DEBUG BEGIN_STMT
>   _1 = x_4(D) + z_7(D);
>   a_9 = _1 + b_8(D);
>   # DEBUG a => a_9
>   # DEBUG BEGIN_STMT
> 
> they are also present on RTL as
> 
> (debug_insn 21 20 0 (debug_marker) "t.c":11 -1
>      (nil))
> 
> which means we _could_ somehow use those for code-generation (in DF analysis
> to extend lifetimes somehow).  Of course if we do we have to emit those
> always to not get code-gen differences -g vs. -g0.
> 
> Maybe the RA could also be generally changed to use a not-LRU style
> preferencing of available registers to choose from.
> 
> In the end nothing saves us from "spilling" things if we really want to
> be able to access all final values of registers up to the point they
> go out of scope...
> 
> So another idea would be instead of like -O0 assigning the main location
> to the stack we spill no longer used vars during LRA?  We'd still have to
> somehow know until when that spill slot needs to live (and afterwards can
> be re-used).

To stay in terms of the current patch, that could mean we mark artificial uses
as such, and then make the compiler handle them smarter than regular uses.

[ Also, this version of the patch adds insertion of nops to prevent empty debug
ranges, which allows me to get the value of 'a' at the 'return a'  mentioned
above. ]

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]