Bug 43051 - [4.5 Regression] VTA causes a stack living parameter unavailable in most of the function
Summary: [4.5 Regression] VTA causes a stack living parameter unavailable in most of t...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: debug (show other bugs)
Version: 4.5.0
: P1 normal
Target Milestone: 4.5.0
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-debug
Depends on:
Blocks:
 
Reported: 2010-02-12 13:27 UTC by Jakub Jelinek
Modified: 2010-03-16 10:53 UTC (History)
3 users (show)

See Also:
Host:
Target: i686-linux
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
gcc45-pr43051.patch (2.37 KB, patch)
2010-02-15 12:21 UTC, Jakub Jelinek
Details | Diff
additional patch (313 bytes, patch)
2010-02-15 16:13 UTC, Jakub Jelinek
Details | Diff
gcc45-pr43051.patch (3.54 KB, patch)
2010-02-22 17:10 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2010-02-12 13:27:33 UTC
static void __attribute__((noinline))
foo (const char *x, long long y, int z)
{
  asm volatile ("" : : "r" (x), "r" ((int) y), "r" (z) : "memory");
}

typedef struct S
{
  struct S *n;
  int v;
} *P;

P
bar (P c, int v, P e)
{
  register int si asm ("esi"), di asm ("edi"), bx asm ("ebx");
  asm volatile ("" : "=r" (si), "=r" (di), "=r" (bx));
  while (c < e)
    {
      foo ("c", (unsigned int) c, 0);
      foo ("v", v, 0);
      foo ("e", (unsigned int) e, 0);
      if (c->v == v) return c;
      foo ("c", (unsigned int) c, 0);
      foo ("v", v, 0);
      foo ("e", (unsigned int) e, 0);
      c++;
    }
  asm volatile ("" : : "r" (si), "r" (di), "r" (bx));
  return 0;
}

at -g -O2 -m32 makes c unavailable in most of the function, while the c var is for the whole time living in the parameter stack slot.  GCC 4.4.x and earlier emit correct location info in this case.
Comment 1 Jakub Jelinek 2010-02-12 14:05:26 UTC
Slightly smaller testcase:
static void __attribute__((noinline))
foo (const char *x, long long y)
{
  asm volatile ("" : : "r" (x), "r" ((int) y), "r" (0) : "memory");
}

typedef struct S
{
  struct S *n;
  int v;
} *P;

P
bar (P c, int v, P e)
{
  register int si asm ("esi"), di asm ("edi"), bx asm ("ebx");
  asm volatile ("" : "=r" (si), "=r" (di), "=r" (bx));
  while (c < e)
    {
      foo ("c", (unsigned int) c);
      foo ("v", v);
      c++;
    }
  asm volatile ("" : : "r" (si), "r" (di), "r" (bx));
  return 0;
}
again -g -O2 -m32.
Comment 2 Jakub Jelinek 2010-02-12 16:44:45 UTC
So, when vt_find_locations visits the second time bb 4 in bar, it copies bb 4's out set (just mentioning the important items):
  name: c D.1235
    offset 0
      (value/s/u:SI 30:3972 @0x211f100/0x214cfa0)
 (value/s/u:SI 30:3972 @0x211f100/0x214cfa0)
    offset 0
      (reg:SI 0 ax)
      (mem/f/c/i:SI (value/s/u:SI 20:3962 @0x211f010/0x214cd50) [2 cD.1235+0 S4 A32])
 (value/s/u:SI 20:3962 @0x211f010/0x214cd50)
    offset 0
      (plus:SI (value/s/u:SI 19:19 @0x211eff8/0x214cdb0)
    (const_int 8 [0x8]))
 (value/s/u:SI 19:19 @0x211eff8/0x214cdb0)
    offset 0
      (value/s/u:SI 2:2 @0x211ec68/0x21338b0)
 (value/s/u:SI 2:2 @0x211ec68/0x21338b0)
    offset 0
      (reg/f:SI 6 bp)
      (value/s/u:SI 6:6 @0x211ecc8/0x21337e0)
      (value/s/u:SI 13:13 @0x211ee00/0x2133910)
      (value/s/u:SI 19:19 @0x211eff8/0x214cdb0)
and bb 3's out set:
  name: c D.1235
    offset 0
      (value/s/u:SI 9:9 @0x211ed10/0x2133720)
 (value/s/u:SI 9:9 @0x211ed10/0x2133720)
    offset 0
      (mem/f/c/i:SI (value/s/u:SI 8:3945 @0x211ecf8/0x2133750) [2 cD.1235+0 S4 A32])
 (value/s/u:SI 8:3945 @0x211ecf8/0x2133750)
    offset 0
      (plus:SI (value/s/u:SI 2:2 @0x211ec68/0x21338b0)
    (const_int 8 [0x8]))
 (value/s/u:SI 2:2 @0x211ec68/0x21338b0)
    offset 0
      (reg/f:SI 6 bp)
      (value/s/u:SI 6:6 @0x211ecc8/0x21337e0)
      (value/s/u:SI 13:13 @0x211ee00/0x2133910)
But as the VALUEs are different and some values are embedded inside of the MEMs/PLUSs, intersect_loc_chains doesn't figure out that value 9:9 is actually equivalent to value 30:3972.  Wonder whether intersect_loc_chains needs to do a for_each_rtx search, comparing stuff other than values right away and comparing values using recursive intersect_loc_chains.  And whether this wouldn't be too expensive.

Wonder if http://gcc.gnu.org/ml/gcc-patches/2010-01/msg01016.html could make a difference here (i.e. if the star canonicalization would help), but then
I haven't heard a response to my query in
http://gcc.gnu.org/ml/gcc-patches/2010-01/msg01490.html
Comment 3 Jakub Jelinek 2010-02-12 17:02:25 UTC
http://gcc.gnu.org/ml/gcc-patches/2010-01/msg01016.html (without the IMHO wrong swapping of the two traverses) doesn't help this case at all.
Comment 4 Jakub Jelinek 2010-02-15 12:21:16 UTC
Created attachment 19877 [details]
gcc45-pr43051.patch

Patch that fixes this issue and I'm going to bootstrap/regtest now.
Comment 5 Jakub Jelinek 2010-02-15 14:10:36 UTC
While the patch bootstrapped/regtested on i686-linux (all,obj-c++,lto but no ada) and resulted in quite substantial changes on .debug_info/.debug_loc - .debug_info on cc1 grew by ~ 11.1% from 16513941 to 18581214 bytes and
.debug_loc grew by 48.1% from 6749001 to 13013048 bytes (one would hope that is better debug info quality), it unfortunately didn't bootstrap on x86_64-linux (all,obj-c++,lto,ada), with an ICE while compiling 32-bit g-sehash.adb
in delete_slot_part - the gcc_assert (changed) line.
p debug_rtx (loc)
(mem/c:SI (value/s/u:SI 175:3747 @0x1f4c4b8/0x1f4d2e0) [22 %sfp+-332 S4 A32])
p debug_rtx (var->var_part[0].cur_loc)
(mem/c:SI (value/s/u:SI 148:3703 @0x1f4c170/0x1f4be30) [26 S4 A32])

I guess this is related to the patch, the two VALUEs probably either at some point or even currently point to the same thing.  The question is what should we do about it, easiest would be just not to assert changed is true, but just set it if the location list is empty.  Is that the only spot that needs changing though?  E.g. dataflow_set_remove_mem_locs does something similar...
Comment 6 Jakub Jelinek 2010-02-15 16:13:16 UTC
Created attachment 19879 [details]
additional patch

With this incremental patch it bootstrapped/regtested even on x86_64-linux.
On x86_64-linux cc1 there are almost no debuginfo differences, but as I said earlier on i686-linux it matters a lot:
number of dies with DW_AT_location attribute before/after:
readelf -wi obj598/gcc/cc1 | grep DW_AT_location | wc -l; readelf -wi obj600/gcc/cc1 | grep DW_AT_location | wc -l
158673
226078
number of location lists in .debug_loc before/after:
readelf -wo obj598/gcc/cc1 | grep '<End of' | wc -l; readelf -wo obj600/gcc/cc1 | grep '<End of' | wc -l
116508
182037
and number of ranges with a valid location before/after:
readelf -wo obj598/gcc/cc1 | grep DW_OP_ | wc -l; readelf -wo obj600/gcc/cc1 | grep DW_OP_ | wc -l
495513
980535
Comment 7 Jakub Jelinek 2010-02-22 17:10:39 UTC
Created attachment 19935 [details]
gcc45-pr43051.patch

Updated patch, which includes lxo's fix, guality testcase and also a fix for infinite recursion discovered in redhat/gcc-4_4-branch backport.

While this bootstraps/regtests on the trunk, on redhat/gcc-4_4-branch bootstrap fails due to endless alteration similar to PR42873 when compiling c-common.c (add_tlist) during profiledbootstrap.  Haven't tried yet trunk profiledbootstrap.  The alternation in this case is between:
(gdb) p debug_rtx (var1->var_part[0].loc_chain->next->loc)
(value/u:DI 90:90 @0x2a72678/0x28c5520)
$30 = void
(gdb) p debug_rtx (var1->var_part[0].loc_chain->loc)
(mem/f:DI (value/s/u:DI 56:56 @0x2a722c8/0x2904028) [161 S8 A64])
$31 = void
(gdb) p debug_rtx (var1->var_part[0].loc_chain->next->loc)
(value/u:DI 90:90 @0x2a72678/0x28c5520)
$32 = void
(gdb) p debug_rtx (var1->var_part[0].loc_chain->next->next->loc)
Cannot access memory at address 0x8
and
(gdb) p debug_rtx (var2->var_part[0].loc_chain->loc)
(mem/s/f:DI (value/s/u:DI 54:54 @0x2a722a8/0x2904078) [161 <variable>.next+0 S8 A64])
$33 = void
(gdb) p debug_rtx (var2->var_part[0].loc_chain->next->loc)
(mem/f:DI (value/s/u:DI 56:56 @0x2a722c8/0x2904028) [161 S8 A64])
$34 = void
(gdb) p debug_rtx (var2->var_part[0].loc_chain->next->next->loc)
(value/u:DI 90:90 @0x2a72678/0x28c5520)
Comment 8 Jakub Jelinek 2010-02-22 17:23:34 UTC
It is actually:
dataflow difference found: old and new follow:
 (value/s/u:DI 54:54 @0x42b02e8/0x42a75c8)
    offset 0
      (mem/f:DI (value/s/u:DI 56:56 @0x42b0308/0x42a7578) [161 S8 A64])
      (value/u:DI 90:90 @0x42b06b8/0x41569c0)
 (value/s/u:DI 54:54 @0x42b02e8/0x42a75c8)   
    offset 0
      (mem/s/f:DI (value/s/u:DI 54:54 @0x42b02e8/0x42a75c8) [161 <variable>.next+0 S8 A64])
      (mem/f:DI (value/s/u:DI 56:56 @0x42b0308/0x42a7578) [161 S8 A64])
      (value/u:DI 90:90 @0x42b06b8/0x41569c0)
alternating, it is quite weird that value 54 is in (mem (value 54)), guess need to find out in detail what's going on.
Comment 9 Jakub Jelinek 2010-02-22 18:17:40 UTC
Seems it is canonicalize_values_star that inserts the extra (mem (value 54:54))
every second run through bb 12 at the end of compute_bb_dataflow.
Comment 10 Jakub Jelinek 2010-02-22 19:21:33 UTC
The next variable is said to live in value 90 and (mem (value 54)), that's where
those two are equivalenced (canonicalize_values_star).  The equivalencing between value 90 and value 54
is done in val_resolve:
            /* Map incoming equivalences.  ??? Wouldn't it be nice if
             we just started sharing the location lists?  Maybe a
             circular list ending at the value itself or some
             such.  */
Comment 11 Jakub Jelinek 2010-03-16 10:51:17 UTC
Subject: Bug 43051

Author: jakub
Date: Tue Mar 16 10:50:42 2010
New Revision: 157476

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=157476
Log:
	PR debug/43051
	PR debug/43092
	* cselib.c (cselib_preserve_constants,
	cfa_base_preserved_val): New static variables.
	(preserve_only_constants): New function.
	(cselib_reset_table): If cfa_base_preserved_val is non-NULL, don't
	clear its REG_VALUES.  If cselib_preserve_constants, don't 
	empty the whole hash table, but preserve there VALUEs with constants,
	cfa_base_preserved_val and cfa_base_preserved_val plus constant.
	(cselib_preserve_cfa_base_value): New function.
	(cselib_invalidate_regno): Don't invalidate cfa_base_preserved_val.
	(cselib_init): Change argument to int bitfield.  Set
	cselib_preserve_constants to whether CSELIB_PRESERVE_CONSTANTS
	is in it.
	(cselib_finish): Clear cselib_preserve_constants and
	cfa_base_preserved_val.
	* cselib.h (enum cselib_record_what): New enum.
	(cselib_init): Change argument to int.
	(cselib_preserve_cfa_base_value): New prototype.
	* postreload.c (reload_cse_regs_1): Adjust cselib_init caller.
	* dse.c (dse_step1): Likewise.
	* cfgcleanup.c (thread_jump): Likewise.
	* sched-deps.c (sched_analyze): Likewise.
	* gcse.c (local_cprop_pass): Likewise.
	* simplify-rtx.c (simplify_replace_fn_rtx): Add argument to callback.
	If FN is non-NULL, call the callback always and whenever it returns
	non-NULL just return that.  Only do rtx_equal_p if FN is NULL.
	* rtl.h (simplify_replace_fn_rtx): Add argument to callback.
	* combine.c (propagate_for_debug_subst): Add old_rtx argument,
	compare from with old_rtx and if it isn't rtx_equal_p, return NULL.
	* Makefile.in (var-tracking.o): Depend on $(RECOG_H).
	* var-tracking.c: Include recog.h.
	(bb_stack_adjust_offset): Remove.
	(vt_stack_adjustments): Don't call it, instead just gather the
	adjustments using insn_stack_adjust_offset_pre_post on each bb insn.
	(adjust_stack_reference): Remove.
	(compute_cfa_pointer): New function.
	(hard_frame_pointer_adjustment, cfa_base_rtx): New static variables.
	(struct adjust_mem_data): New type.
	(adjust_mems, adjust_mem_uses, adjust_mem_stores, adjust_insn): New
	functions.
	(get_address_mode): New function.
	(replace_expr_with_values): Use it.
	(use_type): Don't do cselib_lookup for VAR_LOC_UNKNOWN_P.
	Use get_address_mode.  For cfa_base_rtx return MO_CLOBBER.
	(adjust_sets): Remove.
	(add_uses): Don't add extra MO_VAL_USE for cfa_base_rtx plus constant.
	Use get_address_mode.
	(get_adjusted_src): Remove.
	(add_stores): Don't call it.  Never reuse expr SET.  Don't add extra
	MO_VAL_USE for cfa_base_rtx plus constant.  Use get_address_mode.
	(add_with_sets): Don't call adjust_sets.
	(fp_setter, vt_init_cfa_base): New functions.
	(vt_initialize): Change return type to bool.  Move most of pool etc.
	initialization to the beginning of the function from end.  Pass
	CSELIB_RECORD_MEMORY | CSELIB_PRESERVE_CONSTANTS to cselib_init.
	If !frame_pointer_needed, call vt_stack_adjustment before mos
	vector is filled, call vt_init_cfa_base if argp/framep has been
	eliminated to sp.  If frame_pointer_needed and argp/framep has
	been eliminated to hard frame pointer, set
	hard_frame_pointer_adjustment and call vt_init_cfa_base after
	encountering fp setter in the prologue.  For MO_ADJUST, call
	log_op_type before pusing the op into mos vector, not afterwards.
	Call adjust_insn before cselib_process_insn/add_with_sets,
	call cancel_changes (0) afterwards.
	(variable_tracking_main_1): Adjust for vt_initialize calling
	vt_stack_adjustments and returning whether it succeeded or not.

	* gcc.dg/guality/pr43051-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/guality/pr43051-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/cfgcleanup.c
    trunk/gcc/combine.c
    trunk/gcc/cselib.c
    trunk/gcc/cselib.h
    trunk/gcc/dse.c
    trunk/gcc/gcse.c
    trunk/gcc/postreload.c
    trunk/gcc/rtl.h
    trunk/gcc/sched-deps.c
    trunk/gcc/simplify-rtx.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/var-tracking.c

Comment 12 Jakub Jelinek 2010-03-16 10:53:19 UTC
Fixed.