This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Small x86 debug info improvement (PR debug/83157)
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Richard Biener <rguenther at suse dot de>, Jeff Law <law at redhat dot com>, Alexandre Oliva <aoliva at redhat dot com>, Uros Bizjak <ubizjak at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Thu, 22 Mar 2018 22:49:45 +0100
- Subject: [PATCH] Small x86 debug info improvement (PR debug/83157)
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
Hi!
The following patch fixes:
-FAIL: gcc.dg/guality/pr41616-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test
-FAIL: gcc.dg/guality/pr41616-1.c -O3 -g execution test
on x86_64-linux and
-FAIL: gcc.dg/guality/pr41616-1.c -O3 -g execution test
on i686-linux. The problem can be explained on a short:
void bar (int);
int
foo (int x)
{
bar (0);
int b = x > 0 ? -1 : 1;
bar (b);
return 0;
}
testcase simplified from pr41616-1.c. Before var-tracking we have:
53: {di:SI=0;clobber flags:CC;}
REG_UNUSED flags:CC
54: flags:CCNO=cmp(bx:SI,0)
REG_DEAD bx:SI
55: strict_low_part(di:QI)=flags:CCNO<=0
REG_DEAD flags:CCNO
38: di:SI=di:SI*0x2-0x1
17: debug b => di:SI
18: debug begin stmt marker
20: call [`bar'] argc:0
REG_DEAD di:SI
REG_CALL_DECL `bar'
where the:
xorl %edi, %edi
testl %ebx, %ebx
setle %dil
is what i386.md peephole2 creates - clears first the whole register,
then performs some comparison, then set? into the low 8 bits of the
previously cleared register, leaving the upper bits still zero, and finally
the whole 32-bit register is used. var-tracking.c unfortunately doesn't
handle the strict_low_part stuff, so the location description says that
b is well defined right on the call insn (it lives there in the %edi
register), but as it is call clobbered register and isn't used after the
call, it says that from the end of the call insn - 1 till later b is
<optimized away>.
cselib.c (cselib_record_sets) has minimal support for STRICT_LOW_PART:
/* A STRICT_LOW_PART can be ignored; we'll record the equivalence for
the low part after invalidating any knowledge about larger modes. */
if (GET_CODE (sets[i].dest) == STRICT_LOW_PART)
sets[i].dest = dest = XEXP (dest, 0);
but var-tracking.c add_store wouldn't do anything anyway, because it didn't
recognize that (fixed by the first hunk). That alone isn't sufficient to
fix this issue, because cselib_record_sets invalidates all the registers
set (which invalidates the whole %rdi register here) and at the end then
establishes the SET_SRC corresponding VALUE to be live in the destination
hard register (%dil). The cselib.c part (unfortunately, can't do that in
var-tracking.c proper, because it needs to be done after the whole register
is invalidated, but it is limited to users with the cselib_record_sets_hook
hook non-NULL, which is only var-tracking) checks for the above situation,
where we know the whole register is zero before the instruction, and adds a
permanent equiv that the value of the whole new register (%edi) is equal to
zero extension of the SET_SRC corresponding VALUE, so if cselib/var-tracking
looks up %dil, it will find the exact SET_SRC's VALUE, if it looks %edi,
it will find (zero_extend:SI (value:QI)).
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2018-03-22 Jakub Jelinek <jakub@redhat.com>
PR debug/83157
* var-tracking.c (add_stores): Handle STRICT_LOW_PART SET_DEST.
* cselib.c (cselib_record_sets): For STRICT_LOW_PART dest,
lookup if dest in some wider mode is known to be const0_rtx and
if so, record permanent equivalence for it to be ZERO_EXTEND of
the narrower mode destination.
--- gcc/var-tracking.c.jj 2018-02-19 22:57:06.141917771 +0100
+++ gcc/var-tracking.c 2018-03-22 16:51:29.264977212 +0100
@@ -5962,7 +5962,9 @@ add_stores (rtx loc, const_rtx expr, voi
mo.type = MO_CLOBBER;
mo.u.loc = loc;
if (GET_CODE (expr) == SET
- && SET_DEST (expr) == loc
+ && (SET_DEST (expr) == loc
+ || (GET_CODE (SET_DEST (expr)) == STRICT_LOW_PART
+ && XEXP (SET_DEST (expr), 0) == loc))
&& !unsuitable_loc (SET_SRC (expr))
&& find_use_val (loc, mode, cui))
{
--- gcc/cselib.c.jj 2018-01-04 00:43:17.210703103 +0100
+++ gcc/cselib.c 2018-03-22 17:45:52.042904883 +0100
@@ -2502,6 +2502,7 @@ cselib_record_sets (rtx_insn *insn)
rtx body = PATTERN (insn);
rtx cond = 0;
int n_sets_before_autoinc;
+ int n_strict_low_parts = 0;
struct cselib_record_autoinc_data data;
body = PATTERN (insn);
@@ -2556,6 +2557,7 @@ cselib_record_sets (rtx_insn *insn)
for (i = 0; i < n_sets; i++)
{
rtx dest = sets[i].dest;
+ rtx orig = dest;
/* A STRICT_LOW_PART can be ignored; we'll record the equivalence for
the low part after invalidating any knowledge about larger modes. */
@@ -2581,6 +2583,55 @@ cselib_record_sets (rtx_insn *insn)
else
sets[i].dest_addr_elt = 0;
}
+
+ /* Improve handling of STRICT_LOW_PART if the current value is known
+ to be const0_rtx, then the low bits will be set to dest and higher
+ bits will remain zero. Used in code like:
+
+ {di:SI=0;clobber flags:CC;}
+ flags:CCNO=cmp(bx:SI,0)
+ strict_low_part(di:QI)=flags:CCNO<=0
+
+ where we can note both that di:QI=flags:CCNO<=0 and
+ also that because di:SI is known to be 0 and strict_low_part(di:QI)
+ preserves the upper bits that di:SI=zero_extend(flags:CCNO<=0). */
+ scalar_int_mode mode;
+ if (dest != orig
+ && cselib_record_sets_hook
+ && REG_P (dest)
+ && HARD_REGISTER_P (dest)
+ && is_a <scalar_int_mode> (GET_MODE (dest), &mode)
+ && n_sets + n_strict_low_parts < MAX_SETS)
+ {
+ opt_scalar_int_mode wider_mode_iter;
+ FOR_EACH_WIDER_MODE (wider_mode_iter, mode)
+ {
+ scalar_int_mode wider_mode = wider_mode_iter.require ();
+ if (GET_MODE_PRECISION (wider_mode) > BITS_PER_WORD)
+ break;
+
+ rtx reg = gen_lowpart (wider_mode, dest);
+ if (!REG_P (reg))
+ break;
+
+ cselib_val *v = cselib_lookup (reg, wider_mode, 0, VOIDmode);
+ if (!v)
+ continue;
+
+ struct elt_loc_list *l;
+ for (l = v->locs; l; l = l->next)
+ if (l->loc == const0_rtx)
+ break;
+
+ if (!l)
+ continue;
+
+ sets[n_sets + n_strict_low_parts].dest = reg;
+ sets[n_sets + n_strict_low_parts].src = dest;
+ sets[n_sets + n_strict_low_parts++].src_elt = sets[i].src_elt;
+ break;
+ }
+ }
}
if (cselib_record_sets_hook)
@@ -2625,6 +2676,20 @@ cselib_record_sets (rtx_insn *insn)
|| (MEM_P (dest) && cselib_record_memory))
cselib_record_set (dest, sets[i].src_elt, sets[i].dest_addr_elt);
}
+
+ /* And deal with STRICT_LOW_PART. */
+ for (i = 0; i < n_strict_low_parts; i++)
+ {
+ if (! PRESERVED_VALUE_P (sets[n_sets + i].src_elt->val_rtx))
+ continue;
+ machine_mode dest_mode = GET_MODE (sets[n_sets + i].dest);
+ cselib_val *v
+ = cselib_lookup (sets[n_sets + i].dest, dest_mode, 1, VOIDmode);
+ cselib_preserve_value (v);
+ rtx r = gen_rtx_ZERO_EXTEND (dest_mode,
+ sets[n_sets + i].src_elt->val_rtx);
+ cselib_add_permanent_equiv (v, r, insn);
+ }
}
/* Return true if INSN in the prologue initializes hard_frame_pointer_rtx. */
Jakub