[PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

Martin Sebor msebor@gmail.com
Thu Jan 14 17:49:42 GMT 2021


On 1/14/21 12:43 AM, Richard Biener wrote:
> On Wed, 13 Jan 2021, Jakub Jelinek wrote:
> 
>> Hi!
>>
>> The following patch doesn't actually fix the print_mem_ref bugs, I've kept
>> it for now as broken as it was, but at least improves the cases where
>> we can unambiguously map back MEM[&something + off] into some particular
>> reference (e.g. something.foo[1].bar etc.).
>> In the distant past I think we were folding such MEM_REFs back to
>> COMPONENT_REFs and ARRAY_REFs, but we've stopped doing that.
> 
> Yeah, because it has different semantics - *(((int *)t + 3)
> accesses an int object while t.u.b accesses a 't' object from the TBAA
> perspective.
> 
>>   But for
>> diagnostics that is what the user actually want to see IMHO.
>> So on the attached testcase, instead of printing what is in left column
>> it prints what is in right column:
>> ((int*)t) + 3		t.u.b
>> ((int*)t) + 6		t.u.e.i
>> ((int*)t) + 8		t.v
>> s + 1			s[1]
> 
> so while that's "nice" in general, for TBAA diagnostics it might actually
> be misleading.
> 
> I wonder whether we absolutely need to print a C expression here.
> We could print, instead of *((int *)t + 3), "access to a memory
> object of type 'int' at offset 12 bytes from 't'", thus explain
> in plain english.
> 
> That said, *((int *)t + 3) is exactly what the access is,
> semantically.  There's the case of a mismatch of the access type
> and the TBAA type which we cannot write down in C terms but maybe
> we want to have a builtin for this?  __builtin_access (ptr, lvalue-type,
> tbaa-type)?
> 
>> Of course, print_mem_ref needs to be also fixed to avoid printing the
>> nonsense it is printing right now, t is a structure type, so it can't be
>> cast to int* in C and in C++ only using some user operator, and
>> the result of what it printed is a pointer, while the uninitialized reads
>> are int.
>>
>> I was hoping Martin would fix that, but given his comment in the PR I think
>> I'll fix it myself tomorrow.
>>
>> Anyway, this patch is useful on its own.  Bootstrapped/regtested on
>> x86_64-linux and i686-linux, ok for trunk?
> 
> In the light of Martins patch this is probably reasonable but still
> the general direction is wrong (which is why I didn't approve Martins
> original patch).  I'm also somewhat disappointed we're breaking this
> so late in the cycle.

So am I.  I didn't test this change as exhaustively as I could and
(in light of the poor test coverage) should have.  That's my bad.
FWIW, I did do it for the first patch (by instrumenting GCC and
formatting every MEM_REF it came across), but it didn't occur to
me to do it this time around.  I have now completed this testing
(it found one more ICE elsewhere that I'll fix soon).

That said, as I mentioned in the patch submission, most middle end
warnings don't format MEM_REFs.  -Wuninitialized is an outlier here.
Most middle end warnings about invalid accesses print the decl or
allocation call with either a type or size of the access, followed
by either an array index or a byte offset in to the target of
the access.  For consistency I'd like to converge most middle end
warnings on the same style (formatted by the same code).  When
that happens, the MEM_REF format should become much less important,
so I wouldn't invest too much effort into perfecting it.

Martin


More information about the Gcc-patches mailing list