This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: determining aggregate member from MEM_REF
- From: Jeff Law <law at redhat dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>, Martin Sebor <msebor at gmail dot com>
- Cc: GCC Mailing List <gcc at gcc dot gnu dot org>
- Date: Mon, 26 Feb 2018 12:57:53 -0700
- Subject: Re: determining aggregate member from MEM_REF
- Authentication-results: sourceware.org; auth=none
- References: <108da35d-4146-2b3a-a667-692d41bcf8f6@gmail.com> <CAFiYyc02CumJYaSG5x7QvSzN+0wyVKXGY=uvyHB+76Z5QxwNCQ@mail.gmail.com> <930e0618-0ea8-573d-8b51-beafa7969565@gmail.com> <CAFiYyc3SeNaZEwVM4eFKcgD1CZF=JLSZBgMx=JPQ+mANrJO3bA@mail.gmail.com>
On 02/26/2018 05:08 AM, Richard Biener wrote:
> On Fri, Feb 16, 2018 at 8:07 PM, Martin Sebor <msebor@gmail.com> wrote:
>> Say I have a struct like this:
>>
>> struct A {
>> char a[4], b[5];
>> };
>>
>> then in
>>
>> extern struct A *a;
>>
>> memset (&a[0].a[0] + 14, 0, 3); // invalid
>>
>> memset (&a[1].b[0] + 1, 0, 3); // valid
>>
>> both references are the same:
>>
>> &MEM_REF[char*, (void *)a + 14];
>>
>> and there's no way to unambiguously tell which member each refers
>> to, or even to distinguish the valid one from the other. MEM_REF
>> makes the kind of analysis I'm interested in very difficult (or
>> impossible) to do reliably.
>
> Yes. Similar issues exist for the objsz pass (aka fortify stuff).
In fact, I think we have a long standing regression in this space.
>
>> Being able to determine the member is useful in -Wrestrict where
>> rather than printing the offsets from the base object I'd like
>> to be able to print the offsets relative to the referenced
>> member. Beyond -Wrestrict, identifying the member is key in
>> detecting writes that span multiple members (e.g., strcpy).
>> Those could (for example) overwrite a member that's a pointer
>> to a function and cause code injection. As it is, GCC has no
>> way to do that because __builtin_object_size considers the
>> size of the entire enclosing object, not that of the member.
>> For the same reason: MEM_REF makes it impossible.
>
> We're first and foremost an optimizing compiler and not a
> static analysis tool. People seem to want some optimization
> to make static analysis easier but then they have to live with
> imperfect results. There's no easy way around this kind of
> issues.
True, but there is significant value in generating good diagnostics.
IMHO it's worth thinking about if/how we can get the refinements we want
on the diagnostic side without regressing on the code generation side.
jeff