This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998)


On Tue, 20 Jun 2017, Jakub Jelinek wrote:

> On Tue, Jun 20, 2017 at 09:41:43AM +0200, Richard Biener wrote:
> > On Mon, 19 Jun 2017, Jakub Jelinek wrote:
> > 
> > > Hi!
> > > 
> > > The following patch adds -fsanitize=pointer-overflow support,
> > > which adds instrumentation (included in -fsanitize=undefined) that checks
> > > that pointer arithmetics doesn't wrap.  If the offset on ptr p+ off when treating
> > > it as signed value is non-negative, we check whether the result is bigger
> > > (uintptr_t comparison) than ptr, if it is negative in ssizetype, we check
> > > whether the result is smaller than ptr, otherwise we check at runtime
> > > whether (ssizetype) off < 0 and do the check based on that.
> > > The patch checks both POINTER_PLUS_EXPR, as well as e.g. ADDR_EXPR of
> > > handled components, and even handled components themselves (exception
> > > is for constant offset when the base is an automatic non-VLA decl or
> > > decl that binds to current function where we can at compile time for
> > > sure guarantee it will fit).
> > 
> > Does this "properly" interact with any array-bound sanitizing we do?
> > Say, for
> > 
> >  &a->b[i].c.d
> > 
> > ?
> 
> It doesn't interact with it right now at all, and I think it shouldn't
> for the case you wrote, &a->b[i].c.d even if i is within bounds of the
> declared array the pointer still could point to something that wraps around.
> struct S { struct T { struct U { int d; } c; } b[0x1000000]; } *a;
> could still in a buggy program point to something that is actually much
> shorter like a = (struct S *) malloc (0x10000 * sizeof (int));
> or similar and there could be a wrap around.
> 
> What I have considered, but haven't implemented yet, is checking if there
> is UBSAN_BOUNDS sanitization and it is actually array or structure with
> array etc. (i.e. there is no pointer dereference) - for
> &q.b[i].c.d
> if the bounds check is present and we are sure about the object size
> (i.e. automatic variable or locally defined file scope var).
> 
> > > Martin has said he'll write the sanopt part of optimization
> > > (if UBSAN_PTR for some pointer is dominated by UBSAN_PTR for the same
> > > pointer and the offset is constant in both cases and equal or absolute value
> > > bigger and same sign in the dominating UBSAN_PTR, we can avoid the dominated
> > > check).  
> > > 
> > > For the cases where there is a dereference (i.e. not ADDR_EXPR of the
> > > handled component or POINTER_PLUS_EXPR), I wonder if we couldn't ignore
> > > say constant offsets in range <-4096, 4096> or something similar, hoping
> > > people don't have anything mapped at the page 0 and -pagesize in hosted
> > > env.  Thoughts on that?
> > 
> > Not sure what the problem is here?
> 
> It would be an attempt to avoid sanitizing int foo (int *p) { return p[10] + p[-5]; }
> (when the offset is constant and small and we dereference it).
> If there is no page mapped at NULL or at the highest page in the virtual
> address space, then the above will crash in case p + 10 or p - 5 wraps
> around.

Ah, so merely an optimization to avoid excessive instrumentation then,
yes, this might work (maybe make 4096 a --param configurable to be able
to disable it).

> > > I've bootstrapped/regtested the patch on x86_64-linux and i686-linux
> > > and additionally bootstrapped/regtested with bootstrap-ubsan on both too.
> > > The latter revealed a couple of issues I'd like to discuss:
> > > 
> > > 1) libcpp/symtab.c contains a couple of spots reduced into:
> > > #define DELETED ((char *) -1)
> > > void bar (char *);
> > > void
> > > foo (char *p)
> > > {
> > >   if (p && p != DELETED)
> > >     bar (p);
> > > }
> > > where we fold it early into if ((p p+ -1) <= (char *) -3)
> > > and as the instrumentation is done during ubsan pass, if p is NULL,
> > > we diagnose this as invalid pointer overflow from NULL to 0xffff*f.
> > > Shall we change the folder so that during GENERIC folding it
> > > actually does the addition and comparison in pointer_sized_int
> > > instead (my preference), or shall I move the UBSAN_PTR instrumentation
> > > earlier into the FEs (but then I still risk stuff is folded earlier)?
> > 
> > Aww, so we turn the pointer test into a range test ;)  That it uses
> > a pointer type rather than an unsigned integer type is a bug, probably
> > caused by pointers being TYPE_UNSIGNED.
> > 
> > Not sure if the folding itself is worthwhile to keep though, thus an
> > option would be to not generate range tests from pointers?
> 
> I'll have a look.  Maybe only do it during reassoc and not earlier.

It certainly looks somewhat premature in fold-const.c.

> > > 3) not really related to this patch, but something I also saw during the
> > > bootstrap-ubsan on i686-linux:
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147475412 cannot be represented in type 'int'
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147478324 cannot be represented in type 'int'
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147451580 cannot be represented in type 'int'
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147465664 cannot be represented in type 'int'
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147469348 - 2147451544 cannot be represented in type 'int'
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147482364 - 2147475376 cannot be represented in type 'int'
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483624 - 2147475376 cannot be represented in type 'int'
> > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483628 - 2147451544 cannot be represented in type 'int'
> > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147426384 - 2147475376 cannot be represented in type 'int'
> > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147450216 - 2147451544 cannot be represented in type 'int'
> > > The problem here is that we lower pointer subtraction, e.g.
> > > long foo (char *p, char *q) { return q - p; }
> > > as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p);
> > > and even for a valid testcase where we have an array across
> > > the middle of the virtual address space, say the first one above
> > > is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if
> > > there is 128KB array starting at 0x7fffd000, it will yield
> > > UB (not in the source, but in whatever the compiler lowered it into).
> > > So, shall we instead do the subtraction in sizetype and only then
> > > cast?  For sizeof (*ptr) > 1 I think we have some outstanding PR,
> > > and it is more difficult to find out in what types to compute it.
> > > Or do we want to introduce POINTER_DIFF_EXPR?
> > 
> > Just use uintptr_t for the difference computation (well, an unsigned
> > integer type of desired precision -- mind address-spaces), then cast
> > the result to signed.
> 
> Ok (of course, will handle this separately from the rest).

Yes.  Note I didn't look at the actual patch (yet).

Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]