Bug 81801 - [PATCH] Difference of two pointers generates signed overflow
Summary: [PATCH] Difference of two pointers generates signed overflow
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
Depends on:
Reported: 2017-08-10 15:15 UTC by Petr Malat
Modified: 2019-09-19 06:28 UTC (History)
3 users (show)

See Also:
Target: 32bit target
Known to work:
Known to fail:
Last reconfirmed:

Correction + test (1.26 KB, text/plain)
2017-08-10 15:15 UTC, Petr Malat

Note You need to log in before you can comment on or make changes to this bug.
Description Petr Malat 2017-08-10 15:15:46 UTC
Created attachment 41964 [details]
Correction + test

SSH compiled with -ftrapv aborts in strlcpy() function, which computes a length of a string by subtracting pointer to the string from the pointer to '\0' character terminating the string.

The problem arises only if the string spawns over the positive/negative boundary, for example assume a string of the length 23 on 32-bit platform located at 0x7FFFfff8:
   char *str = (char *)0x7FFFfff8;
   char *str_end = (char *)0x8000000f;
then the following:
   int len = str_end - str;
produces a signed overflow, which leads to abort() if compiled with -ftrapv.

The whole issue can be easily shown by compiling the following code:
  int ptrdiff(char *a, char *b) {
    return b - a;
but more illustrative is to use a pointer to a larger object:
  int ptrdiff(int *a, int *b) {
    return b - a;
which produces the following test.c.003t.original:
  ;; Function ptrdiff (null)
  ;; enabled by -tree-original
    return ((int) b - (int) a) /[ex] 4;
I have tracked this problem down to pointer_diff() function, which explicitly uses signed type - either ptrdiff_t or larger signed type matching the pointer size if needed. I assume this is not correct and the pointer difference should always use unsigned type matching the size of the pointer to make the operation well behaving even if it overflows. After changing the function to use unsigned (the patch is attached), I get the following test.c.003t.original:
  ;; Function ptrdiff (null)
  ;; enabled by -tree-original
    return (int) ((unsigned int) b - (unsigned int) a) /[ex] 4;
which looks better to me.

The problem with this change is that it leads to
FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized "minus_expr"
where it doesn't optimize
  __PTRDIFF_TYPE__ l1 = b - a;
  __PTRDIFF_TYPE__ l2 = c - a;
  return l1 < l2;
  return b < c;
Which it could still do, but not because it knows the overflow can't occur but because if the difference of two pointers doesn't fit to __PTRDIFF_TYPE__ the behavior is undefined - so it's not possible the difference of two pointers would wrap around to a wrong sign in the correct program.
Comment 1 Andrew Pinski 2017-08-11 08:03:16 UTC
Actually it is worse than you think.  The problem is __PTRDIFF_TYPE__ is defined incorrectly if pointers can span across the sign bit (32bit).  There is whole thread about this on gcc@.  The C/C++ language defines ptrdiff_t as a type which can be used two take a difference of any two valid pointers and get a signed difference.

In the case here 0x8002000f-0x0001000f should actually return a signed integer that represents the space between those two pointers.
Comment 2 Andrew Pinski 2017-08-11 08:04:54 UTC
Related to PR 80998.
Comment 3 Petr Malat 2017-08-11 09:04:34 UTC
After reading the related part of the standard (actually, N1570 draft), I'm not so sure about this. The semantics section in 6.5.6 says pointers have to be from the same object and also the result must fit into the ptrdiff_t, which range is [PTRDIFF_MIN, PTRDIFF_MAX], which has to be equal or larger than [-65535, 65535].

So, theoretically it would be fine to use a shorter type for that, however there is still a problem if the calculation is done as it is now for larger objects as the difference is computed first at bytes and divided later by the object size, so the overflow can still happen even the result fits in [PTRDIFF_MIN, PTRDIFF_MAX].

One could solve it by dividing both pointers by the size before the subtraction, but that would make the operation more expensive.
Comment 4 joseph@codesourcery.com 2017-08-11 12:18:54 UTC
On Fri, 11 Aug 2017, oss at malat dot biz wrote:

> One could solve it by dividing both pointers by the size before the
> subtraction, but that would make the operation more expensive.

See also what I said in bug 67999 comment 24.

The combination (subtract to produce a ptrdiff_t result, with modulo 
arithmetic on overflow, then divide in ptrdiff_t) isn't correct for 
objects too large in bytes for ptrdiff_t.  But the following should be 
correct: right shift (logical) both pointers for the power-of-2 factor in 
what you are dividing by, subtract (modulo), multiply (modulo) by the (mod 
2^n) reciprocal of the odd part of what you are dividing by, interpret the 
result as signed.

It's less efficient than code sequences that only work for objects less 
than half the address space, and such large objects can never work if you 
do pointer subtraction of pointers to char that are too far apart to be 
represented in ptrdiff_t.  I think it makes most sense to disallow such 
large objects properly (including in malloc and mmap), but an option to 
support pointer subtraction in them would not be ridiculous.