Bug 96040 - [10/11 Regression] Compiled code causes SIGBUS at -O2
Summary: [10/11 Regression] Compiled code causes SIGBUS at -O2
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 10.1.0
: P2 normal
Target Milestone: 10.2
Assignee: Martin Jambor
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2020-07-02 21:21 UTC by Joseph C. Sible
Modified: 2020-07-04 17:52 UTC (History)
4 users (show)

See Also:
Host:
Target: x86_64-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-07-02 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joseph C. Sible 2020-07-02 21:21:36 UTC
Consider this C code:

int puts(const char *);
int snprintf(char *, unsigned long, const char *, ...);
unsigned long strspn(const char *, const char *);

struct TValue {
  union {
    long long i;
    double n;
  } value_;
  unsigned char tt_;
};

static int tostringbuff (struct TValue *num, char *str) {
  int len;
  if (num->tt_ == 3) {
    len = snprintf(str,50,"%lld",num->value_.i);
  } else {
    len = snprintf(str,50,"%.14g",num->value_.n);
    if (str[strspn(str, "-0123456789")] == '\0') {
      str[len++] = '.';
      str[len++] = '0';
    }
  }
  return len;
}

void unused (int *buff, struct TValue *num) {
  char junk[50];
  *buff += tostringbuff(num, junk);
}

char space[400];

void addnum2buff (int *buff, struct TValue *num) __attribute__((__noinline__));
void addnum2buff (int *buff, struct TValue *num) {
  *buff += tostringbuff(num, space);
}

int main(void) {
    int buff = 0;
    struct TValue num;
    num.value_.n = 1.0;
    num.tt_ = 19;
    addnum2buff(&buff, &num);
    puts(space);
}

It's supposed to print "1.0". When compiled with "gcc -O2", it instead crashes with SIGBUS. This appears to be a regression, since it works fine on GCC 9.

The minimization is my own, but the bug was originally found in the wild by actboy168 compiling Lua 5.4.0 on Arch Linux: http://lua-users.org/lists/lua-l/2020-07/msg00001.html

https://godbolt.org/z/RMc3RX
Comment 1 Jakub Jelinek 2020-07-02 22:04:15 UTC
Started with r10-4651-gafeb887562af17ea235fbec650ff6d16c412682a
From quick look it seems like a disagreement on how many arguments tostringbuff.part.0.isra has.
Comment 2 Joseph C. Sible 2020-07-03 02:02:17 UTC
Andrew Gierth posted this to the Lua mailing list:

> I think I see what's happening here, but I don't think I have an account
> on the gcc bug tracker to post it there (feel free to forward this).
> It's not (I think) a confusion over how many arguments the specialized
> tostringbuff.part.0.isra function has, but rather over the type, or
> equivalently the register allocation, for the first parameter.
> 
> The callsite is putting num->value_.n into %rdi, and &space into %rsi,
> as if the function were declared as taking (long long v, char *buf)
> rather than (double v, char *buf) which would require that num->value_.n
> be placed in %xmm0 and &space into %rdi.
> 
> But the code of the function itself is assuming that the incoming values
> are in %xmm0 and %rdi - %xmm0 is not touched because it's already in the
> right place for the snprintf call, while %rdi is left as the first arg
> to snprintf. The value 0x3ff0000000000000 is of course 1.0 as a double,
> which naturally does not work well as a pointer so it blows up.

I agree with the gist of that. I think this is more of a calling convention issue than an argument count issue. It's as if the following pseudo-C were written:

static int tostringbuff.part.0.isra.0 (union { long long i; double n; } value_ /* rdi */, char *str /* rsi */);

void addnum2buff (int *buff, struct TValue *num) {
    if(num->tt_ == 3) {
        buff += snprintf(space,50,"%lld",num->value_.i);
    } else {
        buff += tostringbuff.part.0.isra.0(num->value_ /* rdi */, space /* rsi */);
    }
}

static int tostringbuff.part.0.isra.0 (double n /* xmm0 */, char *str /* rdi */) {
    int len;
    len = snprintf(str,50,"%.14g",n);
    if(str[strspn(str, "-0123456789")] == '\0') {
      str[len++] = '.';
      str[len++] = '0';
    }
    return len;
}

In English, the compiler couldn't make up its mind as to how tostringbuff.part.0.isra.0 was declared. At the call site, the first parameter was a union, but at the definition, the first parameter was a double. Mixing up a union and its element is fine when they only live in memory since they have the same address, but here it was in a register. Since the calling convention puts those types in different kinds of registers, what the function thought was a pointer was actually the double value, predictably causing a crash when it was dereferenced.
Comment 3 Richard Biener 2020-07-03 06:58:38 UTC
Indeed in the IL I see long long actual arguments passed but the function argument type is double.  It looks like somehow argument modification
of the calls fails or is incomplete.  Martin?
Comment 4 Martin Jambor 2020-07-03 09:04:55 UTC
I'll have a look
Comment 5 Martin Jambor 2020-07-03 10:32:42 UTC
IPA-split puts the double access to the union in the .part function
and keeps only the long int access in the "original" function.
IPA-SRA thinks it can work with that but the code in "transitive" call
parameter splitting apparently does not handle this case properly.

The easiest fix and probably the one most suitable for backporting is
to prevent splitting of such unions with the following:

--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -3271,7 +3271,9 @@ all_callee_accesses_present_p (isra_param_desc *param_desc,
        continue;
       param_access *pacc = find_param_access (param_desc, argacc->unit_offset,
                                              argacc->unit_size);
-      if (!pacc || !pacc->certain)
+      if (!pacc
+         || !pacc->certain
+         || !types_compatible_p (argacc->type, pacc->type))
        return false;
     }
   return true;


Alternatively, we can of course handle the type mismatch and insert
appropriate V_C_E:

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 2cc4bc79dc1..de9bad78712 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -641,6 +641,12 @@ ipa_param_adjustments::modify_call (gcall *stmt,
            && trans_map[j].unit_offset == apm->unit_offset)
          {
            repl = trans_map[j].repl;
+           if (!useless_type_conversion_p (apm->type, TREE_TYPE (repl)))
+             {
+               repl = build1 (VIEW_CONVERT_EXPR, apm->type, repl);
+               repl = force_gimple_operand_gsi (&gsi, repl, true, NULL, true,
+                                                GSI_SAME_STMT);
+             }
            break;
          }
       if (repl)
Comment 6 Jakub Jelinek 2020-07-03 10:35:05 UTC
V_C_Es could be only used if both types have the same size, is that checked somewhere that the union e.g. doesn't have int and double members instead?
Comment 7 Martin Jambor 2020-07-03 11:34:04 UTC
Yes, IPA-SRA identifies accesses by both offset and size, so the situation would not have happened if the size was different.
Comment 8 Jakub Jelinek 2020-07-03 11:50:08 UTC
Ok.  Still, there can be problems in holding non-floating values in floating point parameters, because not all bits are necessarily preserved (e.g. padding bits inside of the floating point format).
Comment 9 Martin Jambor 2020-07-03 12:47:26 UTC
True. Richi expressed preference for avoiding the transform when there are type mismatches, so I'm currently bootstrapping that.  I guess we can always revisit the decision if we ever discover it would be really beneficial to perform the split.
Comment 10 CVS Commits 2020-07-03 15:41:35 UTC
The master branch has been updated by Martin Jambor <jamborm@gcc.gnu.org>:

https://gcc.gnu.org/g:b9a15a8325ba89b926e3c437b7961829a6b2fa2b

commit r11-1809-gb9a15a8325ba89b926e3c437b7961829a6b2fa2b
Author: Martin Jambor <mjambor@suse.cz>
Date:   Fri Jul 3 17:37:33 2020 +0200

    ipa-sra: Avoid transitive splits with type mismatches (PR 96040)
    
    PR 96040 revealed IPA-SRA, when checking whether an intended split is
    the same as the one in a called function does not also check if the
    types match and the transformation code does not handle any resulting
    type mismatches.  This patch simply avoids the the split in the case
    of mismatches, so that we do not have to be careful about invalid
    floating-point values being passed in floating point registers and
    related issues.
    
    gcc/ChangeLog:
    
    2020-07-03  Martin Jambor  <mjambor@suse.cz>
    
            PR ipa/96040
            * ipa-sra.c (all_callee_accesses_present_p): Do not accept type
            mismatched accesses.
    
    gcc/testsuite/ChangeLog:
    
    2020-07-03  Martin Jambor  <mjambor@suse.cz>
    
            PR ipa/96040
            * gcc.dg/ipa/pr96040.c: New test.
Comment 11 CVS Commits 2020-07-04 17:48:17 UTC
The releases/gcc-10 branch has been updated by Martin Jambor <jamborm@gcc.gnu.org>:

https://gcc.gnu.org/g:56a34e3e1cbb7d3b2f9298c14d4d3a3a030c7755

commit r10-8425-g56a34e3e1cbb7d3b2f9298c14d4d3a3a030c7755
Author: Martin Jambor <mjambor@suse.cz>
Date:   Sat Jul 4 19:46:52 2020 +0200

    ipa-sra: Avoid transitive splits with type mismatches (PR 96040)
    
    PR 96040 revealed IPA-SRA, when checking whether an intended split is
    the same as the one in a called function does not also check if the
    types match and the transformation code does not handle any resulting
    type mismatches.  This patch simply avoids the the split in the case
    of mismatches, so that we do not have to be careful about invalid
    floating-point values being passed in floating point registers and
    related issues.
    
    gcc/ChangeLog:
    
    2020-07-03  Martin Jambor  <mjambor@suse.cz>
    
            PR ipa/96040
            * ipa-sra.c (all_callee_accesses_present_p): Do not accept type
            mismatched accesses.
    
    gcc/testsuite/ChangeLog:
    
    2020-07-03  Martin Jambor  <mjambor@suse.cz>
    
            PR ipa/96040
            * gcc.dg/ipa/pr96040.c: New test.
    
    (cherry picked from commit b9a15a8325ba89b926e3c437b7961829a6b2fa2b)
Comment 12 Martin Jambor 2020-07-04 17:52:49 UTC
Fixed.