Bug 67442 - [5 Regression] GCC 5.2.0 on x86_64 creates invalid address on specific array index calculation through pointer
Summary: [5 Regression] GCC 5.2.0 on x86_64 creates invalid address on specific array ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 5.2.0
: P3 normal
Target Milestone: 5.3
Assignee: Richard Biener
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2015-09-03 07:00 UTC by Jussi Judin
Modified: 2015-09-28 10:46 UTC (History)
2 users (show)

See Also:
Host:
Target: x86_64
Build:
Known to work: 6.0
Known to fail: 5.2.0
Last reconfirmed: 2015-09-03 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jussi Judin 2015-09-03 07:00:10 UTC
GCC 5.2.0 and the current git version (ae436f3bdc66153005b7a260e0239521679f3965) create invalid address to access array index calculated in specific way on x86_64 platform:

short foo[100];

int main()
{
    short* bar = &foo[50];
    short i = 1;
    short j = 1;
    short value = bar[8 - i * 2 * j];
    return value;
}

$ ~/local/gcc-bin/bin/gcc -v -save-temps -Wall -Wextra -g3 -ggdb tst.c
$ ./a.out
Segmentation fault (core dumped)
$ gdb a.out core
Core was generated by `./a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000004004d5 in main () at tst.c:8
8           short value = bar[8 - i * 2 * j];
(gdb) disassemble /m 0x00000000004004d5
8           short value = bar[8 - i * 2 * j];
   0x00000000004004ae <+24>:    movswq -0xa(%rbp),%rdx
   0x00000000004004b3 <+29>:    movswq -0xc(%rbp),%rax
   0x00000000004004b8 <+34>:    imul   %rax,%rdx
   0x00000000004004bc <+38>:    mov    %rdx,%rax
   0x00000000004004bf <+41>:    shl    $0x1e,%rax
   0x00000000004004c3 <+45>:    sub    %rdx,%rax
   0x00000000004004c6 <+48>:    shl    $0x2,%rax
   0x00000000004004ca <+52>:    lea    0x10(%rax),%rdx
   0x00000000004004ce <+56>:    mov    -0x8(%rbp),%rax
   0x00000000004004d2 <+60>:    add    %rdx,%rax
=> 0x00000000004004d5 <+63>:    movzwl (%rax),%eax
   0x00000000004004d8 <+66>:    mov    %ax,-0xe(%rbp)
(gdb) p &bar[8 - i * 2 * j]
$3 = (short *) 0x600970 <foo+112>
(gdb) p/x $rax
$2 = 0x100600970

So there is 0x100000000 added somewhere to the address that is used for the actual array access. I have seen value 0x500000000 added in the production code I encountered this issue. It seems to need following parts to trigger: reference to an array (&foo[50], foo could also be on stack, but it creates more nasty addresses), array index access through pointer by [const1 - var1 * const2 * var2], where const1 > 0 and const2 > 1. I just used values 8 and 2 to demonstrate that we don't access negative indexes, whereas the actual signal processing code I encountered this does.

If I assign this array index to a variable and use it to access the array through pointer, then GCC creates correct binary. Also if I use -m32 to compile for x86, this does not result in segmentation fault. So it looks like it's x86_64 specific issue.

The system I used to compile this GCC is x86_64 Ubuntu 14.04 based one. GCC returns following information about itself:

$ ~/local/gcc-bin/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/ejusjud/local/gcc-bin/bin/gcc
COLLECT_LTO_WRAPPER=/local/ejusjud/gcc-bin/bin/../libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/home/ejusjud/local/gcc-bin
Thread model: posix
gcc version 6.0.0 20150902 (experimental) (GCC) 


And about the tools and other libraries during the compilation:


GNU C11 (GCC) version 6.0.0 20150902 (experimental) (x86_64-pc-linux-gnu)
        compiled by GNU C version 6.0.0 20150902 (experimental), GMP version 5.1.3, MPFR version 3.1.2-p3, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C11 (GCC) version 6.0.0 20150902 (experimental) (x86_64-pc-linux-gnu)
        compiled by GNU C version 6.0.0 20150902 (experimental), GMP version 5.1.3, MPFR version 3.1.2-p3, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 2b4ce747c000701038232915f15d53af
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wextra' '-g3' '-ggdb' '-mtune=generic' '-march=x86-64'
 as -v --64 -o tst.o tst.s
GNU assembler version 2.24 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.24
Comment 1 Richard Biener 2015-09-03 08:24:52 UTC
Confirmed on x86_64 and the 5 branch with -O1 even:

main:
.LFB0:
        .cfi_startproc
        movabsq $foo+4294967408, %rax
        movswl  (%rax), %eax
        ret

.original has:

  short int value = *(bar + (sizetype) (((long unsigned int) i * (long unsigned int) j) * 4294967292 + 16));

works correctly with -fstrict-overflow (or -O2).  Some bug in folding somewhere.

4.8 and 4.9 have

  short int value = *(bar + (sizetype) (((long unsigned int) i * (long unsigned int) j) * 18446744073709551612 + 16));
Comment 2 Marek Polacek 2015-09-03 08:27:37 UTC
Seems this started with r210113 - wide_int merge.
Comment 3 Richard Biener 2015-09-03 08:44:20 UTC
tree built by

0x00000000009f4f2a in associate_trees (loc=0, t1=0x7ffff69ea230, 
    t2=0x7ffff69e9378, code=MULT_EXPR, type=0x7ffff68d09d8)
    at /space/rguenther/src/svn/trunk/gcc/fold-const.c:919
919           return build2_loc (loc, code, type, fold_convert_loc (loc, type, t1),
920                              fold_convert_loc (loc, type, t2));

#1  0x0000000000a1870e in fold_binary_loc (loc=0, code=MULT_EXPR, 
    type=0x7ffff68d09d8, op0=0x7ffff69ea208, op1=0x7ffff69fe0e0)
    at /space/rguenther/src/svn/trunk/gcc/fold-const.c:9565
#2  0x0000000000a2653c in fold_build2_stat_loc (loc=0, code=MULT_EXPR, 
    type=0x7ffff68d09d8, op0=0x7ffff69ea208, op1=0x7ffff69fe0e0)
    at /space/rguenther/src/svn/trunk/gcc/fold-const.c:12422
#3  0x0000000000a09f78 in extract_muldiv_1 (t=0x7ffff69ea1b8, 
    c=0x7ffff68e9318, code=MULT_EXPR, wide_type=0x7ffff68d09d8, 
    strict_overflow_p=0x7fffffffc747)
    at /space/rguenther/src/svn/trunk/gcc/fold-const.c:6153
#4  0x0000000000a088d9 in extract_muldiv (t=0x7ffff69ea1b8, c=0x7ffff68e9318, 
    code=MULT_EXPR, wide_type=0x7ffff68d09d8, strict_overflow_p=0x7fffffffc747)
    at /space/rguenther/src/svn/trunk/gcc/fold-const.c:5877

so extract_muldiv goes berzerk somewhere (ugh, how I hate this beast...)

Hmm, no, it looks wide-int related:

      /* If these are the same operation types, we can associate them
         assuming no overflow.  */
      if (tcode == code)
        {
          bool overflow_p = false;
          bool overflow_mul_p;
          signop sign = TYPE_SIGN (ctype);
          wide_int mul = wi::mul (op1, c, sign, &overflow_mul_p);
          overflow_p = TREE_OVERFLOW (c) | TREE_OVERFLOW (op1);
          if (overflow_mul_p
              && ((sign == UNSIGNED && tcode != MULT_EXPR) || sign == SIGNED))
            overflow_p = true;
          if (!overflow_p)
            return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
                                wide_int_to_tree (ctype, mul));

so we have op1 == -2, c == 2, both 'int' but ctype is unsinged long.  What
wide_int_to_tree does now is _not_ sign-extend mul properly!?

Richard?
Comment 4 Richard Biener 2015-09-03 08:59:37 UTC
That is,

tree
wide_int_to_tree (tree type, const wide_int_ref &pcst)
{
...
  unsigned int prec = TYPE_PRECISION (type);
  signop sgn = TYPE_SIGN (type);
...
  wide_int cst = wide_int::from (pcst, prec, sgn);

is according to its docs "The value is extended from its precision according to the sign of the type", but that isn't what anyone would expect.  IIRC
the wide_int_ref doesn't have a "sign", so we can't do the correct thing.

So IMHO we should require pcst.get_precision () == prec and maybe offer an
overload

tree
wide_int_to_tree (tree type, const wide_int_ref &pcst, signop sgn);

that can specify the sign of pcst (or maybe require that always...).
Comment 5 Richard Biener 2015-09-03 09:02:21 UTC
I wonder how many bugs of this kind we have lurking in the tree... It works
fine when type has the same signedness as pcst of course, but as we don't
know pcst sign we can't check for that case...  The only way to get all
places that need auditing is to assert the precisions match :(
Comment 6 Richard Biener 2015-09-03 09:09:33 UTC
Ok, truncation also should work, thus

  gcc_assert (prec <= pcst.get_precision ());

(trying to see how far bootstrap goes with that)
Comment 7 Richard Biener 2015-09-03 09:11:48 UTC
Ok, so fold_convert_const_int_from_int makes sure to use a widest_int.  So
the fold code in question should do the same?  But then I suppose overflow
detection doesn't work anymore.

Un-assigning for now, I'm going to be on vacation starting from tomorrow.
Comment 8 Richard Biener 2015-09-03 09:27:56 UTC
/space/rguenther/src/svn/trunk2/gcc/genmatch.c:4640:1: internal compiler error:in wide_int_to_tree, at tree.c:1398
 }
 ^
0x12cfa95 wide_int_to_tree(tree_node*, generic_wide_int<wide_int_ref_storage<false> > const&)
        /space/rguenther/src/svn/trunk2/gcc/tree.c:1398
0x133cd51 output_constant
        /space/rguenther/src/svn/trunk2/gcc/varasm.c:4702

so we don't get very far.  Of course

      else if (TREE_CODE (exp) == INTEGER_CST)
        exp = wide_int_to_tree (saved_type, exp);

is just a stupid way to write fold_convert (saved_type, exp).  Or not, if
the special wide_int_to_tree behavior was desired.

Seems we get a bit further with the above "fixed".
Comment 9 Richard Biener 2015-09-03 10:04:08 UTC
Bootstrap finished with that assert and the fix (c,c++,fortran only, expanding that now).  I'm going to commit the varasm.c change if testing is fine.
Comment 10 Richard Biener 2015-09-03 10:38:32 UTC
Assert triggers through

/space/rguenther/src/svn/trunk2/gcc/testsuite/g++.dg/tree-ssa/pr49911.C:8:22: internal compiler error: in wide_int_to_tree, at tree.c:1398^M
0xf49a69 wide_int_to_tree(tree_node*, generic_wide_int<wide_int_ref_storage<false> > const&)^M
        /space/rguenther/src/svn/trunk2/gcc/tree.c:1398^M
0xcab4e4 set_min_and_max_values_for_integral_type(tree_node*, int, signop)^M
        /space/rguenther/src/svn/trunk2/gcc/stor-layout.c:2645^M
0x62c701 finish_enum_value_list(tree_node*)^M
        /space/rguenther/src/svn/trunk2/gcc/cp/decl.c:13267^M

-fstrict-enums related.  set_min_and_max_values_for_integral_type
doesn't handle differing sgn correctly.
Comment 11 Richard Biener 2015-09-15 14:21:19 UTC
Testing

@@ -6166,8 +6173,12 @@ extract_muldiv_1 (tree t, tree c, enum t
              && ((sign == UNSIGNED && tcode != MULT_EXPR) || sign == SIGNED))
            overflow_p = true;
          if (!overflow_p)
-           return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
-                               wide_int_to_tree (ctype, mul));
+           {
+             mul = wide_int::from (mul, TYPE_PRECISION (ctype),
+                                   TYPE_SIGN (TREE_TYPE (op1)));
+             return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
+                                 wide_int_to_tree (ctype, mul));
+           }
        }
 
       /* If these operations "cancel" each other, we have the main
Comment 12 Richard Biener 2015-09-16 07:25:49 UTC
Author: rguenth
Date: Wed Sep 16 07:25:15 2015
New Revision: 227818

URL: https://gcc.gnu.org/viewcvs?rev=227818&root=gcc&view=rev
Log:
2015-09-16  Richard Biener  <rguenther@suse.de>

	PR middle-end/67442
	* fold-const.c (extract_muldiv_1): Properly extend multiplication
	result before builting a tree via wide_int_to_tree.

	* gcc.dg/torture/pr67442.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr67442.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog
Comment 13 Richard Biener 2015-09-16 07:26:01 UTC
Fixed on trunk sofar.
Comment 14 Richard Biener 2015-09-28 10:46:10 UTC
Fixed.
Comment 15 Richard Biener 2015-09-28 10:46:27 UTC
Author: rguenth
Date: Mon Sep 28 10:45:55 2015
New Revision: 228199

URL: https://gcc.gnu.org/viewcvs?rev=228199&root=gcc&view=rev
Log:
2015-09-28  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2015-08-03  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/66917
	* tree-vectorizer.h (struct dataref_aux): Add base_element_aligned
	field.
	(DR_VECT_AUX): New macro.
	(set_dr_misalignment): Adjust.
	(dr_misalignment): Likewise.
	* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
	Compute whether the base is at least element aligned.
	* tree-vect-stmts.c (ensure_base_align): Adjust.
	(vectorizable_store): If the base is not element aligned
	preserve alignment of the original access if misalignment is unknown.
	(vectorizable_load): Likewise.

	2015-09-16  Richard Biener  <rguenther@suse.de>

	PR middle-end/67442
	* fold-const.c (extract_muldiv_1): Properly extend multiplication
	result before builting a tree via wide_int_to_tree.

	* gcc.dg/torture/pr67442.c: New testcase.

Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr67442.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/fold-const.c
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
    branches/gcc-5-branch/gcc/tree-vect-data-refs.c
    branches/gcc-5-branch/gcc/tree-vect-stmts.c
    branches/gcc-5-branch/gcc/tree-vectorizer.h