Bug 26587 - [4.1/4.2 Regression] strict aliasing incorrectly pre-loads an array element with loops
Summary: [4.1/4.2 Regression] strict aliasing incorrectly pre-loads an array element w...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.1.0
: P3 critical
Target Milestone: 4.1.1
Assignee: Richard Biener
URL:
Keywords: alias, wrong-code
Depends on:
Blocks:
 
Reported: 2006-03-07 01:47 UTC by Alexander Peslyak
Modified: 2006-03-07 16:49 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work: 4.0.3
Known to fail: 4.1.0 4.2.0
Last reconfirmed: 2006-03-07 14:26:19


Attachments
testcase (572 bytes, text/plain)
2006-03-07 01:50 UTC, Alexander Peslyak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Peslyak 2006-03-07 01:47:19 UTC
I _think_ I understand C strict aliasing - it's based on the type of expressions.  In the testcase below, the type of expressions is the same, which is why I think it's a compiler bug.

The bug is reproducible with gcc 4.1.0 on multiple platforms.  I've tried i386 (i686) and alpha myself and am able to reproduce the problem on both.  The original reporter of the problem with my program (the real one, not the testcase) was able to reproduce the problem on other architectures.

The problem does not occur with gcc 2.95.3 and 3.4.5.

Basically, in the program below, gcc 4.1.0 would pre-load BF_current.P[0] and BF_current.P[17] into registers or stack locations - however the loop modifies P[0] via ptr.

#include <stdio.h>
#include <string.h>

#define BF_N 16

typedef unsigned int BF_word;
typedef BF_word BF_key[BF_N + 2];

static struct {
        BF_word S[4][0x100];
        BF_key P;
} BF_current;

#define BF_ROUND(L, R, N) \
        tmp1 = L & 0xFF; \
        tmp2 = L >> 8; \
        tmp2 &= 0xFF; \
        tmp3 = L >> 16; \
        tmp3 &= 0xFF; \
        tmp4 = L >> 24; \
        tmp1 = BF_current.S[3][tmp1]; \
        tmp2 = BF_current.S[2][tmp2]; \
        tmp3 = BF_current.S[1][tmp3]; \
        tmp3 += BF_current.S[0][tmp4]; \
        tmp3 ^= tmp2; \
        R ^= BF_current.P[N + 1]; \
        tmp3 += tmp1; \
        R ^= tmp3;

#define BF_ENCRYPT \
        L ^= BF_current.P[0]; \
        { \
                int i; \
                for (i = 0; i < BF_N; i += 2) { \
                        BF_ROUND(L, R, i); \
                        BF_ROUND(R, L, i+1); \
                } \
        } \
        tmp4 = R; \
        R = L; \
        L = tmp4 ^ BF_current.P[BF_N + 1];

int main(void)
{
        BF_word L, R;
        BF_word tmp1, tmp2, tmp3, tmp4, *ptr;
        BF_word i, j;

        for (i = 0; i < 4; i++)
                for (j = 0; j < 0x100; j++)
                        BF_current.S[i][j] = (i + 0x12345678) * j;
        for (i = 0; i < BF_N + 2; i++)
                BF_current.P[i] = i * 0x98765432;

        L = R = 0;
        ptr = BF_current.P;
        do {
#ifndef WORKAROUND
                ptr += 2;
                BF_ENCRYPT;
                *(ptr - 2) = L;
                *(ptr - 1) = R;
#else
                BF_ENCRYPT;
                *ptr = L;
                *(ptr + 1) = R;
                ptr += 2;
#endif
        } while (ptr < &BF_current.P[BF_N + 2]);

        printf("%08x %08x\n", L, R);

        return 0;
}

host!user:~$ gcc gcc-4.1.0-aliasing-bug.c -o gcc-4.1.0-aliasing-bug -Wall -O2 -fno-strict-aliasing
host!user:~$ ./gcc-4.1.0-aliasing-bug 
8261e2e6 f1f1bc33
host!user:~$ gcc gcc-4.1.0-aliasing-bug.c -o gcc-4.1.0-aliasing-bug -Wall -O2 
host!user:~$ ./gcc-4.1.0-aliasing-bug 
46be060b df072f90
host!user:~$ gcc gcc-4.1.0-aliasing-bug.c -o gcc-4.1.0-aliasing-bug -Wall -O2 -DWORKAROUND
host!user:~$ ./gcc-4.1.0-aliasing-bug 
8261e2e6 f1f1bc33
host!user:~$ uname -mrs
Linux 2.4.32-ow1 alpha
host!user:~$ gcc -v
Using built-in specs.
Target: alphaev56-unknown-linux-gnu
Configured with: ./configure --prefix=/home/user/gcc-4.1.0
Thread model: posix
gcc version 4.1.0

(i386 gives the same results)
Comment 1 Alexander Peslyak 2006-03-07 01:50:23 UTC
Created attachment 10979 [details]
testcase
Comment 2 Andrew Pinski 2006-03-07 02:03:52 UTC
Hmm -O2 -fno-ivopts allows for it to work.
Comment 3 Andrew Pinski 2006-03-07 02:08:20 UTC
This is interesting because the mainline works with or without -fno-strict-aliasing.

Confirmed a regression, hopefully someone will reduce the testcase further.
Comment 4 Richard Biener 2006-03-07 10:02:02 UTC
Reduced testcase:

extern int printf(const char *format, ...);
typedef unsigned int BF_word;
typedef BF_word BF_key[16 + 2];
static struct {
    BF_key P;
} BF_current;
int main(void)
{
    BF_word L, R;
    BF_word tmp1, tmp2, tmp3, tmp4, *ptr;
    L = R = 0;
    ptr = BF_current.P;
    do {
        int i;
        ptr += 2;
        L ^= BF_current.P[0];
        for (i = 0; i < 16; i += 2) {
            tmp1 = L & 0xFF;
            tmp3 += tmp1;
            L ^= tmp3;;
        }
        *(ptr - 2) = L;
    } while (ptr < &BF_current.P[16 + 2]);
    printf("%08x %08x\n", L, R);
    return 0;
}

-O2 -fno-strict-aliasing:
0001e6d4 00000000

-O2:
0001c100 00000000

-fno-ivopts does not fix it.
Comment 5 Richard Biener 2006-03-07 10:12:47 UTC
Umm, that has uninitialized vars.  The following is ok wrt -Wall -Werror:

extern int printf(const char *, ...);
typedef unsigned int BF_word;
typedef BF_word BF_key[16 + 2];
static struct {
    BF_key P;
} BF_current;
int main(void)
{
    BF_word L;
    BF_word tmp4, *ptr;
    BF_word i;
    for (i = 0; i < 16 + 2; i++)
	BF_current.P[i] = i * 0x98765432;
    L = 0;
    ptr = BF_current.P;
    do {
	ptr += 2;
	L ^= BF_current.P[0];
	tmp4 = L >> 24;
	L = tmp4 ^ BF_current.P[16 + 1];
	*(ptr - 2) = L;
    } while (ptr < &BF_current.P[16 + 2]);
    printf("%08x\n", L);
    return 0;
}

and prints 1fdb9752 vs. 1fdb974d.
Comment 6 Richard Biener 2006-03-07 10:39:01 UTC
With -fno-ivopts -fno-tree-vrp tree-optimizers produce the same for both 4.1.0 and 4.2.0 until tree loop-invariant-motion, which does the wrong transformation
because of bogus alias information:

<L11>:;
  
  # TMT.6_23 = PHI <TMT.6_24(3), TMT.6_25(5)>;
  # ptr_3 = PHI <&BF_current.P[0](3), ptr_9(5)>;
  # L_2 = PHI <0(3), L_14(5)>;
<L7>:;
  ptr_9 = ptr_3 + 8B;
  #   VUSE <BF_current_21>;
  D.1297_10 = BF_current.P[0];
  L_11 = L_2 ^ D.1297_10;
  tmp4_12 = L_11 >> 24; 
  #   VUSE <BF_current_21>;
  D.1298_13 = BF_current.P[17];
  L_14 = tmp4_12 ^ D.1298_13;
  D.1299_15 = ptr_9 - 8B;
  #   TMT.6_25 = V_MAY_DEF <TMT.6_23>;
  *D.1299_15 = L_14;
  if (ptr_9 < &BF_current.P[18]) goto <L12>; else goto <L4>;

the store to *D.1299_15 kills BF_current_21, which we don't notice.
Comment 7 Richard Biener 2006-03-07 10:41:33 UTC
The problem is latent on the mainline with -fno-tree-salias.
Comment 8 Richard Biener 2006-03-07 10:43:24 UTC
With -O2 -fno-ivopts -fno-tree-vrp -fno-tree-salias -fno-strict-aliasing we correctly get

<L11>:;

  # BF_current_23 = PHI <BF_current_21(5), BF_current_24(7)>;
  # ptr_3 = PHI <&BF_current.P[0](5), ptr_9(7)>;
  # L_2 = PHI <0(5), L_14(7)>;
<L7>:;
  ptr_9 = ptr_3 + 8B;
  #   VUSE <BF_current_23>;
  D.1541_10 = BF_current.P[0];
  L_11 = L_2 ^ D.1541_10;
  tmp4_12 = L_11 >> 24;
  #   VUSE <BF_current_23>;
  D.1542_13 = BF_current.P[17];
  L_14 = tmp4_12 ^ D.1542_13;
  D.1543_15 = ptr_9 - 8B;
  #   BF_current_24 = V_MAY_DEF <BF_current_23>;
  *D.1543_15 = L_14;
  if (ptr_9 < &BF_current.P[18]) goto <L12>; else goto <L4>;
Comment 9 Richard Biener 2006-03-07 11:02:13 UTC
Pointed-to sets for pointers in main

ptr_8, points-to vars: { BF_current }
ptr_3, points-to vars: { BF_current }
ptr_9, points-to vars: { BF_current }
D.1543_15, is dereferenced, points-to anything
D.1544_16, points-to vars: { BF_current }

is wrong because of:

handle_ptr_arith (struct constraint_expr lhs, tree expr)
{
  tree op0, op1;
  struct constraint_expr base, offset;

  if (TREE_CODE (expr) != PLUS_EXPR)
    return false;

and

(gdb) call debug_tree(expr)
 <minus_expr 0x40194360
    type <pointer_type 0x402260b8
        type <integer_type 0x4021df74 BF_word sizes-gimplified public unsigned SI
            size <integer_cst 0x4018d3f0 constant invariant 32>
            unit size <integer_cst 0x4018d180 constant invariant 4>
            align 32 symtab 0 alias set -1 precision 32 min <integer_cst 0x4018d468 0> max <integer_cst 0x4018d450 4294967295>
            pointer_to_this <pointer_type 0x402260b8>>
        sizes-gimplified unsigned SI size <integer_cst 0x4018d3f0 32> unit size <integer_cst 0x4018d180 4>
        align 32 symtab 0 alias set -1>
   
    arg 0 <ssa_name 0x40228e38 type <pointer_type 0x402260b8>
        visited var <var_decl 0x40199210 ptr> def_stmt <modify_expr 0x40194264>
        version 9
        ptr-info 0x4021f740>
    arg 1 <integer_cst 0x402271c8 type <pointer_type 0x402260b8> constant invariant 8>>

I have a fix.
Comment 10 patchapp@dberlin.org 2006-03-07 12:20:18 UTC
Subject: Bug number PR tree-optimization/26587

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00357.html
Comment 11 Richard Biener 2006-03-07 14:26:19 UTC
Patches posted.
Comment 12 Richard Biener 2006-03-07 16:23:41 UTC
Subject: Bug 26587

Author: rguenth
Date: Tue Mar  7 16:23:38 2006
New Revision: 111808

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=111808
Log:
2006-03-07  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/26587
	* tree-ssa-structalias.c (handle_ptr_arith): Handle MINUS_EXPR.

	* gcc.dg/torture/pr26587.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr26587.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-structalias.c

Comment 13 Richard Biener 2006-03-07 16:26:18 UTC
Subject: Bug 26587

Author: rguenth
Date: Tue Mar  7 16:26:14 2006
New Revision: 111809

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=111809
Log:
2006-03-07  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/26587
	* tree-ssa-structalias.c (handle_ptr_arith): Handle MINUS_EXPR.

	* gcc.dg/torture/pr26587.c: New testcase.

Added:
    branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/torture/pr26587.c
Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_1-branch/gcc/tree-ssa-structalias.c

Comment 14 Richard Biener 2006-03-07 16:49:27 UTC
Fixed.