Bug 43247 - [4.4 Regression] Incorrect optimization while declaring array[1]
Summary: [4.4 Regression] Incorrect optimization while declaring array[1]
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.4.3
: P2 normal
Target Milestone: 4.5.0
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 43537 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-03-03 14:36 UTC by Olivier Goffart
Modified: 2012-03-13 13:04 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.2.4, 4.5.0
Known to fail: 4.3.0, 4.3.4, 4.4.3
Last reconfirmed: 2010-03-03 16:08:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Goffart 2010-03-03 14:36:54 UTC
This code should print "WORKS!!!!" a couple of time. But with -O2, g++ optimize the condition away.
It works with -O1

Tested with gcc 4.4.3 on linux x86_64 (Archlinux)


#include <stdio.h>
#include <string.h>
#include <stdlib.h>

struct QVectorTypedData
{
    int array[1];
};

int main(int , char **)
{
    QVectorTypedData *d;
    d = static_cast<QVectorTypedData *>(::malloc(sizeof(QVectorTypedData)+ (45-1) * sizeof(int)));
    memset(d->array, 0, 45 * sizeof(int));
    int *array = d->array;
    int count = 0;
    for (int i = 0; i < 10; i++) {
        fprintf(stderr, "%d %d %d %d\n", i, (i>=3), (i<=8), (i>=3) && (i <= 8));
        if (i >= 3 && i <= 8) {
            fprintf(stderr, "WORKS!!!\n");
        }
        array[i] = 4;
    }
    return 0;
}
Comment 1 Thiago Macieira 2010-03-03 14:41:44 UTC
Problem also happens on:

gcc 4.4.3 on linux 32-bit
gcc 4.4.1 on linux ARM (armel gnueabi)

Also reproducible with -O1 -ftree-vrp.
Comment 2 Thiago Macieira 2010-03-03 14:44:02 UTC
Also:
-O1 -ftree-vrp -fno-cprop-registers -fno-defer-pop -fno-guess-branch-probability -fno-if-conversion -fno-if-conversion2 -fno-ipa-pure-const -fno-ipa-reference -fno-merge-constants -fno-omit-frame-pointer -fno-split-wide-types -fno-tree-ch -fno-tree-copy-prop -fno-tree-copyrename -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse -fno-tree-fre -fno-tree-sink -fno-tree-sra -fno-tree-ter

However, if I add -fno-tree-ccp, the program starts to work as expected again.
Comment 3 Richard Biener 2010-03-03 16:08:00 UTC
Confirmed.
Comment 4 Andrew Pinski 2010-03-26 18:46:21 UTC
*** Bug 43537 has been marked as a duplicate of this bug. ***
Comment 5 H.J. Lu 2010-03-26 18:54:08 UTC
This is fixed by revision 151360:

http://gcc.gnu.org/ml/gcc-cvs/2009-09/msg00106.html

and was introduced by revision 118729:

http://gcc.gnu.org/ml/gcc-cvs/2006-11/msg00380.html
Comment 6 Thiago Macieira 2010-03-26 21:46:38 UTC
Is this fix going to be backported to the 4.4.x line?
Comment 7 Richard Biener 2010-03-27 10:45:51 UTC
I very much doubt that the cited revision fixed anything here.
Comment 8 H.J. Lu 2010-03-27 14:24:57 UTC
(In reply to comment #7)
> I very much doubt that the cited revision fixed anything here.
> 

If it is true, that only means that the bug is latent on trunk.
Comment 9 Richard Biener 2010-05-22 18:13:53 UTC
GCC 4.3.5 is being released, adjusting target milestone.
Comment 10 Thiago Macieira 2010-12-22 10:35:23 UTC
This is still not fixed. I can reproduce now with a different testcase, in 4.5.1. However, this time, the same code works fine in 4.4. The reason is again accessing an array out-of-bounds for elements that we know to be there. Pay attention to the way operator== is implemented in the following code.

If I compile it with -O1, it prints "true" as it should. If I compile it with -O2, it prints "false". If I compile it with -O1 -finline-small-functions -finline -findirect-inlining -fstrict-overflow and compare the disassembly with -O2 and a suitable list of -fno-*, the code is exactly identical, except for some instructions that should perform the copy of half of m1's data into m3. So in the end the comparison fails due to comparing to garbage.

=== code ===
#include <stdio.h>

template <int N, int M, typename T>
class QGenericMatrix
{
public:
    QGenericMatrix();
    QGenericMatrix(const QGenericMatrix<N, M, T>& other);
    explicit QGenericMatrix(const T *values);

    bool operator==(const QGenericMatrix<N, M, T>& other) const;
private:
    T m[N][M];    // Column-major order to match OpenGL.

    QGenericMatrix(int) {}       // Construct without initializing identity matrix
};

template <int N, int M, typename T>
QGenericMatrix<N, M, T>::QGenericMatrix(const QGenericMatrix<N, M, T>& other)
{
    for (int col = 0; col < N; ++col)
        for (int row = 0; row < M; ++row)
            m[col][row] = other.m[col][row];
}

template <int N, int M, typename T>
QGenericMatrix<N, M, T>::QGenericMatrix(const T *values)
{
    for (int col = 0; col < N; ++col)
        for (int row = 0; row < M; ++row)
            m[col][row] = values[row * N + col];
}

template <int N, int M, typename T>
bool QGenericMatrix<N, M, T>::operator==(const QGenericMatrix<N, M, T>& other) const
{
    for (int index = 0; index < N * M; ++index) {
        if (m[0][index] != other.m[0][index])
            return false;
    }
    return true;
}

typedef double qreal;
typedef QGenericMatrix<2, 2, qreal> QMatrix2x2;

int main(int , char**)
{
    qreal m1Data[] = {0.0, 0.0, 0.0, 0.0};
    QMatrix2x2 m1(m1Data);

    QMatrix2x2 m3 = m1;
    puts((m1 == m3) ? "true" : "false");
}
=== code ===

common args: -fno-exceptions -fno-rtti -fverbose-asm -march=core2 -mfpmath=sse (though x87 math also shows the same problem)

prints "true" with: -O1 -finline-small-functions -finline -findirect-inlining -fstrict-overflow

prints "false" with: -O2 -fno-align-functions -fno-align-jumps -fno-align-labels -fno-caller-saves -fno-tree-switch-conversion -fno-tree-vrp -fno-crossjumping -fno-cse-follow-jumps -fno-expensive-optimizations -fno-gcse -fno-ipa-cp -fno-ipa-sra -fno-optimize-register-move -fno-optimize-sibling-calls -fno-peephole2 -fno-regmove -fno-reorder-blocks -fno-reorder-functions -fno-rerun-cse-after-loop -fno-schedule-insns2 -fno-strict-aliasing -fno-strict-aliasing -fno-thread-jumps -fno-tree-builtin-call-dce -fno-tree-pre
Comment 11 Andrew Pinski 2010-12-22 17:48:55 UTC
>The reason is again accessing an array out-of-bounds for elements that we know to be there.

No that is undefined and different from the original testcase.
Comment 12 Thiago Macieira 2010-12-22 19:55:38 UTC
(In reply to comment #11)
> >The reason is again accessing an array out-of-bounds for elements that we know to be there.
> 
> No that is undefined and different from the original testcase.

Ok. Shall I open a new report with the new information?
Comment 13 Richard Biener 2011-06-27 12:14:08 UTC
4.3 branch is being closed, moving to 4.4.7 target.
Comment 14 Jakub Jelinek 2012-03-13 13:04:33 UTC
Fixed in 4.5+, 4.4 is no longer supported.