59320 – ftree-vrp breaks simple loops

Bug 59320 - ftree-vrp breaks simple loops

Summary: ftree-vrp breaks simple loops

Status:	RESOLVED INVALID

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	regression (show other bugs)
Version:	4.8.2

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2013-11-27 19:33 UTC by David Kaufmann
Modified:	2013-11-30 19:56 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2013-11-28 00:00:00

Attachments
File which triggers the bug by using a specific line drawing style. (1.02 KB, image/x-xfig) 2013-11-27 19:33 UTC, David Kaufmann	Details
preprocessor output (60.20 KB, text/plain) 2013-11-28 17:08 UTC, David Kaufmann	Details
gdb "backtrace full" when segfaulting (2.87 KB, text/x-log) 2013-11-28 18:54 UTC, David Kaufmann	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description David Kaufmann 2013-11-27 19:33:46 UTC

Created attachment 31312 [details]
File which triggers the bug by using a specific line drawing style.

With -O1 there is no problem.

With "-O1 -ftree-vrp" and with "-O2" it fails to compile the following loop properly. instead of running the loop 4 times it runs until it segfaults ("il<nd" always returns 1, resulting in "0<4 => 1" as well as "30542<4 => 1" when showing "il<nd" in a printf.

With "-O2 -fno-tree-vrp" it runs properly.

--
int il, nd;
nd = 4;
for (il =0; il<nd; il ++) {
    if (fl[il] != 0.) {
        ....
    }
}
--
(Source: xfig 3.2.5c, w_drawprim.c:1401)
gcc version 4.8.2 20131017 (Red Hat 4.8.2-1) (GCC)

fedora 14 does not has that issue, but i do not know the specific gcc-version which was used then.

Comment 1 Andrew Pinski 2013-11-27 19:37:33 UTC

What is the size of fl ?

Comment 2 Andrew Pinski 2013-11-27 19:38:01 UTC

(In reply to Andrew Pinski from comment #1)
> What is the size of fl ?

What I mean are you going past the bounds of the array fl?

Comment 3 David Kaufmann 2013-11-27 20:03:09 UTC

(In reply to Andrew Pinski from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > What is the size of fl ?
> 
> What I mean are you going past the bounds of the array fl?

yes, i am, but if the comparison would work it would not.

from the source file:
--
static int ndash_dot = 4;
static float dash_dot[4] = { 1., 0.5, 0., 0.5 };
int il, nd;
float *fl;
fl=dash_dot;
nd=ndash_dot;
--
there is no other position where this is modified.

this code produces the output following it:
--
for (il =0; il<nd; il ++) {
    printf("il: %i, nd: %i, il<nd: %i\n", il, nd, il<nd);
    if (fl[il] != 0.) {
        ....
    }
}
--
Output:
--
il: 0, nd: 4, il<nd: 1
il: 1, nd: 4, il<nd: 1
il: 2, nd: 4, il<nd: 1
il: 3, nd: 4, il<nd: 1
il: 4, nd: 4, il<nd: 1
il: 5, nd: 4, il<nd: 1
il: 6, nd: 4, il<nd: 1
il: 7, nd: 4, il<nd: 1
...
and so on.
--

Comment 4 Richard Biener 2013-11-28 11:33:30 UTC

Please attach a testcase or at least preprocessed source of w_drawprim.c.

Comment 5 David Kaufmann 2013-11-28 17:08:37 UTC

Created attachment 31320 [details]
preprocessor output

i hope this is the proper preprocessed source, i generated it with the "-E" parameter.

the whole command was:
"gcc -E -O2 -fno-strength-reduce -fno-strict-aliasing -I/usr/include/X11 -I/usr/include -I/usr/include/X11 -I. -I/usr/include -Dlinux -D__amd64__ -D_POSIX_C_SOURCE=199309L -D_POSIX_SOURCE -D_XOPEN_SOURCE -D_BSD_SOURCE -D_SVID_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DFUNCPROTO=15 -DNARROWPROTO -DUSE_XPM -DXAW3D -DXAW3D1_5E -DUSE_JPEG -DNEWARROWTYPES w_drawprim.c -o w_drawprim.i"

the buggy function is on line 18136, but it does not seem to have been changed.

Comment 6 David Kaufmann 2013-11-28 18:33:21 UTC

(In reply to Richard Biener from comment #4)
> Please attach a testcase or at least preprocessed source of w_drawprim.c.
a testcase is the first attachment to this bug with either xfig version 3.2.5b or 3.2.5c.
debian and fedora have currently version 3.2.5b in their repository.

i could reproduce the error in fedora-f18.x86_64 and fedora-f19.x86_64.

currently i do not have another system which has been updated recently and is configured for compiling, maybe i'll find one in the next few days.

any input is welcome.

Comment 7 Joost VandeVondele 2013-11-28 18:33:36 UTC

(In reply to David Kaufmann from comment #5)
> Created attachment 31320 [details]
> preprocessor output
> the buggy function is on line 18136, but it does not seem to have been
> changed.

static unsigned char dash_list[16][2]
static int ndash_dot = 4;
nd=ndash_dot;
for (il =0; il<nd; il ++){
dash_list[op][il] = ...
}
so clearly il<nd == 1

Comment 8 David Kaufmann 2013-11-28 18:54:05 UTC

Created attachment 31323 [details]
gdb "backtrace full" when segfaulting

backtrace from gdb when opening xfig with provided testcase.

Comment 9 David Kaufmann 2013-11-28 19:01:42 UTC

(In reply to Joost VandeVondele from comment #7)
> (In reply to David Kaufmann from comment #5)
> > Created attachment 31320 [details]
> > preprocessor output
> > the buggy function is on line 18136, but it does not seem to have been
> > changed.
> 
> static unsigned char dash_list[16][2]
> static int ndash_dot = 4;
> nd=ndash_dot;
> for (il =0; il<nd; il ++){
> dash_list[op][il] = ...
> }
> so clearly il<nd == 1

i am not sure, that probably is another bug. (as dash_list[op]-size is only two, but dash_list[op][0], dash_list[op][1] and dash_list[op][3] get set.)

however, gdb says it crashes on line 1401:
1401:          if (fl[il] != 0.) {

from the backtrace it can be seen that il somehow is 441896, if i do a printf on il it shows il counting up from 0 up to approx 448000, then it segfaults.

Also, as this is happening only with -O2 (and with -O1 -ftree-vrp) but not with -O1, so i suspect some weird behavior in -ftree-vrp.

Comment 10 Joost VandeVondele 2013-11-28 19:10:23 UTC

(In reply to David Kaufmann from comment #9)
> (In reply to Joost VandeVondele from comment #7)
> > (In reply to David Kaufmann from comment #5)
> > > Created attachment 31320 [details]
> > > preprocessor output
> > > the buggy function is on line 18136, but it does not seem to have been
> > > changed.
> > 
> > static unsigned char dash_list[16][2]
> > static int ndash_dot = 4;
> > nd=ndash_dot;
> > for (il =0; il<nd; il ++){
> > dash_list[op][il] = ...
> > }
> > so clearly il<nd == 1
> 
> i am not sure, that probably is another bug. (as dash_list[op]-size is only
> two, but dash_list[op][0], dash_list[op][1] and dash_list[op][3] get set.)

no weird behavior of vrp. given the size of dash_list (2) it correctly establishes the range on the value, i.e. that il must be either 0 or 1 (any other value would trigger out-of-bounds, which can't happen in a valid C program). All the rest is a consequence of this. So, the bug is in the C program, not the compiler.

Comment 11 David Kaufmann 2013-11-29 01:40:40 UTC

(In reply to Joost VandeVondele from comment #10)
> no weird behavior of vrp. given the size of dash_list (2) it correctly
> establishes the range on the value, i.e. that il must be either 0 or 1 (any
> other value would trigger out-of-bounds, which can't happen in a valid C
> program). All the rest is a consequence of this. So, the bug is in the C
> program, not the compiler.
Okay, I tried to apply a temporary fix in the Code and the bug went away.

I just don't understand how it can happen, that if there is a imminent out-of-range happening somewhere in the loop and therefor il only may have the value 0 or 1 it compiles it to code, which then stops checking the condition and therefor running the loop until it gets out-of-bounds.

What surprises me is that a printf on "il < nd = (il<nd)" returns always 1, no matter the shown values.
(As stated before printf returns "0 < 4 = 1", "1 < 4 = 1", "2 < 4 = 1", "3 < 4 = 1", "4 < 4 = 1", "5 < 4 = 1", and so on until a value of il of approx. 440k.
Then it shows a sigsegv on reading something around fl[448054]. - remember, fl has only 4 elements...)

Anyway, thankfully now I know where to fix the bug. As -Wall does not show any warnings and the SIGSEGV is shown on the wrong line combined with the misleading printf-output I would never have found that in ages.

Thanks,
David

Comment 12 Jakub Jelinek 2013-11-29 08:10:15 UTC

That is simple, accessing out of bounds element is undefined behavior and the compiler may optimize based on the fact that undefined behavior does not happen.
As the upper bound is not constant, no warning is emitted on it, the compiler just assumes that in a valid program nd must be < 2 before entering the loop.

Comment 13 Manuel López-Ibáñez 2013-11-30 03:31:31 UTC

(In reply to Jakub Jelinek from comment #12)
> That is simple, accessing out of bounds element is undefined behavior and
> the compiler may optimize based on the fact that undefined behavior does not
> happen.
> As the upper bound is not constant, no warning is emitted on it, the
> compiler just assumes that in a valid program nd must be < 2 before entering
> the loop.

Will -fsanitize=undefined catch these? If so, perhaps the message shown before reporting a bug should mention trying this first.

Comment 14 Joost VandeVondele 2013-11-30 07:51:10 UTC

(In reply to Manuel López-Ibáñez from comment #13)
> Will -fsanitize=undefined catch these? If so, perhaps the message shown
> before reporting a bug should mention trying this first.

I guess -fsanitize=address would catch the out-of-bounds access, but I don't think the message should become too complex (it is already a bit narrow and C-centric, for example '-fno-strict-aliasing' means nothing for Fortran programs), and furthermore that option would only be available for a recent gcc.

Comment 15 Manuel López-Ibáñez 2013-11-30 19:56:57 UTC

(In reply to Joost VandeVondele from comment #14)
> (In reply to Manuel López-Ibáñez from comment #13)
> > Will -fsanitize=undefined catch these? If so, perhaps the message shown
> > before reporting a bug should mention trying this first.
> 
> I guess -fsanitize=address would catch the out-of-bounds access, but I don't
> think the message should become too complex (it is already a bit narrow and
> C-centric, for example '-fno-strict-aliasing' means nothing for Fortran
> programs), and furthermore that option would only be available for a recent
> gcc.

It would be fine to have per-language recommendations, that is,

For C/C++ do, try this.
For Fortran, try that.
For Ada, try ...

It is a waste of effort by everybody when bugs are reported for broken code. It is not like gcc has too few PRs. There are 3268 unconfirmed ones.