Seems to be an optimization bug for some templatized functions that I have (specifically vbox_blur in the precompiled output pen.ii, which is bzipped as requested). It's pretty severe. In this case, it crashes in release mode (I think because it is forgetting to decrement the iterator outpen as specified after the inner loop in vbox_blur (should be right above all of the test functions at the bottom of the file). Another bizarre thing, is if function calls are inserted into the loop (say print functions for debugging, though it has similar effects for some dummy functions I've tried), it can change whether or not it crashes and such. In the end it looks like an optimization bug, where it optimizes its way around some pretty important operations (pointer operations mostly it seems...). The program outputs patterns representing the result of the operations (the box blur is the one that's freaking out). The pattern printed out below the CORRECT OUTPUT line is what the AFTER BOX BLUR output should look like. You can also see the correct output for all of the substeps by compiling it in full debug mode with no optimizations. If you need some examples for the function call thing, just ask, I have plenty (I can send the normal source file with all the commented out print functions for debugging if need be). Here's GCC's compiler output: source='pen.cpp' object='pen.o' libtool=no \ depfile='.deps/pen.Po' tmpdepfile='.deps/pen.TPo' \ depmode=gcc3 /bin/sh ../config/depcomp \ g++ -DHAVE_CONFIG_H -I.. -I../ETL -I.. -I.. -v -save-temps -g -O2 -DNDEBUG -Wall -c -o pen.o `test -f 'pen.cpp' || echo './'`pen.cpp Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/specs Configured with: /var/tmp/portage/gcc-3.3.2-r5/work/gcc-3.3.2/configure --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/3.3 --includedir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3/info --enable-shared --host=i686-pc-linux-gnu --target=i686-pc-linux-gnu --with-system-zlib --enable-languages=c,c++,f77,objc --enable-threads=posix --enable-long-long --disable-checking --enable-cstdio=stdio --enable-clocale=generic --enable-__cxa_atexit --enable-version-specific-runtime-libs --with-gxx-include-dir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include/g++-v3 --with-local-prefix=/usr/local --enable-shared --enable-nls --without-included-gettext --disable-multilib Thread model: posix gcc version 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7) /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/cc1plus -E -D__GNUG__=3 -quiet -v -I.. -I../ETL -I.. -I.. -MD pen.d -MF .deps/pen.TPo -MP -MT pen.o -MQ pen.o -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=2 -D_GNU_SOURCE -DHAVE_CONFIG_H -DNDEBUG pen.cpp -Wall -O2 pen.ii ignoring nonexistent directory "/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/../../../../i686-pc-linux-gnu/include" ignoring duplicate directory ".." ignoring duplicate directory ".." #include "..." search starts here: #include <...> search starts here: .. ../ETL /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include/g++-v3 /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include/g++-v3/i686-pc-linux-gnu /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include/g++-v3/backward /usr/local/include /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include /usr/include End of search list. /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/cc1plus -fpreprocessed pen.ii -quiet -dumpbase pen.cpp -auxbase-strip pen.o -g -O2 -Wall -version -o pen.s GNU C++ version 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7) (i686-pc-linux-gnu) compiled by GNU C version 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7). GGC heuristics: --param ggc-min-expand=51 --param ggc-min-heapsize=40960 /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/../../../../i686-pc-linux-gnu/bin/as -V -Qy -o pen.o pen.s GNU assembler version 2.14.90.0.8 (i686-pc-linux-gnu) using BFD version 2.14.90.0.8 20040114 g++ -v -save-temps -g -O2 -DNDEBUG -Wall -o pen pen.o -lpthread Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/specs Configured with: /var/tmp/portage/gcc-3.3.2-r5/work/gcc-3.3.2/configure --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/3.3 --includedir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3/info --enable-shared --host=i686-pc-linux-gnu --target=i686-pc-linux-gnu --with-system-zlib --enable-languages=c,c++,f77,objc --enable-threads=posix --enable-long-long --disable-checking --enable-cstdio=stdio --enable-clocale=generic --enable-__cxa_atexit --enable-version-specific-runtime-libs --with-gxx-include-dir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include/g++-v3 --with-local-prefix=/usr/local --enable-shared --enable-nls --without-included-gettext --disable-multilib Thread model: posix gcc version 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7) /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o pen /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/../../../crt1.o /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/../../../crti.o /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/crtbegin.o -L/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2 -L/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/../../../../i686-pc-linux-gnu/lib -L/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/../../.. pen.o -lpthread -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/crtend.o /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/../../../crtn.o
Created attachment 6477 [details] The preprocessed file for compilation bzipped
My bad for putting the gcc output in the post. I also forgot to post my system specs (Athlon Tbird 1.2 Ghz, 512 M RAM, A7S333 Mobo running Gentoo linux, kernel 2.6.0). First bug ever, so please excuse the oops :).
Ok, I've narrowed it down farther. In both hbox_blur and vbox_blur if you call a function inside the inner loop and pass it x for hbox and y for vbox (printf for example), it the faulty optimization is circumvented. ex. line 32951 of pen.ii insert: printf("%d",y); That loop will then work fine. Hope this helps.
Can you try 3.4.0 and fix your code at the same time because there is some invalid C++ in there which will not compile for 3.4.0?
Sure I'll try, it isn't tested at all in gentoo, so it could have some issues, but it never hurts to try :).
Created attachment 6478 [details] smaller testcase (15 KB)
You are violating aliasing rules in the deferencing of the value returned here: operator const generic_pen_row_iterator<const value_type>&()const { return *reinterpret_cast<generic_pen_row_iterator<const value_type>*>(this); } So this is invalid, please fix your code.
Sorry about that. GCC 3.4 is taking forever to rebuild on my machine. You should be able to just get rid of that function. I'm just upkeeping this code, but I'm pretty sure it's not used.
You have other code which violates C++ aliasing rules, either use -fno-strict-aliasing as a work around or fix your code to avoid the aliasing rules problems. So this is not a bug in gcc still.
I was not speaking about compiler errors, the bugs I was referring to are incorrect optimizations in gcc 3.3.x. It compiles fine in gcc 3.3.2 and 3.3.3. I will gladly fix any non-standard C++, however, at the moment, I have no way of detecting which individual errors you are referring to (I am working on setting up a stable gcc 3.4 environment as we speak). As is, I don't think gcc 3.4 is particularly relevant, since, from what I've heard, the optimization system has been changed substantially. If I am incorrect please let me know. I will try and fix the errors (you will have to wait until I can detect them). However, 3.3 is the target. If I should just revert to 3.2 and wait for 3.4 to become stable on my distro, then I will do so. Thank you for your patience as I am unfamiliar with the protocols of gcc bug registration.
No what I am saying is that your code does things which are "undefined" by the C++ standard so the compiler is able to do anything (including erasing your hard drive, no gcc will not produce code to do that though). Casting to a different type and then accessing the data inside is declared as undefined (there are a few exceptions to that rule, like you access the data via a char pointer and there is no aliasing problems, you can cast to a type which adds const or violate and there is no aliasing problems but casting between different pointer types and then accessing the data for both of the type is undefined).
Created attachment 6482 [details] Further trimmed, and illustrated hack to fix optimization error This is a further trimmed version of the file. I've successfully compiled it with gcc 3.4 using MinGW. Otherwise, I'm not sure to which aliasing errors you are referring, and why exactly they invalidate this as a bug with respect to gcc.
Ok, I understand that. However, if it is undefined why does gcc not give an error? (sorry I missed your last post as I was adding the attachment). Also, the pen class was originally designed by someone else in such a way to allow it to work with non standard compiler packing structures (although that was not accomplished as you'd also need a byte stride for the x dimension to do that effectively). I.E. you could refer to a member of a struct which was defined to align on a 2 byte boundary and still be able to jump from one element to the other. Perhaps this is an unnecessary feature, and I suppose it could be modified to work on a strict element basis rather than byte alignment. However, Is any way to accomplish this besides the method used? The only thing I can think of to satisfy this constraint is to be more explicit (i.e. rather than ((char *&)p)+= stride; do p = (value_type*)((char*)p + stride); ). Would that not work as well? I'll try that and see.... Thanks for the help though. Please excuse me if I seem a bit irritable, this is the weirdest thing I have ever run into :).
Hmmmmm, ok, so I changed a bunch of the functions to use p = (pointer)((char*)p + stride); and it seems to have fixed the mis-optimization. (it could just be because the current function has to recurse another level, but I'm willing to bet that it just doesn't get confused). So my question now is why does gcc even allow the cast to (char*&)? If it will get so confused shouldn't it tell me? It's worth noting that the & part of the cast was a quick fix to get around compiler errors that I was getting about aliasing... it seemed to work at the time :P. So I'll certainly keep that in mind when I'm designing future stuff (I didn't write this really in the first place... so you know :P).
Because undefined means just that, it can be defined any which way the compiler wants it to be defined and differently at different times too. There can be diagnostic for undefined code but there are other bug reports about adding warnings for code like this. I will note that one way to work around this is to use -fno-strict-aliasing. Also the reason why it compiles fine with 3.3.3 and not 3.4.0 is because 3.4.0 is a stricter in following the C++ standard.
Subject: Re: Optimization (of templatized functions?) "adruab at voria dot com" <gcc-bugzilla@gcc.gnu.org> writes: | So my question now is why does gcc even allow the cast to (char*&)? | If it will get so confused shouldn't it tell me? GCC did not get confused, and it did not do any mis-compilation. -- Gaby
Created attachment 6483 [details] The code written correctly (a bit ugly though...) I was referring to gcc 3.4 (under mingw) when I was saying it wasn't producing any errors. It worked fine when I tried it. Yes, it probably was non-standard C++ code, but that doesn't change the fact it didn't give me an error (I would certainly prefer it to if it's going to cause a crash). Does compiling the last attachment give anyone else an error (bug15837-1.cc)? And pardon me for saying so, but if gcc does something different when using optimizations, I'd certainly say that's a mistake when optimizing. As it happens, from what you guys have been saying it probably should have given me an error saying it didn't know how to deal with it. The fact remains, that it didn't. So where the problem lies I don't know, but there IS a problem. I'm just trying to help out here, so please don't get all angry at me because I mis-interpreted what the exact problem was.
In some cases, gcc has to make assumptions (base on what the standard allows it to) for aggressive optimizations. Some of these assumptions can be checked, in which case the compiler will give you an error or warning message. In some other cases, checking whether your code allows to make such assumptions is just out of the picture, for example if the compiler would have to check where exactly each pointer was initialized and how you casted it around -- aliasing refers to the question whether you access each object only through pointers to this object's type, or if, for example, you access a double via pointers to integers. The latter is not allowed by the standard, but the compiler can't prove in some cases that you don't. If your code violates aliasing assumptions, there is a -Wstrict-aliasing switch (or similar) that can warn about some, though not all, invalid constructs in codes. W.
Heh, thanks, that makes sense. I've never actually run into any issue like that before, where problems are undiagnosable by the compiler :). So it working correctly using a p = (type*)((char*)p + stride) method is purely coincidental? Does it violate the same aliasing rules? I assume at this point that you are talking about aliasing as in relation to memory boundaries (in which case the other would still break the code). Can you point me to spot in the spec that discusses aliasing rules? Thanks for the help, I hope that this bug can help diagnose problems like this in the future. Adruab
Reopening to ...
Mark as a dup of bug 21920. *** This bug has been marked as a duplicate of 21920 ***