Created attachment 27147 [details] source code (generated with -E) compiling the attached "real-life" code I get c++ -v -O3 -std=c++0x -ftree-loop-if-convert-stores -c buggy.i Using built-in specs. COLLECT_GCC=c++ COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-apple-darwin11.3.0/4.7.0/lto-wrapper Target: x86_64-apple-darwin11.3.0 Configured with: ./configure --enable-languages=c,c++,fortran --disable-multilib --disable-bootstrap --enable-lto -disable-libitm Thread model: posix gcc version 4.7.0 20120205 (experimental) (GCC) COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.7.3' '-v' '-O3' '-std=c++11' '-ftree-loop-if-convert-stores' '-c' '-shared-libgcc' '-mtune=core2' /usr/local/libexec/gcc/x86_64-apple-darwin11.3.0/4.7.0/cc1plus -fpreprocessed buggy.i -fPIC -quiet -dumpbase buggy.i -mmacosx-version-min=10.7.3 -mtune=core2 -auxbase buggy -O3 -std=c++11 -version -ftree-loop-if-convert-stores -o /var/folders/hd/vml6pgj48xjfkp006s6djxf80000gq/T//ccT2Uqbf.s GNU C++ (GCC) version 4.7.0 20120205 (experimental) (x86_64-apple-darwin11.3.0) compiled by GNU C version 4.7.0 20111201 (experimental), GMP version 4.3.1, MPFR version 2.4.1, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C++ (GCC) version 4.7.0 20120205 (experimental) (x86_64-apple-darwin11.3.0) compiled by GNU C version 4.7.0 20111201 (experimental), GMP version 4.3.1, MPFR version 2.4.1, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: 3c7081ac5da07d4d6f6ca789c3e35f66 unhandled expression in get_expr_operands(): <truth_not_expr 0x1050e00a0 type <boolean_type 0x141b1bb28 bool sizes-gimplified public unsigned type_6 QI size <integer_cst 0x141b0cba0 constant 8> unit size <integer_cst 0x141b0cbc0 constant 1> align 8 symtab 0 alias set 14 canonical type 0x141b1bb28 precision 1 min <integer_cst 0x141b0cfa0 0> max <integer_cst 0x141b0cfe0 1> pointer_to_this <pointer_type 0x1013bfe70> reference_to_this <reference_type 0x1013bff18>> arg 0 <lt_expr 0x104f9d900 type <boolean_type 0x141b1bb28 bool> arg 0 <ssa_name 0x105523550 type <real_type 0x141b1be70 float> visited var <var_decl 0x104fb0aa0 maxpix>def_stmt maxpix_135 = MEM[(struct SiStripTemplate *)templ_76(D) + 40B]; version 135> arg 1 <ssa_name 0x105523820 type <real_type 0x141b1be70 float> visited var <var_decl 0x105475b40 D.104278>def_stmt D.104278_144 = qscale_123 * D.104277_143; version 144> SiStripTemplateReco.cc:160:5> SiStripTemplateReco.cc:160:5> SiStripTemplateReco.cc: In function ‘int SiStripTemplateReco::StripTempReco1D(int, float, float, float, std::vector<float>&, SiStripTemplate&, float&, float&, float&, int&, int, float&)’: SiStripTemplateReco.cc:79:5: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035 Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions. pb-d-128-141-131-26:bugs48 innocent$ c++ -v -O3 -std=c++0x -c buggy.i Using built-in specs. COLLECT_GCC=c++ COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-apple-darwin11.3.0/4.7.0/lto-wrapper Target: x86_64-apple-darwin11.3.0 Configured with: ./configure --enable-languages=c,c++,fortran --disable-multilib --disable-bootstrap --enable-lto -disable-libitm Thread model: posix gcc version 4.7.0 20120205 (experimental) (GCC) COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.7.3' '-v' '-O3' '-std=c++11' '-c' '-shared-libgcc' '-mtune=core2' /usr/local/libexec/gcc/x86_64-apple-darwin11.3.0/4.7.0/cc1plus -fpreprocessed buggy.i -fPIC -quiet -dumpbase buggy.i -mmacosx-version-min=10.7.3 -mtune=core2 -auxbase buggy -O3 -std=c++11 -version -o /var/folders/hd/vml6pgj48xjfkp006s6djxf80000gq/T//ccLlzfNR.s GNU C++ (GCC) version 4.7.0 20120205 (experimental) (x86_64-apple-darwin11.3.0) compiled by GNU C version 4.7.0 20111201 (experimental), GMP version 4.3.1, MPFR version 2.4.1, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C++ (GCC) version 4.7.0 20120205 (experimental) (x86_64-apple-darwin11.3.0) compiled by GNU C version 4.7.0 20111201 (experimental), GMP version 4.3.1, MPFR version 2.4.1, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: 3c7081ac5da07d4d6f6ca789c3e35f66 COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.7.3' '-v' '-O3' '-std=c++11' '-c' '-shared-libgcc' '-mtune=core2' as -arch x86_64 -force_cpusubtype_ALL -o buggy.o /var/folders/hd/vml6pgj48xjfkp006s6djxf80000gq/T//ccLlzfNR.s COMPILER_PATH=/usr/local/libexec/gcc/x86_64-apple-darwin11.3.0/4.7.0/:/usr/local/libexec/gcc/x86_64-apple-darwin11.3.0/4.7.0/:/usr/local/libexec/gcc/x86_64-apple-darwin11.3.0/:/usr/local/lib/gcc/x86_64-apple-darwin11.3.0/4.7.0/:/usr/local/lib/gcc/x86_64-apple-darwin11.3.0/ LIBRARY_PATH=/usr/local/lib/gcc/x86_64-apple-darwin11.3.0/4.7.0/:/usr/local/lib/gcc/x86_64-apple-darwin11.3.0/4.7.0/../../../:/usr/lib/ COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.7.3' '-v' '-O3' '-std=c++11' '-c' '-shared-libgcc' '-mtune=core2' same on linux with 4.7.0 c++ -O3 -std=c++0x -ftree-loop-if-convert-stores -c buggy.i -v Using built-in specs. COLLECT_GCC=/afs/cern.ch/cms/slc5_amd64_gcc470/external/gcc/4.7.0/bin/c++ Target: x86_64-unknown-linux-gnu Configured with: ../configure --prefix=/build/da/build-BOOTSTRAP_slc5_amd64_gcc470/b/tmp/BUILDROOT/de8f21fa6a50f532872e71e7ff72a173/opt/cmssw/slc5_amd64_gcc470/external/gcc/4.7.0 --disable-multilib --disable-nls --enable-languages=c,c++,fortran --enable-gold=yes --enable-lto --with-gmp=/build/da/build-BOOTSTRAP_slc5_amd64_gcc470/b/tmp/BUILDROOT/de8f21fa6a50f532872e71e7ff72a173/opt/cmssw/slc5_amd64_gcc470/external/gcc/4.7.0 --with-mpfr=/build/da/build-BOOTSTRAP_slc5_amd64_gcc470/b/tmp/BUILDROOT/de8f21fa6a50f532872e71e7ff72a173/opt/cmssw/slc5_amd64_gcc470/external/gcc/4.7.0 --with-mpc=/build/da/build-BOOTSTRAP_slc5_amd64_gcc470/b/tmp/BUILDROOT/de8f21fa6a50f532872e71e7ff72a173/opt/cmssw/slc5_amd64_gcc470/external/gcc/4.7.0 --with-ppl=/build/da/build-BOOTSTRAP_slc5_amd64_gcc470/b/tmp/BUILDROOT/de8f21fa6a50f532872e71e7ff72a173/opt/cmssw/slc5_amd64_gcc470/external/gcc/4.7.0 --with-cloog=/build/da/build-BOOTSTRAP_slc5_amd64_gcc470/b/tmp/BUILDROOT/de8f21fa6a50f532872e71e7ff72a173/opt/cmssw/slc5_amd64_gcc470/external/gcc/4.7.0 --enable-cloog-backend=isl --enable-shared CC='gcc -fPIC' CXX='c++ -fPIC' CPP=cpp CXXCPP='c++ -E' Thread model: posix gcc version 4.7.0 20120302 (prerelease) (GCC) COLLECT_GCC_OPTIONS='-O3' '-std=c++11' '-ftree-loop-if-convert-stores' '-c' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /afs/cern.ch/cms/slc5_amd64_gcc470/external/gcc/4.7.0/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/cc1plus -fpreprocessed buggy.i -quiet -dumpbase buggy.i -mtune=generic -march=x86-64 -auxbase buggy -O3 -std=c++11 -version -ftree-loop-if-convert-stores -o /tmp/innocent/ccEsHv8G.s GNU C++ (GCC) version 4.7.0 20120302 (prerelease) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.7.0 20120302 (prerelease), GMP version 5.0.2, MPFR version 3.0.1, MPC version 0.9 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C++ (GCC) version 4.7.0 20120302 (prerelease) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.7.0 20120302 (prerelease), GMP version 5.0.2, MPFR version 3.0.1, MPC version 0.9 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 69c1686c229166b95e41aad94dc9fb81 SiStripTemplateReco.cc: In function 'int SiStripTemplateReco::StripTempReco1D(int, float, float, float, std::vector<float>&, SiStripTemplate&, float&, float&, float&, int&, int, float&)': SiStripTemplateReco.cc:79:5: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035 Please submit a full bug report, with preprocessed source if appropriate.
markus@x4 tmp % cat test.cpp #include <vector> int a, b; void foo (std::vector<float> &cluster) { int j; float xsum[0]; for (; a ; ++j) { xsum[j] = cluster[j]; if (xsum[j] > 0) xsum[j] = 0; } if (xsum[0]) b = 0; } markus@x4 tmp % gcc test.cpp -std=c++0x -ftree-loop-if-convert-stores -O1 test.cpp: In function ‘void foo(std::vector<float>&)’: test.cpp:3:6: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035 Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions.
Mine.
(gdb) call debug_gimple_stmt (stmt) _ifc_.9_27 = !(D.11217_6 > 0.0) ? _ifc_.8_26 : _ifc_.7_20; the negate is spurious. I have a patch. if-conversion is also incredibly stupid, transforming if (cond) x = a; else x = b; to x = cond ? a : x; x = !cond ? b : x; and only DSE removes the first dead store. But we keep both conditionals as nothing even tries to optimize them: D.1966_8 = *D.1965_7; _ifc_.3_12 = xsum[j_21]; _ifc_.5_19 = D.1966_8 > 0.0 ? 0.0 : _ifc_.3_12; D.1983_25 = D.1966_8 > 0.0; D.1984_26 = ~D.1983_25; _ifc_.8_27 = D.1983_25 ? _ifc_.5_19 : D.1966_8; xsum[j_21] = _ifc_.8_27; Reduced testcase: int a, b; float xsum[100]; void foo (float *cluster) { int j; for (; a ; ++j) { xsum[j] = cluster[j]; if (xsum[j] > 0) xsum[j] = 0; } if (xsum[0]) b = 0; } I have a patch.
Ready to test the patch. I've another code that produces the same ICE in stl_algo.h:3264 not easy to reproduce in a small example...
Created attachment 27151 [details] patch
patch applied to latest trunk. success on both cases. thanks. v. p.s. optimizing the if-conversion to produce a single comparison will be appreciated as well
On Fri, 13 Apr 2012, vincenzo.innocente at cern dot ch wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52969 > > --- Comment #6 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2012-04-13 11:59:59 UTC --- > patch applied to latest trunk. > success on both cases. > thanks. > v. > > p.s. optimizing the if-conversion to produce a single comparison will be > appreciated as well It looks like RTL optimization is able to get rid of the redundant one for the testcase. Do you have a testcase where this obviously not happens? I do have a patch that should address this. Richard.
Author: rguenth Date: Fri Apr 13 12:22:16 2012 New Revision: 186416 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186416 Log: 2012-04-13 Richard Guenther <rguenther@suse.de> PR tree-optimization/52969 * tree-if-conv.c (predicate_mem_writes): Properly gimplify the condition for the COND_EXPR and handle predicate negation by swapping the COND_EXPR arms. * gcc.dg/torture/pr52969.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr52969.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-if-conv.c
Author: rguenth Date: Fri Apr 13 12:27:02 2012 New Revision: 186417 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186417 Log: 2012-04-13 Richard Guenther <rguenther@suse.de> PR tree-optimization/52969 * tree-if-conv.c (predicate_mem_writes): Properly gimplify the condition for the COND_EXPR and handle predicate negation by swapping the COND_EXPR arms. * gcc.dg/torture/pr52969.c: New testcase. Added: branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr52969.c Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/testsuite/ChangeLog branches/gcc-4_7-branch/gcc/tree-if-conv.c
Fixed.
I do not have a clear case in hand with evidence of "double" compare I will have a closer look to "real life" code. btw I just noticed that your test case does not vectorize even if I rewrite as for (; a ; ++j) xsum[j] = (cluster[j] > 0.) ? 0. : cluster[j]; any idea why? c++ -O3 -c ifconv.cc -ftree-loop-if-convert-stores -ftree-vectorizer-verbose=9 Analyzing loop at ifconv.cc:10 10: ===== analyze_loop_nest ===== 10: === vect_analyze_loop_form === 10: not vectorized: control flow in loop. 10: bad loop form. ifconv.cc:3: note: vectorized 0 loops in function. pb-d-128-141-131-26:bugs48 innocent$ c++ -O3 -c ifconv.cc -ftree-loop-if-convert-stores -ftree-vectorizer-verbose=9 Analyzing loop at ifconv.cc:10 10: ===== analyze_loop_nest ===== 10: === vect_analyze_loop_form === 10: not vectorized: control flow in loop. 10: bad loop form. ifconv.cc:3: note: vectorized 0 loops in function. pb-d-128-141-131-26:bugs48 innocent$ c++ -Ofast -c ifconv.cc -ftree-loop-if-convert-stores -ftree-vectorizer-verbose=9 Analyzing loop at ifconv.cc:10 10: ===== analyze_loop_nest ===== 10: === vect_analyze_loop_form === 10: not vectorized: multiple exits. 10: bad loop form. ifconv.cc:3: note: vectorized 0 loops in function. pb-d-128-141-131-26:bugs48 innocent$ c++ -Ofast -c ifconv.cc -ftree-vectorizer-verbose=9 Analyzing loop at ifconv.cc:10 10: ===== analyze_loop_nest ===== 10: === vect_analyze_loop_form === 10: not vectorized: multiple exits. 10: bad loop form. ifconv.cc:3: note: vectorized 0 loops in function.
(In reply to comment #11) > I do not have a clear case in hand with evidence of "double" compare > I will have a closer look to "real life" code. > > btw > I just noticed that your test case does not vectorize even if > I rewrite as > for (; a ; ++j) > xsum[j] = (cluster[j] > 0.) ? 0. : cluster[j]; > > any idea why? It's because of the loop exit condition which we realize makes the loop execute at most once. If you rewrite it to for (; j < a; ++j) then it works.
Richard, please, look at PR59275. I think your testcase CAN produce not optimized code.