The following program, compiled with -O3, never returns: #include <string> #include <deque> #include <iterator> #include <iostream> #include <boost/iterator/filter_iterator.hpp> struct Foo { std::string bar, s = ""; char a = '\0'; }; int main() { const std::deque<Foo> foos(14, {""}); const std::string test {}; const auto p = [test] (const auto& foo) { return foo.bar == test; }; using boost::make_filter_iterator; const auto begin = make_filter_iterator(p, std::cbegin(foos), std::cend(foos)); const auto end = make_filter_iterator(p, std::cend(foos), std::cend(foos)); std::cout << std::distance(begin, end) << std::endl; } Observations: - GCC with optimisations -O2 or less returns as expected. - Clang (3.8) returns the correct answer with any optimisation level. - Changing std::deque to std::vector or std::list results in expected behaviour. - The 14 is critical; anything less and the problem disappears. - The sizeof(Foo) is important; removing s or a makes the problem go away. - Capturing test by reference, or just comparing to a constant expression (e.g. foo.bar == " ") results in normal behaviour. - There are no compiler warnings (with -Wall -Wextra -pedantic). - Valgrind reports no errors. - Use fsanitize=undefined and the problem goes away. From: http://stackoverflow.com/q/39424753/2970186
-fno-tree-slp-vectorize "fixes" the issue.
Created attachment 39596 [details] unreduced testcase
-fno-strict-aliasing also resolves.
@Jonathan Wakely: Why do you not consider this critical?
(In reply to Daniel Cooke from comment #4) > @Jonathan Wakely: Why do you not consider this critical? We don't use this field. So normal is just the default.
And if we did use it, it would be the severity as determined by the GCC project, not according to the bug reporter. Bug reporters tend to consider their own bugs of higher importance than everyone else's bugs :)
Well, the best solution would be to simply disable the field for users. This what other bugzillas do.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77551
I'm unable to reduce the test case... The deque_iterator has these members: _Elt_pointer _M_cur; _Elt_pointer _M_first; _Elt_pointer _M_last; _Map_pointer _M_node; the first 3 elements have alias set 12 while _M_node has alias set 13. when all 4 elements are assigned that becomes 2 vector statements, but all use alias set 12. but _M_node is read back in the loop using alias set 13. and thus it seems _M_node is constant. It can be seen first in expand. _M_node is read without non-vectorized: (insn 290 289 291 44 (set (reg/f:DI 117 [ prephitmp_166 ]) (mem/f/c:DI (plus:DI (reg/f:DI 82 virtual-stack-vars) (const_int -176 [0xffffffffffffff50])) [12 MEM[(struct _Deque_iterator *)&D.66017]._M_last+0 S8 A128])) -1 (nil)) (insn 291 290 292 44 (set (reg/f:DI 149 [ _215 ]) (mem/f/c:DI (plus:DI (reg/f:DI 82 virtual-stack-vars) (const_int -168 [0xffffffffffffff58])) [13 MEM[(struct _Deque_iterator *)&D.66017]._M_node+0 S8 A64])) /var/tmp/gcc_test/usr/local/include/c++/7.0.0/bits/stl_deque.h:173 -1 (nil)) _M_node is written using _M_last's alias set in insn 307: (insn 305 304 306 46 (set (mem/c:V2DI (plus:DI (reg/f:DI 82 virtual-stack-vars) (const_int -192 [0xffffffffffffff40])) [12 MEM[(struct Foo * *)&D.66017]+0 S16 A128]) (reg:V2DI 237)) /var/tmp/gcc_test/usr/local/include/c++/7.0.0/bits/stl_deque.h:174 -1 (nil)) (insn 306 305 307 46 (set (reg:V2DI 238) (vec_concat:V2DI (reg/f:DI 117 [ prephitmp_166 ]) (reg:DI 90 [ _25 ]))) /var/tmp/gcc_test/usr/local/include/c++/7.0.0/bits/stl_deque.h:258 -1 (nil)) (insn 307 306 308 46 (set (mem/c:V2DI (plus:DI (reg/f:DI 82 virtual-stack-vars) (const_int -176 [0xffffffffffffff50])) [12 MEM[(struct Foo * *)&D.66017 + 16B]+0 S16 A128]) (reg:V2DI 238)) /var/tmp/gcc_test/usr/local/include/c++/7.0.0/bits/stl_deque.h:174 -1 (nil))
(In reply to Bernd Edlinger from comment #9) > I'm unable to reduce the test case... > > > The deque_iterator has these members: > > _Elt_pointer _M_cur; > _Elt_pointer _M_first; > _Elt_pointer _M_last; > _Map_pointer _M_node; > > the first 3 elements have alias set 12 > while _M_node has alias set 13. > > when all 4 elements are assigned that becomes > 2 vector statements, but all use alias set 12. The vector stores should have become the aliasing set of the struct instead as suggested on the store widening patch.
(In reply to Bernd Edlinger from comment #9) > I'm unable to reduce the test case... Creduce is running and hopefully I will have a small testcase tomorrow morning. trippels@gcc2-power8 ~ % cat check.sh #!/bin/sh ulimit -t 2 clang++ -Wall -Werror -std=c++14 -O3 -fsanitize=undefined test.ii if ! test $? = 0; then exit 1 fi ./a.out 2>&1 | grep "runtime error" if test $? = 0; then exit 1 fi ./a.out if ! test $? = 0; then exit 1 fi ~/gcc_5/usr/local/bin/g++ -std=c++14 -O3 test.ii -w -Wfatal-errors if ! test $? = 0; then exit 1 fi ./a.out if ! test $? = 0; then exit 1 fi g++ -O2 test.ii -w -Wfatal-errors if ! test $? = 0; then exir 1 fi valgrind -q --error-exitcode=1 ./a.out if ! test $? = 0; then exit 1 fi g++ -O3 test.ii -w -Wfatal-errors if ! test $? = 0; then exit 1 fi ./a.out if ! test $? = 137; then exit 1 fi
(In reply to Markus Trippelsdorf from comment #11) > (In reply to Bernd Edlinger from comment #9) > > I'm unable to reduce the test case... > > Creduce is running and hopefully I will have a small testcase tomorrow > morning. > > good luck. here is a hacky patch, test case passes, but I don't know if it is going design-wise in the right direction: --- tree-vect-stmts.c 2016-09-11 19:53:01.841182370 +0200 +++ tree-vect-stmts.c 2016-09-11 19:34:15.691171637 +0200 @@ -5482,6 +5483,7 @@ gimple *new_stmt; int vf; vec_load_store_type vls_type; + bool alias_p = false; if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo) return false; @@ -5740,6 +5742,16 @@ first_stmt = GROUP_FIRST_ELEMENT (stmt_info); first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt)); group_size = GROUP_SIZE (vinfo_for_stmt (first_stmt)); + next_stmt = GROUP_NEXT_ELEMENT (vinfo_for_stmt (first_stmt)); + for (i = 1; i < group_size; i++) + { + struct data_reference *next_dr + = STMT_VINFO_DATA_REF (vinfo_for_stmt (next_stmt)); + if (get_alias_set (DR_REF (first_dr)) + != get_alias_set (DR_REF (next_dr))) + alias_p = true; + next_stmt = GROUP_NEXT_ELEMENT (vinfo_for_stmt (next_stmt)); + } GROUP_STORE_COUNT (vinfo_for_stmt (first_stmt))++; @@ -6174,8 +6188,10 @@ dataref_ptr, dataref_offset ? dataref_offset - : build_int_cst (reference_alias_ptr_type - (DR_REF (first_dr)), 0)); + : build_int_cst (alias_p + ? ptr_type_node + : reference_alias_ptr_type + (DR_REF (first_dr)), 0)); align = TYPE_ALIGN_UNIT (vectype); if (aligned_access_p (first_dr)) misalign = 0;
Created attachment 39606 [details] somewhat reduced testcase
The reduced testcase needs libstdc++.so from trunk. Otherwise you'll get undefined reference errors to _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC1EPcOS3_ .
The patch should at least work (loads are similarly affected). At some point we want some infrastructure in the middle-end to chose a better fallback (as Andrew says, if possible we can use the alias-set of the containing aggregate for example).
(In reply to Richard Biener from comment #15) > The patch should at least work (loads are similarly affected). At some > point we want some infrastructure in the middle-end to chose a better > fallback (as Andrew says, if possible we can use the alias-set of the > containing aggregate for example). Good. Then I go ahead and look for a way how to fix all of the different load and store code path. If the whole group does not agree in the alias_set I would fall back to ptr_type_node, I think if I put that code in an extra function it can be changed later to the alias-set of the containing aggregate.
Author: edlinger Date: Wed Sep 21 13:03:59 2016 New Revision: 240313 URL: https://gcc.gnu.org/viewcvs?rev=240313&root=gcc&view=rev Log: gcc: 2016-09-21 Bernd Edlinger <bernd.edlinger@hotmail.de> PR tree-optimization/77550 * tree-vect-stmts.c (create_array_ref): Change parameters. (get_group_alias_ptr_type): New function. (vectorizable_store, vectorizable_load): Use get_group_alias_ptr_type. testsuite: 2016-09-21 Bernd Edlinger <bernd.edlinger@hotmail.de> PR tree-optimization/77550 * g++.dg/pr77550.C: New test. Added: trunk/gcc/testsuite/g++.dg/pr77550.C Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-stmts.c
Author: edlinger Date: Mon Oct 24 20:21:06 2016 New Revision: 241495 URL: https://gcc.gnu.org/viewcvs?rev=241495&root=gcc&view=rev Log: 2016-10-24 Bernd Edlinger <bernd.edlinger@hotmail.de> Backport from mainline 2016-09-21 Bernd Edlinger <bernd.edlinger@hotmail.de> PR tree-optimization/77550 * tree-vect-stmts.c (create_array_ref): Change parameters. (get_group_alias_ptr_type): New function. (vectorizable_store, vectorizable_load): Use get_group_alias_ptr_type. testsuite: 2016-09-21 Bernd Edlinger <bernd.edlinger@hotmail.de> PR tree-optimization/77550 * g++.dg/pr77550.C: New test. Added: branches/gcc-6-branch/gcc/testsuite/g++.dg/pr77550.C Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/testsuite/ChangeLog branches/gcc-6-branch/gcc/tree-vect-stmts.c
fixed