On Linux/x86-64, r215740 failed to profiledbootstrap when configured with --with-build-config=bootstrap-lto: /export/project/git/gcc-regression/gcc/gcc/genhooks.c: In function ‘emit_documentation’: /export/project/git/gcc-regression/gcc/gcc/genhooks.c:117:1: internal compiler error: in freqs_to_counts_path, at tree-ssa-threadupdate.c:981 emit_documentation (const char *in_fname) ^ 0x107902e freqs_to_counts_path /export/project/git/gcc-regression/gcc/gcc/tree-ssa-threadupdate.c:981 0x107902e ssa_fix_duplicate_block_edges(redirection_data*, ssa_local_info_t*) /export/project/git/gcc-regression/gcc/gcc/tree-ssa-threadupdate.c:1061 0x10791ea ssa_create_duplicates(redirection_data**, ssa_local_info_t*) /export/project/git/gcc-regression/gcc/gcc/tree-ssa-threadupdate.c:1275 0x1080149 void hash_table<redirection_data, xcallocator, false>::traverse_noresize<ssa_local_info_t*, &(ssa_create_duplicates(redirection_data**, ssa_local_info_t*))>(ssa_local_info_t*) /export/project/git/gcc-regression/gcc/gcc/hash-table.h:942 0x1080149 void hash_table<redirection_data, xcallocator, false>::traverse<ssa_local_info_t*, &(ssa_create_duplicates(redirection_data**, ssa_local_info_t*))>(ssa_local_info_t*) /export/project/git/gcc-regression/gcc/gcc/hash-table.h:964 0x10794b3 thread_block_1 /export/project/git/gcc-regression/gcc/gcc/tree-ssa-threadupdate.c:1515 0x107d556 thread_block /export/project/git/gcc-regression/gcc/gcc/tree-ssa-threadupdate.c:1559 0x107d556 thread_through_all_blocks(bool) /export/project/git/gcc-regression/gcc/gcc/tree-ssa-threadupdate.c:2279 0x114eecb finalize_jump_threads /export/project/git/gcc-regression/gcc/gcc/tree-vrp.c:9856 0x114eecb execute_vrp /export/project/git/gcc-regression/gcc/gcc/tree-vrp.c:10010 0x114eecb execute /export/project/git/gcc-regression/gcc/gcc/tree-vrp.c:10073 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. make[4]: *** [/tmp/ccyuxuEg.ltrans0.ltrans.o] Error 1 lto-wrapper: fatal error: make returned 2 exit status compilation terminated. /bin/ld: lto-wrapper failed r215738 is OK. r215739 may be the cause.
Looks like a dup of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63422, due to incoming profile insanities. Will loosen up the new assert. Teresa
r215830 doesn't solve the problem. Now I got /export/project/git/gcc-regression/gcc/gcc/c/c-parser.c:3898:1: error: verify_flow_info: Wrong probability of edge 238->241 -1819341072 c_parser_attributes (c_parser *parser) ^ /export/project/git/gcc-regression/gcc/gcc/c/c-parser.c:3898:1: internal compiler error: verify_flow_info failed 0x6b2baa verify_flow_info() /export/project/git/gcc-regression/gcc/gcc/cfghooks.c:260 0xdd9bc7 cleanup_tree_cfg_noloop /export/project/git/gcc-regression/gcc/gcc/tree-cfgcleanup.c:765 0xdd9bc7 cleanup_tree_cfg() /export/project/git/gcc-regression/gcc/gcc/tree-cfgcleanup.c:814 0xbf2b04 execute_function_todo /export/project/git/gcc-regression/gcc/gcc/passes.c:1704 0xbff5b7 do_per_function /export/project/git/gcc-regression/gcc/gcc/passes.c:1478 0xbff5b7 execute_todo /export/project/git/gcc-regression/gcc/gcc/passes.c:1808 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. make[4]: *** [/tmp/ccxbDLKp.ltrans2.ltrans.o] Error 1
Presumably the fix for PR63422 (r215822) fixed the initial problem, but r215830 must have introduced this. Will take a look right now. Teresa On Fri, Oct 3, 2014 at 8:59 AM, hjl.tools at gmail dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 > > --- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> --- > r215830 doesn't solve the problem. Now I got > > /export/project/git/gcc-regression/gcc/gcc/c/c-parser.c:3898:1: error: > verify_flow_info: Wrong probability of edge 238->241 -1819341072 > c_parser_attributes (c_parser *parser) > ^ > /export/project/git/gcc-regression/gcc/gcc/c/c-parser.c:3898:1: internal > compiler error: verify_flow_info failed > 0x6b2baa verify_flow_info() > /export/project/git/gcc-regression/gcc/gcc/cfghooks.c:260 > 0xdd9bc7 cleanup_tree_cfg_noloop > /export/project/git/gcc-regression/gcc/gcc/tree-cfgcleanup.c:765 > 0xdd9bc7 cleanup_tree_cfg() > /export/project/git/gcc-regression/gcc/gcc/tree-cfgcleanup.c:814 > 0xbf2b04 execute_function_todo > /export/project/git/gcc-regression/gcc/gcc/passes.c:1704 > 0xbff5b7 do_per_function > /export/project/git/gcc-regression/gcc/gcc/passes.c:1478 > 0xbff5b7 execute_todo > /export/project/git/gcc-regression/gcc/gcc/passes.c:1808 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <http://gcc.gnu.org/bugs.html> for instructions. > make[4]: *** [/tmp/ccxbDLKp.ltrans2.ltrans.o] Error 1 > > -- > You are receiving this mail because: > You are on the CC list for the bug.
r215830 introduced: /* Scale up the frequency by REG_BR_PROB_BASE, to avoid rounding errors applying the edge probability when the frequencies are very small. */ epath->src->count = epath->src->frequency * REG_BR_PROB_BASE; FOR_EACH_EDGE (esucc, ei, epath->src->succs) esucc->count = apply_probability (esucc->src->count, esucc->probability); Can it guarantee that esucc->src->count has been scaled up by REG_BR_PROB_BASE from esucc->src->frequency?
On Fri, Oct 3, 2014 at 10:21 AM, hjl.tools at gmail dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 > > --- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> --- > r215830 introduced: > > /* Scale up the frequency by REG_BR_PROB_BASE, to avoid rounding > errors applying the edge probability when the frequencies are very > small. */ > epath->src->count = epath->src->frequency * REG_BR_PROB_BASE; > FOR_EACH_EDGE (esucc, ei, epath->src->succs) > esucc->count = apply_probability (esucc->src->count, > esucc->probability); > > Can it guarantee that esucc->src->count has been scaled up by > REG_BR_PROB_BASE from esucc->src->frequency? I believe so. Since esucc are successor edges of epath->src, esucc->src == epath->src, and we have just set epath->src->count = epath->src->frequency * REG_BR_PROB_BASE. BTW, I am having a little trouble doing an LTO profiledbootstrap on my linux/x86_64 machine. Here's what I did: $ /usr/local/google/home/tejohnson/gcc_trunk_7/configure --prefix=/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/install --with-build-config=bootstrap-lto $ make -j8 profiledbootstrap I first got an error: --- /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/gengtype.c: In function ‘type* create_user_defined_type(const char*, fileloc*)’: /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/gengtype.c:583:29: error: ‘next’ may be used uninitialized in this function [-Werror=maybe-uninitialized] str += strspn (str, delim); ^ /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/gengtype.c:636:13: note: ‘next’ was declared here char *next; ^ cc1plus: all warnings being treated as errors make[3]: *** [gengtype.o] Error 1 --- which I made a simple change in my gcc client to address. After restarting the "make -j8 profiledbootstrap" I got a bunch of other errors about undefined references in ltrans.o files. In case the first error left things in a bad state I have wiped everything out and restarted the process. Will let you know once I have reproduced. Thanks, Teresa > > -- > You are receiving this mail because: > You are on the CC list for the bug.
(In reply to Teresa Johnson from comment #5) > > BTW, I am having a little trouble doing an LTO profiledbootstrap on my > linux/x86_64 machine. Here's what I did: > > $ /usr/local/google/home/tejohnson/gcc_trunk_7/configure > --prefix=/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/ > install > --with-build-config=bootstrap-lto > $ make -j8 profiledbootstrap > You need to configure GCC with --disable-werror.
On Fri, Oct 3, 2014 at 10:33 AM, hjl.tools at gmail dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 > > --- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> --- > (In reply to Teresa Johnson from comment #5) >> >> BTW, I am having a little trouble doing an LTO profiledbootstrap on my >> linux/x86_64 machine. Here's what I did: >> >> $ /usr/local/google/home/tejohnson/gcc_trunk_7/configure >> --prefix=/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/ >> install >> --with-build-config=bootstrap-lto >> $ make -j8 profiledbootstrap >> > > You need to configure GCC with --disable-werror. Did that, restarted the whole thing, still seeing undefined ref errors. Here is an example: /usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/./prev-gcc/xg++ -B/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/./prev-gcc/ -B/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/install/x86_64-unknown-linux-gnu/bin/ -nostdinc++ -B/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs -B/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu -I/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include -I/usr/local/google/home/tejohnson/gcc_trunk_7/libstdc++-v3/libsupc++ -L/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs -L/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -g -O2 -flto=jobserver -frandom-seed=1 -fprofile-use -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmodes \ build/genmodes.o build/errors.o .././libiberty/libiberty.a /usr/local/google/home/tejohnson/tmpdir/cc9adAmE.ltrans0.ltrans.o: In function `main': /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:108: undefined reference to `md5_init_ctx' /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:111: undefined reference to `md5_finish_ctx' /usr/local/google/home/tejohnson/tmpdir/cc9adAmE.ltrans0.ltrans.o: In function `dosum': /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:40: undefined reference to `fopen_unlocked' /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:81: undefined reference to `md5_process_block' /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:86: undefined reference to `md5_process_bytes' /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:43: undefined reference to `xstrerror' /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:90: undefined reference to `xstrerror' /usr/local/google/home/tejohnson/gcc_trunk_7/gcc/genchecksum.c:50: undefined reference to `xstrerror' collect2: error: ld returned 1 exit status make[3]: *** [build/genchecksum] Error 1 Anything obvious that am I doing wrong that could cause this? I see that those functions should come from libiberty, and I see libiberty.a in the above link. In the meantime, if you have the gcda files and preprocessed source or whatever I need to reproduce the failure and can attach them to the bug that would be great. Thanks, Teresa > > -- > You are receiving this mail because: > You are on the CC list for the bug.
Bug is in static void recompute_probabilities (basic_block bb) { edge esucc; edge_iterator ei; FOR_EACH_EDGE (esucc, ei, bb->succs) { if (bb->count) esucc->probability = GCOV_COMPUTE_SCALE (esucc->count, bb->count); if (esucc->probability > REG_BR_PROB_BASE) { We got (gdb) whatis esucc->probability type = int (gdb) p esucc->count $7 = 2822215 (gdb) p bb->count $8 = 1 (gdb) p esucc->probability $9 = -1842621072 (gdb) When count is set from frequency scaled up by REG_BR_PROB_BASE, probability may overflow.
This works for me: diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index d660fdb..c6f3956 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -874,7 +874,8 @@ recompute_probabilities (basic_block bb) if (bb->count) esucc->probability = GCOV_COMPUTE_SCALE (esucc->count, bb->count); - if (esucc->probability > REG_BR_PROB_BASE) + if (esucc->probability < 0 + || esucc->probability > REG_BR_PROB_BASE) { /* Can happen with missing/guessed probabilities, since we may determine that more is flowing along duplicated
On Fri, Oct 3, 2014 at 12:47 PM, hjl.tools at gmail dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 > > H.J. Lu <hjl.tools at gmail dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Status|UNCONFIRMED |NEW > Last reconfirmed| |2014-10-03 > Ever confirmed|0 |1 > > --- Comment #8 from H.J. Lu <hjl.tools at gmail dot com> --- > Bug is in > > static void > recompute_probabilities (basic_block bb) > { > edge esucc; > edge_iterator ei; > FOR_EACH_EDGE (esucc, ei, bb->succs) > { > if (bb->count) > esucc->probability = GCOV_COMPUTE_SCALE (esucc->count, > bb->count); > if (esucc->probability > REG_BR_PROB_BASE) > { > > We got > > (gdb) whatis esucc->probability > type = int > (gdb) p esucc->count > $7 = 2822215 > (gdb) p bb->count > $8 = 1 > (gdb) p esucc->probability > $9 = -1842621072 This points to a bigger problem, since esucc->count should be <= bb->count. Can you attach the input files and command line and I will take a closer look? Thanks, Teresa > (gdb) > > When count is set from frequency scaled up by REG_BR_PROB_BASE, > probability may overflow. > > -- > You are receiving this mail because: > You are on the CC list for the bug.
(In reply to Teresa Johnson from comment #10) > > This points to a bigger problem, since esucc->count should be <= > bb->count. Can you attach the input files and command line and I will > take a closer look? > Since the compressed input file is about 3MB, I can't upload it here.
Feel free to email it to me at tejohnson@google.com. Teresa On Fri, Oct 3, 2014 at 1:23 PM, hjl.tools at gmail dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 > > --- Comment #11 from H.J. Lu <hjl.tools at gmail dot com> --- > (In reply to Teresa Johnson from comment #10) >> >> This points to a bigger problem, since esucc->count should be <= >> bb->count. Can you attach the input files and command line and I will >> take a closer look? >> > > Since the compressed input file is about 3MB, I can't upload it here. > > -- > You are receiving this mail because: > You are on the CC list for the bug.
Thanks to H.J. for the test case, I have reproduced the issue. It exposed two separate problems. Cc'ing Honza and Jeff for input on the profile count and jump threading issues, respectively. The first is that we are calling my new freqs_to_counts_path in cases where the functions do have non-zero profile counts. I am invoking this when either the profile status is != PROFILE_READ, or when the entry block count is 0. The latter is the case where the read in profile had zero counts for the function, so we kept the guessed frequencies (see counts_to_freqs in predict.c). In some cases I am finding the the profile count on the entry block is 0, but the rest of the blocks (bb 2 and descendants) have non-zero profile counts. I thought the way to check for this case was to look at the entry block's count, and in fact that is what we seem to do in other places. I could also check the entry block successor counts, or I could simply check the blocks along the path (to see if they all have 0 counts but non-0 frequencies). I don't want to check all bbs in the function for every path. Honza, is there a good way to detect this case (function is marked PROFILE_READ but has all-0 bb counts and we kept the estimated frequencies)? Even when I skip freqs_to_counts_path in this case, we still get the insane edge probability. I dumped out some more verbose info while jump threading, and I am seeing a jump threading path that I don't understand. Since it violates my assumptions, the new profile update computations go haywire. Here is the path that we are attempting to thread: (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; Notice that the normal edge is not connected to the rest of the path. This path appears to be constructed during jump threading, as block 155 was created by an earlier threading opportunity. In fact, the earlier threadings that created bb 155 removed all predecessors of bb 13. We originally had bb 150 with successors bb 12 and bb 13 bb 12 with successor bb 13. Then we do: Threaded jump 12 --> 13 to 155 Threaded jump 150 --> 13 to 155 and bb 13 has no more predecessors. So I'm not sure what it means to have a jump threading path through 150 and 13? Jeff, is the above type of jump threading path expected and reasonable? If so, I need to redo some of the profile update code to handle it better. Thanks, Teresa On Fri, Oct 3, 2014 at 1:36 PM, Teresa Johnson <tejohnson@google.com> wrote: > Feel free to email it to me at tejohnson@google.com. > Teresa > > On Fri, Oct 3, 2014 at 1:23 PM, hjl.tools at gmail dot com > <gcc-bugzilla@gcc.gnu.org> wrote: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 >> >> --- Comment #11 from H.J. Lu <hjl.tools at gmail dot com> --- >> (In reply to Teresa Johnson from comment #10) >>> >>> This points to a bigger problem, since esucc->count should be <= >>> bb->count. Can you attach the input files and command line and I will >>> take a closer look? >>> >> >> Since the compressed input file is about 3MB, I can't upload it here. >> >> -- >> You are receiving this mail because: >> You are on the CC list for the bug. > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On Fri, Oct 3, 2014 at 3:35 PM, Teresa Johnson <tejohnson@google.com> wrote: > Thanks to H.J. for the test case, I have reproduced the issue. It > exposed two separate problems. Cc'ing Honza and Jeff for input on the > profile count and jump threading issues, respectively. > > The first is that we are calling my new freqs_to_counts_path in cases > where the functions do have non-zero profile counts. I am invoking > this when either the profile status is != PROFILE_READ, or when the > entry block count is 0. The latter is the case where the read in > profile had zero counts for the function, so we kept the guessed > frequencies (see counts_to_freqs in predict.c). In some cases I am > finding the the profile count on the entry block is 0, but the rest of > the blocks (bb 2 and descendants) have non-zero profile counts. I > thought the way to check for this case was to look at the entry > block's count, and in fact that is what we seem to do in other places. > I could also check the entry block successor counts, or I could simply > check the blocks along the path (to see if they all have 0 counts but > non-0 frequencies). I don't want to check all bbs in the function for > every path. Honza, is there a good way to detect this case (function > is marked PROFILE_READ but has all-0 bb counts and we kept the > estimated frequencies)? > > Even when I skip freqs_to_counts_path in this case, we still get the > insane edge probability. I dumped out some more verbose info while > jump threading, and I am seeing a jump threading path that I don't > understand. Since it violates my assumptions, the new profile update > computations go haywire. Here is the path that we are attempting to > thread: > > (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; > > Notice that the normal edge is not connected to the rest of the path. > This path appears to be constructed during jump threading, as block > 155 was created by an earlier threading opportunity. In fact, the > earlier threadings that created bb 155 removed all predecessors of bb > 13. We originally had > > bb 150 with successors bb 12 and bb 13 > bb 12 with successor bb 13. > > Then we do: > Threaded jump 12 --> 13 to 155 > Threaded jump 150 --> 13 to 155 > and bb 13 has no more predecessors. So I'm not sure what it means to > have a jump threading path through 150 and 13? > > Jeff, is the above type of jump threading path expected and > reasonable? If so, I need to redo some of the profile update code to > handle it better. I see what happened - 155 is a duplicate of 13 from the earlier threading. So the path that was originally: (119, 150) incoming edge; (150, 13) joiner; (13, 15) normal; became (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; This happened naturally when we redirected the incoming edge 150->13 to 155 on the earlier jump threading. During the same threading we created duplicate bb 154 of bb 15, and redirect bb 155 to bb 154. This redirection does not affect the original 13->15 edge, and thus 13->15 remains on the above path. The 155->154 edge should be the new edge in the path instead of 13->15. But the edges recorded in other paths are not explicitly updated, so the normal edge was left alone. This works out ok from a correctness standpoint, since bb 13 still exists and has the same outgoing edges as it did before, which were duplicated onto bb 154. However, those outgoing edges have had their profile counts updated and it no longer works out to use those profile counts to update the counts along this new threading path. We really want to look at 155->154 for the profile to update. Probably we should be updating remaining jump threading paths when we perform jump threading. I need to think about the best way to do that. Teresa > > Thanks, > Teresa > > > On Fri, Oct 3, 2014 at 1:36 PM, Teresa Johnson <tejohnson@google.com> wrote: >> Feel free to email it to me at tejohnson@google.com. >> Teresa >> >> On Fri, Oct 3, 2014 at 1:23 PM, hjl.tools at gmail dot com >> <gcc-bugzilla@gcc.gnu.org> wrote: >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 >>> >>> --- Comment #11 from H.J. Lu <hjl.tools at gmail dot com> --- >>> (In reply to Teresa Johnson from comment #10) >>>> >>>> This points to a bigger problem, since esucc->count should be <= >>>> bb->count. Can you attach the input files and command line and I will >>>> take a closer look? >>>> >>> >>> Since the compressed input file is about 3MB, I can't upload it here. >>> >>> -- >>> You are receiving this mail because: >>> You are on the CC list for the bug. >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On Fri, Oct 3, 2014 at 11:15 PM, Teresa Johnson <tejohnson@google.com> wrote: > On Fri, Oct 3, 2014 at 3:35 PM, Teresa Johnson <tejohnson@google.com> wrote: >> Thanks to H.J. for the test case, I have reproduced the issue. It >> exposed two separate problems. Cc'ing Honza and Jeff for input on the >> profile count and jump threading issues, respectively. >> >> The first is that we are calling my new freqs_to_counts_path in cases >> where the functions do have non-zero profile counts. I am invoking >> this when either the profile status is != PROFILE_READ, or when the >> entry block count is 0. The latter is the case where the read in >> profile had zero counts for the function, so we kept the guessed >> frequencies (see counts_to_freqs in predict.c). In some cases I am >> finding the the profile count on the entry block is 0, but the rest of >> the blocks (bb 2 and descendants) have non-zero profile counts. I >> thought the way to check for this case was to look at the entry >> block's count, and in fact that is what we seem to do in other places. >> I could also check the entry block successor counts, or I could simply >> check the blocks along the path (to see if they all have 0 counts but >> non-0 frequencies). I don't want to check all bbs in the function for >> every path. Honza, is there a good way to detect this case (function >> is marked PROFILE_READ but has all-0 bb counts and we kept the >> estimated frequencies)? I am also seeing cases where the entry bb 0 has a non-zero count, but the rest of the bbs in the function have 0 counts and estimated frequencies. So I think the safest thing to do here, and what I have implemented, is if the profile status is PROFILE_READ, then check all the bbs/edges along the path - if and only if they all have 0 counts and there is at least one non-zero frequency do we do the freqs_to_counts. >> >> Even when I skip freqs_to_counts_path in this case, we still get the >> insane edge probability. I dumped out some more verbose info while >> jump threading, and I am seeing a jump threading path that I don't >> understand. Since it violates my assumptions, the new profile update >> computations go haywire. Here is the path that we are attempting to >> thread: >> >> (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; >> >> Notice that the normal edge is not connected to the rest of the path. >> This path appears to be constructed during jump threading, as block >> 155 was created by an earlier threading opportunity. In fact, the >> earlier threadings that created bb 155 removed all predecessors of bb >> 13. We originally had >> >> bb 150 with successors bb 12 and bb 13 >> bb 12 with successor bb 13. >> >> Then we do: >> Threaded jump 12 --> 13 to 155 >> Threaded jump 150 --> 13 to 155 >> and bb 13 has no more predecessors. So I'm not sure what it means to >> have a jump threading path through 150 and 13? >> >> Jeff, is the above type of jump threading path expected and >> reasonable? If so, I need to redo some of the profile update code to >> handle it better. > > I see what happened - 155 is a duplicate of 13 from the earlier > threading. So the path that was originally: > > (119, 150) incoming edge; (150, 13) joiner; (13, 15) normal; > > became > > (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; > > This happened naturally when we redirected the incoming edge 150->13 > to 155 on the earlier jump threading. During the same threading we > created duplicate bb 154 of bb 15, and redirect bb 155 to bb 154. This > redirection does not affect the original 13->15 edge, and thus 13->15 > remains on the above path. The 155->154 edge should be the new edge in > the path instead of 13->15. But the edges recorded in other paths are > not explicitly updated, so the normal edge was left alone. > > This works out ok from a correctness standpoint, since bb 13 still > exists and has the same outgoing edges as it did before, which were > duplicated onto bb 154. However, those outgoing edges have had their > profile counts updated and it no longer works out to use those profile > counts to update the counts along this new threading path. We really > want to look at 155->154 for the profile to update. > > Probably we should be updating remaining jump threading paths when we > perform jump threading. I need to think about the best way to do that. Looking at the code, I think it attempts to check for this case and prevent it but that code does not work in this case because of the order the paths are identified and subsequently iterated in the paths vec. In mark_threaded_blocks it looks at paths with joiners and if there are any edges along the path already part of another path it will skip that path. But in this case, we registered the paths in this order: Registering jump thread: (119, 150) incoming edge; (150, 13) joiner; (13, 15) normal; ... Registering jump thread: (150, 13) incoming edge; (13, 15) joiner; (15, 17) normal; For the first path, the path is attached to incoming edge 119->150. So when we walk edges in the 2nd path we don't see the first path. If we looked at the paths in the reverse order we would have seen the path attached to incoming edge 150->13 and skipped the 119->150 path. Note that we end up doing the actual threading in the reverse order - first we do the threading through 13 (the second path), then later the threading through 150 (the first path), leading to the issue I am seeing. Jeff, what is intended here - should we not be threading both of these paths? Thanks, Teresa
On Sat, Oct 4, 2014 at 8:34 AM, Teresa Johnson <tejohnson@google.com> wrote: > On Fri, Oct 3, 2014 at 11:15 PM, Teresa Johnson <tejohnson@google.com> wrote: >> On Fri, Oct 3, 2014 at 3:35 PM, Teresa Johnson <tejohnson@google.com> wrote: >>> Thanks to H.J. for the test case, I have reproduced the issue. It >>> exposed two separate problems. Cc'ing Honza and Jeff for input on the >>> profile count and jump threading issues, respectively. >>> >>> The first is that we are calling my new freqs_to_counts_path in cases >>> where the functions do have non-zero profile counts. I am invoking >>> this when either the profile status is != PROFILE_READ, or when the >>> entry block count is 0. The latter is the case where the read in >>> profile had zero counts for the function, so we kept the guessed >>> frequencies (see counts_to_freqs in predict.c). In some cases I am >>> finding the the profile count on the entry block is 0, but the rest of >>> the blocks (bb 2 and descendants) have non-zero profile counts. I >>> thought the way to check for this case was to look at the entry >>> block's count, and in fact that is what we seem to do in other places. >>> I could also check the entry block successor counts, or I could simply >>> check the blocks along the path (to see if they all have 0 counts but >>> non-0 frequencies). I don't want to check all bbs in the function for >>> every path. Honza, is there a good way to detect this case (function >>> is marked PROFILE_READ but has all-0 bb counts and we kept the >>> estimated frequencies)? > > I am also seeing cases where the entry bb 0 has a non-zero count, but > the rest of the bbs in the function have 0 counts and estimated > frequencies. So I think the safest thing to do here, and what I have > implemented, is if the profile status is PROFILE_READ, then check all > the bbs/edges along the path - if and only if they all have 0 counts > and there is at least one non-zero frequency do we do the > freqs_to_counts. > >>> >>> Even when I skip freqs_to_counts_path in this case, we still get the >>> insane edge probability. I dumped out some more verbose info while >>> jump threading, and I am seeing a jump threading path that I don't >>> understand. Since it violates my assumptions, the new profile update >>> computations go haywire. Here is the path that we are attempting to >>> thread: >>> >>> (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; >>> >>> Notice that the normal edge is not connected to the rest of the path. >>> This path appears to be constructed during jump threading, as block >>> 155 was created by an earlier threading opportunity. In fact, the >>> earlier threadings that created bb 155 removed all predecessors of bb >>> 13. We originally had >>> >>> bb 150 with successors bb 12 and bb 13 >>> bb 12 with successor bb 13. >>> >>> Then we do: >>> Threaded jump 12 --> 13 to 155 >>> Threaded jump 150 --> 13 to 155 >>> and bb 13 has no more predecessors. So I'm not sure what it means to >>> have a jump threading path through 150 and 13? >>> >>> Jeff, is the above type of jump threading path expected and >>> reasonable? If so, I need to redo some of the profile update code to >>> handle it better. >> >> I see what happened - 155 is a duplicate of 13 from the earlier >> threading. So the path that was originally: >> >> (119, 150) incoming edge; (150, 13) joiner; (13, 15) normal; >> >> became >> >> (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; >> >> This happened naturally when we redirected the incoming edge 150->13 >> to 155 on the earlier jump threading. During the same threading we >> created duplicate bb 154 of bb 15, and redirect bb 155 to bb 154. This >> redirection does not affect the original 13->15 edge, and thus 13->15 >> remains on the above path. The 155->154 edge should be the new edge in >> the path instead of 13->15. But the edges recorded in other paths are >> not explicitly updated, so the normal edge was left alone. >> >> This works out ok from a correctness standpoint, since bb 13 still >> exists and has the same outgoing edges as it did before, which were >> duplicated onto bb 154. However, those outgoing edges have had their >> profile counts updated and it no longer works out to use those profile >> counts to update the counts along this new threading path. We really >> want to look at 155->154 for the profile to update. >> >> Probably we should be updating remaining jump threading paths when we >> perform jump threading. I need to think about the best way to do that. > > Looking at the code, I think it attempts to check for this case and > prevent it but that code does not work in this case because of the > order the paths are identified and subsequently iterated in the paths > vec. In mark_threaded_blocks it looks at paths with joiners and if > there are any edges along the path already part of another path it > will skip that path. But in this case, we registered the paths in this > order: > > Registering jump thread: (119, 150) incoming edge; (150, 13) > joiner; (13, 15) normal; > ... > Registering jump thread: (150, 13) incoming edge; (13, 15) joiner; > (15, 17) normal; > > For the first path, the path is attached to incoming edge 119->150. So > when we walk edges in the 2nd path we don't see the first path. If we > looked at the paths in the reverse order we would have seen the path > attached to incoming edge 150->13 and skipped the 119->150 path. Note > that we end up doing the actual threading in the reverse order - first > we do the threading through 13 (the second path), then later the > threading through 150 (the first path), leading to the issue I am > seeing. > > Jeff, what is intended here - should we not be threading both of these paths? I have a patch to make the mark_threaded_blocks checking of paths work regardless of the ordering of paths in the vec. This fixes the failure. The other approach is whenever we finish threading a path, go through the vec of remaining paths and update the edges for any that have been affected by the threading and that should instead include the duplicated edges. Teresa > > Thanks, > Teresa > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
I'm going to finish testing my patch, which passes regular x86_64-unknown-linux-gnu bootstrap + regression tests. I am still trying to get the lto profiledbootstrap to work. I found some workarounds for the undefined reference issue on Honza's blog, and I am trying to get that working. Then I will send to gcc-patches for review. In the meantime, H.J. do you want to try the patch? It is below. Thanks, Teresa 2014-10-06 Teresa Johnson <tejohnson@google.com> * tree-ssa-threadupdate.c (estimated_freqs_path): New function. (ssa_fix_duplicate_block_edges): Invoke it. (mark_threaded_blocks): Make two passes to avoid ordering dependences. Index: tree-ssa-threadupdate.c =================================================================== --- tree-ssa-threadupdate.c (revision 215830) +++ tree-ssa-threadupdate.c (working copy) @@ -959,6 +959,43 @@ update_joiner_offpath_counts (edge epath, basic_bl } +/* Check if the paths through RD all have estimated frequencies but zero + profile counts. This is more accurate than checking the entry block + for a zero profile count, since profile insanities sometimes creep in. */ + +static bool +estimated_freqs_path (struct redirection_data *rd) +{ + edge e = rd->incoming_edges->e; + vec<jump_thread_edge *> *path = THREAD_PATH (e); + edge ein; + edge_iterator ei; + bool non_zero_freq = false; + FOR_EACH_EDGE (ein, ei, e->dest->preds) + { + if (ein->count) + return false; + non_zero_freq |= ein->src->frequency != 0; + } + + for (unsigned int i = 1; i < path->length (); i++) + { + edge epath = (*path)[i]->e; + if (epath->src->count) + return false; + non_zero_freq |= epath->src->frequency != 0; + edge esucc; + FOR_EACH_EDGE (esucc, ei, epath->src->succs) + { + if (esucc->count) + return false; + non_zero_freq |= esucc->src->frequency != 0; + } + } + return non_zero_freq; +} + + /* Invoked for routines that have guessed frequencies and no profile counts to record the block and edge frequencies for paths through RD in the profile count fields of those blocks and edges. This is because @@ -1058,9 +1095,11 @@ ssa_fix_duplicate_block_edges (struct redirection_ data we first take a snapshot of the existing block and edge frequencies by copying them into the empty profile count fields. These counts are then used to do the incremental updates, and cleared at the end of this - routine. */ + routine. If the function is marked as having a profile, we still check + to see if the paths through RD are using estimated frequencies because + the routine had zero profile counts. */ bool do_freqs_to_counts = (profile_status_for_fn (cfun) != PROFILE_READ - || !ENTRY_BLOCK_PTR_FOR_FN (cfun)->count); + || estimated_freqs_path (rd)); if (do_freqs_to_counts) freqs_to_counts_path (rd); @@ -2077,35 +2116,48 @@ mark_threaded_blocks (bitmap threaded_blocks) /* Now iterate again, converting cases where we want to thread through a joiner block, but only if no other edge on the path - already has a jump thread attached to it. */ + already has a jump thread attached to it. We do this in two passes, + to avoid situations where the order in the paths vec can hide overlapping + threads (the path is recorded on the incoming edge, so we would miss + cases where the second path starts at a downstream edge on the same + path). First record all joiner paths, deleting any in the unexpected + case where there is already a path for that incoming edge. */ for (i = 0; i < paths.length (); i++) { vec<jump_thread_edge *> *path = paths[i]; if ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK) + { + /* Attach the path to the starting edge if none is yet recorded. */ + if ((*path)[0]->e->aux == NULL) + (*path)[0]->e->aux = path; + else if (dump_file && (dump_flags & TDF_DETAILS)) + dump_jump_thread_path (dump_file, *path, false); + } + } + /* Second, look for paths that have any other jump thread attached to + them, and either finish converting them or cancel them. */ + for (i = 0; i < paths.length (); i++) + { + vec<jump_thread_edge *> *path = paths[i]; + edge e = (*path)[0]->e; + + if ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK && e->aux == path) { unsigned int j; - - for (j = 0; j < path->length (); j++) + for (j = 1; j < path->length (); j++) if ((*path)[j]->e->aux != NULL) break; /* If we iterated through the entire path without exiting the loop, - then we are good to go, attach the path to the starting edge. */ + then we are good to go, record it. */ if (j == path->length ()) - { - edge e = (*path)[0]->e; - e->aux = path; - bitmap_set_bit (tmp, e->dest->index); - } + bitmap_set_bit (tmp, e->dest->index); else if (dump_file && (dump_flags & TDF_DETAILS)) - { - dump_jump_thread_path (dump_file, *path, false); - } + dump_jump_thread_path (dump_file, *path, false); } } - /* If optimizing for size, only thread through block if we don't have to duplicate it or it's an otherwise empty redirection block. */ if (optimize_function_for_size_p (cfun))
On Mon, Oct 6, 2014 at 10:15 PM, Teresa Johnson <tejohnson@google.com> wrote: > I'm going to finish testing my patch, which passes regular > x86_64-unknown-linux-gnu bootstrap + regression tests. I am still > trying to get the lto profiledbootstrap to work. I found some > workarounds for the undefined reference issue on Honza's blog, and I Nevermind, that was for something else, the issue was that I needed to upgrade my binutils from 2.22 to 2.24. Now the LTO profiledbootstrap is humming along. Once I get a round of testing in I will send for review. Thanks, Teresa > am trying to get that working. Then I will send to gcc-patches for > review. > > In the meantime, H.J. do you want to try the patch? It is below. > > Thanks, > Teresa > > 2014-10-06 Teresa Johnson <tejohnson@google.com> > > * tree-ssa-threadupdate.c (estimated_freqs_path): New function. > (ssa_fix_duplicate_block_edges): Invoke it. > (mark_threaded_blocks): Make two passes to avoid ordering dependences. > > Index: tree-ssa-threadupdate.c > =================================================================== > --- tree-ssa-threadupdate.c (revision 215830) > +++ tree-ssa-threadupdate.c (working copy) > @@ -959,6 +959,43 @@ update_joiner_offpath_counts (edge epath, basic_bl > } > > > +/* Check if the paths through RD all have estimated frequencies but zero > + profile counts. This is more accurate than checking the entry block > + for a zero profile count, since profile insanities sometimes creep in. */ > + > +static bool > +estimated_freqs_path (struct redirection_data *rd) > +{ > + edge e = rd->incoming_edges->e; > + vec<jump_thread_edge *> *path = THREAD_PATH (e); > + edge ein; > + edge_iterator ei; > + bool non_zero_freq = false; > + FOR_EACH_EDGE (ein, ei, e->dest->preds) > + { > + if (ein->count) > + return false; > + non_zero_freq |= ein->src->frequency != 0; > + } > + > + for (unsigned int i = 1; i < path->length (); i++) > + { > + edge epath = (*path)[i]->e; > + if (epath->src->count) > + return false; > + non_zero_freq |= epath->src->frequency != 0; > + edge esucc; > + FOR_EACH_EDGE (esucc, ei, epath->src->succs) > + { > + if (esucc->count) > + return false; > + non_zero_freq |= esucc->src->frequency != 0; > + } > + } > + return non_zero_freq; > +} > + > + > /* Invoked for routines that have guessed frequencies and no profile > counts to record the block and edge frequencies for paths through RD > in the profile count fields of those blocks and edges. This is because > @@ -1058,9 +1095,11 @@ ssa_fix_duplicate_block_edges (struct redirection_ > data we first take a snapshot of the existing block and edge frequencies > by copying them into the empty profile count fields. These counts are > then used to do the incremental updates, and cleared at the end of this > - routine. */ > + routine. If the function is marked as having a profile, we still check > + to see if the paths through RD are using estimated frequencies because > + the routine had zero profile counts. */ > bool do_freqs_to_counts = (profile_status_for_fn (cfun) != PROFILE_READ > - || !ENTRY_BLOCK_PTR_FOR_FN (cfun)->count); > + || estimated_freqs_path (rd)); > if (do_freqs_to_counts) > freqs_to_counts_path (rd); > > @@ -2077,35 +2116,48 @@ mark_threaded_blocks (bitmap threaded_blocks) > > /* Now iterate again, converting cases where we want to thread > through a joiner block, but only if no other edge on the path > - already has a jump thread attached to it. */ > + already has a jump thread attached to it. We do this in two passes, > + to avoid situations where the order in the paths vec can hide overlapping > + threads (the path is recorded on the incoming edge, so we would miss > + cases where the second path starts at a downstream edge on the same > + path). First record all joiner paths, deleting any in the unexpected > + case where there is already a path for that incoming edge. */ > for (i = 0; i < paths.length (); i++) > { > vec<jump_thread_edge *> *path = paths[i]; > > if ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK) > + { > + /* Attach the path to the starting edge if none is yet recorded. */ > + if ((*path)[0]->e->aux == NULL) > + (*path)[0]->e->aux = path; > + else if (dump_file && (dump_flags & TDF_DETAILS)) > + dump_jump_thread_path (dump_file, *path, false); > + } > + } > + /* Second, look for paths that have any other jump thread attached to > + them, and either finish converting them or cancel them. */ > + for (i = 0; i < paths.length (); i++) > + { > + vec<jump_thread_edge *> *path = paths[i]; > + edge e = (*path)[0]->e; > + > + if ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK && e->aux == path) > { > unsigned int j; > - > - for (j = 0; j < path->length (); j++) > + for (j = 1; j < path->length (); j++) > if ((*path)[j]->e->aux != NULL) > break; > > /* If we iterated through the entire path without exiting the loop, > - then we are good to go, attach the path to the starting edge. */ > + then we are good to go, record it. */ > if (j == path->length ()) > - { > - edge e = (*path)[0]->e; > - e->aux = path; > - bitmap_set_bit (tmp, e->dest->index); > - } > + bitmap_set_bit (tmp, e->dest->index); > else if (dump_file && (dump_flags & TDF_DETAILS)) > - { > - dump_jump_thread_path (dump_file, *path, false); > - } > + dump_jump_thread_path (dump_file, *path, false); > } > } > > - > /* If optimizing for size, only thread through block if we don't have > to duplicate it or it's an otherwise empty redirection block. */ > if (optimize_function_for_size_p (cfun)) > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On 10/04/14 09:34, Teresa Johnson wrote: > > Looking at the code, I think it attempts to check for this case and > prevent it but that code does not work in this case because of the > order the paths are identified and subsequently iterated in the paths > vec. In mark_threaded_blocks it looks at paths with joiners and if > there are any edges along the path already part of another path it > will skip that path. But in this case, we registered the paths in this > order: > > Registering jump thread: (119, 150) incoming edge; (150, 13) > joiner; (13, 15) normal; > ... > Registering jump thread: (150, 13) incoming edge; (13, 15) joiner; > (15, 17) normal; > > For the first path, the path is attached to incoming edge 119->150. So > when we walk edges in the 2nd path we don't see the first path. If we > looked at the paths in the reverse order we would have seen the path > attached to incoming edge 150->13 and skipped the 119->150 path. Note > that we end up doing the actual threading in the reverse order - first > we do the threading through 13 (the second path), then later the > threading through 150 (the first path), leading to the issue I am > seeing. > > Jeff, what is intended here - should we not be threading both of these paths? The code in mark_threaded_jumps looks for cases where there is a jump threading path with a joiner block is subsumed by a jump threading path without a joiner block and it will suppress the jump threading path with the joiner. We had more suppressed paths in the past, but they mostly got ripped out as the implementation solidified. In the case shown above, we have two paths with joiner blocks so the suppression code in mark_threaded_blocks does not fire and we attempt to realize both jump threading paths. From a correctness standpoint the code should handle this case; however, there may be strong reasons to suppress one path or the other from a code bloat and/or performance standpoint. I typically have to draw out the CFG and the intermediate steps to gain clarity into the desired result and there's not enough info here to do that. My gut tells me suppression of one jump thread path is desirable, we aren't going to have anything good to guide which path to suppress, the vast majority of the time it won't matter which path is suppressed. So I'm comfortable with suppressing either path. jeff
On 10/04/14 13:29, Teresa Johnson wrote: >> Jeff, what is intended here - should we not be threading both of these paths? > > I have a patch to make the mark_threaded_blocks checking of paths work > regardless of the ordering of paths in the vec. This fixes the > failure. This seems like a better solution. It'll decrease unnecessary block copying. > > The other approach is whenever we finish threading a path, go through > the vec of remaining paths and update the edges for any that have been > affected by the threading and that should instead include the > duplicated edges. That'd probably work too, but I suspect there's not much, if any, benefit to keeping both paths. Jeff
On Tue, Oct 7, 2014 at 11:08 AM, Jeff Law <law@redhat.com> wrote: > On 10/04/14 13:29, Teresa Johnson wrote: >>> >>> Jeff, what is intended here - should we not be threading both of these >>> paths? >> >> >> I have a patch to make the mark_threaded_blocks checking of paths work >> regardless of the ordering of paths in the vec. This fixes the >> failure. > > This seems like a better solution. It'll decrease unnecessary block > copying. Ok, great. The patch I posted had a bug I introduced when cleaning it up that was caught by my LTO profiledbootstrap in that I wasn't resetting the e->aux pointer back to NULL when canceling the jump thread. I am kicking off the testing again and should hopefully have a final patch to send to gcc-patches later today. > >> >> The other approach is whenever we finish threading a path, go through >> the vec of remaining paths and update the edges for any that have been >> affected by the threading and that should instead include the >> duplicated edges. > > That'd probably work too, but I suspect there's not much, if any, benefit to > keeping both paths. Yeah, in this case the extra threading worked from a correctness standpoint but doesn't make sense from a logical or performance standpoint. > I typically have to draw out the CFG and the intermediate steps to gain clarity into the desired result and there's not enough info here to do that. I'm glad I'm not the only one who has to draw it out every time! Thanks, Teresa > > Jeff
Author: tejohnson Date: Thu Oct 9 04:38:24 2014 New Revision: 216024 URL: https://gcc.gnu.org/viewcvs?rev=216024&root=gcc&view=rev Log: 2014-10-07 Teresa Johnson <tejohnson@google.com> PR bootstrap/63432. * tree-ssa-threadupdate.c (estimated_freqs_path): New function. (ssa_fix_duplicate_block_edges): Invoke it. (mark_threaded_blocks): Make two passes to avoid ordering dependences. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-threadupdate.c
With r216039, I still got ../../src-trunk/gcc/cp/parser.c: In function ‘cp_parser_declaration_seq_opt’: ../../src-trunk/gcc/cp/parser.c:11049:0: error: verify_flow_info: Wrong probability of edge 53->54 -1192228368 cp_parser_declaration_seq_opt (cp_parser* parser) ^ ../../src-trunk/gcc/cp/parser.c:11049:0: internal compiler error: verify_flow_info failed 0x6b2e3a verify_flow_info() ../../src-trunk/gcc/cfghooks.c:260 0xddcd27 cleanup_tree_cfg_noloop ../../src-trunk/gcc/tree-cfgcleanup.c:765 0xddcd27 cleanup_tree_cfg() ../../src-trunk/gcc/tree-cfgcleanup.c:814 0xbf5804 execute_function_todo ../../src-trunk/gcc/passes.c:1704 0xc02287 do_per_function ../../src-trunk/gcc/passes.c:1478 0xc02287 execute_todo ../../src-trunk/gcc/passes.c:1808 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. make[7]: *** [/tmp/cchv0APR.ltrans3.ltrans.o] Error 1 make[7]: *** Waiting for unfinished jobs.... lto-wrapper: fatal error: make returned 2 exit status compilation terminated. /usr/local/bin/ld: lto-wrapper failed collect2: error: ld returned 1 exit status make[6]: *** [cc1plus] Error 1
On Thu, Oct 9, 2014 at 12:52 PM, hjl.tools at gmail dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 > > --- Comment #23 from H.J. Lu <hjl.tools at gmail dot com> --- > With r216039, I still got > > ../../src-trunk/gcc/cp/parser.c: In function ‘cp_parser_declaration_seq_opt’: > ../../src-trunk/gcc/cp/parser.c:11049:0: error: verify_flow_info: Wrong > probability of edge 53->54 -1192228368 > cp_parser_declaration_seq_opt (cp_parser* parser) > ^ > ../../src-trunk/gcc/cp/parser.c:11049:0: internal compiler error: > verify_flow_info failed > 0x6b2e3a verify_flow_info() > ../../src-trunk/gcc/cfghooks.c:260 > 0xddcd27 cleanup_tree_cfg_noloop > ../../src-trunk/gcc/tree-cfgcleanup.c:765 > 0xddcd27 cleanup_tree_cfg() > ../../src-trunk/gcc/tree-cfgcleanup.c:814 > 0xbf5804 execute_function_todo > ../../src-trunk/gcc/passes.c:1704 > 0xc02287 do_per_function > ../../src-trunk/gcc/passes.c:1478 > 0xc02287 execute_todo > ../../src-trunk/gcc/passes.c:1808 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <http://gcc.gnu.org/bugs.html> for instructions. > make[7]: *** [/tmp/cchv0APR.ltrans3.ltrans.o] Error 1 > make[7]: *** Waiting for unfinished jobs.... > lto-wrapper: fatal error: make returned 2 exit status > compilation terminated. > /usr/local/bin/ld: lto-wrapper failed > collect2: error: ld returned 1 exit status > make[6]: *** [cc1plus] Error 1 Arg, looks very similar so maybe another instance of the duplicate threading is slipping through? My own lto profiled bootstrap succeeded with my patch. I will try updating to r216039 and redo it to see if I can provoke the same failure. Teresa > > -- > You are receiving this mail because: > You are on the CC list for the bug.
Unfortunately I can't reproduce this failure. Here's what I did: In my gcc source: % svn update -r r216039 In my build directory: % ~/gcc_trunk_7/configure --prefix=/usr/local/google/home/tejohnson/gcc_trunk_7_validate_lto/bld-gcc/install --disable-werror --with-build-config=bootstrap-lto --enable-checking % make -j8 profiledbootstrap I might need you to email me the input files again. Sorry about that. Teresa On Thu, Oct 9, 2014 at 1:37 PM, Teresa Johnson <tejohnson@google.com> wrote: > On Thu, Oct 9, 2014 at 12:52 PM, hjl.tools at gmail dot com > <gcc-bugzilla@gcc.gnu.org> wrote: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 >> >> --- Comment #23 from H.J. Lu <hjl.tools at gmail dot com> --- >> With r216039, I still got >> >> ../../src-trunk/gcc/cp/parser.c: In function ‘cp_parser_declaration_seq_opt’: >> ../../src-trunk/gcc/cp/parser.c:11049:0: error: verify_flow_info: Wrong >> probability of edge 53->54 -1192228368 >> cp_parser_declaration_seq_opt (cp_parser* parser) >> ^ >> ../../src-trunk/gcc/cp/parser.c:11049:0: internal compiler error: >> verify_flow_info failed >> 0x6b2e3a verify_flow_info() >> ../../src-trunk/gcc/cfghooks.c:260 >> 0xddcd27 cleanup_tree_cfg_noloop >> ../../src-trunk/gcc/tree-cfgcleanup.c:765 >> 0xddcd27 cleanup_tree_cfg() >> ../../src-trunk/gcc/tree-cfgcleanup.c:814 >> 0xbf5804 execute_function_todo >> ../../src-trunk/gcc/passes.c:1704 >> 0xc02287 do_per_function >> ../../src-trunk/gcc/passes.c:1478 >> 0xc02287 execute_todo >> ../../src-trunk/gcc/passes.c:1808 >> Please submit a full bug report, >> with preprocessed source if appropriate. >> Please include the complete backtrace with any bug report. >> See <http://gcc.gnu.org/bugs.html> for instructions. >> make[7]: *** [/tmp/cchv0APR.ltrans3.ltrans.o] Error 1 >> make[7]: *** Waiting for unfinished jobs.... >> lto-wrapper: fatal error: make returned 2 exit status >> compilation terminated. >> /usr/local/bin/ld: lto-wrapper failed >> collect2: error: ld returned 1 exit status >> make[6]: *** [cc1plus] Error 1 > > Arg, looks very similar so maybe another instance of the duplicate > threading is slipping through? My own lto profiled bootstrap succeeded > with my patch. I will try updating to r216039 and redo it to see if I > can provoke the same failure. > > Teresa > >> >> -- >> You are receiving this mail because: >> You are on the CC list for the bug. > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
I will find a testcase when it fails again.
(In reply to Teresa Johnson from comment #24) > Arg, looks very similar so maybe another instance of the duplicate > threading is slipping through? My own lto profiled bootstrap succeeded > with my patch. I will try updating to r216039 and redo it to see if I > can provoke the same failure. > I sent you another testcase against r216150.
On Mon, Oct 13, 2014 at 8:53 AM, hjl.tools at gmail dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 > > --- Comment #27 from H.J. Lu <hjl.tools at gmail dot com> --- > (In reply to Teresa Johnson from comment #24) > >> Arg, looks very similar so maybe another instance of the duplicate >> threading is slipping through? My own lto profiled bootstrap succeeded >> with my patch. I will try updating to r216039 and redo it to see if I >> can provoke the same failure. >> > > I sent you another testcase against r216150. Thanks for the testcase, I reproduced it. It is a case of garbage in / garbage out. The fre2 pass is introducing some big profile count insanities, leading to the probability insanity being introduced when we try to use the counts to compute the new probability in recompute_probabilities. There is already handling for really large probabilities due to this issue, and we need to add the same thing for negative probabilities - essentially the patch you had originally suggested for the first problem which wasn't necessary for that one since that was an actually jump threading induced issue. Will test that and send for review. > > -- > You are receiving this mail because: > You are on the CC list for the bug.
On Mon, Oct 13, 2014 at 2:32 PM, Teresa Johnson <tejohnson@google.com> wrote: > On Mon, Oct 13, 2014 at 8:53 AM, hjl.tools at gmail dot com > <gcc-bugzilla@gcc.gnu.org> wrote: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432 >> >> --- Comment #27 from H.J. Lu <hjl.tools at gmail dot com> --- >> (In reply to Teresa Johnson from comment #24) >> >>> Arg, looks very similar so maybe another instance of the duplicate >>> threading is slipping through? My own lto profiled bootstrap succeeded >>> with my patch. I will try updating to r216039 and redo it to see if I >>> can provoke the same failure. >>> >> >> I sent you another testcase against r216150. > > Thanks for the testcase, I reproduced it. It is a case of garbage in / > garbage out. The fre2 pass is introducing some big profile count > insanities, leading to the probability insanity being introduced when > we try to use the counts to compute the new probability in > recompute_probabilities. There is already handling for really large > probabilities due to this issue, and we need to add the same thing for > negative probabilities - essentially the patch you had originally > suggested for the first problem which wasn't necessary for that one > since that was an actually jump threading induced issue. Actually, I traced the initial profile insanity back to inlining. FRE merely propagated it further. I have a better overflow test done before the scaling, running it through LTO profiledbootstrap then will send for review. Teresa > > Will test that and send for review. > >> >> -- >> You are receiving this mail because: >> You are on the CC list for the bug. > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Author: tejohnson Date: Wed Oct 15 15:45:59 2014 New Revision: 216269 URL: https://gcc.gnu.org/viewcvs?rev=216269&root=gcc&view=rev Log: 2014-10-15 Teresa Johnson <tejohnson@google.com> PR bootstrap/63432 * tree-ssa-threadupdate.c (recompute_probabilities): Better overflow checking. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-threadupdate.c
Assuming fixed (this bug is horrid to follow - please open new bugs for new issues, not hijack old ones for same symptoms).