Bug 105282

Summary: [11 Regression] V_INDIR overflow causes ICE on -O0 -flto in stream_out_histogram_value, at value-prof.cc:340
Product: gcc Reporter: Sergei Trofimovich <slyfox>
Component: gcov-profileAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: dimhen, herrtimson, marxin
Priority: P2 Keywords: ice-on-valid-code
Version: 12.0   
Target Milestone: 11.4   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2022-04-19 00:00:00
Attachments: 0001-gcov-profile-Allow-negative-counts-of-indirect-calls.patch

Description Sergei Trofimovich 2022-04-15 06:41:01 UTC
Initially the bug is reported by John Helmert III in https://bugs.gentoo.org/838094 where python-3.10.4 failed to build on -flto -O0. Here is the single-file minimal reproducer:

#include <stddef.h>

typedef void (*cb_t)(void);
#define F(__fn) static void __fn(void) {}

F(f00);F(f01);F(f02);F(f03);F(f04);F(f05);F(f06);F(f07);F(f08);F(f09);
F(f10);F(f11);F(f12);F(f13);F(f14);F(f15);F(f16);F(f17);F(f18);F(f19);
F(f20);F(f21);F(f22);F(f23);F(f24);F(f25);F(f26);F(f27);F(f28);F(f29);
F(f30);F(f31);F(f32);F(f33);F(f34);F(f35);F(f36);F(f37);F(f38);F(f39);
F(f40);F(f41);F(f42);F(f43);F(f44);F(f45);F(f46);F(f47);F(f48);F(f49);

static void f(int i) {
    /* Needs to be bigger than gcc's GCOV_TOPN_MAXIMUM_TRACKED_VALUES == 32
     * to overflow GCOV_COUNTER_V_INDIR couter type.
     */
    static const cb_t fs[] = {
        &f00,&f01,&f02,&f03,&f04,&f05,&f06,&f07,&f08,&f09,
        &f10,&f11,&f12,&f13,&f14,&f15,&f16,&f17,&f18,&f19,
        &f20,&f21,&f22,&f23,&f24,&f25,&f26,&f27,&f28,&f29,
        &f30,&f31,&f32,&f33,&f34,&f35,&f36,&f37,&f38,&f39,
        &f40,&f41,&f42,&f43,&f44,&f45,&f46,&f47,&f48,&f49,
    };
    size_t sz = sizeof (fs) / sizeof (fs[0]);
    fs[i % sz]();
}

int l(int argc, char * argv[]);

int main(int argc, char *argv[]) {
    if (argc == 1)
      for (unsigned int i = 0; i < 25; i++)
        f(i);
    if (argc == 2)
      for (unsigned int i = 25; i < 50; i++)
        f(i);
}

Crashing:

$ gcc -flto -O0 a.c -fprofile-generate -o a
$ ./a # populate first 25 buckets
$ ./a 1 # populate 25 more buckets, cause overflow
$ gcc -flto -O0 a.c -fprofile-use

during IPA pass: modref
a.c:36:1: internal compiler error: in stream_out_histogram_value, at value-prof.cc:340
   36 | }
      | ^
0x8351fb stream_out_histogram_value(output_block*, histogram_value_t*)
        ../../gcc-12-20220410/gcc/value-prof.cc:340
0x1c848c0 output_gimple_stmt
        ../../gcc-12-20220410/gcc/gimple-streamer-out.cc:192
0x1c848c0 output_bb(output_block*, basic_block_def*, function*)
        ../../gcc-12-20220410/gcc/gimple-streamer-out.cc:227
0xdc91ad output_function
        ../../gcc-12-20220410/gcc/lto-streamer-out.cc:2453
0xdc91ad lto_output()
        ../../gcc-12-20220410/gcc/lto-streamer-out.cc:2796
0xe57b11 write_lto
        ../../gcc-12-20220410/gcc/passes.cc:2762
0xe57b11 ipa_write_summaries_1
        ../../gcc-12-20220410/gcc/passes.cc:2826
0xe57b11 ipa_write_summaries()
        ../../gcc-12-20220410/gcc/passes.cc:2882
0xaac060 ipa_passes
        ../../gcc-12-20220410/gcc/cgraphunit.cc:2209
0xaac060 symbol_table::compile()
        ../../gcc-12-20220410/gcc/cgraphunit.cc:2282
0xaaea77 symbol_table::compile()
        ../../gcc-12-20220410/gcc/cgraphunit.cc:2262
0xaaea77 symbol_table::finalize_compilation_unit()
        ../../gcc-12-20220410/gcc/cgraphunit.cc:2530
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ gcc -v
Using built-in specs.
COLLECT_GCC=/<<NIX>>/gcc-debug-12.0.0/bin/gcc
COLLECT_LTO_WRAPPER=/<<NIX>>/gcc-debug-12.0.0/libexec/gcc/x86_64-unknown-linux-gnu/12.0.1/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with:
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.0.1 20220410 (experimental) (GCC)

I think it happens due to overly restrictive gcc_assert() in gcc/value-prof.cc:

 void
 stream_out_histogram_value (struct output_block *ob, histogram_value hist)
 {
  unsigned int i;

  ...

  for (i = 0; i < hist->n_counters; i++)
    {
      /* When user uses an unsigned type with a big value, constant converted
         to gcov_type (a signed type) can be negative.  */
      gcov_type value = hist->hvalue.counters[i];
      if (hist->type == HIST_TYPE_TOPN_VALUES
          || hist->type == HIST_TYPE_IOR)
        /* Note that the IOR counter tracks pointer values and these can have
           sign bit set.  */
        ;
      else
        gcc_assert (value >= 0);

      streamer_write_gcov_count (ob, value);
    }
  ...
 }


Note how it implies that all entries of HIST_TYPE_INDIR_CALL are expected to be non-nevative values. It's not true for a case when two merged histograms overflow at libgcc/libgcov-merge.c:

/* ...

   We use -TOTAL for situation when merging dropped some values.
   The information is used for -fprofile-reproducible flag.
   */

void
__gcov_merge_topn (gcov_type *counters, unsigned n_counters)
{

          ...
          full |= gcov_topn_add_value (counters + GCOV_TOPN_MEM_COUNTERS * i,
                                       value, count, 0, 0);
        }

      if (full)
        *total = -(*total);
    }
}
Comment 1 Sergei Trofimovich 2022-04-15 06:47:13 UTC
Relevant bit of counters dump for completeness (after merge):

$ gcov-dump -l a.gcda
...
a.gcda:    01a90000: 528:COUNTERS indirect_call 66 counts
a.gcda:                   0: -50 32 1456173180 1 1792104613 1 918340114 1
a.gcda:                   8: 1406444659 1 263798468 1 1664310260 1 1063174467 1
a.gcda:                  16: 1596551981 1 54847898 1 533075953 1 1135316294 1
a.gcda:                  24: 601636648 1 2142348703 1 450479102 1 1186224457 1
a.gcda:                  32: 416313568 1 1153296983 1 617240633 1 2024260238 1
a.gcda:                  40: 1680162021 1 944285266 1 1480528956 1 72519307 1
a.gcda:                  48: 1631250666 1 1029141085 1 941945699 1 1682532820 1
a.gcda:                  56: 71228346 1 1481851149 1 1154596710 1 414983633 1
a.gcda:                  64: 2026608575 1
Comment 2 Sergei Trofimovich 2022-04-15 07:50:28 UTC
Proposed the fix as https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg283031.html
Comment 3 Sergei Trofimovich 2022-04-16 14:23:02 UTC
Created attachment 52819 [details]
0001-gcov-profile-Allow-negative-counts-of-indirect-calls.patch
Comment 4 Martin Liška 2022-04-19 09:45:45 UTC
(In reply to Sergei Trofimovich from comment #2)
> Proposed the fix as
> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg283031.html

Next time, please use our official mailing list:
https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593287.html
Comment 5 GCC Commits 2022-04-19 19:07:38 UTC
The master branch has been updated by Sergei Trofimovich <slyfox@gcc.gnu.org>:

https://gcc.gnu.org/g:90a29845bfe7d6002e6c2fd49a97820b00fbc4a3

commit r12-8199-g90a29845bfe7d6002e6c2fd49a97820b00fbc4a3
Author: Sergei Trofimovich <siarheit@google.com>
Date:   Fri Apr 15 08:35:27 2022 +0100

    gcov-profile: Allow negative counts of indirect calls [PR105282]
    
    TOPN metrics are histograms that contain overall count and per-bucket
    count. Overall count can be negative when two profiles merge and some
    of per-bucket metrics are disacarded.
    
    Noticed as an ICE on python PGO build where gcc crashes as:
    
        during IPA pass: modref
        a.c:36:1: ICE: in stream_out_histogram_value, at value-prof.cc:340
           36 | }
              | ^
        stream_out_histogram_value(output_block*, histogram_value_t*)
                gcc/value-prof.cc:340
    
    gcc/ChangeLog:
    
            PR gcov-profile/105282
            * value-prof.cc (stream_out_histogram_value): Allow negative counts
            on HIST_TYPE_INDIR_CALL.
Comment 6 Sergei Trofimovich 2022-04-19 19:24:51 UTC
Should be fixed on gcc-12.
Comment 7 Sergei Trofimovich 2022-04-19 19:30:21 UTC
(In reply to Martin Liška from comment #4)
> (In reply to Sergei Trofimovich from comment #2)
> > Proposed the fix as
> > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg283031.html
> 
> Next time, please use our official mailing list:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593287.html

Sounds good!

Proposed identical patch as a gcc-11 backport: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593382.html
Comment 8 Richard Biener 2022-04-21 07:51:42 UTC
GCC 11.3 is being released, retargeting bugs to GCC 11.4.
Comment 9 GCC Commits 2022-04-21 09:27:46 UTC
The releases/gcc-11 branch has been updated by Martin Liska <marxin@gcc.gnu.org>:

https://gcc.gnu.org/g:7b879564ec2bda6b5441fbaf231d70ec6359db01

commit r11-9896-g7b879564ec2bda6b5441fbaf231d70ec6359db01
Author: Sergei Trofimovich <siarheit@google.com>
Date:   Fri Apr 15 08:35:27 2022 +0100

    gcov-profile: Allow negative counts of indirect calls [PR105282]
    
    TOPN metrics are histograms that contain overall count and per-bucket
    count. Overall count can be negative when two profiles merge and some
    of per-bucket metrics are disacarded.
    
    Noticed as an ICE on python PGO build where gcc crashes as:
    
        during IPA pass: modref
        a.c:36:1: ICE: in stream_out_histogram_value, at value-prof.cc:340
           36 | }
              | ^
        stream_out_histogram_value(output_block*, histogram_value_t*)
                gcc/value-prof.c:340
    
    gcc/ChangeLog:
    
            PR gcov-profile/105282
            * value-prof.c (stream_out_histogram_value): Allow negative counts
            on HIST_TYPE_INDIR_CALL.
    
    (cherry picked from commit 90a29845bfe7d6002e6c2fd49a97820b00fbc4a3)
Comment 10 Martin Liška 2022-04-21 09:28:24 UTC
Cherry-picked after 11.3 got released. Closing now.