Bug 112962 - [14 Regression] ICE: SIGSEGV in operator()<rtx_def*, rtx_def*> (recog.h:431) with -fexceptions -mssse3 and __builtin_ia32_pabsd128()
Summary: [14 Regression] ICE: SIGSEGV in operator()<rtx_def*, rtx_def*> (recog.h:431) ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: 14.0
Assignee: Jakub Jelinek
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2023-12-11 14:59 UTC by Zdenek Sojka
Modified: 2023-12-20 11:02 UTC (History)
2 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build:
Known to work: 13.2.1
Known to fail: 14.0
Last reconfirmed: 2023-12-12 00:00:00


Attachments
reduced testcase (105 bytes, text/plain)
2023-12-11 14:59 UTC, Zdenek Sojka
Details
Proposed patch (648 bytes, patch)
2023-12-12 09:05 UTC, Uroš Bizjak
Details | Diff
gcc14-pr112962-1.patch (1.14 KB, patch)
2023-12-12 11:24 UTC, Jakub Jelinek
Details | Diff
gcc14-pr112962-2.patch (699 bytes, patch)
2023-12-12 11:26 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2023-12-11 14:59:44 UTC
Created attachment 56850 [details]
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -fexceptions -mssse3 testcase.c -wrapper valgrind,-q
==15604== Jump to the invalid address stated on the next line
==15604==    at 0x0: ???
==15604==    by 0x19A93F0: operator()<rtx_def*, rtx_def*> (recog.h:431)
==15604==    by 0x19A93F0: ix86_expand_args_builtin(builtin_description const*, tree_node*, rtx_def*) (i386-expand.cc:11848)
==15604==    by 0x10B1BCF: expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) (expr.cc:12305)
==15604==    by 0xF73D28: expand_expr (expr.h:313)
==15604==    by 0xF73D28: expand_call_stmt (cfgexpand.cc:2832)
==15604==    by 0xF73D28: expand_gimple_stmt_1 (cfgexpand.cc:3894)
==15604==    by 0xF73D28: expand_gimple_stmt(gimple*) (cfgexpand.cc:4058)
==15604==    by 0xF7A0AE: expand_gimple_basic_block(basic_block_def*, bool) (cfgexpand.cc:6114)
==15604==    by 0xF7BD87: (anonymous namespace)::pass_expand::execute(function*) (cfgexpand.cc:6849)
==15604==    by 0x13B774A: execute_one_pass(opt_pass*) (passes.cc:2646)
==15604==    by 0x13B803F: execute_pass_list_1(opt_pass*) (passes.cc:2755)
==15604==    by 0x13B8078: execute_pass_list(function*, opt_pass*) (passes.cc:2766)
==15604==    by 0xFBCA35: expand (cgraphunit.cc:1842)
==15604==    by 0xFBCA35: cgraph_node::expand() (cgraphunit.cc:1795)
==15604==    by 0xFBD949: output_in_order (cgraphunit.cc:2192)
==15604==    by 0xFBD949: symbol_table::compile() [clone .part.0] (cgraphunit.cc:2396)
==15604==    by 0xFC08F7: compile (cgraphunit.cc:2312)
==15604==    by 0xFC08F7: symbol_table::finalize_compilation_unit() (cgraphunit.cc:2584)
==15604==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==15604==
during RTL pass: expand
testcase.c: In function 'foo':
testcase.c:6:3: internal compiler error: Segmentation fault
    6 |   __builtin_ia32_pabsd128 ((V){});
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0x14f8def crash_signal
        /repo/gcc-trunk/gcc/toplev.cc:316
0x0 subreg_lowpart_offset(machine_mode, machine_mode)
        /repo/gcc-trunk/gcc/machmode.h:566
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-6412-20231211150807-g2505a8b41d3-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-6412-20231211150807-g2505a8b41d3-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231211 (experimental) (GCC)
Comment 1 Uroš Bizjak 2023-12-12 09:05:18 UTC
Created attachment 56862 [details]
Proposed patch

Patch in testing.
Comment 2 Jakub Jelinek 2023-12-12 09:36:38 UTC
Started with r14-1145-g1ede03e2d0437ea9c2f7453fcbe263505b4e0def
Comment 3 Jakub Jelinek 2023-12-12 09:38:00 UTC
I was thinking whether it wouldn't be better to expand x86 const or pure builtins when lhs is ignored to nothing in the expanders.
Comment 4 Uroš Bizjak 2023-12-12 09:38:46 UTC
(In reply to Jakub Jelinek from comment #3)
> I was thinking whether it wouldn't be better to expand x86 const or pure
> builtins when lhs is ignored to nothing in the expanders.

Yes, this could be a better solution.
Comment 5 Jakub Jelinek 2023-12-12 09:39:47 UTC
With -O1 or higher there is some DCE which will remove them (unless disabled), but the above ICE is with -O0...
Comment 6 Uroš Bizjak 2023-12-12 09:54:48 UTC
(In reply to Jakub Jelinek from comment #3)
> I was thinking whether it wouldn't be better to expand x86 const or pure
> builtins when lhs is ignored to nothing in the expanders.

Something like this?

--cut here--
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index a53d69d5400..0f3d6108d77 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -13032,6 +13032,9 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
   unsigned int fcode = DECL_MD_FUNCTION_CODE (fndecl);
   HOST_WIDE_INT bisa, bisa2;
 
+  if (ignore && (TREE_READONLY (fndecl) || DECL_PURE_P (fndecl)))
+    return const0_rtx;
+
   /* For CPU builtins that can be folded, fold first and expand the fold.  */
   switch (fcode)
     {
@@ -14401,9 +14404,6 @@ rdseed_step:
       return target;
 
     case IX86_BUILTIN_READ_FLAGS:
-      if (ignore)
-       return const0_rtx;
-
       emit_insn (gen_pushfl ());
 
       if (optimize
--cut here--
Comment 7 Jakub Jelinek 2023-12-12 10:03:36 UTC
On the other side, maybe we want some of the diagnostics that is only done at expansion time (argument xyz must be such and such immediate).  Though, at -O2 it isn't diagnosed anyway because it is DCEd.
Anyway, with just -O0 -mssse3 this works fine because of expr.cc:
11097	  /* If we are going to ignore this result, we need only do something
11098	     if there is a side-effect somewhere in the expression.  If there
11099	     is, short-circuit the most common cases here.  Note that we must
11100	     not call expand_expr with anything but const0_rtx in case this
11101	     is an initial expansion of a size that contains a PLACEHOLDER_EXPR.  */
11102	
11103	  if (ignore)
11104	    {
11105	      if (! TREE_SIDE_EFFECTS (exp))
11106		return const0_rtx;
but with -fexceptions (and probably because we incorrectly don't mark the builtins nothrow?) this doesn't happen.
I was thinking about something like
--- gcc/i386-expand.cc.jj	2023-12-07 08:31:59.855850982 +0100
+++ gcc/i386-expand.cc	2023-12-12 11:02:54.524733315 +0100
@@ -11842,6 +11842,12 @@ ix86_expand_args_builtin (const struct b
       xops[i] = op;
     }
 
+  if (icode == CODE_FOR_nothing)
+    {
+      gcc_assert (target == const0_rtx);
+      return const0_rtx;
+    }
+
   switch (nargs)
     {
     case 1:
but of course, your choice...
Comment 8 Jakub Jelinek 2023-12-12 10:08:25 UTC
Of course, yet another option is:
--- gcc/config/i386/i386.cc	2023-12-12 08:54:39.821148670 +0100
+++ gcc/config/i386/i386.cc	2023-12-12 11:07:03.795286363 +0100
@@ -19377,7 +19377,10 @@ ix86_gimple_fold_builtin (gimple_stmt_it
     do_shift:
       gcc_assert (n_args >= 2);
       if (!gimple_call_lhs (stmt))
-	break;
+	{
+	  gsi_replace (gsi, gimple_build_nop (), false);
+	  return true;
+	}
       arg0 = gimple_call_arg (stmt, 0);
       arg1 = gimple_call_arg (stmt, 1);
       elems = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
@@ -19523,7 +19526,10 @@ ix86_gimple_fold_builtin (gimple_stmt_it
     case IX86_BUILTIN_PABSD256_MASK:
       gcc_assert (n_args >= 1);
       if (!gimple_call_lhs (stmt))
-	break;
+	{
+	  gsi_replace (gsi, gimple_build_nop (), false);
+	  return true;
+	}
       arg0 = gimple_call_arg (stmt, 0);
       elems = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
       /* For masked ABS, only optimize if the mask is all ones.  */
(and I wonder why all the gsi_replace calls in that function are with false, IMHO they should use true).
Comment 9 Uroš Bizjak 2023-12-12 10:12:36 UTC
(In reply to Jakub Jelinek from comment #8)
> Of course, yet another option is:

This goes out of my (limited) area of expertise, so if my proposed (trivial) patch is papering over some other issue, I'll happily leave the solution to you.
Comment 10 Uroš Bizjak 2023-12-12 10:15:52 UTC
(In reply to Jakub Jelinek from comment #7)

> but with -fexceptions (and probably because we incorrectly don't mark the
> builtins nothrow?) this doesn't happen.

Maybe we should finally fix the above nothrow issue?
Comment 11 Jakub Jelinek 2023-12-12 10:36:01 UTC
Sure.  But we need to find out first what builtins might actually throw.
Perhaps with -fnon-call-exceptions those which read/store (vector loads/stores, masked loads/stores, scatters/gathers?) memory can?  Unsure if we handle it though...
Comment 12 Hongtao Liu 2023-12-12 11:06:20 UTC
(In reply to Jakub Jelinek from comment #8)
> Of course, yet another option is:
> --- gcc/config/i386/i386.cc	2023-12-12 08:54:39.821148670 +0100
> +++ gcc/config/i386/i386.cc	2023-12-12 11:07:03.795286363 +0100
> @@ -19377,7 +19377,10 @@ ix86_gimple_fold_builtin (gimple_stmt_it
>      do_shift:
>        gcc_assert (n_args >= 2);
>        if (!gimple_call_lhs (stmt))
> -	break;
> +	{
> +	  gsi_replace (gsi, gimple_build_nop (), false);
> +	  return true;
> +	}
>        arg0 = gimple_call_arg (stmt, 0);
>        arg1 = gimple_call_arg (stmt, 1);
>        elems = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
> @@ -19523,7 +19526,10 @@ ix86_gimple_fold_builtin (gimple_stmt_it
>      case IX86_BUILTIN_PABSD256_MASK:
>        gcc_assert (n_args >= 1);
>        if (!gimple_call_lhs (stmt))
> -	break;
> +	{
> +	  gsi_replace (gsi, gimple_build_nop (), false);
> +	  return true;
> +	}
>        arg0 = gimple_call_arg (stmt, 0);
>        elems = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
>        /* For masked ABS, only optimize if the mask is all ones.  */
> (and I wonder why all the gsi_replace calls in that function are with false,
> IMHO they should use true).

I prefer this solution, that's what we did for blendvps case.
I don't know either, just follow what we did before (with false) when folding builtins.
Comment 13 Hongtao Liu 2023-12-12 11:08:11 UTC
> I prefer this solution, that's what we did for blendvps case.
> I don't know either, just follow what we did before (with false) when
> folding builtins.
I mean when I was working on r14-1145-g1ede03e2d0437ea9c2f7453fcbe263505b4e0def, I'm just follow the existed code.
Comment 14 Jakub Jelinek 2023-12-12 11:24:54 UTC
Created attachment 56863 [details]
gcc14-pr112962-1.patch

Patch I'm going to test.
Comment 15 Jakub Jelinek 2023-12-12 11:26:48 UTC
Created attachment 56864 [details]
gcc14-pr112962-2.patch

And another one for nothrow/leaf.  I'm too lazy to figure out the exact details
for -fnon-call-exceptions, will defer that to somebody who cares about those and has time to figure out all the details.
I think e.g. aarch64 doesn't set nothrow on builtins with -fnon-call-exceptions if they might raise floating point exceptions or read/write memory.
Comment 16 Andrew Pinski 2023-12-12 22:59:30 UTC
(In reply to Jakub Jelinek from comment #15)
> I think e.g. aarch64 doesn't set nothrow on builtins with
> -fnon-call-exceptions if they might raise floating point exceptions or
> read/write memory.

That is correct, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107209#c7 for a change related to aarch64's builtins and non-call exceptions and folding there.
Comment 17 GCC Commits 2023-12-13 10:34:46 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:02c30fdad2f46a1f7b4e30d0eff0ac275cd108a5

commit r14-6485-g02c30fdad2f46a1f7b4e30d0eff0ac275cd108a5
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Dec 13 11:34:12 2023 +0100

    i386: Fix ICE on __builtin_ia32_pabsd128 without lhs [PR112962]
    
    The following patch fixes ICE on the testcase in similar way to how
    other folded builtins are handled in ix86_gimple_fold_builtin when
    they don't have a lhs; these builtins are const or pure, so normally
    DCE would remove them later, but with -O0 that isn't guaranteed to
    happen, and during expansion if they are marked TREE_SIDE_EFFECTS
    it might still be attempted to be expanded.
    This removes them right away during the folding.
    
    Initially I wanted to also change all gsi_replace last args in that function
    to true, but Andrew pointed to PR107209, so I've kept them as is.
    
    2023-12-13  Jakub Jelinek  <jakub@redhat.com>
    
            PR target/112962
            * config/i386/i386.cc (ix86_gimple_fold_builtin): For shifts
            and abs without lhs replace with nop.
    
            * gcc.target/i386/pr112962.c: New test.
Comment 18 Jakub Jelinek 2023-12-13 11:10:46 UTC
Fixed.
Comment 19 GCC Commits 2023-12-20 11:02:53 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:96e0b513717e25405aee36851d5164aab0d0403a

commit r14-6743-g96e0b513717e25405aee36851d5164aab0d0403a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Dec 20 12:01:57 2023 +0100

    i386: Make most MD builtins nothrow, leaf [PR112962]
    
    The following patch makes most of x86 MD builtins nothrow,leaf
    (like most middle-end builtins are).  For -fnon-call-exceptions it
    doesn't nothrow, better might be to still add it if the builtins
    don't read or write memory and can't raise floating point exceptions,
    but we don't have such information readily available, so the patch
    uses just !flag_non_call_exceptions for now.
    Not sure if we shouldn't have some exceptions for the leaf attribute,
    e.g. wonder about EMMS/FEMMS and the various xsave/xrstor etc. builtins,
    pedantically none of those builtins do anything that leaf functions
    are forbidden to do (having callbacks, calling functions from current TU,
    longjump into the current TU), but sometimes non-leaf is also used on
    really complex functions to prevent some unwanted optimizations.
    That said, haven't run into any problems as is with the patch.
    
    2023-12-20  Jakub Jelinek  <jakub@redhat.com>
    
            PR target/112962
            * config/i386/i386-builtins.cc (ix86_builtins): Increase by one
            element.
            (def_builtin): If not -fnon-call-exceptions, set TREE_NOTHROW on
            the builtin FUNCTION_DECL.  Add leaf attribute to DECL_ATTRIBUTES.
            (ix86_add_new_builtins): Likewise.