Bug 84844 - [8 Regression] ICE in extract_constrain_insn_cached, at recog.c:2217 (error: insn does not satisfy its constraints)
Summary: [8 Regression] ICE in extract_constrain_insn_cached, at recog.c:2217 (error: ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.0
: P1 normal
Target Milestone: 8.0
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2018-03-13 09:55 UTC by Arseny Solokha
Modified: 2018-03-14 08:57 UTC (History)
2 users (show)

See Also:
Host:
Target: x86_64-pc-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-03-13 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arseny Solokha 2018-03-13 09:55:32 UTC
gcc-8.0.0-alpha20180311 snapshot (r258438) ICEs when compiling the following snippet w/ -march=bdver1 (=bdver2) -O1 (-O2, -O3, -Ofast, -Og, -Os) -fschedule-insns -fselective-scheduling:

double
vl (int *hm, int ue, int jy)
{
  *hm = ue;
  return jy;
}

% x86_64-pc-linux-gnu-gcc-8.0.0-alpha20180311 -march=bdver1 -O1 -fschedule-insns -fselective-scheduling -c iy3oz1yv.c
iy3oz1yv.c: In function 'vl':
iy3oz1yv.c:6:1: error: insn does not satisfy its constraints:
 }
 ^
(insn 19 0 0 (set (reg:DF 91)
        (float:DF (reg:SI 1 dx [ jy ]))) 205 {*floatsidf2_mixed}
     (expr_list:REG_DEAD (reg/v:SI 90 [ jy ])
        (nil)))
during RTL pass: sched1
iy3oz1yv.c:6:1: internal compiler error: in extract_constrain_insn_cached, at recog.c:2217
0x64a621 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/rtl-error.c:108
0x64a647 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/rtl-error.c:118
0x648bc4 extract_constrain_insn_cached(rtx_insn*)
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/recog.c:2217
0x118653d get_attr_type(rtx_insn*)
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/config/i386/i386.md:2241
0x11b6f86 internal_dfa_insn_code_bdver1(rtx_insn*)
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/config/i386/i386.md:15271
0x11a29a8 dfa_insn_code
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/build/gcc/insn-automata.c:150079
0x11a29a8 state_transition(void*, rtx_def*)
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/build/gcc/insn-automata.c:150094
0xc5f1fd estimate_insn_cost
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:4291
0xc6b0ab get_expr_cost
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:4322
0xc6b0ab choose_best_insn
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:4351
0xc6b0ab find_best_expr
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:4401
0xc6b0ab fill_insns
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:5544
0xc6b0ab schedule_on_fences
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:7361
0xc6b0ab sel_sched_region_2
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:7499
0xc6d227 sel_sched_region_1
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:7541
0xc6d227 sel_sched_region(int)
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:7642
0xc6e288 run_selective_scheduling()
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sel-sched.c:7718
0xc4ddfd rest_of_handle_sched
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sched-rgn.c:3715
0xc4ddfd execute
	/var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180311/work/gcc-8-20180311/gcc/sched-rgn.c:3825
Comment 1 Jakub Jelinek 2018-03-13 11:17:02 UTC
Started with r247036.
Comment 2 Jakub Jelinek 2018-03-13 11:50:06 UTC
The problem is that if a constraint returns NO_REGS before RA (i.e. in non-strict mode) then that alternative is ignored, and e.g. get_attr_type used by sched1 needs to compute alternative.  As the instruction uses nonimmediate_operand and the only alternative with "r" input has "Yc" output which is NO_REGS for this -march=, if the operand is a register, there is no alternative left.

The following patch fixes this for me:
--- gcc/config/i386/i386.md.jj	2018-03-13 09:05:22.480987994 +0100
+++ gcc/config/i386/i386.md	2018-03-13 12:44:21.372058947 +0100
@@ -5325,16 +5325,17 @@
 })
 
 (define_insn "*float<SWI48:mode><MODEF:mode>2_mixed"
-  [(set (match_operand:MODEF 0 "register_operand" "=f,Yc,v")
+  [(set (match_operand:MODEF 0 "register_operand" "=f,Yc,v,!???v")
 	(float:MODEF
-	  (match_operand:SWI48 1 "nonimmediate_operand" "m,r,m")))]
+	  (match_operand:SWI48 1 "nonimmediate_operand" "m,r,m,r")))]
   "SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_SSE_MATH"
   "@
    fild%Z1\t%1
    %vcvtsi2<MODEF:ssemodesuffix><SWI48:rex64suffix>\t{%1, %d0|%d0, %1}
+   %vcvtsi2<MODEF:ssemodesuffix><SWI48:rex64suffix>\t{%1, %d0|%d0, %1}
    %vcvtsi2<MODEF:ssemodesuffix><SWI48:rex64suffix>\t{%1, %d0|%d0, %1}"
-  [(set_attr "type" "fmov,sseicvt,sseicvt")
-   (set_attr "prefix" "orig,maybe_vex,maybe_vex")
+  [(set_attr "type" "fmov,sseicvt,sseicvt,sseicvt")
+   (set_attr "prefix" "orig,maybe_vex,maybe_vex,maybe_vex")
    (set_attr "mode" "<MODEF:MODE>")
    (set (attr "prefix_rex")
      (if_then_else
@@ -5342,11 +5343,11 @@
 	    (match_test "<SWI48:MODE>mode == DImode"))
        (const_string "1")
        (const_string "*")))
-   (set_attr "unit" "i387,*,*")
-   (set_attr "athlon_decode" "*,double,direct")
-   (set_attr "amdfam10_decode" "*,vector,double")
-   (set_attr "bdver1_decode" "*,double,direct")
-   (set_attr "znver1_decode" "double,*,*")
+   (set_attr "unit" "i387,*,*,*")
+   (set_attr "athlon_decode" "*,double,direct,double")
+   (set_attr "amdfam10_decode" "*,vector,double,vector")
+   (set_attr "bdver1_decode" "*,double,direct,double")
+   (set_attr "znver1_decode" "double,*,*,*")
    (set_attr "fp_int_src" "true")
    (set (attr "enabled")
      (cond [(eq_attr "alternative" "0")

The CPU does support this insn, just it is very undesirable for tuning.
So, the !??? should tell reload to try very hard not to use that alternative.
Comment 3 Jakub Jelinek 2018-03-13 11:53:17 UTC
Or shall I got for ?*v instead, or ???*v, or !???*v ?
Comment 4 Uroš Bizjak 2018-03-13 12:24:40 UTC
(In reply to Jakub Jelinek from comment #3)
> Or shall I got for ?*v instead, or ???*v, or !???*v ?

Or we can revert PR78090 fix and use preferred_for_speed attribute again. This will allow the compiler to emit reg->xmm for cold parts.

BTW: we shouldn't just grep assembly dumps for unoptimal insns, as was done and reported in PR78090. There is no harm to emit reg->xmm form of the insn in the cold parts.

Based on this, I prefer the revert of PR78090 patch, perhaps with a run of 462.libquantum to assess the performance impact.
Comment 5 Jakub Jelinek 2018-03-13 12:36:09 UTC
(In reply to Uroš Bizjak from comment #4)
> (In reply to Jakub Jelinek from comment #3)
> > Or shall I got for ?*v instead, or ???*v, or !???*v ?
> 
> Or we can revert PR78090 fix and use preferred_for_speed attribute again.
> This will allow the compiler to emit reg->xmm for cold parts.
> 
> BTW: we shouldn't just grep assembly dumps for unoptimal insns, as was done
> and reported in PR78090. There is no harm to emit reg->xmm form of the insn
> in the cold parts.
> 
> Based on this, I prefer the revert of PR78090 patch, perhaps with a run of
> 462.libquantum to assess the performance impact.

I can test the reversion, but don't have any setup for SPEC testing nor AMD HW,
so that would need to be done by somebody else.
Comment 6 Uroš Bizjak 2018-03-13 13:27:01 UTC
(In reply to Jakub Jelinek from comment #2)
> The problem is that if a constraint returns NO_REGS before RA (i.e. in
> non-strict mode) then that alternative is ignored, and e.g. get_attr_type
> used by sched1 needs to compute alternative.  As the instruction uses
> nonimmediate_operand and the only alternative with "r" input has "Yc" output
> which is NO_REGS for this -march=, if the operand is a register, there is no
> alternative left.

OTOH, if the above is the problem - we have quite some constructs that declare insn pattern with (match_operand "nonimmediate_operand" "m"), please see the comment above floathi<mode>2. The comment also says that LRA will fix up this during register allocation. Also, -fschedule-insns without -fselective-scheduling is able to compile the testcase without errors, so it looks that -fselective-scheduling should be made more robust.
Comment 7 Jakub Jelinek 2018-03-13 13:46:22 UTC
I'm afraid I don't see what sel-sched could do.
It asks the DFA about cost estimate, and that ICEs.
And it ICEs because to determine the estimated cost it needs to estimate which alternative is going to be used and for that it needs to compute get_attr_type, which contains extract_constrain_insn_cached if there are multiple alternatives with different types.

I assume
(match_operand "nonimmediate_operand" "m")
is not really a problem if there is just one alternative, or even if there are multiple alternatives, but all of them have the same type attribute.  It is just the case where type is different (or some other attribute that needs to be queried before reload), which implies extract_constrain_insn_cached and that ignores alternatives that have NO_REGS, or memory constraint on REG.  For most constraints in !strict mode, it will win if reg_class_for_constraint (cn) != NO_REGS, or, if it is NO_REGS, if constraint_satisfied_p, or a couple of other cases.
Comment 8 Jakub Jelinek 2018-03-14 08:49:45 UTC
Author: jakub
Date: Wed Mar 14 08:48:40 2018
New Revision: 258515

URL: https://gcc.gnu.org/viewcvs?rev=258515&root=gcc&view=rev
Log:
	PR target/84844
	Revert
	2017-04-20  Uros Bizjak  <ubizjak@gmail.com>

	PR target/78090
	* config/i386/constraints.md (Yc): New register constraint.
	* config/i386/i386.md (*float<SWI48:mode><MODEF:mode>2_mixed):
	Use Yc constraint for alternative 2 of operand 0.  Remove
	preferred_for_speed attribute.

	* gcc.target/i386/pr84844.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr84844.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/constraints.md
    trunk/gcc/config/i386/i386.md
    trunk/gcc/testsuite/ChangeLog
Comment 9 Jakub Jelinek 2018-03-14 08:57:08 UTC
Fixed.