gcc-9.0.0-alpha20190407 snapshot (r270192) ICEs when compiling the following testcase w/ -march=bdver1 (=bdver2) -mfpmath=387 -O1 (-O2, -O3, -Ofast) -fschedule-insns -fselective-scheduling: void qj (int b9, int r9, int k4, int k0, int e7) { (void) b9; (void) r9; (void) k4; while (!!k0 == e7 * 1.1) { } } % x86_64-pc-linux-gnu-gcc-9.0.0-alpha20190407 -march=bdver1 -mfpmath=387 -O1 -fschedule-insns -fselective-scheduling -c nhzbpwxv.c nhzbpwxv.c: In function 'qj': nhzbpwxv.c:11:1: error: insn does not satisfy its constraints: 11 | } | ^ (insn 39 0 0 (set (reg:DF 95) (float:DF (reg:SI 36 r8 [ e7 ]))) 172 {*floatsidf2} (expr_list:REG_DEAD (reg:SI 98) (nil))) during RTL pass: sched1 nhzbpwxv.c:11:1: internal compiler error: in extract_constrain_insn_cached, at recog.c:2223 0x66a363 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/rtl-error.c:108 0x66a389 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/rtl-error.c:118 0x6685a6 extract_constrain_insn_cached(rtx_insn*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/recog.c:2223 0x12b020f get_attr_type(rtx_insn*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/config/i386/i386.md:2288 0x12d6805 internal_dfa_insn_code_bdver1(rtx_insn*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/config/i386/i386.md:15343 0x12c4100 dfa_insn_code /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/build/gcc/insn-automata.c:158875 0x12c4100 state_transition(void*, rtx_def*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/build/gcc/insn-automata.c:158890 0xd0b74a estimate_insn_cost /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:4293 0xd17aab get_expr_cost /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:4324 0xd17aab choose_best_insn /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:4353 0xd17aab find_best_expr /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:4403 0xd17aab fill_insns /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:5550 0xd17aab schedule_on_fences /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:7368 0xd17aab sel_sched_region_2 /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:7506 0xd185e8 sel_sched_region_1 /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:7548 0xd1a111 sel_sched_region(int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:7649 0xd1a111 sel_sched_region(int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:7634 0xd1acc8 run_selective_scheduling() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sel-sched.c:7735 0xcf915d rest_of_handle_sched /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sched-rgn.c:3717 0xcf915d execute /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190407/work/gcc-9-20190407/gcc/sched-rgn.c:3827
The pattern of the offending instruction is defined as (define_insn "*float<SWI48:mode><MODEF:mode>2" [(set (match_operand:MODEF 0 "register_operand" "=f,v,v") (float:MODEF (match_operand:SWI48 1 "nonimmediate_operand" "m,r,m")))] since RA is nowadays able to reload input operand of alternative 0: (insn 17 35 15 2 (set (reg:DF 95) (float:DF (reg:SI 98))) "pr90007.c":8:21 172 {*floatsidf2} (expr_list:REG_DEAD (reg:SI 98) (nil))) via memory: (insn 17 35 37 2 (set (reg:DF 8 st [95]) (float:DF (mem/c:SI (plus:DI (reg/f:DI 7 sp) (const_int -4 [0xfffffffffffffffc])) [1 %sfp+-4 S4 A32]))) "pr90007.c":8:21 172 {*floatsidf2} (nil)) sel-sched is not prepared for this, uses: --cut here-- /* Estimate the cost of issuing INSN on DFA state STATE. */ static int estimate_insn_cost (rtx_insn *insn, state_t state) { static state_t temp = NULL; int cost; if (!temp) temp = xmalloc (dfa_state_size); memcpy (temp, state, dfa_state_size); cost = state_transition (temp, insn); if (cost < 0) return 0; else if (cost == 0) return 1; return cost; } --cut here-- that calls state_transition, which tries to calculate and verify constraints via the following call sequence: #4 0x000000000064ea7f in extract_constrain_insn_cached (insn=insn@entry=0x7fffea669940) at ../../git/gcc/gcc/recog.c:2223 #5 0x0000000001217c4f in get_attr_type (insn=insn@entry=0x7fffea669940) at ../../git/gcc/gcc/config/i386/i386.md:2288 #6 0x000000000124994c in internal_dfa_insn_code_bdver1 (insn=0x7fffea669940) at ../../git/gcc/gcc/config/i386/i386.md:15343 #7 0x0000000001233169 in dfa_insn_code (insn=0x27) at insn-automata.c:158875 #8 state_transition (state=0x21a7af0, insn=insn@entry=0x7fffea669940) at insn-automata.c:27818 and crashes due to unmet constraints.
We have a pseudo:SI<-hardreg:SI assignment followed by pseudo:DF<-float(pseudo:SI) conversion, and we substitute the latter through the former, creating a pseudo:DF<-float(hardreg:SI) insn that fails in recog. I'm not exactly sure why RA would reject reloading the operand when it's a hardreg, but happily reload when it's a pseudo. Am I missing something obvious, or are such constraints written down somewhere?
Why does sel-sched try to propagate hard registers into insns before RA? The whole point of the combiner changes was not to do that, so that the RA can do better job.
Well, often sel-sched just does not discriminate hardregs and pseudos when checking if renaming/substitution may be applied. Sure, as a matter of efficiency we should probably disallow substitution through such mixed pseudo=hardreg assignments. Nevertheless, if it's not only a matter of optimization, but also of internal consistency, then I'd like to understand it better. Hence the question in comment #2.
It is not a matter of efficiency, but primarily that RA can't do anything in many cases after propagating hard registers into instructions. This PR is just one of the many cases.
CCing Segher and Vlad on this if they want to comment on that further.
(In reply to Jakub Jelinek from comment #3) > Why does sel-sched try to propagate hard registers into insns before RA? > The whole point of the combiner changes was not to do that, so that the RA > can do better job. That, and *correctness*. Propagating hard registers can lead to things that cannot be reloaded. Even in the simple case here you cannot necessarily replace the hard reg with a pseudo and end up with valid code.
The problem of sel-sched propagating hard-regs is likely older, thus P2 since this is also not from a real-world program that newly fails to build.
GCC 9.1 has been released.
GCC 9.2 has been released.
(In reply to Alexander Monakov from comment #4) > Well, often sel-sched just does not discriminate hardregs and pseudos when > checking if renaming/substitution may be applied. Sure, as a matter of > efficiency we should probably disallow substitution through such mixed > pseudo=hardreg assignments. > > Nevertheless, if it's not only a matter of optimization, but also of > internal consistency, then I'd like to understand it better. Hence the > question in comment #2. Alexander, it is a legitimate question. LRA changed actively during many years and LRA now can reload hard register as pseudo register into memory. Unfortunately, recog.c was not adapted to these changes. I think the following patch will reflect the LRA changes: Index: recog.c =================================================================== --- recog.c (revision 278413) +++ recog.c (working copy) @@ -2757,10 +2757,9 @@ constrain_operands (int strict, alternat /* Before reload, accept what reload can turn into a mem. */ || (strict < 0 && CONSTANT_P (op)) - /* Before reload, accept a pseudo, + /* Before reload, accept a pseudo or hard register, since LRA can turn it into a mem. */ - || (strict < 0 && targetm.lra_p () && REG_P (op) - && REGNO (op) >= FIRST_PSEUDO_REGISTER) + || (strict < 0 && targetm.lra_p () && REG_P (op)) /* During reload, accept a pseudo */ || (reload_in_progress && REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER))) I've checked the patch. It solves the problem. I'll submit it for approval after testing, probably tomorrow.
Author: vmakarov Date: Wed Nov 27 14:24:47 2019 New Revision: 278770 URL: https://gcc.gnu.org/viewcvs?rev=278770&root=gcc&view=rev Log: 2019-11-27 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/90007 * recog.c (constrain_operands): Permit hard registers too for memory when LRA is used. 2019-11-27 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/90007 * gcc.target/i386/pr90007.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr90007.c Modified: trunk/gcc/ChangeLog trunk/gcc/recog.c trunk/gcc/testsuite/ChangeLog
Does that work? You cannot put all hard registers in memory I think? Or do we require that and it is just not documented?
(In reply to Segher Boessenkool from comment #13) > Does that work? You cannot put all hard registers in memory I think? > Or do we require that and it is just not documented? It depends on insns. For example, if insn only requires memory operand but we have hard register instead, we spill it into memory. Generally speaking I think we can "invent" an architecture where such feature may result in problems. But the same possible for other aspects of LRA/reload work. In practice, I believe it is not a problem for real architectures.
GCC 9.3.0 has been released, adjusting target milestone.
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Fixed in GCC 10.
.