Reduced testcase: -Ofast -mavx -mno-avx2 -mtune=bdver2 float *a, *b; int c, d, e, f; void foo (void) { for (; c; c++) a[c] = 0; if (!d) for (; c < f; c++) b[c] = (double) e / b[c]; } There are various differences of the kind: - vcvtps2pd 16(%rsp), %xmm3 + vmovaps 16(%rsp), %xmm7 + vcvtps2pd %xmm7, %xmm3
With diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index bac9d66..1a7d386 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -161,7 +161,7 @@ "@internal GOT memory operand." (match_operand 0 "GOT_memory_operand")) -(define_constraint "Bm" +(define_memory_constraint "Bm" "@internal Vector memory operand." (match_operand 0 "vector_memory_operand")) I got - vmovaps 16(%rsp), %xmm7 - vcvtps2pd %xmm7, %xmm3 + vcvtps2pd 16(%rsp), %xmm3 But we can't use define_memory_constraint: https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00125.html
The related LRA code is case CT_MEMORY: if (MEM_P (op) && satisfies_memory_constraint_p (op, cn)) win = true; else if (spilled_pseudo_p (op)) win = true; /* If we didn't already win, we can reload constants via force_const_mem or put the pseudo value into memory, or make other memory by reloading the address like for 'o'. */ if (CONST_POOL_OK_P (mode, op) || MEM_P (op) || REG_P (op)) badop = false; /* If this operand accepts a register, and if the register class has at least one allocatable register, then this operand can be reloaded. */ if (winreg && !no_regs_p) badop = false;
Maybe we really need to have two types of memory constraints, ones which can be worst case always satisfied by reloading their address into an address register and another ones which can be worst case always satisfied by loading the memory into a temporary register (for loads) or storing it from a temporary register. Or consider the constraint as CT_MEMORY only if the operand satisfies the constraint predicate and as CT_FIXED_FORM (or whatever is the default) otherwise? Only normal CT_MEMORY, for Bm as long as it satisfies memory_operand, the exact address form doesn't really matters, but what matters is something inherent in the memory (and ISA flags).
Created attachment 37364 [details] A patch Does it make sense?
Created attachment 37372 [details] An updated patch This patch also changes LRA to skip bad MEM. Although there are no regressions on x86-64, there could be latent issues. Someone should exam IRA/LRA to see if it is OK.
Created attachment 37378 [details] A followup patch to call constraint_satisfied_p to check memory operand
(In reply to Jakub Jelinek from comment #3) > Maybe we really need to have two types of memory > constraints, ones which can be worst case always satisfied by reloading > their address into an address register and another ones which can be worst > case always satisfied by loading the memory into a temporary register (for > loads) or storing it from a temporary register. > > Or consider the constraint as CT_MEMORY only if the operand satisfies the > constraint predicate and as CT_FIXED_FORM (or whatever is the default) > otherwise? Only normal CT_MEMORY, for Bm as long as it satisfies > memory_operand, the exact address form doesn't really matters, but what > matters is something inherent in the memory (and ISA flags). I think it is a good idea. We need memory constraint which prevents memory address reloading. So I've started to implement the new type of constraint.
Author: vmakarov Date: Fri Jan 29 18:47:17 2016 New Revision: 232993 URL: https://gcc.gnu.org/viewcvs?rev=232993&root=gcc&view=rev Log: 2016-01-29 Vladimir Makarov <vmakarov@redhat.com> PR target/69299 * config/i386/constraints.md (Bm): Describe as special memory constraint. * doc/md.texi (DEFINE_SPECIAL_MEMORY_CONSTRAINT): Describe it. * genoutput.c (main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT. * genpreds.c (struct constraint_data): Add is_special_memory. (have_special_memory_constraints, special_memory_start): New static vars. (special_memory_end): Ditto. (add_constraint): Add new arg is_special_memory. Add code to process its true value. Update have_special_memory_constraints. (process_define_constraint): Pass the new arg. (process_define_register_constraint): Ditto. (choose_enum_order): Process special memory. (write_tm_preds_h): Generate enum const CT_SPECIAL_MEMORY and function insn_extra_special_memory_constraint. (main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT. * gensupport.c (process_rtx): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT. * ira-costs.c (record_reg_classes): Process CT_SPECIAL_MEMORY. * ira-lives.c (single_reg_class): Use insn_extra_special_memory_constraint. * ira.c (ira_setup_alts): Process CT_SPECIAL_MEMORY. * lra-constraints.c (process_alt_operands): Ditto. (curr_insn_transform): Use insn_extra_special_memory_constraint. * recog.c (asm_operand_ok, preprocess_constraints): Process CT_SPECIAL_MEMORY. * reload.c (find_reloads): Ditto. * rtl.def (DEFINE_SPECIFAL_MEMORY_CONSTRAINT): New. * stmt.c (parse_input_constraint): Use insn_extra_special_memory_constraint. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/constraints.md trunk/gcc/doc/md.texi trunk/gcc/genoutput.c trunk/gcc/genpreds.c trunk/gcc/gensupport.c trunk/gcc/ira-costs.c trunk/gcc/ira-lives.c trunk/gcc/ira.c trunk/gcc/lra-constraints.c trunk/gcc/recog.c trunk/gcc/reload.c trunk/gcc/rtl.def trunk/gcc/stmt.c
Fixed by Vlad's commit on the trunk.