The following test case from both glibc and openlibm: typedef union { float value; /* FIXME: Assumes 32 bit int. */ unsigned int word; } ieee_float_shape_type; /* Get a 32 bit int from a float. */ #define GET_FLOAT_WORD(i,d) \ do { \ ieee_float_shape_type gf_u; \ gf_u.value = (d); \ (i) = gf_u.word; \ } while (0) int foo(float d) { int i; GET_FLOAT_WORD(i, d); return i; } Uses the stack to do the conversion: foo: stfs 1,-16(1) ori 2,2,0 lwa 3,-16(1) LLVM does what I expect: foo: xscvdpspn 0, 1 xxsldwi 0, 0, 0, 3 mfvsrwz 3, 0 extsw 3, 3
Confirmed. Still better is xscvdpspn 0,1 mfvsrd 3,0 srdi 3,3,32
sradi, that is.
Tracked this back to 210824, and in particular this change: @@ -860,10 +897,15 @@ } } - /* If the alternative actually allows memory, make - things a bit cheaper since we won't need an extra - insn to load it. */ - if (op_class != NO_REGS) + if (op_class == NO_REGS) + /* Although we don't need insn to reload from + memory, still accessing memory is usually more + expensive than a register. */ + pp->mem_cost = frequency; + else + /* If the alternative actually allows memory, make + things a bit cheaper since we won't need an + extra insn to load it. */ pp->mem_cost = ((out_p ? ira_memory_move_cost[mode][op_class][0] : 0) + (in_p ? ira_memory_move_cost[mode][op_class][1] : 0) Without this change, you get this from IRA costs: a0 (r157,l0) best GENERAL_REGS, allocno GENERAL_REGS a0(r157,l0) costs: BASE_REGS:4000,4000 GENERAL_REGS:4000,4000 LINK_REGS:16000,16000 CTR_REGS:16000,16000 LINK_OR_CTR_REGS:16000,16000 SPEC_OR_GEN_REGS:16000,16000 MEM:4000,4000 and we pick the pattern for xscvdpspn. With the change, you get a0 (r157,l0) best NO_REGS, allocno NO_REGS a0(r157,l0) costs: BASE_REGS:3000,3000 GENERAL_REGS:3000,3000 LINK_REGS:15000,15000 CTR_REGS:15000,15000 LINK_OR_CTR_REGS:15000,15000 SPEC_OR_GEN_REGS:15000,15000 MEM:2000,2000 And this gets done via memory instead of the register moves.
(In reply to acsawdey from comment #3) > Tracked this back to 210824, and in particular this change: > > @@ -860,10 +897,15 @@ > } > } > > - /* If the alternative actually allows memory, make > - things a bit cheaper since we won't need an extra > - insn to load it. */ > - if (op_class != NO_REGS) > + if (op_class == NO_REGS) > + /* Although we don't need insn to reload from > + memory, still accessing memory is usually more > + expensive than a register. */ > + pp->mem_cost = frequency; > + else > + /* If the alternative actually allows memory, make > + things a bit cheaper since we won't need an > + extra insn to load it. */ > pp->mem_cost > = ((out_p ? ira_memory_move_cost[mode][op_class][0] : > 0) > + (in_p ? ira_memory_move_cost[mode][op_class][1] : > 0) So looking into things, there are the following 2 insns using pseudo 157: (insn 2 4 3 2 (set (reg/v:SF 157 [ d ]) (reg:SF 33 1 [ d ])) test.c:9 466 {movsf_hardfloat} (expr_list:REG_DEAD (reg:SF 33 1 [ d ]) (expr_list:REG_EQUIV (mem/c:SF (plus:DI (reg/f:DI 67 ap) (const_int 48 [0x30])) [1 d+0 S4 A64]) (nil)))) (insn 11 6 12 2 (set (reg/i:DI 3 3) (sign_extend:DI (subreg:SI (reg/v:SF 157 [ d ]) 0))) test.c:16 40 {extendsidi2} (expr_list:REG_DEAD (reg/v:SF 157 [ d ]) (nil))) For the alternatives of the 2 insns that accept memory, the new code is now setting a mem_cost of 1000 (frequency), whereas before it wouldn't have been modified for those alternatives. This seems to ignore the fact that the given insn alternatives will be a store and load. Vlad, Should the new code somehow be taking into account memory_move_cost (i.e. memory_move_cost * frequency)?
GCC 4.9 branch is being closed
*** Bug 78823 has been marked as a duplicate of this bug. ***
I noticed this bug and created PR 78823. Since that's the same bug as this, I'll take it.
Author: meissner Date: Thu Jan 5 00:43:53 2017 New Revision: 244084 URL: https://gcc.gnu.org/viewcvs?rev=244084&root=gcc&view=rev Log: [gcc] 2017-01-04 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71977 PR target/70568 PR target/78823 * config/rs6000/predicates.md (sf_subreg_operand): New predicate. (altivec_register_operand): Do not return true if the operand contains a SUBREG mixing SImode and SFmode. (vsx_register_operand): Likewise. (vsx_reg_sfsubreg_ok): New predicate. (vfloat_operand): Do not return true if the operand contains a SUBREG mixing SImode and SFmode. (vint_operand): Likewise. (vlogical_operand): Likewise. (gpc_reg_operand): Likewise. (int_reg_operand): Likewise. * config/rs6000/rs6000-protos.h (valid_sf_si_move): Add declaration. * config/rs6000/rs6000.c (valid_sf_si_move): New function to determine if a MOVSI or MOVSF operation contains SUBREGs that mix SImode and SFmode. (rs6000_emit_move_si_sf_subreg): New helper function. (rs6000_emit_move): Call rs6000_emit_move_si_sf_subreg to possbily fixup SUBREGs involving SImode and SFmode. * config/rs6000/vsx.md (SFBOOL_*): New constants that are operand numbers for the new peephole2 optimization. (peephole2 for SFmode unions): New peephole2 to optimize cases in the GLIBC math library that do AND/IOR/XOR operations on single precision floating point. * config/rs6000/rs6000.h (TARGET_NO_SF_SUBREG): New internal target macros to say whether we need to avoid SUBREGs mixing SImode and SFmode. (TARGET_ALLOW_SF_SUBREG): Likewise. * config/rs6000/rs6000.md (UNSPEC_SF_FROM_SI): New unspecs. (UNSPEC_SI_FROM_SF): Likewise. (iorxor): Change spacing. (and_ior_xor): New iterator for AND, IOR, and XOR. (movsi_from_sf): New insns for SImode/SFmode SUBREG support. (movdi_from_sf_zero_ext): Likewise. (mov<mode>_hardfloat, FMOVE32 iterator): Use register_operand instead of gpc_reg_operand. Add SImode/SFmode SUBREG support. (movsf_from_si): New insn for SImode/SFmode SUBREG support. (fma<mode>4): Use gpc_reg_operand instead of register_operand. (fms<mode>4): Likewise. (fnma<mode>4): Likewise. (fnms<mode>4): Likewise. (nfma<mode>4): Likewise. (nfms<mode>4): Likewise. [gcc/testsuite] 2017-01-04 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71977 PR target/70568 PR target/78823 * gcc.target/powerpc/pr71977-1.c: New tests to check whether on 64-bit VSX systems with direct move, whether we optimize common code sequences in the GLIBC math library for float math functions. * gcc.target/powerpc/pr71977-2.c: Likewise. Added: trunk/gcc/testsuite/gcc.target/powerpc/pr71977-1.c trunk/gcc/testsuite/gcc.target/powerpc/pr71977-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/predicates.md trunk/gcc/config/rs6000/rs6000-protos.h trunk/gcc/config/rs6000/rs6000.c trunk/gcc/config/rs6000/rs6000.h trunk/gcc/config/rs6000/rs6000.md trunk/gcc/config/rs6000/vsx.md trunk/gcc/testsuite/ChangeLog
Author: meissner Date: Tue Jan 10 17:44:17 2017 New Revision: 244279 URL: https://gcc.gnu.org/viewcvs?rev=244279&root=gcc&view=rev Log: [gcc] 2017-01-10 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from trunk 2017-01-04 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71977 PR target/70568 PR target/78823 * config/rs6000/predicates.md (sf_subreg_operand): New predicate. (altivec_register_operand): Do not return true if the operand contains a SUBREG mixing SImode and SFmode. (vsx_register_operand): Likewise. (vsx_reg_sfsubreg_ok): New predicate. (vfloat_operand): Do not return true if the operand contains a SUBREG mixing SImode and SFmode. (vint_operand): Likewise. (vlogical_operand): Likewise. (gpc_reg_operand): Likewise. (int_reg_operand): Likewise. * config/rs6000/rs6000-protos.h (valid_sf_si_move): Add declaration. * config/rs6000/rs6000.c (valid_sf_si_move): New function to determine if a MOVSI or MOVSF operation contains SUBREGs that mix SImode and SFmode. (rs6000_emit_move_si_sf_subreg): New helper function. (rs6000_emit_move): Call rs6000_emit_move_si_sf_subreg to possbily fixup SUBREGs involving SImode and SFmode. * config/rs6000/vsx.md (SFBOOL_*): New constants that are operand numbers for the new peephole2 optimization. (peephole2 for SFmode unions): New peephole2 to optimize cases in the GLIBC math library that do AND/IOR/XOR operations on single precision floating point. * config/rs6000/rs6000.h (TARGET_NO_SF_SUBREG): New internal target macros to say whether we need to avoid SUBREGs mixing SImode and SFmode. (TARGET_ALLOW_SF_SUBREG): Likewise. * config/rs6000/rs6000.md (UNSPEC_SF_FROM_SI): New unspecs. (UNSPEC_SI_FROM_SF): Likewise. (iorxor): Change spacing. (and_ior_xor): New iterator for AND, IOR, and XOR. (movsi_from_sf): New insns for SImode/SFmode SUBREG support. (movdi_from_sf_zero_ext): Likewise. (mov<mode>_hardfloat, FMOVE32 iterator): Use register_operand instead of gpc_reg_operand. Add SImode/SFmode SUBREG support. (movsf_from_si): New insn for SImode/SFmode SUBREG support. (fma<mode>4): Use gpc_reg_operand instead of register_operand. (fms<mode>4): Likewise. (fnma<mode>4): Likewise. (fnms<mode>4): Likewise. (nfma<mode>4): Likewise. (nfms<mode>4): Likewise. * config/rs6000/rs6000.h (TARGET_DIRECT_MOVE_64BIT): Define macro used by the above patch in a limited form. * config/rs6000/vsx.md (peephole2 for SFmode unions): Use DFmode for doing direct moves instead of DImode, since GCC 6.x does not support DImode in Altivec registers. [gcc/testsuite] 2017-01-10 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from trunk 2017-01-04 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71977 PR target/70568 PR target/78823 * gcc.target/powerpc/pr71977-1.c: New tests to check whether on 64-bit VSX systems with direct move, whether we optimize common code sequences in the GLIBC math library for float math functions. * gcc.target/powerpc/pr71977-2.c: Likewise. Added: branches/ibm/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr71977-1.c - copied unchanged from r244232, trunk/gcc/testsuite/gcc.target/powerpc/pr71977-1.c branches/ibm/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr71977-2.c - copied unchanged from r244232, trunk/gcc/testsuite/gcc.target/powerpc/pr71977-2.c Modified: branches/ibm/gcc-6-branch/gcc/ChangeLog.ibm branches/ibm/gcc-6-branch/gcc/config/rs6000/predicates.md branches/ibm/gcc-6-branch/gcc/config/rs6000/rs6000-protos.h branches/ibm/gcc-6-branch/gcc/config/rs6000/rs6000.c branches/ibm/gcc-6-branch/gcc/config/rs6000/rs6000.h branches/ibm/gcc-6-branch/gcc/config/rs6000/rs6000.md branches/ibm/gcc-6-branch/gcc/config/rs6000/vsx.md branches/ibm/gcc-6-branch/gcc/testsuite/ChangeLog.ibm
Fixed in trunk (GCC 7) and the IBM Advance Toolchain branch for GCC 6. See PR 71977 for more details.