gcc is using a divide and multiply where it could be using shift/ands: unsigned long f1(unsigned long x, unsigned long n) { return x % (1UL << n); } produces: f1: li 9,1 slw 9,9,4 divwu 0,3,9 mullw 0,0,9 subf 3,0,3 blr
powerpc-linux-gcc -m32 -O2 -S with patch applied now generates f1: li 9,1 slw 9,9,4 addi 9,9,-1 and 3,3,9 blr
Subject: Bug number pr26026 A patch for this bug has been added to the patch tracker. The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01468.html
Subject: Bug 26026 Author: amodra Date: Tue Apr 18 23:45:47 2006 New Revision: 113060 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113060 Log: PR rtl-optimization/26026 * fold-const.c (fold_binary): Optimize div and mod where the divisor is a known power of two shifted left a variable amount. Modified: trunk/gcc/ChangeLog trunk/gcc/fold-const.c
Patch applied mainline
just wanted to catch any follow-ups should they arise in future. thanks all! DaveK.
Subject: Bug 26026 Author: bergner Date: Thu Oct 19 04:05:34 2006 New Revision: 117877 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=117877 Log: * doc/invoke.texi: Add cpu_type power6x (RS/6000 and PowerPC Options): Add -mmfpgpr. * recog.c (store_data_bypass_p): Add support to allow IN_INSN to be a PARALLEL containing sets. Return false when out_pat is not a PARALLEL insn. * config/rs6000/aix52.h (ASM_CPU_SPEC): Add power6x. * config.gcc: Add cpu_type power6x. * configure.ac: Add test for mf{t,f}gpr instructions. (HAVE_AS_MFPGPR): New. * config.in: Regenerate. * configure: Regenerate. * config/rs6000/linux64.h (PROCESSOR_DEFAULT): Default to POWER6. (PROCESSOR_DEFAULT64): Likewise. * config/rs6000/rs6000.md (define_attr "type"): Add insert_dword, shift,trap,var_shift_rotate,cntlz,exts, var_delayed_compare, mffgpr and mftgpr attributes. (define_attr "cpu"): Add power6. Add power6x. Change instruction sequences to use new attributes. (floatsidf2,fix_truncdfsi2): use TARGET_MFPGPR. (fix_truncdfsi2_internal_mfpgpr): New. (floatsidf_ppc64_mfpgpr): New. (floatsidf_ppc64): Added !TARGET_MFPGPR condition. (movdf_hardfloat64_mfpgpr,movdi_internal64_mfpgpr): New. (movdf_hardfloat64): Added !TARGET_MFPGPR condition. (movdi_internal64): Added !TARGET_MFPGPR and related conditions. (fix_truncdfsi2): Use gpc_reg_operand constraint. * config/rs6000/{6xx.md,power4.md,8540.md,603.md,mpc.md, 7xx.md,rios2.md,7450.md,440.md,rios1.md,rs64.md,power5.md,40x.md}: Add descriptions for insert_dword, shift,trap,var_shift_rotate, cntlz,exts and var_delayed_compare. * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define _ARCH_PWR6X, if features enabled. * config/rs6000/rs6000.opt (mmfpgpr): New. * config/rs6000/rs6000.c (rs6000_align_branch_targets): New variable. (cached_can_issue_more): New variable. (processor_costs): Add power6_cost. (rs6000_sched_init): New function. (is_dispatch_slot_restricted): Deleted. (set_to_load_agen): New function. (is_load_insn,is_store_insn): New functions. (adjacent_mem_locations): New function. (insn_must_be_first_in_group): New function. (insn_must_be_last_in_group): New function. (rs6000_sched_reorder): New function. (rs6000_sched_reorder2): New function. (TARGET_SCHED_INIT,TARGET_SCHED_REORDER, TARGET_SCHED_REORDER2): Define. (processor_target_table): Use PROCESSOR_POWER6 for power6. Add power6x. Add MASK_MFPGPR for power6x. (POWERPC_MASKS): Add MASK_MFPGPR. (rs6000_override_options): Set rs6000_always_hint to false for power6. Set rs6000_align_branch_targets. Replace rs6000_sched_groups check with rs6000_align_branch_targets. Use PROCESSOR_POWER6. (last_scheduled_insn): New variable. (load_store_pendulum): New variable. (rs6000_variable_issue): Set last_scheduled_insn and cached_can_issue_more. (rs6000_adjust_cost): Add power6 cost adjustments. (rs6000_adjust_priority): Replace is_dispatch_slot_restricted with insn_must_be_first_in_group. Add power6 priority adjustments. (rs6000_issue_rate): Add CPU_POWER6. Add CPU_POWER6X. (insn_terminates_group_p): Use insn_must_be_{first,last}_in_group. * config/rs6000/rs6000.h (processor_type): Add PROCESSOR_POWER6. (TARGET_MFPGPR): New. (SECONDARY_MEMORY_NEEDED): Use TARGET_MFPGPR. (ASM_CPU_SPEC): Add power6x. Pass -mpower5 when cpu=power5. Pass -mpower5 when cpu=power5+. Pass -mpower6 when cpu=power6. (SECONDARY_MEMORY_NEEDED): Added mode!=DFmode and mode!=DImode conditions. * config/rs6000/power6.md: New file. PR rtl-optimization/26026 Backport from mainline 2006-04-19 Alan Modra <amodra@bigpond.net.au> * fold-const.c (fold_binary): Optimize div and mod where the divisor is a known power of two shifted left a variable amount. Added: branches/ibm/gcc-4_1-branch/gcc/config/rs6000/power6.md Modified: branches/ibm/gcc-4_1-branch/gcc/ChangeLog branches/ibm/gcc-4_1-branch/gcc/config.gcc branches/ibm/gcc-4_1-branch/gcc/config.in branches/ibm/gcc-4_1-branch/gcc/config/rs6000/40x.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/440.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/603.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/6xx.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/7450.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/7xx.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/8540.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/aix52.h branches/ibm/gcc-4_1-branch/gcc/config/rs6000/linux64.h branches/ibm/gcc-4_1-branch/gcc/config/rs6000/mpc.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/power4.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/power5.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rios1.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rios2.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000-c.c branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.c branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.h branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.md branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.opt branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs64.md branches/ibm/gcc-4_1-branch/gcc/configure branches/ibm/gcc-4_1-branch/gcc/configure.ac branches/ibm/gcc-4_1-branch/gcc/doc/invoke.texi branches/ibm/gcc-4_1-branch/gcc/fold-const.c branches/ibm/gcc-4_1-branch/gcc/recog.c
Subject: Bug 26026 Author: bergner Date: Fri Jun 22 17:56:14 2007 New Revision: 125952 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125952 Log: Reassociation rewrite backport from mainline. 2006-03-22 Jeff Law <law@redhat.com> * loop-unroll.c (analyze_iv_to_split_insn): Handle iv_analyze_result returning false. 2006-04-20 Jeff Law <law@redhat.com> * tree-ssa-reassoc.c (negate_value): Avoid num_imm_uses when checking for zero or one use. (reassociate_bb): Similarly. 2006-04-19 Alan Modra <amodra@bigpond.net.au> PR rtl-optimization/26026 * fold-const.c (fold_binary): Optimize div and mod where the divisor is a known power of two shifted left a variable amount. 2006-01-06 Jeff Law <law@redhat.com> * tree-cfg.c (bsi_replace): Rename final argument from PRESERVE_EH_INFO to UPDATE_EH_INFO. Fix typo in last change (stmt -> orig_stmt). * tree-eh.c (verify_eh_throw_stmt_node): New function. (bsi_remove): Add new argument. Remove EH information if requested. (verify_eh_throw_table_statements): New function. (bsi_remove): Add new argument REMOVE_EH_INFO. All callers updated. * tree-optimize.c (execute_free_cfg_annotations): Verify the EH throw statement table after removing annotations. * except.h (verify_eh_throw_table_statements): Prototype. * tree-flow.h (bsi_remove): Update prototype. * tree-vrp.c (remove_range_assertions): Add new argument to bsi_remove call. * tree-ssa-loop-im.c (move_computations_stmt): Likewise. * tree-complex.c (expand_complex_div_wide): Likewise. * tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Likewise * tree-tailcall.c (eliminate_tailcall): Likewise. * tree-ssa-dse.c (dse_optimize_stmt): Likewise. * tree-ssa-loop-ivopts.c (remove_statement): Likewise. * tree-nrv.c (tree_nrv): Likewise. * tree-vectorizer.c (slpeel_make_loop_iterate_ntimes): Likewise. * tree-if-conv.c (tree_if_convert_cond_expr): Likewise. (combine_blocks): Likewise. * tree-ssa-phiopt.c (replace_phi_edge_with_variable): Likewise. * tree-cfgcleanup.c (cleanup_ctrl_expr_graph): Likewise. (cleanup_control_flow): Likewise. (remove_forwarder_block): Likewise. * tree-ssa-pre.c (remove_dead_inserted_code): Likewise. * tree-sra.c (sra_replace): Likewise. * tree-ssa-forwprop.c (forward_propagate_into_cond): Likewise. (forward_propagate_single_use_vars): Likewise. * tree-ssa-dce.c (remove_dead_stmt): Likewise. * tree-inline.c (expand_call_inline): Likewise. * tree-vect-transform.c (vect_transform_loop): Likewise. * tree-outof-ssa.c (rewrite_trees): Likewise. * tree-cfg.c (make_goto_expr_edges): Likewise. (cleanup_dead_labels): Likewise. (tree_merge_blocks, remove_bb, disband_implicit_edges): Likewise. (bsi_move_before, bsi_move_after): Likewise. (bsi_move_to_bb_end, try_redirect_by_replacing_jump): Likewise (tree_redirect_edge_and_branch, tree_split_block): Likewise. 2006-01-04 Jeff Law <law@redhat.com> * tree-cfg.c (bsi_replace): Remove the original statement from the EH throw statement table. 2005-12-19 Roger Sayle <roger@eyesopen.com> * combine.c (try_combine): Improve splitting of binary operators by taking advantage of reassociative transformations. 2005-12-12 Jeff Law <law@redhat.com> * tree-ssa-dom.c (simplify_rhs_and_lookup_avail_expr): Remove reassociation code. * passes.c (init_optimization_passes): Run reassociation again after loop optimizations. 2005-12-12 Daniel Berlin <dberlin@dberlin.org> * tree-ssa-dom.c (thread_across_edge): Canonicalize condition if necessary. (optimize_stmt): Ditto. (canonicalize_comparison): New function. * tree-ssa-operands.c (swap_tree_operands): Make external. (get_expr_operands): Stop auto-canonicalization. * tree-ssa-reassoc.c: Rewrite. (init_optimization_passes): * tree-flow.h (swap_tree_operands): Prototype. * Makefile.in (tree-ssa-reassoc.o): Update dependencies. * gcc.dg/tree-ssa/ssa-pre-2.c: Update due to reassociation changes. * gcc.dg/tree-ssa/reassoc-1.c: Likewise. * gcc.dg/tree-ssa/reassoc-2.c: Likewise. * gcc.dg/tree-ssa/reassoc-3.c: Likewise. * gcc.dg/tree-ssa/reassoc-4.c: Likewise. * gcc.dg/tree-ssa/reassoc-5.c: New. * gcc.dg/tree-ssa/reassoc-6.c: New. * gcc.dg/tree-ssa/reassoc-7.c: New. * gcc.dg/tree-ssa/reassoc-8.c: New. * gcc.dg/tree-ssa/reassoc-9.c: New. * gcc.dg/tree-ssa/reassoc-10.c: New. * gcc.dg/tree-ssa/reassoc-11.c: New. Added: branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-10.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-11.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-5.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-6.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-7.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-8.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-9.c Modified: branches/ibm/gcc-4_1-branch/gcc/ChangeLog branches/ibm/gcc-4_1-branch/gcc/Makefile.in branches/ibm/gcc-4_1-branch/gcc/combine.c branches/ibm/gcc-4_1-branch/gcc/except.h branches/ibm/gcc-4_1-branch/gcc/loop-unroll.c branches/ibm/gcc-4_1-branch/gcc/passes.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-1.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-2.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-3.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-4.c branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-2.c branches/ibm/gcc-4_1-branch/gcc/tree-cfg.c branches/ibm/gcc-4_1-branch/gcc/tree-cfgcleanup.c branches/ibm/gcc-4_1-branch/gcc/tree-complex.c branches/ibm/gcc-4_1-branch/gcc/tree-eh.c branches/ibm/gcc-4_1-branch/gcc/tree-flow.h branches/ibm/gcc-4_1-branch/gcc/tree-if-conv.c branches/ibm/gcc-4_1-branch/gcc/tree-inline.c branches/ibm/gcc-4_1-branch/gcc/tree-nrv.c branches/ibm/gcc-4_1-branch/gcc/tree-optimize.c branches/ibm/gcc-4_1-branch/gcc/tree-outof-ssa.c branches/ibm/gcc-4_1-branch/gcc/tree-sra.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-dce.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-dom.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-dse.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-forwprop.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-loop-im.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-loop-ivopts.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-operands.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-phiopt.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-pre.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-reassoc.c branches/ibm/gcc-4_1-branch/gcc/tree-ssa-threadupdate.c branches/ibm/gcc-4_1-branch/gcc/tree-tailcall.c branches/ibm/gcc-4_1-branch/gcc/tree-vect-transform.c branches/ibm/gcc-4_1-branch/gcc/tree-vectorizer.c branches/ibm/gcc-4_1-branch/gcc/tree-vrp.c