Bug 26026 - power of 2 mod missing optimisation
Summary: power of 2 mod missing optimisation
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.2.0
: P3 enhancement
Target Milestone: 4.2.0
Assignee: Alan Modra
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: missed-optimization, patch
Depends on:
Blocks: spec
  Show dependency treegraph
 
Reported: 2006-01-30 07:02 UTC by Anton Blanchard
Modified: 2006-04-21 12:10 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-01-31 02:45:41


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Anton Blanchard 2006-01-30 07:02:57 UTC
gcc is using a divide and multiply where it could be using shift/ands:

unsigned long f1(unsigned long x, unsigned long n)
{
        return x % (1UL << n);
}

produces:

f1:
        li 9,1
        slw 9,9,4
        divwu 0,3,9
        mullw 0,0,9
        subf 3,0,3
        blr
Comment 1 Alan Modra 2006-02-03 12:37:05 UTC
powerpc-linux-gcc -m32 -O2 -S with patch applied now generates
f1:
        li 9,1
        slw 9,9,4
        addi 9,9,-1
        and 3,3,9
        blr
Comment 2 patchapp@dberlin.org 2006-03-24 03:38:46 UTC
Subject: Bug number pr26026

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01468.html
Comment 3 Alan Modra 2006-04-18 23:45:55 UTC
Subject: Bug 26026

Author: amodra
Date: Tue Apr 18 23:45:47 2006
New Revision: 113060

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113060
Log:
	PR rtl-optimization/26026
	* fold-const.c (fold_binary): Optimize div and mod where the divisor
	is a known power of two shifted left a variable amount.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c

Comment 4 Alan Modra 2006-04-18 23:46:37 UTC
Patch applied mainline
Comment 5 Dave Korn 2006-04-21 12:10:07 UTC
just wanted to catch any follow-ups should they arise in future.  thanks all!
  DaveK.
Comment 6 Peter Bergner 2006-10-19 04:05:47 UTC
Subject: Bug 26026

Author: bergner
Date: Thu Oct 19 04:05:34 2006
New Revision: 117877

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=117877
Log:
	* doc/invoke.texi: Add cpu_type power6x
	(RS/6000 and PowerPC Options): Add -mmfpgpr.
	* recog.c (store_data_bypass_p): Add support to allow IN_INSN to
	be a PARALLEL containing sets.	Return false when out_pat is not
	a PARALLEL insn.
	* config/rs6000/aix52.h (ASM_CPU_SPEC): Add power6x.
	* config.gcc: Add cpu_type power6x.
	* configure.ac: Add test for mf{t,f}gpr instructions.
	(HAVE_AS_MFPGPR): New.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config/rs6000/linux64.h (PROCESSOR_DEFAULT): Default to POWER6.
	(PROCESSOR_DEFAULT64): Likewise.
	* config/rs6000/rs6000.md (define_attr "type"): Add insert_dword,
	shift,trap,var_shift_rotate,cntlz,exts, var_delayed_compare, mffgpr
	and mftgpr attributes.
	(define_attr "cpu"): Add power6. Add power6x.
	Change instruction sequences to use new attributes.
	(floatsidf2,fix_truncdfsi2): use TARGET_MFPGPR.
	(fix_truncdfsi2_internal_mfpgpr): New.
	(floatsidf_ppc64_mfpgpr): New.
	(floatsidf_ppc64): Added !TARGET_MFPGPR condition.
	(movdf_hardfloat64_mfpgpr,movdi_internal64_mfpgpr): New.
	(movdf_hardfloat64): Added !TARGET_MFPGPR condition.
	(movdi_internal64): Added !TARGET_MFPGPR and related conditions.
	(fix_truncdfsi2): Use gpc_reg_operand constraint.
	* config/rs6000/{6xx.md,power4.md,8540.md,603.md,mpc.md,
	7xx.md,rios2.md,7450.md,440.md,rios1.md,rs64.md,power5.md,40x.md}:
	Add descriptions for insert_dword, shift,trap,var_shift_rotate,
	cntlz,exts and var_delayed_compare.
	* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define
	_ARCH_PWR6X, if features enabled.
	* config/rs6000/rs6000.opt (mmfpgpr): New.
	* config/rs6000/rs6000.c (rs6000_align_branch_targets): New variable.
	(cached_can_issue_more): New variable.
	(processor_costs): Add power6_cost.
	(rs6000_sched_init): New function.
	(is_dispatch_slot_restricted): Deleted.
	(set_to_load_agen): New function.
	(is_load_insn,is_store_insn): New functions.
	(adjacent_mem_locations): New function.
	(insn_must_be_first_in_group): New function.
	(insn_must_be_last_in_group): New function.
	(rs6000_sched_reorder): New function.
	(rs6000_sched_reorder2): New function.
	(TARGET_SCHED_INIT,TARGET_SCHED_REORDER,
	TARGET_SCHED_REORDER2): Define.
	(processor_target_table): Use PROCESSOR_POWER6 for power6.
	Add power6x. Add MASK_MFPGPR for power6x.
	(POWERPC_MASKS): Add MASK_MFPGPR.
	(rs6000_override_options): Set rs6000_always_hint to false
	for power6.  Set rs6000_align_branch_targets. Replace
	rs6000_sched_groups check with rs6000_align_branch_targets.
	Use PROCESSOR_POWER6.
	(last_scheduled_insn): New variable.
	(load_store_pendulum): New variable.
	(rs6000_variable_issue): Set last_scheduled_insn and
	cached_can_issue_more.
	(rs6000_adjust_cost): Add power6 cost adjustments.
	(rs6000_adjust_priority): Replace is_dispatch_slot_restricted
	with insn_must_be_first_in_group. Add power6 priority adjustments.
	(rs6000_issue_rate): Add CPU_POWER6. Add CPU_POWER6X.
	(insn_terminates_group_p): Use insn_must_be_{first,last}_in_group.
	* config/rs6000/rs6000.h (processor_type): Add PROCESSOR_POWER6.
	(TARGET_MFPGPR): New.
	(SECONDARY_MEMORY_NEEDED): Use TARGET_MFPGPR.
	(ASM_CPU_SPEC): Add power6x. Pass -mpower5 when cpu=power5.
	Pass -mpower5 when cpu=power5+.  Pass -mpower6 when cpu=power6.
	(SECONDARY_MEMORY_NEEDED): Added mode!=DFmode and mode!=DImode
	conditions.
	* config/rs6000/power6.md: New file.

	PR rtl-optimization/26026
	Backport from mainline
	2006-04-19  Alan Modra	<amodra@bigpond.net.au>
	* fold-const.c (fold_binary): Optimize div and mod where the divisor
	is a known power of two shifted left a variable amount.

Added:
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/power6.md
Modified:
    branches/ibm/gcc-4_1-branch/gcc/ChangeLog
    branches/ibm/gcc-4_1-branch/gcc/config.gcc
    branches/ibm/gcc-4_1-branch/gcc/config.in
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/40x.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/440.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/603.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/6xx.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/7450.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/7xx.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/8540.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/aix52.h
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/linux64.h
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/mpc.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/power4.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/power5.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rios1.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rios2.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000-c.c
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.c
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.h
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.md
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs6000.opt
    branches/ibm/gcc-4_1-branch/gcc/config/rs6000/rs64.md
    branches/ibm/gcc-4_1-branch/gcc/configure
    branches/ibm/gcc-4_1-branch/gcc/configure.ac
    branches/ibm/gcc-4_1-branch/gcc/doc/invoke.texi
    branches/ibm/gcc-4_1-branch/gcc/fold-const.c
    branches/ibm/gcc-4_1-branch/gcc/recog.c

Comment 7 Peter Bergner 2007-06-22 17:56:33 UTC
Subject: Bug 26026

Author: bergner
Date: Fri Jun 22 17:56:14 2007
New Revision: 125952

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125952
Log:
Reassociation rewrite backport from mainline.

	2006-03-22  Jeff Law  <law@redhat.com>
	* loop-unroll.c (analyze_iv_to_split_insn): Handle
	iv_analyze_result returning false.

	2006-04-20  Jeff Law  <law@redhat.com>
	* tree-ssa-reassoc.c (negate_value): Avoid num_imm_uses when
	checking for zero or one use.
	(reassociate_bb): Similarly.

	2006-04-19  Alan Modra  <amodra@bigpond.net.au>
	PR rtl-optimization/26026
	* fold-const.c (fold_binary): Optimize div and mod where the divisor
	is a known power of two shifted left a variable amount.

	2006-01-06  Jeff Law  <law@redhat.com>
	* tree-cfg.c (bsi_replace): Rename final argument from
	PRESERVE_EH_INFO to UPDATE_EH_INFO.  Fix typo in last
	change (stmt -> orig_stmt).
	* tree-eh.c (verify_eh_throw_stmt_node): New function.
	(bsi_remove): Add new argument.  Remove EH information
	if requested.
	(verify_eh_throw_table_statements): New function.
	(bsi_remove): Add new argument REMOVE_EH_INFO.  All callers
	updated.
	* tree-optimize.c (execute_free_cfg_annotations): Verify
	the EH throw statement table after removing annotations.
	* except.h (verify_eh_throw_table_statements): Prototype.
	* tree-flow.h (bsi_remove): Update prototype.
	* tree-vrp.c (remove_range_assertions): Add new argument to
	bsi_remove call.
	* tree-ssa-loop-im.c (move_computations_stmt): Likewise.
	* tree-complex.c (expand_complex_div_wide): Likewise.
	* tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Likewise
	* tree-tailcall.c (eliminate_tailcall): Likewise.
	* tree-ssa-dse.c (dse_optimize_stmt): Likewise.
	* tree-ssa-loop-ivopts.c (remove_statement): Likewise.
	* tree-nrv.c (tree_nrv): Likewise.
	* tree-vectorizer.c (slpeel_make_loop_iterate_ntimes): Likewise.
	* tree-if-conv.c (tree_if_convert_cond_expr): Likewise.
	(combine_blocks): Likewise.
	* tree-ssa-phiopt.c (replace_phi_edge_with_variable): Likewise.
	* tree-cfgcleanup.c (cleanup_ctrl_expr_graph): Likewise.
	(cleanup_control_flow): Likewise.
	(remove_forwarder_block): Likewise.
	* tree-ssa-pre.c (remove_dead_inserted_code): Likewise.
	* tree-sra.c (sra_replace): Likewise.
	* tree-ssa-forwprop.c (forward_propagate_into_cond): Likewise.
	(forward_propagate_single_use_vars): Likewise.
	* tree-ssa-dce.c (remove_dead_stmt): Likewise.
	* tree-inline.c (expand_call_inline): Likewise.
	* tree-vect-transform.c (vect_transform_loop): Likewise.
	* tree-outof-ssa.c (rewrite_trees): Likewise.
	* tree-cfg.c (make_goto_expr_edges): Likewise.
	(cleanup_dead_labels): Likewise.
	(tree_merge_blocks, remove_bb, disband_implicit_edges): Likewise.
	(bsi_move_before, bsi_move_after): Likewise.
	(bsi_move_to_bb_end, try_redirect_by_replacing_jump): Likewise
	(tree_redirect_edge_and_branch, tree_split_block): Likewise.

	2006-01-04  Jeff Law  <law@redhat.com>
	* tree-cfg.c (bsi_replace): Remove the original statement
	from the EH throw statement table.

	2005-12-19  Roger Sayle  <roger@eyesopen.com>
	* combine.c (try_combine): Improve splitting of binary operators
	by taking advantage of reassociative transformations.

	2005-12-12  Jeff Law  <law@redhat.com>
	* tree-ssa-dom.c (simplify_rhs_and_lookup_avail_expr): Remove
	reassociation code.
	* passes.c (init_optimization_passes): Run reassociation again
	after loop optimizations.

	2005-12-12  Daniel Berlin  <dberlin@dberlin.org>
	* tree-ssa-dom.c (thread_across_edge): Canonicalize condition
	if necessary.
	(optimize_stmt): Ditto.
	(canonicalize_comparison): New function.
	* tree-ssa-operands.c (swap_tree_operands): Make external.
	(get_expr_operands): Stop auto-canonicalization.
	* tree-ssa-reassoc.c: Rewrite.
	(init_optimization_passes): 
	* tree-flow.h (swap_tree_operands): Prototype.
	* Makefile.in (tree-ssa-reassoc.o): Update dependencies.

	* gcc.dg/tree-ssa/ssa-pre-2.c: Update due to reassociation changes.
	* gcc.dg/tree-ssa/reassoc-1.c: Likewise.
	* gcc.dg/tree-ssa/reassoc-2.c: Likewise.
	* gcc.dg/tree-ssa/reassoc-3.c: Likewise.
	* gcc.dg/tree-ssa/reassoc-4.c: Likewise.
	* gcc.dg/tree-ssa/reassoc-5.c: New.
	* gcc.dg/tree-ssa/reassoc-6.c: New.
	* gcc.dg/tree-ssa/reassoc-7.c: New.
	* gcc.dg/tree-ssa/reassoc-8.c: New.
	* gcc.dg/tree-ssa/reassoc-9.c: New.
	* gcc.dg/tree-ssa/reassoc-10.c: New.
	* gcc.dg/tree-ssa/reassoc-11.c: New.

Added:
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-10.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-11.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-5.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-6.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-7.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-8.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-9.c
Modified:
    branches/ibm/gcc-4_1-branch/gcc/ChangeLog
    branches/ibm/gcc-4_1-branch/gcc/Makefile.in
    branches/ibm/gcc-4_1-branch/gcc/combine.c
    branches/ibm/gcc-4_1-branch/gcc/except.h
    branches/ibm/gcc-4_1-branch/gcc/loop-unroll.c
    branches/ibm/gcc-4_1-branch/gcc/passes.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-1.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-2.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-3.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/reassoc-4.c
    branches/ibm/gcc-4_1-branch/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-2.c
    branches/ibm/gcc-4_1-branch/gcc/tree-cfg.c
    branches/ibm/gcc-4_1-branch/gcc/tree-cfgcleanup.c
    branches/ibm/gcc-4_1-branch/gcc/tree-complex.c
    branches/ibm/gcc-4_1-branch/gcc/tree-eh.c
    branches/ibm/gcc-4_1-branch/gcc/tree-flow.h
    branches/ibm/gcc-4_1-branch/gcc/tree-if-conv.c
    branches/ibm/gcc-4_1-branch/gcc/tree-inline.c
    branches/ibm/gcc-4_1-branch/gcc/tree-nrv.c
    branches/ibm/gcc-4_1-branch/gcc/tree-optimize.c
    branches/ibm/gcc-4_1-branch/gcc/tree-outof-ssa.c
    branches/ibm/gcc-4_1-branch/gcc/tree-sra.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-dce.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-dom.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-dse.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-forwprop.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-loop-im.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-loop-ivopts.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-operands.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-phiopt.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-pre.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-reassoc.c
    branches/ibm/gcc-4_1-branch/gcc/tree-ssa-threadupdate.c
    branches/ibm/gcc-4_1-branch/gcc/tree-tailcall.c
    branches/ibm/gcc-4_1-branch/gcc/tree-vect-transform.c
    branches/ibm/gcc-4_1-branch/gcc/tree-vectorizer.c
    branches/ibm/gcc-4_1-branch/gcc/tree-vrp.c