Bug 43920 - Choosing conditional execution over conditional branches for code size in some cases.
Summary: Choosing conditional execution over conditional branches for code size in som...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.7.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2010-04-28 08:18 UTC by Carrot
Modified: 2011-09-19 06:17 UTC (History)
3 users (show)

See Also:
Host: i686-linux
Target: arm-eabi
Build: i686-linux
Known to work:
Known to fail: 4.5.0
Last reconfirmed: 2010-04-28 09:55:09


Attachments
test case (206 bytes, text/x-csrc)
2010-04-28 08:18 UTC, Carrot
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carrot 2010-04-28 08:18:00 UTC
Compile the attached source code with options -march=armv7-a -mthumb -Os, gcc generates following instructions for "if (start == -1 || end == -1)":

        ...
        cmp     r4, #-1
        ite     ne
        movne   r3, #0
        moveq   r3, #1
        cmp     r0, #-1
        it      eq
        orreq   r3, r3, #1
        cbnz    r3, .L4
        ...

A simplified code sequence is:

        ...
        cmp r4, #-1
        bne .L4
        cmp r0, #-1
        bne .L4
        ...

The if statement is trivially translated into the following gimple statements:

  D.2530 = start == -1;
  D.2531 = end == -1;
  D.2532 = D.2530 || D.2531;
  if (D.2532 != 0) goto <D.2533>; else goto <D.2534>;

And then expanded into rtl insns without further optimizations.
Comment 1 Carrot 2010-04-28 08:18:39 UTC
Created attachment 20504 [details]
test case
Comment 2 Carrot 2010-04-28 09:01:16 UTC
The expected sequence should be:

       ...
       cmp r4, #-1
       beq  .L4
       cmp r0, #-1
       beq  .L4
       ...

When changes the options to -march=armv5te -mthumb -Os, gcc can generate the expected codes. This time gcc still generate the same gimple code, but expand to different rtl insns. So we should use the same expand logic as thumb1 in this case.

Comment 3 Ramana Radhakrishnan 2010-04-28 09:55:09 UTC
Confirmed though it isn't as simple as an "expand" time problem alone.
Comment 4 Carrot 2010-04-29 02:23:53 UTC
It is not only good to code size, but also benefit performance. For any path to any successor block, the same number of taken branch executed, but less alu instructions executed.

It may be difficult to calculate the tradeoff for an arbitrary condition expression. But for a condition composed of a series of || or && operation, such as
    if (A || B || C || ...)
or 
    if (A && B && C && ...)
This simplification is always beneficial.
Comment 5 Tom de Vries 2011-04-05 10:04:48 UTC
Author: vries
Date: Tue Apr  5 10:04:44 2011
New Revision: 171976

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=171976
Log:
2011-04-05  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* cfgcleanup.c (flow_find_cross_jump): Don't count USE or CLOBBER as
	insn.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgcleanup.c
Comment 6 Tom de Vries 2011-04-05 10:12:17 UTC
Author: vries
Date: Tue Apr  5 10:12:14 2011
New Revision: 171977

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=171977
Log:
2011-04-05  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* function.c (emit_use_return_register_into_block): New function.
	(thread_prologue_and_epilogue_insns): Use
	emit_use_return_register_into_block.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/function.c
Comment 7 Tom de Vries 2011-04-05 10:33:16 UTC
Author: vries
Date: Tue Apr  5 10:33:13 2011
New Revision: 171978

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=171978
Log:
2011-04-05  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* config/arm/arm.h (BRANCH_COST): Set to 1 for Thumb-2 when optimizing
	for size.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.h
Comment 8 Tom de Vries 2011-04-05 13:01:55 UTC
Author: vries
Date: Tue Apr  5 13:01:50 2011
New Revision: 171986

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=171986
Log:
2011-04-05  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* gcc.target/arm/pr43920-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/arm/pr43920-1.c
Modified:
    trunk/gcc/testsuite/ChangeLog
Comment 9 Ramana Radhakrishnan 2011-04-06 09:41:10 UTC
Author: ramana
Date: Wed Apr  6 09:41:07 2011
New Revision: 172031

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172031
Log:
Fix commit for PR target/43920

Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/arm/pr43920-1.c
Comment 10 Tom de Vries 2011-04-07 08:10:38 UTC
Author: vries
Date: Thu Apr  7 08:10:34 2011
New Revision: 172090

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172090
Log:
2011-04-07  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* cfgcleanup.c (equal_different_set_p, can_replace_by, merge_dir): New
	function.
	(old_insns_match_p): Change return type.  Replace return false/true with
	return dir_none/dir_both.  Use can_replace_by.
	(flow_find_cross_jump): Add dir_p parameter.  Init replacement direction
	from dir_p.  Register replacement direction in dir, last_dir and
	afterlast_dir.	Handle new return type of old_insns_match_p using
	merge_dir.  Return replacement direction in dir_p.
	(flow_find_head_matching_sequence, outgoing_edges_match): Handle new
	return type of old_insns_match_p.
	(try_crossjump_to_edge): Add argument to call to flow_find_cross_jump.
	* ifcvt.c ( cond_exec_process_if_block): Add argument to call to
	flow_find_cross_jump.
	* basic-block.h (enum replace_direction): New type.
	(flow_find_cross_jump): Add parameter to declaration.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/basic-block.h
    trunk/gcc/cfgcleanup.c
    trunk/gcc/ifcvt.c
Comment 11 Tom de Vries 2011-04-07 08:35:27 UTC
Author: vries
Date: Thu Apr  7 08:35:23 2011
New Revision: 172091

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172091
Log:
2011-04-07  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* cfgcleanup.c (walk_to_nondebug_insn): New function.
	(flow_find_cross_jump): Use walk_to_nondebug_insn.  Recalculate bb1 and
	bb2.
	(try_crossjump_to_edge): Handle case that newpos1 or newpos2 is not src1
	or src2.  Redirect edges to the last basic block.  Update frequency and
	count on multiple basic blocks in case of fallthru.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgcleanup.c
Comment 12 Tom de Vries 2011-04-07 09:28:15 UTC
Author: vries
Date: Thu Apr  7 09:28:11 2011
New Revision: 172093

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172093
Log:
2011-04-07  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* lib/scanasm.exp (object-size): New proc.
	* gcc.target/arm/pr43920-2.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/arm/pr43920-2.c
Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/lib/scanasm.exp
Comment 13 Tom de Vries 2011-04-07 09:48:42 UTC
Author: vries
Date: Thu Apr  7 09:48:39 2011
New Revision: 172094

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172094
Log:
2011-04-07  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* cfgcleanup.c (try_crossjump_to_edge): Add dir parameter.  Pass dir to
	flow_find_cross_jump.  Swap variables to implement backward replacement.
	(try_crossjump_bb): Add argument to try_crossjump_to_edge.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgcleanup.c
Comment 14 Tom de Vries 2011-07-11 16:38:12 UTC
Patches and test-cases are checked in into trunk. Marking bug fixed.
Comment 15 jye2 2011-09-19 06:17:54 UTC
Author: jye2
Date: Mon Sep 19 06:17:45 2011
New Revision: 178953

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=178953
Log:
2011-09-19  chengbin  <bin.cheng@arm.com>

	Backport r174035 from mainline
	2011-05-22  Tom de Vries  <tom@codesourcery.com>

	PR middle-end/48689
	* fold-const.c (fold_checksum_tree): Guard TREE_CHAIN use with
	CODE_CONTAINS_STRUCT (TS_COMMON).

	Backport r172297 from mainline
	2011-04-11  Chung-Lin Tang  <cltang@codesourcery.com>
		Richard Earnshaw  <rearnsha@arm.com>

	PR target/48250
	* config/arm/arm.c (arm_legitimize_reload_address): Update cases
	to use sign-magnitude offsets. Reject unsupported unaligned
	cases. Add detailed description in comments.
	* config/arm/arm.md (reload_outdf): Disable for ARM mode; change
	condition from TARGET_32BIT to TARGET_ARM.

	Backport r171978 from mainline
	2011-04-05  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* config/arm/arm.h (BRANCH_COST): Set to 1 for Thumb-2 when optimizing
	for size.

	Backport r171632 from mainline
	2011-03-28  Richard Sandiford  <richard.sandiford@linaro.org>

	* builtins.c (expand_builtin_memset_args): Use gen_int_mode
	instead of GEN_INT.

	Backport r171379 from mainline
	2011-03-23  Chung-Lin Tang  <cltang@codesourcery.com>

	PR target/46934
	* config/arm/arm.md (casesi): Use the gen_int_mode() function
	to subtract lower bound instead of GEN_INT().

	Backport r171251 from mainline 
	2011-03-21  Daniel Jacobowitz  <dan@codesourcery.com>

	* config/arm/unwind-arm.c (__gnu_unwind_pr_common): Correct test
	for barrier handlers.

	Backport r171096 from mainline
	2011-03-17  Chung-Lin Tang  <cltang@codesourcery.com>

	PR target/43872
	* config/arm/arm.c (arm_get_frame_offsets): Adjust early
	return condition with !cfun->calls_alloca.


Modified:
    branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
    branches/ARM/embedded-4_6-branch/gcc/builtins.c
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.c
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.h
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.md
    branches/ARM/embedded-4_6-branch/gcc/config/arm/unwind-arm.c
    branches/ARM/embedded-4_6-branch/gcc/fold-const.c
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.target/arm/pr40887.c
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.target/arm/pr42575.c
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.target/arm/pr43698.c
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.target/arm/pr44788.c
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.target/arm/sync-1.c