If-conversion currently lacks the ability to merge two basic blocks that are identical except for an input register that can be set up using a conditional move, and possibly some different local registers. Cross-jumping lacks the ability to recognize that two blocks are equivalent when there are different local registers, and (for -Os) to find opportunities where setting up input registers will allow to do cross jumping. The problem has been discussed here: http://gcc.gnu.org/ml/gcc-patches/2004-01/msg03281.html http://gcc.gnu.org/ml/gcc-patches/2005-01/msg00672.html A patch against 4.0 20050218 has been posted here: http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01066.html
Confirmed based on RTH's comments to the patch.
regression testing of the patch in gcc 4.0 20050218 was successful.
While merging the patch in a 3.4.3 based branch, I noticed three small issues: - The rest_of_handle_life prototype in passes.c is not necessary. It is a vestige from a previous (more intrusive) attempt to provide life information to if_convert. - In passes.c:rest_of_handle_if_conversion, at the end of the flag_expensive_optimizations code, EXIT_BLOCK_PTR->global_live_at_start has to be cleared for compatibility with this patch: 2004-05-17 J"orn Rennecke <joern.rennecke@superh.com> * cse.c (trivially_dead_nonlocal_regs): New variable. (note_dead_set): New function. (delete_trivially_dead_insns): If life info is available, update it. For completeness, we might also clear ENTRY_BLOCK_PTR->global_live_at_end. - config/pa/pa.c:pa_commutative_p is missing the parameter list: (rtx x, int outer_code)
The recog.c / recog.h part of the patch has been committed as part of another patch: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00133.html
(In reply to comment #4) > The recog.c / recog.h part of the patch has been committed as part > of another patch: > http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00133.html Yes this also caused a regression on ppc-darwin (it might also be reproducable with ppc-elf and ppc- linux).
Subject: Re: If-conversion can't match equivalent code, and cross-jumping only works for literal matches pinskia at gcc dot gnu dot org wrote: >------- Additional Comments From pinskia at gcc dot gnu dot org 2005-03-03 02:44 ------- >(In reply to comment #4) > > >>The recog.c / recog.h part of the patch has been committed as part >>of another patch: >>http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00133.html >> >> > >Yes this also caused a regression on ppc-darwin (it might also be reproducable with ppc-elf and ppc- >linux). > > > I've tried ppc-eabisim, but can't reproduce a failure there building gcc or newlib. powerpc-apple-darwin7.4.0 wants lots of include files which I don't have. Could you send me a preprocessed file so that I can test my patch against that?
Created attachment 8403 [details] incremental patch to prevent invalid moves during cross-jump
PR and patch predate gcc 4.0 branch.
Created attachment 8677 [details] Incremental patch to fix problem found by valgrind.
For the purposes of PR20070, it is sufficient that PR21767 is fixed on mainline.
An updated patch is here: http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00482.html
Created attachment 9231 [details] testcase The jave/parse.c from the current sh-elf-4_1-branch is miscompiled at -march=i686 -O2 when the most recently posted patch is applied. This testcase shows the pruned-down function; the result of not_accessible_p is ignored.
Created attachment 9232 [details] incremental patch The problem was that at the end of the insn processing loop, cancel_changes cancelled any changes that had not been committed while processing an equivalence. The wider issue here is that we potentially built up a large number of changes when processing lots of insns. This is fixed in the attached patch by changing struct_equiv_improve_checkpoint so that whenever it commits to the current set of changes, it confirms or cancels them.
Created attachment 9245 [details] Incremental patch for input conflict problem sh-elf regression testing in the sh-elf-4_1-branch revealed another problem: When struct_equiv_block_eq is run a second time, and struct_equiv_improve_checkpoint resolves an input conflict, check_input_conflict is reset at the end of struct_equiv_block_eq. This causes the final phase not to resolve input conflicts. Fixed in attched patch with the addition of the had_input_conflict field. Visual inspection of the code also showed that there was a potential problem when an input conflict is resolved in struct_equiv_improve_checkpoint, but the checkpoint is not actually taken; then confirmed inputs could be mixed with ones that are rolled back later. Fixed with a small code rearrangent.
An updated patch is here: http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00843.html
Subject: Bug 20070 CVSROOT: /cvs/gcc Module name: gcc Branch: sh-elf-4_1-branch Changes by: amylaar@gcc.gnu.org 2005-07-12 13:29:58 Modified files: gcc : ChangeLog basic-block.h cfgcleanup.c ifcvt.c recog.c recog.h Log message: PR rtl-optimization/20070 http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00843.html * basic-block.h (STRUCT_EQUIV_START, STRUCT_EQUIV_RERUN): Define. (STRUCT_EQUIV_FINAL, STRUCT_EQUIV_MAX_LOCAL): Likewise. (struct struct_equiv_checkpoint, struct equiv_info): Likewise. (struct_equiv_block_eq): Declare. * cfgcleanup.c (reload.h, expr.h): #include. (IMPOSSIBLE_MOVE_FACTOR): Define. (flow_find_cross_jump): Remove. (assign_reg_reg_set, struct_equiv, struct_equiv_set): New functions. (struct_equiv_dst_mem, struct_equiv_make_checkpoint): Likewise. (struct_equiv_improve_checkpoint): Likewise. (struct_equiv_restore_checkpoint, struct_equiv_death): Likewise. (struct_equiv_block_eq): Likewise. (find_dying_inputs, resolve_input_conflict): Likewise. (try_crossjump_to_edge): Use struct_equiv_block_eq instead of flow_find_cross_jump. * ifcvt.c (noce_try_complex_cmove): New function. (noce_process_if_block): Call it. For flag_expensive_optimizations, update cond_exec_changed_p on success. (rest_of_handle_if_conversion): For flag_expensive_optimizations, provide if_convert with register lifeness info. Back out this change: 2005-03-07 Kazu Hirata <kazu@cs.umass.edu> * recog.c (verify_changes): Make it static. * recog.h: Remove the corresponding prototype. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=2.8142.2.17&r2=2.8142.2.18 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/basic-block.h.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=1.246.2.2&r2=1.246.2.3 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cfgcleanup.c.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=1.143.2.1&r2=1.143.2.2 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ifcvt.c.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=1.184.2.1&r2=1.184.2.2 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/recog.c.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=1.221.4.1&r2=1.221.4.2 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/recog.h.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=1.55.4.1&r2=1.55.4.2
Created attachment 9280 [details] Don't reorder inputs unless there is a checkpoint. In reply #14 I wrote: > Visual inspection of the code also showed that there > was a potential problem when an input conflict is resolved in > struct_equiv_improve_checkpoint, but the checkpoint is not actually taken; > then confirmed inputs could be mixed with ones that are rolled back later. > Fixed with a small code rearrange[me]nt. But somehow that bit got lost. Here it is again, along with a change to resolve_input_conflict to make sure it doesn't leave the inputs in a different order if it can't resolve the conflict.
An updated patch is here: http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01041.html
Subject: Bug 20070 CVSROOT: /cvs/gcc Module name: gcc Branch: sh-elf-4_1-branch Changes by: amylaar@gcc.gnu.org 2005-07-15 14:25:24 Modified files: gcc : cfgcleanup.c ChangeLog Log message: Update this patch: 2005-07-14 J"orn Rennecke <joern.rennecke@st.com> PR rtl-optimization/20070 http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01041.html * basic-block.h (STRUCT_EQUIV_START, STRUCT_EQUIV_RERUN): Define. (STRUCT_EQUIV_FINAL, STRUCT_EQUIV_MAX_LOCAL): Likewise. (struct struct_equiv_checkpoint, struct equiv_info): Likewise. (struct_equiv_block_eq): Declare. * cfgcleanup.c (reload.h, expr.h): #include. (IMPOSSIBLE_MOVE_FACTOR): Define. (flow_find_cross_jump): Remove. (assign_reg_reg_set, struct_equiv, struct_equiv_set): New functions. (struct_equiv_dst_mem, struct_equiv_make_checkpoint): Likewise. (struct_equiv_improve_checkpoint): Likewise. (struct_equiv_restore_checkpoint, struct_equiv_death): Likewise. (struct_equiv_block_eq): Likewise. (find_dying_inputs, resolve_input_conflict): Likewise. (try_crossjump_to_edge): Use struct_equiv_block_eq instead of flow_find_cross_jump. * ifcvt.c (noce_try_complex_cmove): New function. (noce_process_if_block): Call it. For flag_expensive_optimizations, update cond_exec_changed_p on success. (rest_of_handle_if_conversion): For flag_expensive_optimizations, provide if_convert with register lifeness info. Back out this change: 2005-03-07 Kazu Hirata <kazu@cs.umass.edu> * recog.c (verify_changes): Make it static. * recog.h: Remove the corresponding prototype. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cfgcleanup.c.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=1.143.2.3&r2=1.143.2.4 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=sh-elf-4_1-branch&r1=2.8142.2.22&r2=2.8142.2.23
Created attachment 9371 [details] Incremental patch to properly check for exactly one input Regression testing of sh64-elf in mainline from 20050722 found a regression of gcc.c-torture/execute/20000715-1.c at -O2. The attached one-line incremental patch fixes this.
For Bug 21803 we could use similar infrastructure to what is proposed for this bug.
Subject: Bug 20070 Author: amylaar Date: Wed Dec 7 13:31:41 2005 New Revision: 108164 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108164 Log: 2005-12-07 J"orn Rennecke <joern.rennecke@st.com> Preparation for PR rtl-optimization/20070 / part1 * basic-block.h (insns_match_p, flow_find_cross_jump): Declare. * cfgcleanup.c (condjump_equiv_p): New function, broken out of outgoing_edges_match. (outgoing_edges_match): Use condjump_equiv_p. (merge_memattrs, insns_match_p, flow_find_cross_jump): Move from here into.. * struct-equiv.c: New file. (death_notes_match_p) New function, broken out of insns_match_p. * Makefile.in (OBJS-common): Add struct-equiv.o. (struct-equiv.o): New target. Added: trunk/gcc/struct-equiv.c - copied, changed from r107723, trunk/gcc/cfgcleanup.c Modified: trunk/gcc/ChangeLog trunk/gcc/Makefile.in trunk/gcc/basic-block.h trunk/gcc/cfgcleanup.c
Subject: Bug 20070 Author: amylaar Date: Tue Dec 13 13:04:18 2005 New Revision: 108480 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108480 Log: PR rtl-optimization/20070 / part1 * flow.c (update_life_info): If PROP_POST_REGSTACK is set, call count_or_remove_death_notes with kill == -1. (mark_set_1): Don't add REG_DEAD / REG_UNUSED notes for stack registers if PROP_POST_REGSTACK is set. (mark_used_reg): Likewise. (count_or_remove_death_notes): If kill is -1, don't remove REG_DEAD / REG_UNUSED notes for stack regs. * cfgcleanup.c (condjump_equiv_p): Change parameters and processing to match rtx_equiv_p machinery. Change caller. (outgoing_edges_match): Likewise. (try_crossjump_to_edge): Use struct_equiv_block_eq instead of flow_find_cross_jump. * basic-block.h (PROP_POST_REGSTACK, STRUCT_EQUIV_START): Define. (STRUCT_EQUIV_RERUN, STRUCT_EQUIV_FINAL): Likewise. (STRUCT_EQUIV_NEED_FULL_BLOCK, STRUCT_EQUIV_MATCH_JUMPS): Likewise. (STRUCT_EQUIV_MAX_LOCAL): Likewise. (struct struct_equiv_checkpoint, struct equiv_info): Likewise. (insns_match_p): Update prototype. (flow_find_cross_jump): Remove prototype. (struct_equiv_block_eq, struct_equiv_init): Declare. (rtx_equiv_p, condjump_equiv_p): Likewise. * struct-equiv.c: Include reload.h. (IMPOSSIBLE_MOVE_FACTOR): Define. (assign_reg_reg_set, struct_equiv_make_checkpoint): New functions. (struct_equiv_improve_checkpoint): Likewise. (struct_equiv_restore_checkpoint, rtx_equiv_p): Likewise. (set_dest_equiv_p, set_dest_addr_equiv_p, struct_equiv_init): Likewise. (struct_equiv_merge, find_dying_input): Likewise. (resolve_input_conflict, note_local_live): Likewise. (death_notes_match_p): Change parameters and processing to match rtx_equiv_p machinery. Change caller. (insns_match_p): Likewise. (flow_find_cross_jump): Replace with: (struct_equiv_block_eq). Back out this change: 2005-03-07 Kazu Hirata <kazu@cs.umass.edu> * recog.c (verify_changes): Make it static. * recog.h: Remove the corresponding prototype. Modified: trunk/gcc/ChangeLog trunk/gcc/basic-block.h trunk/gcc/cfgcleanup.c trunk/gcc/flow.c trunk/gcc/recog.c trunk/gcc/recog.h trunk/gcc/struct-equiv.c
Subject: Bug 20070 Author: amylaar Date: Mon Dec 19 14:36:59 2005 New Revision: 108792 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108792 Log: * cfgcleanup.c: Temporarily revert patches for PR 20070 till Bernd comes back. Modified: trunk/gcc/ChangeLog trunk/gcc/cfgcleanup.c
can we close this?
(In reply to comment #25) > can we close this? > No, we still have to actually re-enable the cross-jumping code, and we haven't even gotten to the review of the if-conversion part.
We've been using this patch: http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01488.html successfully in a 4.1.1 based compiler. It is not optimal, though: the check for equality of the regsets in struct_equiv_init is unnecessary, and it can also lead skipping possible optimizations. The frequent live updates are also not necessary. This optimization can work with register life info that is inexact, as long as it is conservative, i.e. it may flag registers as live that are not, but it most not pretend that live registers are dead. Note also that re-integration of the if-conversion improvements was not within the scope of this patch, i.e. this is still an outstanding issue.
Created attachment 17995 [details] Patch agains r148322, works pre-RA only Joern's original ifcvt.c patch only dealt with pre-reload if-conversion. The subsequent changes to make struct-equiv work for crossjumping and after reload, made the code too complicated IMHO. So I've gone back to the roots of the patch. I've simplified things a bit -- mostly by using the DF machinery. This new attached patch is far from complete though. The struct-equiv code should use rtx_equal_p_cb, but the rtx_equal_p_cb needs to be modified first (to be more like for_each_rtx: 3-state and passing around a pointer to auxiliary data). The local_reg_p stuff should probably go into df-problems.c as a _p function. And so on. But the patch does work. I wanted to let folks now that this bug is not yet forgotten!
I am testing Steven's Crossjumping patch attached here. With CoreMark we see a 1% increase in performance when using Os. Other proprietary tests show ~0.5% decrease in code size. The path however does not fix PR30905.
Comment on attachment 17995 [details] Patch agains r148322, works pre-RA only The ifcvt.c part of the patch does not work anymore, because it doesn't handle DEBUG_INSNs properly. I am working on an update.
Created attachment 19525 [details] Patch agains r155731, works pre-RA only Bootstrapped&tested on ia64-unknown-linux-gnu.
I will re-test on our port and report my findings, cheers!
*** Bug 43864 has been marked as a duplicate of this bug. ***