On Linux/Intel64, I got FAIL: gfortran.dg/array_constructor_9.f90 -O3 -fomit-frame-pointer -funroll-loops execution test with revision 131671. Revision 131622 is OK.
Confirmed and it is a wierd regression: the test pass with '-O3' alone, but not with '-O3 -funroll-loops'. In addition if I replace the line if (values (last) .ne. j + k * k) call abort by if (values (last) .ne. j + k * k) print *, i, j, k, values (last) the test pass without printing anything. It looks as a middle-end regression between revs. 131660 and 131671.
Fails also at '-O2 -funroll-loops', but not if I add return 10 print *, i, j, k, values (last) before end subroutine test
The failure comes in the first call of 'test()' at i=order, j=2. The values tested in the k loop seems to be: 5.0000000 10.0000000 3.0000000 6.0000000 11.000000 4.0000000 7.0000000 12.000000 5.0000000 instead of 3.0000000 6.0000000 11.000000 18.000000 27.000000 38.000000 51.000000 66.000000 83.000000 The outputs of -ftree-dump-optimized are identical for revs. 131628 and 131676.
Worked with rev. 131669, fails with 131671. Rev. 131670 seems the most likely, added Kenneth Zadeck to the cc list.
The test and its variants pass on powerpc-apple-darwin9 rev. 131676.
I need a more info to reproduce this bug. I bootstrapped and regression tested on x86_64-unknown-linux-gnu with suse 10.3 and using --enable-languages=c,c++,fortran --disable-multilib before committing the patch and got === gfortran Summary === # of expected passes 23538 # of expected failures 4 # of unsupported tests 18 i am not doubting that the failure is related to this patch. Given all of rest of the info, it smells like this patch is responsible, but i do not get the failure on my config.
> I need a more info to reproduce this bug. I have tried to give all the info I have been able to gather on my own. My config is: Configured with: ../gcc-4.3-work/configure --prefix=/opt/gcc/gcc4.3w --mandir=/opt/gcc/gcc4.3w/share/man --infodir=/opt/gcc/gcc4.3w/share/info --build=i686-apple-darwin9 --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/usr --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib As far as I can tell, the bug appears after the tree optimization, but at this point I don't know what I should dump. Having looked at the test-suite results, the problem appears on 32 bit x86 platforms. From > --disable-multilib I infer that you cannot try with -m32, isn't it?
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 dominiq at lps dot ens dot fr wrote: > ------- Comment #7 from dominiq at lps dot ens dot fr 2008-01-20 14:39 ------- > >> I need a more info to reproduce this bug. >> > > I have tried to give all the info I have been able to gather on my own. My > config is: > > Configured with: ../gcc-4.3-work/configure --prefix=/opt/gcc/gcc4.3w > --mandir=/opt/gcc/gcc4.3w/share/man --infodir=/opt/gcc/gcc4.3w/share/info > --build=i686-apple-darwin9 --enable-languages=c,c++,fortran,objc,obj-c++,java > --with-gmp=/sw --with-libiconv-prefix=/usr --with-system-zlib > --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib > > As far as I can tell, the bug appears after the tree optimization, but at this > point I don't know what I should dump. Having looked at the test-suite results, > the problem appears on 32 bit x86 platforms. From > > >> --disable-multilib >> > > I infer that you cannot try with -m32, isn't it? > > > the first comment of the bug says linux/intel64. your config string looks like you are building on a mac "darwin" box. That would be the difference. I build on a real linux box that cannot run darwin. could you please send me two tar files: one tar file from the release with out this patch containing the test case case with the "-da" option and one from the release with the patch with the same option. This option will produce a large number of dump files and from those dumps i will fix the bug. Thanks in advance. kenny
> you are building on a mac "darwin" box Yes indeed, but the bug is also present for i686-pc-linux-gnu, see for instance: http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00914.html
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 dominiq at lps dot ens dot fr wrote: > ------- Comment #9 from dominiq at lps dot ens dot fr 2008-01-20 15:30 ------- > >> you are building on a mac "darwin" box >> > > Yes indeed, but the bug is also present for i686-pc-linux-gnu, see for > instance: > > http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00914.html > > > i will build this on a 32 bit box. that is my problem. sorry, thanks. kenny
I have put the results of the compilation with -da with the patch at http://www.lps.ens.fr/~dominiq/gcc/tmp_fresh.tar.bz2 All the files will be in a directory tmp_fresh. Do you still need the same without the patch? It will take some time to reverse the patch and to do the rebuilding.
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 dominiq at lps dot ens dot fr wrote: > ------- Comment #11 from dominiq at lps dot ens dot fr 2008-01-20 15:47 ------- > I have put the results of the compilation with -da with the patch at > > http://www.lps.ens.fr/~dominiq/gcc/tmp_fresh.tar.bz2 > > All the files will be in a directory tmp_fresh. Do you still need the same > without the patch? It will take some time to reverse the patch and to do the > rebuilding. > > > let me try to build a 32 bit compiler. that appears to be the problem. it will be easier if i can get it on my machine. kenny
It happens for me on Linux/Intel64 with -m32: http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00907.html My configuration is configure flags: --enable-clocale=gnu --with-system-zlib --enable-decimal-float=bid --with-demangler-in-ld --enable-shared --enable-threads=posix --enable-haifa --enable-checking=assert --prefix=/usr/gcc-4.3 --with-local-prefix=/usr/local
confirmed on my machine, i will have my best people work on it. kenny
There appears to be an design inconsistency in the way that we have specified the various dataflow problems with respect to the eq notes. I hate eq notes. In the rd patch that just went in where we trim the solution based on the lr info, it seems to also trim the bits associated with the eq notes because these are not carried around in the lr problem. This screws ups the use-def chains that are used in fwprop, see and web (in this particular case, the problem is in web, but that is only a matter of bad luck). I can either blow away the rd patch for 34400, or we can teach lr to respect the DF_EQ_NOTES patch or just have lr always propagate the eq notes as if they were real uses. The latter actually sounds like the only correct solution but at this late point of stage 3, we may need to sacrifice correctness for stability. Kenny
I favor blowing away the RD patch. Having the LR problem (and probably the LIVE problem too?) always propagate the REG_EQ* notes as if they were real uses sounds like a terrible idea to me. They are not real uses and should not be treated as such.
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 steven at gcc dot gnu dot org wrote: > ------- Comment #16 from steven at gcc dot gnu dot org 2008-01-20 23:22 ------- > I favor blowing away the RD patch. > > Having the LR problem (and probably the LIVE problem too?) always propagate the > REG_EQ* notes as if they were real uses sounds like a terrible idea to me. > They are not real uses and should not be treated as such. > > > Unless the powers that be need this fixed asap, i want to wait til i get a consensus from the other df maintainers. but i do see blowing it away as a reasonable answer. it is a shame, it actually does provide a good speedup. Kenny
I agree with Steven. The trimming could be adjusted to not trim register that are mentioned by REG_EQ* notes, but I'm not sure it would be a good idea, either.
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 bonzini at gnu dot org wrote: > ------- Comment #18 from bonzini at gnu dot org 2008-01-21 08:04 ------- > I agree with Steven. The trimming could be adjusted to not trim register that > are mentioned by REG_EQ* notes, but I'm not sure it would be a good idea, > either. > > I am testing a patch that add eq_notes to the lr problem if you specify DF_EQ_NOTES. If you and steven still want me to abandon this, i will, but in my mind, it seems that if the pass wants chains built with eq_notes, then the rest of the df information should match this decision. Kenny
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 > I am testing a patch that add eq_notes to the lr problem if you specify > DF_EQ_NOTES. > If you and steven still want me to abandon this, i will, but in my mind, > it seems that if the pass wants chains built with eq_notes, then the > rest of the df information should match this decision. Chains is one thing, liveness is another. EQ notes are there as "reminders of useful information", I don't want them to affect register allocation (via liveness) if the information was not used.
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 bonzini at gnu dot org wrote: > ------- Comment #20 from bonzini at gnu dot org 2008-01-21 13:21 ------- > Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 > > > >> I am testing a patch that add eq_notes to the lr problem if you specify >> DF_EQ_NOTES. >> If you and steven still want me to abandon this, i will, but in my mind, >> it seems that if the pass wants chains built with eq_notes, then the >> rest of the df information should match this decision. >> > > Chains is one thing, liveness is another. EQ notes are there as > "reminders of useful information", I don't want them to affect register > allocation (via liveness) if the information was not used. > > > I understand that, that is why if the pass does not specify DF_EQ_NOTES, the lr (and the rest of the info) stays as it is now. But if you are building chains out of them, then you are treating them as live anyway. kenny
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 On 21 Jan 2008 13:25:23 -0000, zadeck at naturalbridge dot com <gcc- > I understand that, that is why if the pass does not specify DF_EQ_NOTES, > the lr (and the rest of the info) stays as it is now. But if you are > building chains out of them, then you are treating them as live anyway. This is not really true. You are treating the registers as "reaching", i.e. even though the register is not live at the point of the REG_EQ* note, it _could_ be live holding the value expected for the REG_EQ* note. This difference between live registers and reaching definitions is also why I feel so uncomfortable about the RD patch. After trimming, DEFs that reach a certain point in the program don't show up in the reaching definitions anymore. You've already found one example how this can be confusing (the loop IV problem). It also breaks my new const/copy prop pass, which expects definitions to reach points where the register of the reaching def is not live. What is confusing about all this, is that in GCC there is nothing to prevent you from inserting a new SET of a register between a previous set of the register and a use in an REG_EQ* note, if the register is dead at the point of the REG_EQ* note. E.g. it is perfectly OK to have a situation like this: (insn 1 (set (reg1) (...))) (insn 2 (...) REG_DEAD (reg1)) (insn 3 (...) REG_EQUAL (... ((reg1) ...))) and nothing in the compiler will object if you would insert a new insn like so: (insn 1 (set (reg1) (...))) (insn 2 (...) REG_DEAD (reg1)) (insn 4 (set (reg1) (...))) (insn 3 (...) REG_EQUAL (... ((reg1) ...))) even though this would probably result in wrong code. The way GCC avoids this kind of situation, is by never re-using registers this way, unless the new definition sets reg1 to the same value (gcse PRE does this, for example). The effect of including REG_* notes in the LR and LIVE problems would be to extend the live ranges of pseudos into regions where they are not actually live. What it comes down to, is that it seems that liveness is not a good condition to trim reaching definitions. Trimming LIVE with LR makes sense because the result of the problem does not change, and the nature of the problems are the same (i.e. compute two meanings of liveness for registers). Live registers and reaching definitions, on the other hand, are dissimilar problems. For LIVE, the effect of trimming is to not compute partial availability in regions where a register is not live. The similar condition for reaching definitions is not liveness, but absence of uses. To trim reaching definitions, one should really be looking at the last reachable use of a definition, and trim from there. I don't know what problem computes the last use of a register, but it may well be a problem that is equivalent to the LR problem, but considering *all* uses including REG_EQ* uses, instead of only real uses. But I haven't thought about this much yet, I must admit.
> The similar > condition for reaching definitions is not liveness, but absence of > uses. To trim reaching definitions, one should really be looking at > the last reachable use of a definition, and trim from there. Just to clarify, this trimming still changes the result of the reaching definitions problem, but should not change the resulting DEF-USE chains. For the reaching definitions problem itself, I don't see any way to trim it.
Side note for the record: the polyhedron test mdbx.f90 also fails in 32 bit mode with rev. 131689 on i686-apple-darwin9.
Can we revert the offending patch and revisit this issue during stage1 please? Thanks. Richard.
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 Agreed. I think we'd better revert the patch now.
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 bonzini at gnu dot org wrote: > ------- Comment #26 from bonzini at gnu dot org 2008-01-21 14:54 ------- > Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 > > Agreed. I think we'd better revert the patch now. > > > i give. i will prepare a revert this afternoon. i need to go run some errands now. kenny
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 zadeck at naturalbridge dot com wrote: > ------- Comment #27 from zadeck at naturalbridge dot com 2008-01-21 15:36 ------- > Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 > > bonzini at gnu dot org wrote: > >> ------- Comment #26 from bonzini at gnu dot org 2008-01-21 14:54 ------- >> Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 >> >> Agreed. I think we'd better revert the patch now. >> >> >> >> > i give. > > i will prepare a revert this afternoon. i need to go run some errands now. > > kenny > > > This patch reverses most of the rd2 patch that speeds up pr3400 and pr26854. Unfortunately that patch caused some regressions that are very difficult to resolved. These were reported in pr34884. This patch reverses most of rd2.diff. It leaves in the fixes to the comments and one fix to a latent bug. The patch has been tested on ia-64, ppc-32, x86-64 and x86-32. 2008-01-22 Kenneth Zadeck <zadeck@naturalbridge.com> PR rtl-optimization/26854 PR rtl-optimization/34400 PR rtl-optimization/34884 * ddg.c (create_ddg_dep_from_intra_loop_link): Use DF_RD->gen. * df.h (df_changeable_flags.DF_RD_NO_TRIM): Deleted (df_rd_bb_info.expanded_lr_out): Deleted * loop_invariant.c (find_defs): Deleted DF_RD_NO_TRIM flag. * loop_iv.c (iv_analysis_loop_init): Ditto. * df-problems.c (df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n, df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free): Removed code to allocate, initialize or free expanded_lr_out. (df_rd_bb_local_compute_process_def): Restructured to make more understandable. (df_rd_confluence_n): Removed code to no apply invalidate_by_call sets if the sets are being trimmed. Ok to commit. Index: ddg.c =================================================================== --- ddg.c (revision 131716) +++ ddg.c (working copy) @@ -184,13 +184,12 @@ create_ddg_dep_from_intra_loop_link (ddg { int regno = REGNO (SET_DEST (set)); struct df_ref *first_def; - struct df_ref *last_def; + struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb); first_def = df_bb_regno_first_def_find (g->bb, regno); gcc_assert (first_def); - last_def = df_bb_regno_last_def_find (g->bb, regno); - if (first_def == last_def) + if (bitmap_bit_p (bb_info->gen, first_def->id)) return; } } Index: df.h =================================================================== --- df.h (revision 131716) +++ df.h (working copy) @@ -405,27 +405,20 @@ enum df_changeable_flags DF_LR_RUN_DCE = 1 << 0, /* Run DCE. */ DF_NO_HARD_REGS = 1 << 1, /* Skip hard registers in RD and CHAIN Building. */ - /* Do not trim the solution using the LR result. This can make the - solution take much longer and take more memory. This is - necessary for the loop optimizations, but has a very small time - and space penalty because the loop optimizations process only a - single loop at a time. Any pass that looks at the entire - function should not set this flag. */ - DF_RD_NO_TRIM = 1 << 2, - DF_EQ_NOTES = 1 << 3, /* Build chains with uses present in EQUIV/EQUAL notes. */ - DF_NO_REGS_EVER_LIVE = 1 << 4, /* Do not compute the regs_ever_live. */ + DF_EQ_NOTES = 1 << 2, /* Build chains with uses present in EQUIV/EQUAL notes. */ + DF_NO_REGS_EVER_LIVE = 1 << 3, /* Do not compute the regs_ever_live. */ /* Cause df_insn_rescan df_notes_rescan and df_insn_delete, to return immediately. This is used by passes that know how to update the scanning them selves. */ - DF_NO_INSN_RESCAN = 1 << 5, + DF_NO_INSN_RESCAN = 1 << 4, /* Cause df_insn_rescan df_notes_rescan and df_insn_delete, to return after marking the insn for later processing. This allows all rescans to be batched. */ - DF_DEFER_INSN_RESCAN = 1 << 6, + DF_DEFER_INSN_RESCAN = 1 << 5, - DF_VERIFY_SCHEDULED = 1 << 7 + DF_VERIFY_SCHEDULED = 1 << 6 }; /* Two of these structures are inline in df, one for the uses and one @@ -719,37 +712,9 @@ struct df_rd_bb_info /* Local sets to describe the basic blocks. */ bitmap kill; bitmap sparse_kill; + bitmap gen; /* The set of defs generated in this block. */ - /* Expanded version of the DF_LT->out bitmap to match the positions - of gen, in and out here. Only allocated if DF_RD_NO_TRIM is - false. */ - bitmap expanded_lr_out; - - /* The set of defs generated in this block. This is not set unless - the def reaches the end of the block. */ - bitmap gen; - - /* The results of the dataflow problem. - - If DF_RD_NO_TRIM is not set, these sets are SOMEWHAT trimmed by - the output of the DF_LR problem. The out set is precisely - trimmed during propagation which means that the result is also - trimmed when the propagation terminates. The in set is not - explicitly trimmed, because this is expensive (adding about 5% to - the cost of a bootstrap). However since the out sets are trimmed - and the in sets are built from the out of the pred, the in set is - MOSTLY trimmed. - - The counter case happens at a branch where the variable V is in - DF_LR->in the true branch but not the false branch. If V is - defined before the branch, RD will propagate that into the - DF_RD_in sets of both branches. When the block is processed, the - DF_RD->out set will have V trimmed out of it but it will still be - left in DF_RD->in. - - If this not a problem for the current optimizers since they were - designed before any trimming was available. This can be fixed by - checking the DF_LR->in set directly. */ + /* The results of the dataflow problem. */ bitmap in; /* At the top of the block. */ bitmap out; /* At the bottom of the block. */ }; Index: loop-invariant.c =================================================================== --- loop-invariant.c (revision 131716) +++ loop-invariant.c (working copy) @@ -639,7 +639,6 @@ find_defs (struct loop *loop, basic_bloc df_remove_problem (df_chain); df_process_deferred_rescans (); df_chain_add_problem (DF_UD_CHAIN); - df_set_flags (DF_RD_NO_TRIM); df_set_blocks (blocks); df_analyze (); Index: loop-iv.c =================================================================== --- loop-iv.c (revision 131716) +++ loop-iv.c (working copy) @@ -278,7 +278,6 @@ iv_analysis_loop_init (struct loop *loop df_remove_problem (df_chain); df_process_deferred_rescans (); df_chain_add_problem (DF_UD_CHAIN); - df_set_flags (DF_RD_NO_TRIM); df_set_blocks (blocks); df_analyze (); if (dump_file) Index: df-problems.c =================================================================== --- df-problems.c (revision 131716) +++ df-problems.c (working copy) @@ -245,8 +245,6 @@ df_rd_free_bb_info (basic_block bb ATTRI struct df_rd_bb_info *bb_info = (struct df_rd_bb_info *) vbb_info; if (bb_info) { - if (bb_info->expanded_lr_out) - BITMAP_FREE (bb_info->expanded_lr_out); BITMAP_FREE (bb_info->kill); BITMAP_FREE (bb_info->sparse_kill); BITMAP_FREE (bb_info->gen); @@ -300,8 +298,6 @@ df_rd_alloc (bitmap all_blocks) struct df_rd_bb_info *bb_info = df_rd_get_bb_info (bb_index); if (bb_info) { - if (bb_info->expanded_lr_out) - bitmap_clear (bb_info->expanded_lr_out); bitmap_clear (bb_info->kill); bitmap_clear (bb_info->sparse_kill); bitmap_clear (bb_info->gen); @@ -310,10 +306,6 @@ df_rd_alloc (bitmap all_blocks) { bb_info = (struct df_rd_bb_info *) pool_alloc (df_rd->block_pool); df_rd_set_bb_info (bb_index, bb_info); - if (df->changeable_flags & DF_RD_NO_TRIM) - bb_info->expanded_lr_out = NULL; - else - bb_info->expanded_lr_out = BITMAP_ALLOC (&problem_data->rd_bitmaps); bb_info->kill = BITMAP_ALLOC (&problem_data->rd_bitmaps); bb_info->sparse_kill = BITMAP_ALLOC (&problem_data->rd_bitmaps); bb_info->gen = BITMAP_ALLOC (&problem_data->rd_bitmaps); @@ -328,53 +320,56 @@ df_rd_alloc (bitmap all_blocks) /* Process a list of DEFs for df_rd_bb_local_compute. */ static void -df_rd_bb_local_compute_process_def (struct df_rd_bb_info *bb_info, +df_rd_bb_local_compute_process_def (struct df_rd_bb_info *bb_info, struct df_ref **def_rec, enum df_ref_flags top_flag) { - for (; *def_rec; def_rec++) + while (*def_rec) { struct df_ref *def = *def_rec; - unsigned int regno = DF_REF_REGNO (def); - - /* This makes sure we do the artificial defs in the right order - since they are all in the same list. */ - if (top_flag != (DF_REF_FLAGS (def) & DF_REF_AT_TOP)) - continue; - - /* Skip over the hard regs if we do not care about them. */ - if ((df->changeable_flags & DF_NO_HARD_REGS) && - (regno < FIRST_PSEUDO_REGISTER)) - continue; - - /* Only the last def(s) for a regno in the block has any - effect. */ - if (bitmap_bit_p (seen_in_block, regno)) - continue; - - /* The first def for regno in insn gets to knock out the - defs from other instructions. */ - if ((!bitmap_bit_p (seen_in_insn, regno)) - /* If the def is to only part of the reg, it does - not kill the other defs that reach here. */ - && (!(DF_REF_FLAGS (def) & - (DF_REF_PARTIAL | DF_REF_CONDITIONAL | DF_REF_MAY_CLOBBER)))) + if (top_flag == (DF_REF_FLAGS (def) & DF_REF_AT_TOP)) { + unsigned int regno = DF_REF_REGNO (def); unsigned int begin = DF_DEFS_BEGIN (regno); unsigned int n_defs = DF_DEFS_COUNT (regno); - if (n_defs > DF_SPARSE_THRESHOLD) - bitmap_set_bit (bb_info->sparse_kill, regno); - else - bitmap_set_range (bb_info->kill, begin, n_defs); - bitmap_clear_range(bb_info->gen, begin, n_defs); + + if ((!(df->changeable_flags & DF_NO_HARD_REGS)) + || (regno >= FIRST_PSEUDO_REGISTER)) + { + /* Only the last def(s) for a regno in the block has any + effect. */ + if (!bitmap_bit_p (seen_in_block, regno)) + { + /* The first def for regno in insn gets to knock out the + defs from other instructions. */ + if ((!bitmap_bit_p (seen_in_insn, regno)) + /* If the def is to only part of the reg, it does + not kill the other defs that reach here. */ + && (!(DF_REF_FLAGS (def) & + (DF_REF_PARTIAL | DF_REF_CONDITIONAL | DF_REF_MAY_CLOBBER)))) + { + if (n_defs > DF_SPARSE_THRESHOLD) + { + bitmap_set_bit (bb_info->sparse_kill, regno); + bitmap_clear_range(bb_info->gen, begin, n_defs); + } + else + { + bitmap_set_range (bb_info->kill, begin, n_defs); + bitmap_clear_range (bb_info->gen, begin, n_defs); + } + } + + bitmap_set_bit (seen_in_insn, regno); + /* All defs for regno in the instruction may be put into + the gen set. */ + if (!(DF_REF_FLAGS (def) + & (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER))) + bitmap_set_bit (bb_info->gen, DF_REF_ID (def)); + } + } } - - bitmap_set_bit (seen_in_insn, regno); - /* All defs for regno in the instruction may be put into - the gen set. */ - if (!(DF_REF_FLAGS (def) - & (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER))) - bitmap_set_bit (bb_info->gen, DF_REF_ID (def)); + def_rec++; } } @@ -385,28 +380,14 @@ df_rd_bb_local_compute (unsigned int bb_ { basic_block bb = BASIC_BLOCK (bb_index); struct df_rd_bb_info *bb_info = df_rd_get_bb_info (bb_index); - struct df_lr_bb_info *lr_bb_info = df_lr_get_bb_info (bb_index); rtx insn; bitmap_clear (seen_in_block); bitmap_clear (seen_in_insn); - if (!(df->changeable_flags & DF_RD_NO_TRIM)) - { - unsigned int regno; - bitmap_iterator bi; - int first_reg = (df->changeable_flags & DF_NO_HARD_REGS) ? FIRST_PSEUDO_REGISTER : 0; - EXECUTE_IF_SET_IN_BITMAP (lr_bb_info->out, first_reg, regno, bi) - { - unsigned int begin = DF_DEFS_BEGIN (regno); - unsigned int n_defs = DF_DEFS_COUNT (regno); - bitmap_set_range (bb_info->expanded_lr_out, begin, n_defs); - } - } - /* Artificials are only hard regs. */ if (!(df->changeable_flags & DF_NO_HARD_REGS)) - df_rd_bb_local_compute_process_def (bb_info, + df_rd_bb_local_compute_process_def (bb_info, df_get_artificial_defs (bb_index), 0); @@ -504,10 +485,7 @@ df_rd_confluence_n (edge e) if (e->flags & EDGE_FAKE) return; - /* If we are trimming the solution, the invalidated_by_call code in - the lr problem makes this unnecessary. However, if we do not - trim, we must take this into account. */ - if ((df->changeable_flags & DF_RD_NO_TRIM) && e->flags & EDGE_EH) + if (e->flags & EDGE_EH) { struct df_rd_problem_data *problem_data = (struct df_rd_problem_data *) df_rd->problem_data; @@ -515,7 +493,7 @@ df_rd_confluence_n (edge e) bitmap dense_invalidated = problem_data->dense_invalidated_by_call; bitmap_iterator bi; unsigned int regno; - bitmap tmp = BITMAP_ALLOC (&problem_data->rd_bitmaps); + bitmap tmp = BITMAP_ALLOC (&df_bitmap_obstack); bitmap_copy (tmp, op2); bitmap_and_compl_into (tmp, dense_invalidated); @@ -547,13 +525,13 @@ df_rd_transfer_function (int bb_index) bitmap gen = bb_info->gen; bitmap kill = bb_info->kill; bitmap sparse_kill = bb_info->sparse_kill; - bool changed = false; - if ((df->changeable_flags & DF_RD_NO_TRIM) && bitmap_empty_p (sparse_kill)) - changed = bitmap_ior_and_compl (out, gen, in, kill); + if (bitmap_empty_p (sparse_kill)) + return bitmap_ior_and_compl (out, gen, in, kill); else { struct df_rd_problem_data *problem_data; + bool changed = false; bitmap tmp; /* Note that TMP is _not_ a temporary bitmap if we end up replacing @@ -570,8 +548,6 @@ df_rd_transfer_function (int bb_index) } bitmap_and_compl_into (tmp, kill); bitmap_ior_into (tmp, gen); - if (!(df->changeable_flags & DF_RD_NO_TRIM)) - bitmap_and_into (tmp, bb_info->expanded_lr_out); changed = !bitmap_equal_p (tmp, out); if (changed) { @@ -579,10 +555,9 @@ df_rd_transfer_function (int bb_index) bb_info->out = tmp; } else - BITMAP_FREE (tmp); + BITMAP_FREE (tmp); + return changed; } - - return changed; } @@ -602,8 +577,6 @@ df_rd_free (void) struct df_rd_bb_info *bb_info = df_rd_get_bb_info (i); if (bb_info) { - if (bb_info->expanded_lr_out) - BITMAP_FREE (bb_info->expanded_lr_out); BITMAP_FREE (bb_info->kill); BITMAP_FREE (bb_info->sparse_kill); BITMAP_FREE (bb_info->gen);
Subject: Bug 34884 Author: zadeck Date: Tue Jan 22 13:57:01 2008 New Revision: 131719 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131719 Log: 2008-01-22 Kenneth Zadeck <zadeck@naturalbridge.com> PR rtl-optimization/26854 PR rtl-optimization/34400 PR rtl-optimization/34884 * ddg.c (create_ddg_dep_from_intra_loop_link): Use DF_RD->gen. * df.h (df_changeable_flags.DF_RD_NO_TRIM): Deleted (df_rd_bb_info.expanded_lr_out): Deleted * loop_invariant.c (find_defs): Deleted DF_RD_NO_TRIM flag. * loop_iv.c (iv_analysis_loop_init): Ditto. * df-problems.c (df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n, df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free): Removed code to allocate, initialize or free expanded_lr_out. (df_rd_bb_local_compute_process_def): Restructured to make more understandable. (df_rd_confluence_n): Removed code to no apply invalidate_by_call sets if the sets are being trimmed. Modified: trunk/gcc/ChangeLog trunk/gcc/ddg.c trunk/gcc/df-problems.c trunk/gcc/df.h trunk/gcc/loop-invariant.c trunk/gcc/loop-iv.c
Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 Kenneth Zadeck wrote: > zadeck at naturalbridge dot com wrote: > >> ------- Comment #27 from zadeck at naturalbridge dot com 2008-01-21 15:36 ------- >> Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 >> >> bonzini at gnu dot org wrote: >> >> >>> ------- Comment #26 from bonzini at gnu dot org 2008-01-21 14:54 ------- >>> Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 >>> >>> Agreed. I think we'd better revert the patch now. >>> >>> >>> >>> >>> >> i give. >> >> i will prepare a revert this afternoon. i need to go run some errands now. >> >> kenny >> >> >> >> > This patch reverses most of the rd2 patch that speeds up pr3400 and pr26854. > Unfortunately that patch caused some regressions that are very difficult > to resolved. These were reported in pr34884. > > This patch reverses most of rd2.diff. It leaves in the fixes to the > comments and one fix to a latent bug. The patch has been tested on > ia-64, ppc-32, x86-64 and x86-32. > > 2008-01-22 Kenneth Zadeck <zadeck@naturalbridge.com> > > PR rtl-optimization/26854 > PR rtl-optimization/34400 > PR rtl-optimization/34884 > * ddg.c (create_ddg_dep_from_intra_loop_link): Use > DF_RD->gen. > * df.h (df_changeable_flags.DF_RD_NO_TRIM): Deleted > (df_rd_bb_info.expanded_lr_out): Deleted > * loop_invariant.c (find_defs): Deleted DF_RD_NO_TRIM flag. > * loop_iv.c (iv_analysis_loop_init): Ditto. * df-problems.c > (df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n, > df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free): > Removed code to allocate, initialize or free expanded_lr_out. > (df_rd_bb_local_compute_process_def): Restructured to make more > understandable. > (df_rd_confluence_n): Removed code to no apply invalidate_by_call > sets if the sets are being trimmed. > > > Ok to commit. > committed as revision 131719 kenny
resolved with the patch referenced in comment 30.