Bug 34884 - [4.3 Regression] gfortran.dg/array_constructor_9.f90
Summary: [4.3 Regression] gfortran.dg/array_constructor_9.f90
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.3.0
: P1 normal
Target Milestone: 4.3.0
Assignee: Kenneth Zadeck
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2008-01-20 08:08 UTC by H.J. Lu
Modified: 2008-01-22 14:35 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2008-01-21 08:04:41


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2008-01-20 08:08:49 UTC
On Linux/Intel64, I got

FAIL: gfortran.dg/array_constructor_9.f90  -O3 -fomit-frame-pointer -funroll-loops  execution test

with revision 131671. Revision 131622 is OK.
Comment 1 Dominique d'Humieres 2008-01-20 10:35:24 UTC
Confirmed and it is a wierd regression: the test pass with '-O3' alone, but not with '-O3 -funroll-loops'. In addition if I replace the line

          if (values (last) .ne. j + k * k) call abort

by

          if (values (last) .ne. j + k * k) print *, i, j, k, values (last)

the test pass without printing anything.

It looks as a middle-end regression between revs. 131660 and 131671.

Comment 2 Dominique d'Humieres 2008-01-20 10:52:15 UTC
Fails also at '-O2 -funroll-loops', but not if I add

    return
10  print *, i, j, k, values (last)

before

  end subroutine test

Comment 3 Dominique d'Humieres 2008-01-20 11:31:05 UTC
The failure comes in the first call of 'test()' at i=order, j=2. The values tested in the k loop seems to be:

5.0000000      10.0000000       3.0000000       6.0000000       11.000000       4.0000000       7.0000000       12.000000       5.0000000

instead of

3.0000000       6.0000000       11.000000       18.000000       27.000000       38.000000       51.000000       66.000000       83.000000

The outputs of -ftree-dump-optimized are identical for revs. 131628 and 131676.

Comment 4 Dominique d'Humieres 2008-01-20 12:04:30 UTC
Worked with rev. 131669, fails with 131671. Rev. 131670 seems the most likely, added Kenneth Zadeck to the cc list.


Comment 5 Dominique d'Humieres 2008-01-20 13:19:36 UTC
The test and its variants pass on powerpc-apple-darwin9 rev. 131676.

Comment 6 Kenneth Zadeck 2008-01-20 13:53:43 UTC
I need a more info to reproduce this bug.  I bootstrapped and regression tested on x86_64-unknown-linux-gnu with suse 10.3 and using --enable-languages=c,c++,fortran  --disable-multilib before committing the patch and got 


		=== gfortran Summary ===

# of expected passes		23538
# of expected failures		4
# of unsupported tests		18

i am not doubting that the failure is related to this patch.  Given all of rest of the info, it smells like this patch is responsible, but i do not get the failure on my config.
Comment 7 Dominique d'Humieres 2008-01-20 14:39:25 UTC
> I need a more info to reproduce this bug.

I have tried to give all the info I have been able to gather on my own. My config is:

Configured with: ../gcc-4.3-work/configure --prefix=/opt/gcc/gcc4.3w --mandir=/opt/gcc/gcc4.3w/share/man --infodir=/opt/gcc/gcc4.3w/share/info --build=i686-apple-darwin9 --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/usr --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib

As far as I can tell, the bug appears after the tree optimization, but at this point I don't know what I should dump. Having looked at the test-suite results, the problem appears on 32 bit x86 platforms. From

> --disable-multilib

I infer that you cannot try with -m32, isn't it?


Comment 8 Kenneth Zadeck 2008-01-20 15:24:15 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

dominiq at lps dot ens dot fr wrote:
> ------- Comment #7 from dominiq at lps dot ens dot fr  2008-01-20 14:39 -------
>   
>> I need a more info to reproduce this bug.
>>     
>
> I have tried to give all the info I have been able to gather on my own. My
> config is:
>
> Configured with: ../gcc-4.3-work/configure --prefix=/opt/gcc/gcc4.3w
> --mandir=/opt/gcc/gcc4.3w/share/man --infodir=/opt/gcc/gcc4.3w/share/info
> --build=i686-apple-darwin9 --enable-languages=c,c++,fortran,objc,obj-c++,java
> --with-gmp=/sw --with-libiconv-prefix=/usr --with-system-zlib
> --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib
>
> As far as I can tell, the bug appears after the tree optimization, but at this
> point I don't know what I should dump. Having looked at the test-suite results,
> the problem appears on 32 bit x86 platforms. From
>
>   
>> --disable-multilib
>>     
>
> I infer that you cannot try with -m32, isn't it?
>
>
>   
the first comment of the bug says linux/intel64. 

your config string looks like you are building on a mac "darwin" box. 
That would be the difference. I build on a real linux box that cannot
run darwin. 

could you please send me two tar files:
one tar file from the release with out this patch containing the test
case case with the "-da" option and one from the release with the patch
with the same option.  This option will produce a large number of dump
files and from those dumps i will fix the bug. 

Thanks in advance.

kenny
Comment 9 Dominique d'Humieres 2008-01-20 15:30:52 UTC
> you are building on a mac "darwin" box

Yes indeed, but the bug is also present for i686-pc-linux-gnu, see for instance:

http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00914.html

Comment 10 Kenneth Zadeck 2008-01-20 15:39:25 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

dominiq at lps dot ens dot fr wrote:
> ------- Comment #9 from dominiq at lps dot ens dot fr  2008-01-20 15:30 -------
>   
>> you are building on a mac "darwin" box
>>     
>
> Yes indeed, but the bug is also present for i686-pc-linux-gnu, see for
> instance:
>
> http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00914.html
>
>
>   
i will build this on a 32 bit box.  that is my problem. sorry, thanks.

kenny
Comment 11 Dominique d'Humieres 2008-01-20 15:47:27 UTC
I have put the results of the compilation with -da with the patch at

http://www.lps.ens.fr/~dominiq/gcc/tmp_fresh.tar.bz2

All the files will be in a directory tmp_fresh.  Do you still need the same without the patch? It will take some time to reverse the patch and to do the rebuilding.

Comment 12 Kenneth Zadeck 2008-01-20 15:52:49 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

dominiq at lps dot ens dot fr wrote:
> ------- Comment #11 from dominiq at lps dot ens dot fr  2008-01-20 15:47 -------
> I have put the results of the compilation with -da with the patch at
>
> http://www.lps.ens.fr/~dominiq/gcc/tmp_fresh.tar.bz2
>
> All the files will be in a directory tmp_fresh.  Do you still need the same
> without the patch? It will take some time to reverse the patch and to do the
> rebuilding.
>
>
>   
let me try to build a 32 bit compiler.  that appears to be the problem.
it will be easier if i can get it on my machine. 

kenny
Comment 13 H.J. Lu 2008-01-20 15:57:29 UTC
It happens for me on Linux/Intel64 with -m32:

http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00907.html

My configuration is

configure flags: --enable-clocale=gnu --with-system-zlib --enable-decimal-float=bid --with-demangler-in-ld --enable-shared --enable-threads=posix --enable-haifa --enable-checking=assert --prefix=/usr/gcc-4.3 --with-local-prefix=/usr/local
Comment 14 Kenneth Zadeck 2008-01-20 18:30:04 UTC
confirmed on my machine,
i will have my best people work on it.

kenny
Comment 15 Kenneth Zadeck 2008-01-20 23:15:26 UTC
There appears to be an design inconsistency in the way that we have specified the various dataflow problems with respect to the eq notes.  

I hate eq notes.

In the rd patch that just went in where we trim the solution based on the lr info, it seems to also trim the bits associated with the eq notes because these are not carried around in the lr problem.  

This screws ups the use-def chains that are used in fwprop, see and web (in this particular case, the problem is in web, but that is only a matter of bad luck).

I can either blow away the rd patch for 34400, or we can teach lr to respect the  DF_EQ_NOTES patch or just have lr always propagate the eq notes as if they were real uses.  

The latter actually sounds like the only correct solution but at this late point of stage 3, we may need to sacrifice correctness for stability.

Kenny
Comment 16 Steven Bosscher 2008-01-20 23:22:37 UTC
I favor blowing away the RD patch.

Having the LR problem (and probably the LIVE problem too?) always propagate the REG_EQ* notes as if they were real uses sounds like a terrible idea to me.  They are not real uses and should not be treated as such.
Comment 17 Kenneth Zadeck 2008-01-20 23:27:59 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

steven at gcc dot gnu dot org wrote:
> ------- Comment #16 from steven at gcc dot gnu dot org  2008-01-20 23:22 -------
> I favor blowing away the RD patch.
>
> Having the LR problem (and probably the LIVE problem too?) always propagate the
> REG_EQ* notes as if they were real uses sounds like a terrible idea to me. 
> They are not real uses and should not be treated as such.
>
>
>   
Unless the powers that be need this fixed asap, i want to wait til i get
a consensus from the other df maintainers.   but i do see blowing it
away as a reasonable answer.  it is a shame, it actually does provide a
good speedup.

Kenny
Comment 18 Paolo Bonzini 2008-01-21 08:04:41 UTC
I agree with Steven.  The trimming could be adjusted to not trim register that are mentioned by REG_EQ* notes, but I'm not sure it would be a good idea, either.
Comment 19 Kenneth Zadeck 2008-01-21 13:02:26 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

bonzini at gnu dot org wrote:
> ------- Comment #18 from bonzini at gnu dot org  2008-01-21 08:04 -------
> I agree with Steven.  The trimming could be adjusted to not trim register that
> are mentioned by REG_EQ* notes, but I'm not sure it would be a good idea,
> either.
>
>   
I am testing a patch that add eq_notes to the lr problem if you specify
DF_EQ_NOTES.
If you and steven still want me to abandon this, i will, but in my mind,
it seems that if the pass wants chains built with eq_notes, then the
rest of the df information should match this decision.

Kenny
Comment 20 Paolo Bonzini 2008-01-21 13:21:45 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90


> I am testing a patch that add eq_notes to the lr problem if you specify
> DF_EQ_NOTES.
> If you and steven still want me to abandon this, i will, but in my mind,
> it seems that if the pass wants chains built with eq_notes, then the
> rest of the df information should match this decision.

Chains is one thing, liveness is another.  EQ notes are there as 
"reminders of useful information", I don't want them to affect register 
allocation (via liveness) if the information was not used.
Comment 21 Kenneth Zadeck 2008-01-21 13:25:22 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

bonzini at gnu dot org wrote:
> ------- Comment #20 from bonzini at gnu dot org  2008-01-21 13:21 -------
> Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90
>
>
>   
>> I am testing a patch that add eq_notes to the lr problem if you specify
>> DF_EQ_NOTES.
>> If you and steven still want me to abandon this, i will, but in my mind,
>> it seems that if the pass wants chains built with eq_notes, then the
>> rest of the df information should match this decision.
>>     
>
> Chains is one thing, liveness is another.  EQ notes are there as 
> "reminders of useful information", I don't want them to affect register 
> allocation (via liveness) if the information was not used.
>
>
>   
I understand that, that is why if the pass does not specify DF_EQ_NOTES,
the lr (and the rest of the info) stays as it is now.  But if you are
building chains out of them, then you are treating them as live anyway.

kenny
Comment 22 stevenb.gcc@gmail.com 2008-01-21 14:29:10 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

On 21 Jan 2008 13:25:23 -0000, zadeck at naturalbridge dot com <gcc-
> I understand that, that is why if the pass does not specify DF_EQ_NOTES,
> the lr (and the rest of the info) stays as it is now.  But if you are
> building chains out of them, then you are treating them as live anyway.

This is not really true.  You are treating the registers as
"reaching", i.e. even though the register is not live at the point of
the REG_EQ* note, it _could_ be live holding the value expected for
the REG_EQ* note.

This difference between live registers and reaching definitions is
also why I feel so uncomfortable about the RD patch.  After trimming,
DEFs that reach a certain point in the program don't show up in the
reaching definitions anymore.  You've already found one example how
this can be confusing (the loop IV problem). It also breaks my new
const/copy prop pass, which expects definitions to reach points where
the register of the reaching def is not live.

What is confusing about all this, is that in GCC there is nothing to
prevent you from inserting a new SET of a register between a previous
set of the register and a use in an REG_EQ* note, if the register is
dead at the point of the REG_EQ* note. E.g. it is perfectly OK to have
a situation like this:

(insn 1 (set (reg1) (...)))
(insn 2 (...) REG_DEAD (reg1))
(insn 3 (...) REG_EQUAL (... ((reg1) ...)))

and nothing in the compiler will object if you would insert a new insn like so:

(insn 1 (set (reg1) (...)))
(insn 2 (...) REG_DEAD (reg1))
(insn 4 (set (reg1) (...)))
(insn 3 (...) REG_EQUAL (... ((reg1) ...)))

even though this would probably result in wrong code. The way GCC
avoids this kind of situation, is by never re-using registers this
way, unless the new definition sets reg1 to the same value (gcse PRE
does this, for example).

The effect of including REG_* notes in the LR and LIVE problems would
be to extend the live ranges of pseudos into regions where they are
not actually live.

What it comes down to, is that it seems that liveness is not a good
condition to trim reaching definitions.  Trimming LIVE with LR makes
sense because the result of the problem does not change, and the
nature of the problems are the same (i.e. compute two meanings of
liveness for registers). Live registers and reaching definitions, on
the other hand, are dissimilar problems.

For LIVE, the effect of trimming is to not compute partial
availability in regions where a register is not live. The similar
condition for reaching definitions is not liveness, but absence of
uses. To trim reaching definitions, one should really be looking at
the last reachable use of a definition, and trim from there.

I don't know what problem computes the last use of a register, but it
may well be a problem that is equivalent to the LR problem, but
considering *all* uses including REG_EQ* uses, instead of only real
uses.  But I haven't thought about this much yet, I must admit.
Comment 23 Steven Bosscher 2008-01-21 14:38:15 UTC
> The similar
> condition for reaching definitions is not liveness, but absence of
> uses. To trim reaching definitions, one should really be looking at
> the last reachable use of a definition, and trim from there.

Just to clarify, this trimming still changes the result of the reaching definitions problem, but should not change the resulting DEF-USE chains.

For the reaching definitions problem itself, I don't see any way to trim it.
Comment 24 Dominique d'Humieres 2008-01-21 14:39:27 UTC
Side note for the record: the polyhedron test mdbx.f90 also fails in 32 bit mode with rev. 131689 on i686-apple-darwin9.

Comment 25 Richard Biener 2008-01-21 14:40:39 UTC
Can we revert the offending patch and revisit this issue during stage1 please?

Thanks.
Richard.
Comment 26 Paolo Bonzini 2008-01-21 14:54:25 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

Agreed.  I think we'd better revert the patch now.
Comment 27 Kenneth Zadeck 2008-01-21 15:36:24 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

bonzini at gnu dot org wrote:
> ------- Comment #26 from bonzini at gnu dot org  2008-01-21 14:54 -------
> Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90
>
> Agreed.  I think we'd better revert the patch now.
>
>
>   
i give.

i will prepare a revert this afternoon.  i need to go run some errands now.

kenny
Comment 28 Kenneth Zadeck 2008-01-22 13:35:37 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

zadeck at naturalbridge dot com wrote:
> ------- Comment #27 from zadeck at naturalbridge dot com  2008-01-21 15:36 -------
> Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90
>
> bonzini at gnu dot org wrote:
>   
>> ------- Comment #26 from bonzini at gnu dot org  2008-01-21 14:54 -------
>> Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90
>>
>> Agreed.  I think we'd better revert the patch now.
>>
>>
>>   
>>     
> i give.
>
> i will prepare a revert this afternoon.  i need to go run some errands now.
>
> kenny
>
>
>   
This patch reverses most of the rd2 patch that speeds up pr3400 and pr26854.
Unfortunately that patch caused some regressions that are very difficult
to resolved.  These were reported in pr34884.

This patch reverses most of rd2.diff.  It leaves in the fixes to the
comments and one fix to a latent bug.  The patch has been tested on
ia-64, ppc-32, x86-64 and x86-32. 

2008-01-22  Kenneth Zadeck <zadeck@naturalbridge.com>

    PR rtl-optimization/26854
    PR rtl-optimization/34400
    PR rtl-optimization/34884
    * ddg.c (create_ddg_dep_from_intra_loop_link): Use
    DF_RD->gen.
    * df.h (df_changeable_flags.DF_RD_NO_TRIM): Deleted
    (df_rd_bb_info.expanded_lr_out): Deleted
    * loop_invariant.c (find_defs): Deleted DF_RD_NO_TRIM flag.
    * loop_iv.c (iv_analysis_loop_init): Ditto.  * df-problems.c
    (df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n,
    df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free):
    Removed code to allocate, initialize or free expanded_lr_out.
    (df_rd_bb_local_compute_process_def): Restructured to make more
    understandable.
    (df_rd_confluence_n): Removed code to no apply invalidate_by_call
    sets if the sets are being trimmed.


Ok to commit.
Index: ddg.c
===================================================================
--- ddg.c	(revision 131716)
+++ ddg.c	(working copy)
@@ -184,13 +184,12 @@ create_ddg_dep_from_intra_loop_link (ddg
         {
           int regno = REGNO (SET_DEST (set));
           struct df_ref *first_def;
-          struct df_ref *last_def;
+          struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb);
 
           first_def = df_bb_regno_first_def_find (g->bb, regno);
           gcc_assert (first_def);
 
-          last_def = df_bb_regno_last_def_find (g->bb, regno);
-          if (first_def == last_def)
+          if (bitmap_bit_p (bb_info->gen, first_def->id))
             return;
         }
     }
Index: df.h
===================================================================
--- df.h	(revision 131716)
+++ df.h	(working copy)
@@ -405,27 +405,20 @@ enum df_changeable_flags 
   DF_LR_RUN_DCE           = 1 << 0, /* Run DCE.  */
   DF_NO_HARD_REGS         = 1 << 1, /* Skip hard registers in RD and CHAIN Building.  */
 
-  /* Do not trim the solution using the LR result.  This can make the
-     solution take much longer and take more memory.  This is
-     necessary for the loop optimizations, but has a very small time
-     and space penalty because the loop optimizations process only a
-     single loop at a time.  Any pass that looks at the entire
-     function should not set this flag.  */
-  DF_RD_NO_TRIM           = 1 << 2,
-  DF_EQ_NOTES             = 1 << 3, /* Build chains with uses present in EQUIV/EQUAL notes. */
-  DF_NO_REGS_EVER_LIVE    = 1 << 4, /* Do not compute the regs_ever_live.  */
+  DF_EQ_NOTES             = 1 << 2, /* Build chains with uses present in EQUIV/EQUAL notes. */
+  DF_NO_REGS_EVER_LIVE    = 1 << 3, /* Do not compute the regs_ever_live.  */
 
   /* Cause df_insn_rescan df_notes_rescan and df_insn_delete, to
   return immediately.  This is used by passes that know how to update
   the scanning them selves.  */
-  DF_NO_INSN_RESCAN       = 1 << 5,
+  DF_NO_INSN_RESCAN       = 1 << 4,
 
   /* Cause df_insn_rescan df_notes_rescan and df_insn_delete, to
   return after marking the insn for later processing.  This allows all
   rescans to be batched.  */
-  DF_DEFER_INSN_RESCAN    = 1 << 6,
+  DF_DEFER_INSN_RESCAN    = 1 << 5,
 
-  DF_VERIFY_SCHEDULED     = 1 << 7
+  DF_VERIFY_SCHEDULED     = 1 << 6
 };
 
 /* Two of these structures are inline in df, one for the uses and one
@@ -719,37 +712,9 @@ struct df_rd_bb_info 
   /* Local sets to describe the basic blocks.   */
   bitmap kill;  
   bitmap sparse_kill;
+  bitmap gen;   /* The set of defs generated in this block.  */
 
-  /* Expanded version of the DF_LT->out bitmap to match the positions
-     of gen, in and out here.  Only allocated if DF_RD_NO_TRIM is
-     false.  */
-  bitmap expanded_lr_out;
-
-  /* The set of defs generated in this block.  This is not set unless
-     the def reaches the end of the block.  */
-  bitmap gen;
-
-  /* The results of the dataflow problem.  
-
-     If DF_RD_NO_TRIM is not set, these sets are SOMEWHAT trimmed by
-     the output of the DF_LR problem.  The out set is precisely
-     trimmed during propagation which means that the result is also
-     trimmed when the propagation terminates.  The in set is not
-     explicitly trimmed, because this is expensive (adding about 5% to
-     the cost of a bootstrap).  However since the out sets are trimmed
-     and the in sets are built from the out of the pred, the in set is
-     MOSTLY trimmed.
-
-     The counter case happens at a branch where the variable V is in
-     DF_LR->in the true branch but not the false branch.  If V is
-     defined before the branch, RD will propagate that into the
-     DF_RD_in sets of both branches.  When the block is processed, the
-     DF_RD->out set will have V trimmed out of it but it will still be
-     left in DF_RD->in.  
-
-     If this not a problem for the current optimizers since they were
-     designed before any trimming was available.  This can be fixed by
-     checking the DF_LR->in set directly.  */
+  /* The results of the dataflow problem.  */
   bitmap in;    /* At the top of the block.  */
   bitmap out;   /* At the bottom of the block.  */
 };
Index: loop-invariant.c
===================================================================
--- loop-invariant.c	(revision 131716)
+++ loop-invariant.c	(working copy)
@@ -639,7 +639,6 @@ find_defs (struct loop *loop, basic_bloc
   df_remove_problem (df_chain);
   df_process_deferred_rescans ();
   df_chain_add_problem (DF_UD_CHAIN);
-  df_set_flags (DF_RD_NO_TRIM);
   df_set_blocks (blocks);
   df_analyze ();
 
Index: loop-iv.c
===================================================================
--- loop-iv.c	(revision 131716)
+++ loop-iv.c	(working copy)
@@ -278,7 +278,6 @@ iv_analysis_loop_init (struct loop *loop
   df_remove_problem (df_chain);
   df_process_deferred_rescans ();
   df_chain_add_problem (DF_UD_CHAIN);
-  df_set_flags (DF_RD_NO_TRIM);
   df_set_blocks (blocks);
   df_analyze ();
   if (dump_file)
Index: df-problems.c
===================================================================
--- df-problems.c	(revision 131716)
+++ df-problems.c	(working copy)
@@ -245,8 +245,6 @@ df_rd_free_bb_info (basic_block bb ATTRI
   struct df_rd_bb_info *bb_info = (struct df_rd_bb_info *) vbb_info;
   if (bb_info)
     {
-      if (bb_info->expanded_lr_out)
-	BITMAP_FREE (bb_info->expanded_lr_out);
       BITMAP_FREE (bb_info->kill);
       BITMAP_FREE (bb_info->sparse_kill);
       BITMAP_FREE (bb_info->gen);
@@ -300,8 +298,6 @@ df_rd_alloc (bitmap all_blocks)
       struct df_rd_bb_info *bb_info = df_rd_get_bb_info (bb_index);
       if (bb_info)
 	{ 
-	  if (bb_info->expanded_lr_out)
-	    bitmap_clear (bb_info->expanded_lr_out);
 	  bitmap_clear (bb_info->kill);
 	  bitmap_clear (bb_info->sparse_kill);
 	  bitmap_clear (bb_info->gen);
@@ -310,10 +306,6 @@ df_rd_alloc (bitmap all_blocks)
 	{ 
 	  bb_info = (struct df_rd_bb_info *) pool_alloc (df_rd->block_pool);
 	  df_rd_set_bb_info (bb_index, bb_info);
-	  if (df->changeable_flags & DF_RD_NO_TRIM)
-	    bb_info->expanded_lr_out = NULL;
-	  else
-	    bb_info->expanded_lr_out = BITMAP_ALLOC (&problem_data->rd_bitmaps);
 	  bb_info->kill = BITMAP_ALLOC (&problem_data->rd_bitmaps);
 	  bb_info->sparse_kill = BITMAP_ALLOC (&problem_data->rd_bitmaps);
 	  bb_info->gen = BITMAP_ALLOC (&problem_data->rd_bitmaps);
@@ -328,53 +320,56 @@ df_rd_alloc (bitmap all_blocks)
 /* Process a list of DEFs for df_rd_bb_local_compute.  */
 
 static void
-df_rd_bb_local_compute_process_def (struct df_rd_bb_info *bb_info,
+df_rd_bb_local_compute_process_def (struct df_rd_bb_info *bb_info, 
 				    struct df_ref **def_rec,
 				    enum df_ref_flags top_flag)
 {
-  for (; *def_rec; def_rec++)
+  while (*def_rec)
     {
       struct df_ref *def = *def_rec;
-      unsigned int regno = DF_REF_REGNO (def);
-
-      /* This makes sure we do the artificial defs in the right order
-	 since they are all in the same list.  */
-      if (top_flag != (DF_REF_FLAGS (def) & DF_REF_AT_TOP))
-	continue;
-
-      /* Skip over the hard regs if we do not care about them.  */
-      if ((df->changeable_flags & DF_NO_HARD_REGS) && 
-	  (regno < FIRST_PSEUDO_REGISTER))
-	continue;
-
-      /* Only the last def(s) for a regno in the block has any
-	 effect.  */ 
-      if (bitmap_bit_p (seen_in_block, regno))
-	continue;
-
-      /* The first def for regno in insn gets to knock out the
-	 defs from other instructions.  */
-      if ((!bitmap_bit_p (seen_in_insn, regno))
-	  /* If the def is to only part of the reg, it does
-	     not kill the other defs that reach here.  */
-	  && (!(DF_REF_FLAGS (def) & 
-		(DF_REF_PARTIAL | DF_REF_CONDITIONAL | DF_REF_MAY_CLOBBER))))
+      if (top_flag == (DF_REF_FLAGS (def) & DF_REF_AT_TOP))
 	{
+	  unsigned int regno = DF_REF_REGNO (def);
 	  unsigned int begin = DF_DEFS_BEGIN (regno);
 	  unsigned int n_defs = DF_DEFS_COUNT (regno);
-	  if (n_defs > DF_SPARSE_THRESHOLD)
-	    bitmap_set_bit (bb_info->sparse_kill, regno);
-	  else
-	    bitmap_set_range (bb_info->kill, begin, n_defs);
-	  bitmap_clear_range(bb_info->gen, begin, n_defs);
+	  
+	  if ((!(df->changeable_flags & DF_NO_HARD_REGS))
+	      || (regno >= FIRST_PSEUDO_REGISTER))
+	    {
+	      /* Only the last def(s) for a regno in the block has any
+		 effect.  */ 
+	      if (!bitmap_bit_p (seen_in_block, regno))
+		{
+		  /* The first def for regno in insn gets to knock out the
+		     defs from other instructions.  */
+		  if ((!bitmap_bit_p (seen_in_insn, regno))
+		      /* If the def is to only part of the reg, it does
+			 not kill the other defs that reach here.  */
+		      && (!(DF_REF_FLAGS (def) & 
+			    (DF_REF_PARTIAL | DF_REF_CONDITIONAL | DF_REF_MAY_CLOBBER))))
+		    {
+		      if (n_defs > DF_SPARSE_THRESHOLD)
+			{
+			  bitmap_set_bit (bb_info->sparse_kill, regno);
+			  bitmap_clear_range(bb_info->gen, begin, n_defs);
+			}
+		      else
+			{
+			  bitmap_set_range (bb_info->kill, begin, n_defs);
+			  bitmap_clear_range (bb_info->gen, begin, n_defs);
+			}
+		    }
+		  
+		  bitmap_set_bit (seen_in_insn, regno);
+		  /* All defs for regno in the instruction may be put into
+		     the gen set.  */
+		  if (!(DF_REF_FLAGS (def) 
+			& (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER)))
+		    bitmap_set_bit (bb_info->gen, DF_REF_ID (def));
+		}
+	    }
 	}
-      
-      bitmap_set_bit (seen_in_insn, regno);
-      /* All defs for regno in the instruction may be put into
-	 the gen set.  */
-      if (!(DF_REF_FLAGS (def) 
-	    & (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER)))
-	bitmap_set_bit (bb_info->gen, DF_REF_ID (def));
+      def_rec++;
     }
 }
 
@@ -385,28 +380,14 @@ df_rd_bb_local_compute (unsigned int bb_
 {
   basic_block bb = BASIC_BLOCK (bb_index);
   struct df_rd_bb_info *bb_info = df_rd_get_bb_info (bb_index);
-  struct df_lr_bb_info *lr_bb_info = df_lr_get_bb_info (bb_index);
   rtx insn;
 
   bitmap_clear (seen_in_block);
   bitmap_clear (seen_in_insn);
 
-  if (!(df->changeable_flags & DF_RD_NO_TRIM))
-    {
-      unsigned int regno;
-      bitmap_iterator bi;
-      int first_reg = (df->changeable_flags & DF_NO_HARD_REGS) ? FIRST_PSEUDO_REGISTER : 0;
-      EXECUTE_IF_SET_IN_BITMAP (lr_bb_info->out, first_reg, regno, bi)
-	{
-	  unsigned int begin = DF_DEFS_BEGIN (regno);
-	  unsigned int n_defs = DF_DEFS_COUNT (regno);
-	  bitmap_set_range (bb_info->expanded_lr_out, begin, n_defs);
-	}
-    }
-
   /* Artificials are only hard regs.  */
   if (!(df->changeable_flags & DF_NO_HARD_REGS))
-    df_rd_bb_local_compute_process_def (bb_info,
+    df_rd_bb_local_compute_process_def (bb_info, 
 					df_get_artificial_defs (bb_index),
 					0);
 
@@ -504,10 +485,7 @@ df_rd_confluence_n (edge e)
   if (e->flags & EDGE_FAKE) 
     return;
 
-  /* If we are trimming the solution, the invalidated_by_call code in
-     the lr problem makes this unnecessary.  However, if we do not
-     trim, we must take this into account.  */
-  if ((df->changeable_flags & DF_RD_NO_TRIM) && e->flags & EDGE_EH)
+  if (e->flags & EDGE_EH)
     {
       struct df_rd_problem_data *problem_data
 	= (struct df_rd_problem_data *) df_rd->problem_data;
@@ -515,7 +493,7 @@ df_rd_confluence_n (edge e)
       bitmap dense_invalidated = problem_data->dense_invalidated_by_call;
       bitmap_iterator bi;
       unsigned int regno;
-      bitmap tmp = BITMAP_ALLOC (&problem_data->rd_bitmaps);
+      bitmap tmp = BITMAP_ALLOC (&df_bitmap_obstack);
 
       bitmap_copy (tmp, op2);
       bitmap_and_compl_into (tmp, dense_invalidated);
@@ -547,13 +525,13 @@ df_rd_transfer_function (int bb_index)
   bitmap gen = bb_info->gen;
   bitmap kill = bb_info->kill;
   bitmap sparse_kill = bb_info->sparse_kill;
-  bool changed = false;
 
-  if ((df->changeable_flags & DF_RD_NO_TRIM) && bitmap_empty_p (sparse_kill))
-    changed = bitmap_ior_and_compl (out, gen, in, kill);
+  if (bitmap_empty_p (sparse_kill))
+    return  bitmap_ior_and_compl (out, gen, in, kill);
   else 
     {
       struct df_rd_problem_data *problem_data;
+      bool changed = false;
       bitmap tmp;
 
       /* Note that TMP is _not_ a temporary bitmap if we end up replacing
@@ -570,8 +548,6 @@ df_rd_transfer_function (int bb_index)
 	}
       bitmap_and_compl_into (tmp, kill);
       bitmap_ior_into (tmp, gen);
-      if (!(df->changeable_flags & DF_RD_NO_TRIM))
-	bitmap_and_into (tmp, bb_info->expanded_lr_out);
       changed = !bitmap_equal_p (tmp, out);
       if (changed)
 	{
@@ -579,10 +555,9 @@ df_rd_transfer_function (int bb_index)
 	  bb_info->out = tmp;
 	}
       else 
-	BITMAP_FREE (tmp);
+	  BITMAP_FREE (tmp);
+      return changed;
     }
-
-  return changed;
 }
 
 
@@ -602,8 +577,6 @@ df_rd_free (void)
 	  struct df_rd_bb_info *bb_info = df_rd_get_bb_info (i);
 	  if (bb_info)
 	    {
-	      if (bb_info->expanded_lr_out)
-		BITMAP_FREE (bb_info->expanded_lr_out);
 	      BITMAP_FREE (bb_info->kill);
 	      BITMAP_FREE (bb_info->sparse_kill);
 	      BITMAP_FREE (bb_info->gen);
Comment 29 zadeck@gcc.gnu.org 2008-01-22 13:57:50 UTC
Subject: Bug 34884

Author: zadeck
Date: Tue Jan 22 13:57:01 2008
New Revision: 131719

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131719
Log:
2008-01-22  Kenneth Zadeck <zadeck@naturalbridge.com>

	PR rtl-optimization/26854
	PR rtl-optimization/34400
	PR rtl-optimization/34884
	* ddg.c (create_ddg_dep_from_intra_loop_link): Use
	DF_RD->gen.
	* df.h (df_changeable_flags.DF_RD_NO_TRIM): Deleted
	(df_rd_bb_info.expanded_lr_out): Deleted
	* loop_invariant.c (find_defs): Deleted DF_RD_NO_TRIM flag.
	* loop_iv.c (iv_analysis_loop_init): Ditto.  * df-problems.c
	(df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n,
	df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free):
	Removed code to allocate, initialize or free expanded_lr_out.
	(df_rd_bb_local_compute_process_def): Restructured to make more
	understandable.
	(df_rd_confluence_n): Removed code to no apply invalidate_by_call
	sets if the sets are being trimmed.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ddg.c
    trunk/gcc/df-problems.c
    trunk/gcc/df.h
    trunk/gcc/loop-invariant.c
    trunk/gcc/loop-iv.c

Comment 30 Kenneth Zadeck 2008-01-22 13:58:06 UTC
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

Kenneth Zadeck wrote:
> zadeck at naturalbridge dot com wrote:
>   
>> ------- Comment #27 from zadeck at naturalbridge dot com  2008-01-21 15:36 -------
>> Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90
>>
>> bonzini at gnu dot org wrote:
>>   
>>     
>>> ------- Comment #26 from bonzini at gnu dot org  2008-01-21 14:54 -------
>>> Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90
>>>
>>> Agreed.  I think we'd better revert the patch now.
>>>
>>>
>>>   
>>>     
>>>       
>> i give.
>>
>> i will prepare a revert this afternoon.  i need to go run some errands now.
>>
>> kenny
>>
>>
>>   
>>     
> This patch reverses most of the rd2 patch that speeds up pr3400 and pr26854.
> Unfortunately that patch caused some regressions that are very difficult
> to resolved.  These were reported in pr34884.
>
> This patch reverses most of rd2.diff.  It leaves in the fixes to the
> comments and one fix to a latent bug.  The patch has been tested on
> ia-64, ppc-32, x86-64 and x86-32. 
>
> 2008-01-22  Kenneth Zadeck <zadeck@naturalbridge.com>
>
>     PR rtl-optimization/26854
>     PR rtl-optimization/34400
>     PR rtl-optimization/34884
>     * ddg.c (create_ddg_dep_from_intra_loop_link): Use
>     DF_RD->gen.
>     * df.h (df_changeable_flags.DF_RD_NO_TRIM): Deleted
>     (df_rd_bb_info.expanded_lr_out): Deleted
>     * loop_invariant.c (find_defs): Deleted DF_RD_NO_TRIM flag.
>     * loop_iv.c (iv_analysis_loop_init): Ditto.  * df-problems.c
>     (df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n,
>     df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free):
>     Removed code to allocate, initialize or free expanded_lr_out.
>     (df_rd_bb_local_compute_process_def): Restructured to make more
>     understandable.
>     (df_rd_confluence_n): Removed code to no apply invalidate_by_call
>     sets if the sets are being trimmed.
>
>
> Ok to commit.
>   
committed as revision 131719

kenny
Comment 31 Kenneth Zadeck 2008-01-22 14:35:58 UTC
resolved with the patch referenced in comment 30.