Bug 45605 - Missed devirtualization
Summary: Missed devirtualization
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Jan Hubicka
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-09 01:14 UTC by Jan Hubicka
Modified: 2019-12-19 19:38 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-09-16 16:00:08


Attachments
WIP patch. Still misses some of Richi's earlier comments. (2.74 KB, patch)
2010-09-09 11:36 UTC, Jan Hubicka
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Hubicka 2010-09-09 01:14:16 UTC
The following testcase taken from testsuite (and many others):
// PR 14535
// { dg-do run }
// { dg-options "-O -finline" }
//
// Original test case failure required that Raiser constructor be inlined.

extern "C" void abort(); 
bool destructor_called = false; 
 
struct B { 
    virtual void Run(){}; 
}; 
 
struct D : public B { 
    virtual void Run() 
      { 
        struct O { 
            ~O() { destructor_called = true; }; 
        } o; 
         
        struct Raiser { 
            Raiser()  throw( int ) {throw 1;}; 
        } raiser; 
      }; 
}; 
 
int main() { 
    try { 
      D d; 
      static_cast<B&>(d).Run(); 
    } catch (...) {} 
 
    if (!destructor_called) 
      abort (); 
} 

leads to following in .optimized dump:
<bb 2>:
  MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2];
  d.D.2108._vptr.B = &_ZTV1D[2];
  D.2210_2 = _ZTV1D[2];
  OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108);

Obviously OBJ_TYPE_REF can be optimized here since we know the
virtual table.  Why it is missed?

I run into it because my new folding code is actually smart enough to lead to:
int (*__vtbl_ptr_type) (void)

void D::<T400> (struct D *)

D.2210_2 = Run;

The error comes because we pick from initializer of VTBL the following ADDR_EXPR:
 <addr_expr 0x7ffff7ec8b10
    type <pointer_type 0x7ffff6b7db28 __vtbl_ptr_type
        type <function_type 0x7ffff6b7da80 type <integer_type 0x7ffff7edb498 int>
            type_6 QI
            size <integer_cst 0x7ffff7ec54d8 constant 8>
            unit size <integer_cst 0x7ffff7ec5500 constant 1>
            align 8 symtab 0 alias set -1 canonical type 0x7ffff6b7da80
            pointer_to_this <pointer_type 0x7ffff6b7db28 __vtbl_ptr_type>>
        unsigned type_6 DI
        size <integer_cst 0x7ffff7ec57d0 constant 64>
        unit size <integer_cst 0x7ffff7ec57f8 constant 8>
        align 64 symtab 0 alias set -1 canonical type 0x7ffff6b7db28
        pointer_to_this <pointer_type 0x7ffff6b7dc78>>
    constant
    arg 0 <function_decl 0x7ffff6b8e500 Run
        type <method_type 0x7ffff6b9b9d8 type <void_type 0x7ffff7edbe70 void>
            type_6 QI size <integer_cst 0x7ffff7ec54d8 8> unit size <integer_cst 0x7ffff7ec5500 1>
            align 8 symtab 0 alias set -1 canonical type 0x7ffff6b9b9d8 method basetype <record_type 0x7ffff6b9b7e0 D>
            arg-types <tree_list 0x7ffff6b80af0 value <pointer_type 0x7ffff6b9ba80>
                chain <tree_list 0x7ffff7eed398 value <void_type 0x7ffff7edbe70 void>>>
            pointer_to_this <pointer_type 0x7ffff6bab150>>
        addressable volatile asm_written used public static weak autoinline virtual decl_5 QI file /home/jh/a.C line 15 col 18 align 16 context <record_type 0x7ffff6b9b7e0 D> initial <error_mark 0x7ffff7ecdb28>
        arguments <parm_decl 0x7ffff7ece990 this type <pointer_type 0x7ffff6b9bb28>
            readonly unsigned DI file /home/jh/a.C line 15 col 22 size <integer_cst 0x7ffff7ec57d0 64> unit size <integer_cst 0x7ffff7ec57f8 8>
            align 64 context <function_decl 0x7ffff6b8e500 Run>
            (reg/f:DI 62 [ this ]) arg-type <pointer_type 0x7ffff6b9bb28>
            incoming-rtl (reg:DI 5 di [ this ])>
        result <result_decl 0x7ffff7ed0380 D.2112 type <void_type 0x7ffff7edbe70 void>
            ignored VOID file /home/jh/a.C line 16 col 7
            align 8 context <function_decl 0x7ffff6b8e500 Run>>
        full-name "virtual void D::Run()"
        pending-inline-info 0x7ffff6ba42a0
        (mem:QI (symbol_ref/i:DI ("_ZN1D3RunEv") [flags 0x1] <function_decl 0x7ffff6b8e500 Run>) [0 S1 A8])>>

Note the differece in between type of address and the method RUN.  This is what makes verifier unhappy.  How to fix this?  Should FE put there some NOP_EXPR somewhere? Still devirtualizing should realize this case a lot earlier.

Honza
Comment 1 Andrew Pinski 2010-09-09 01:17:14 UTC
I think this is the same issue as PR 19816.
Comment 2 rguenther@suse.de 2010-09-09 08:43:37 UTC
Subject: Re:   New: Missed devirtualization

On Thu, 9 Sep 2010, hubicka at gcc dot gnu dot org wrote:

> The following testcase taken from testsuite (and many others):
> // PR 14535
> // { dg-do run }
> // { dg-options "-O -finline" }
> //
> // Original test case failure required that Raiser constructor be inlined.
> 
> extern "C" void abort(); 
> bool destructor_called = false; 
> 
> struct B { 
>     virtual void Run(){}; 
> }; 
> 
> struct D : public B { 
>     virtual void Run() 
>       { 
>         struct O { 
>             ~O() { destructor_called = true; }; 
>         } o; 
> 
>         struct Raiser { 
>             Raiser()  throw( int ) {throw 1;}; 
>         } raiser; 
>       }; 
> }; 
> 
> int main() { 
>     try { 
>       D d; 
>       static_cast<B&>(d).Run(); 
>     } catch (...) {} 
> 
>     if (!destructor_called) 
>       abort (); 
> } 
> 
> leads to following in .optimized dump:
> <bb 2>:
>   MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2];
>   d.D.2108._vptr.B = &_ZTV1D[2];
>   D.2210_2 = _ZTV1D[2];
>   OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108);
> 
> Obviously OBJ_TYPE_REF can be optimized here since we know the
> virtual table.  Why it is missed?
> 
> I run into it because my new folding code is actually smart enough to lead to:
> int (*__vtbl_ptr_type) (void)
> 
> void D::<T400> (struct D *)
> 
> D.2210_2 = Run;
> 
> The error comes because we pick from initializer of VTBL the following
> ADDR_EXPR:
>  <addr_expr 0x7ffff7ec8b10
>     type <pointer_type 0x7ffff6b7db28 __vtbl_ptr_type
>         type <function_type 0x7ffff6b7da80 type <integer_type 0x7ffff7edb498
> int>
>             type_6 QI
>             size <integer_cst 0x7ffff7ec54d8 constant 8>
>             unit size <integer_cst 0x7ffff7ec5500 constant 1>
>             align 8 symtab 0 alias set -1 canonical type 0x7ffff6b7da80
>             pointer_to_this <pointer_type 0x7ffff6b7db28 __vtbl_ptr_type>>
>         unsigned type_6 DI
>         size <integer_cst 0x7ffff7ec57d0 constant 64>
>         unit size <integer_cst 0x7ffff7ec57f8 constant 8>
>         align 64 symtab 0 alias set -1 canonical type 0x7ffff6b7db28
>         pointer_to_this <pointer_type 0x7ffff6b7dc78>>
>     constant
>     arg 0 <function_decl 0x7ffff6b8e500 Run
>         type <method_type 0x7ffff6b9b9d8 type <void_type 0x7ffff7edbe70 void>
>             type_6 QI size <integer_cst 0x7ffff7ec54d8 8> unit size
> <integer_cst 0x7ffff7ec5500 1>
>             align 8 symtab 0 alias set -1 canonical type 0x7ffff6b9b9d8 method
> basetype <record_type 0x7ffff6b9b7e0 D>
>             arg-types <tree_list 0x7ffff6b80af0 value <pointer_type
> 0x7ffff6b9ba80>
>                 chain <tree_list 0x7ffff7eed398 value <void_type 0x7ffff7edbe70
> void>>>
>             pointer_to_this <pointer_type 0x7ffff6bab150>>
>         addressable volatile asm_written used public static weak autoinline
> virtual decl_5 QI file /home/jh/a.C line 15 col 18 align 16 context
> <record_type 0x7ffff6b9b7e0 D> initial <error_mark 0x7ffff7ecdb28>
>         arguments <parm_decl 0x7ffff7ece990 this type <pointer_type
> 0x7ffff6b9bb28>
>             readonly unsigned DI file /home/jh/a.C line 15 col 22 size
> <integer_cst 0x7ffff7ec57d0 64> unit size <integer_cst 0x7ffff7ec57f8 8>
>             align 64 context <function_decl 0x7ffff6b8e500 Run>
>             (reg/f:DI 62 [ this ]) arg-type <pointer_type 0x7ffff6b9bb28>
>             incoming-rtl (reg:DI 5 di [ this ])>
>         result <result_decl 0x7ffff7ed0380 D.2112 type <void_type
> 0x7ffff7edbe70 void>
>             ignored VOID file /home/jh/a.C line 16 col 7
>             align 8 context <function_decl 0x7ffff6b8e500 Run>>
>         full-name "virtual void D::Run()"
>         pending-inline-info 0x7ffff6ba42a0
>         (mem:QI (symbol_ref/i:DI ("_ZN1D3RunEv") [flags 0x1] <function_decl
> 0x7ffff6b8e500 Run>) [0 S1 A8])>>
> 
> Note the differece in between type of address and the method RUN.  This is what
> makes verifier unhappy.  How to fix this?  Should FE put there some NOP_EXPR
> somewhere? Still devirtualizing should realize this case a lot earlier.

Well - it's easy.  Whoever does substitution needs to verify it can
do so.  Which of course is tricky with the stupid way we do
substitute_and_fold ...

Richard.
Comment 3 Jan Hubicka 2010-09-09 11:02:37 UTC
Hmm, is it?
C equivalent IMO is:
int a(void);

typedef void (*ptr) (void);

static const ptr array[1]={(ptr)a};
ptr t;
main()
{
  t=array[0];
}

Here we have ctor represented as follows:
 <nop_expr 0x7ffff6b53f30
    type <pointer_type 0x7ffff6b383f0
        type <function_type 0x7ffff7eee690 type <void_type 0x7ffff7ed9e70 void>
            QI
            size <integer_cst 0x7ffff7ec54d8 constant 8>
            unit size <integer_cst 0x7ffff7ec5500 constant 1>
            align 8 symtab 0 alias set -1 canonical type 0x7ffff7eee690
            arg-types <tree_list 0x7ffff7ec5eb0 value <void_type 0x7ffff7ed9e70 void>>
            pointer_to_this <pointer_type 0x7ffff6b383f0>>
        unsigned DI
        size <integer_cst 0x7ffff7ec57d0 constant 64>
        unit size <integer_cst 0x7ffff7ec57f8 constant 8>
        align 64 symtab 0 alias set -1 canonical type 0x7ffff6b383f0
        pointer_to_this <pointer_type 0x7ffff6b38738>>
    constant
    arg 0 <addr_expr 0x7ffff6b53f00
        type <pointer_type 0x7ffff6b38bd0 type <function_type 0x7ffff7eee888>
            unsigned DI size <integer_cst 0x7ffff7ec57d0 64> unit size <integer_cst 0x7ffff7ec57f8 8>
            align 64 symtab 0 alias set -1 canonical type 0x7ffff6b38bd0>
        constant
        arg 0 <function_decl 0x7ffff6b37f00 a type <function_type 0x7ffff7eee888>
            addressable used public external decl_5 QI file t.c line 1 col 5 align 8 chain <var_decl 0x7ffff6b550a0 t>>
        t.c:5:33>
    t.c:5:28>

this gets into canonicalize_constructor_val and it does STRIP_NOPS so we
return addr_expr with different type than array_ref from fold_const_aggregate_ref (this code is copied from old implementation, the STRIP_NOPS there seemed bid odd to me originally) and this lands in ccp_fold that is happy and CCP re-inserts the nop later.

I guess C++ FE is wrong to produce type mismatch in addr_expr?  That should be easy to fix.

I wonder about other users of fold_const_aggregate_ref. Shall they also work on re-inserting the conversions to avoid possible type mismatch?

Obviously more worms...
Honza
Comment 4 Jan Hubicka 2010-09-09 11:33:05 UTC
testing the following fix:
Index: class.c
===================================================================
--- class.c     (revision 163947)
+++ class.c     (working copy)
@@ -7797,7 +7797,7 @@ build_vtbl_initializer (tree binfo,
            {
              fn = abort_fndecl;
              if (abort_fndecl_addr == NULL)
-               abort_fndecl_addr = build1 (ADDR_EXPR, vfunc_ptr_type_node, fn);
+               fold_convert (vfunc_ptr_type_node, build_fold_addr_expr (fn));
              init = abort_fndecl_addr;
            }
          else
@@ -7810,7 +7810,7 @@ build_vtbl_initializer (tree binfo,
                }
              /* Take the address of the function, considering it to be of an
                 appropriate generic type.  */
-             init = build1 (ADDR_EXPR, vfunc_ptr_type_node, fn);
+             init = fold_convert (vfunc_ptr_type_node, build_fold_addr_expr (fn));
            }
this solves the ICE, but won't get call devirtualized. We end up with:
  MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2];
  d.D.2108._vptr.B = &_ZTV1D[2];
  D.2210_2 = (int (*__vtbl_ptr_type) (void)) Run;
  OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108);
It seems that forw-prop should be told that OBJ_TYPE_REF does not care about nops on the operand and also fold_ccp can be told to call fold_obj_type_ref on the substituted constant.  Martin, can you take a look, please?

I am attaching current WIP version of my constructor folding that makes D.2210_2 = (int (*__vtbl_ptr_type) (void)) Run; to happen.

Honza
Comment 5 Jan Hubicka 2010-09-09 11:36:07 UTC
Created attachment 21750 [details]
WIP patch. Still misses some of Richi's earlier comments.
Comment 6 Jan Hubicka 2010-09-10 09:42:33 UTC
Subject: Bug 45605

Author: hubicka
Date: Fri Sep 10 09:42:20 2010
New Revision: 164148

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164148
Log:

	PR tree-optimization/45605
	* cp/class.c (build_vtbl_initializer): Avoid wrong type conversion in ADDR_EXPR


Modified:
    trunk/gcc/cp/ChangeLog
    trunk/gcc/cp/class.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/g++.dg/inherit/covariant7.C

Comment 7 Martin Jambor 2010-09-15 18:42:55 UTC
Well, it turns out that fold_stmt_1 is never called on that statement
(neither with -O -finline or -O2 or -O3).  Where is it supposed to be
called from?
Comment 8 rguenther@suse.de 2010-09-15 19:09:59 UTC
Subject: Re:  Missed devirtualization

On Wed, 15 Sep 2010, jamborm at gcc dot gnu dot org wrote:

> ------- Comment #7 from jamborm at gcc dot gnu dot org  2010-09-15 18:42 -------
> Well, it turns out that fold_stmt_1 is never called on that statement
> (neither with -O -finline or -O2 or -O3).  Where is it supposed to be
> called from?

We fold a stmt only if it is propagated to (by ccp, copyprop, forwprop,
dom or by inlining).
Comment 9 Jan Hubicka 2010-09-15 22:39:59 UTC
Subject: Re:  Missed devirtualization

> We fold a stmt only if it is propagated to (by ccp, copyprop, forwprop,
> dom or by inlining).
Well, since fold_stmt is stornger than what fe does, I guess we should fold each stmt at least once.

Honza
Comment 10 rguenther@suse.de 2010-09-16 08:50:56 UTC
Subject: Re:  Missed devirtualization

On Wed, 15 Sep 2010, hubicka at ucw dot cz wrote:

> ------- Comment #9 from hubicka at ucw dot cz  2010-09-15 22:39 -------
> Subject: Re:  Missed devirtualization
> 
> > We fold a stmt only if it is propagated to (by ccp, copyprop, forwprop,
> > dom or by inlining).
> Well, since fold_stmt is stornger than what fe does, I guess we should fold
> each stmt at least once.

No we shouldn't.  We do fold some stmts during gimplification.

Richard.
Comment 11 Jan Hubicka 2010-09-16 12:25:30 UTC
Hmm, so do you have any idea where folding should be added for this particular case?

It always seemed to me that it would make sense to add verifier that all statements are folded (locally, not by looking at SSA graph) at optimization pass boundaries. Tried it in the past and found a lot of issues that got fixed, but we never got into agreeing on any policy here.

Missing optimizations just because we are stupid enough to forget call fold when updating something seems bad IMO.
Comment 12 rguenther@suse.de 2010-09-16 12:31:37 UTC
Subject: Re:  Missed devirtualization

On Thu, 16 Sep 2010, hubicka at gcc dot gnu dot org wrote:

> ------- Comment #11 from hubicka at gcc dot gnu dot org  2010-09-16 12:25 -------
> Hmm, so do you have any idea where folding should be added for this particular
> case?
> 
> It always seemed to me that it would make sense to add verifier that all
> statements are folded (locally, not by looking at SSA graph) at optimization
> pass boundaries. Tried it in the past and found a lot of issues that got fixed,
> but we never got into agreeing on any policy here.
> 
> Missing optimizations just because we are stupid enough to forget call fold
> when updating something seems bad IMO.

I'm lost in this PR - for what testcase what statement needs folding
(and what pending patches do I need to apply to see that)?
Comment 13 Jan Hubicka 2010-09-16 12:48:55 UTC
Subject: Re:  Missed devirtualization

> I'm lost in this PR - for what testcase what statement needs folding
> (and what pending patches do I need to apply to see that)?
PR is tracking missed optimization in the testcase in comment 0.

There are two issues



1) OBJ_TYPE_REF folding should handle it.  For that we seem to need to
evern call fold on 
  OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108);
this you can see on mainline

2) generic folding should work it out the hard way. I.e. for:
  MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2];
  d.D.2108._vptr.B = &_ZTV1D[2];
  D.2210_2 = _ZTV1D[2];
  OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108);
there is nothing that prevents us to resolve _ZTV1D[2] into pointer to Run (by
looking into initializer of vtable variable) and then take OBJ_TYPE_REF away
since it is pointless when first operand is known function.

With patch http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html we closer in
a way that .vpr1 dump has:
  MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2];
  d.D.2078._vptr.B = &_ZTV1D[2];
  D.2179_1 = &_ZTV1D[2];
  D.2180_2 = (int (*__vtbl_ptr_type) (void)) Run;
  OBJ_TYPE_REF(D.2180_2;&d.D.2078->0) (&d.D.2078);

Somewhere I have patch that adds OBJ_TYPE_REF folding into CCP (so when first
argument is function pointer, we just fold into direct call). I will update it
and submit after http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html
is resolved.  Then still there is problem that resolution comes late
since we need FRE to fold
 d.D.2078._vptr.B = &_ZTV1D[2];
  D.2179_1 =  d.D.2078._vptr.B
and FRE is not run early (and we should devirutalize everything early
to get inlining)



1) is priority IMO (at moment we make amazingly little devirtualization at
Mozilla, about 20 calls).
2) is just side effect of my attempt to get folding working that I run into
while looking into kernel poor man C vtables (and ours targhooks).

Honza
Comment 14 rguenther@suse.de 2010-09-16 12:51:02 UTC
Subject: Re:  Missed devirtualization

On Thu, 16 Sep 2010, hubicka at ucw dot cz wrote:

> ------- Comment #13 from hubicka at ucw dot cz  2010-09-16 12:48 -------
> Subject: Re:  Missed devirtualization
> 
> > I'm lost in this PR - for what testcase what statement needs folding
> > (and what pending patches do I need to apply to see that)?
> PR is tracking missed optimization in the testcase in comment 0.
> 
> There are two issues
> 
> 
> 
> 1) OBJ_TYPE_REF folding should handle it.  For that we seem to need to
> evern call fold on 
>   OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108);
> this you can see on mainline
> 
> 2) generic folding should work it out the hard way. I.e. for:
>   MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2];
>   d.D.2108._vptr.B = &_ZTV1D[2];
>   D.2210_2 = _ZTV1D[2];
>   OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108);
> there is nothing that prevents us to resolve _ZTV1D[2] into pointer to Run (by
> looking into initializer of vtable variable) and then take OBJ_TYPE_REF away
> since it is pointless when first operand is known function.
> 
> With patch http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html we closer in
> a way that .vpr1 dump has:
>   MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2];
>   d.D.2078._vptr.B = &_ZTV1D[2];
>   D.2179_1 = &_ZTV1D[2];
>   D.2180_2 = (int (*__vtbl_ptr_type) (void)) Run;
>   OBJ_TYPE_REF(D.2180_2;&d.D.2078->0) (&d.D.2078);
> 
> Somewhere I have patch that adds OBJ_TYPE_REF folding into CCP (so when first
> argument is function pointer, we just fold into direct call). I will update it
> and submit after http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html
> is resolved.  Then still there is problem that resolution comes late
> since we need FRE to fold
>  d.D.2078._vptr.B = &_ZTV1D[2];
>   D.2179_1 =  d.D.2078._vptr.B
> and FRE is not run early (and we should devirutalize everything early
> to get inlining)
> 
> 
> 
> 1) is priority IMO (at moment we make amazingly little devirtualization at
> Mozilla, about 20 calls).
> 2) is just side effect of my attempt to get folding working that I run into
> while looking into kernel poor man C vtables (and ours targhooks).

1) should be fixed by using fold_stmt in gimplify_call instead of
just fold_call_expr.
Comment 15 Richard Biener 2010-09-16 14:55:03 UTC
Like

Index: gimplify.c
===================================================================
--- gimplify.c  (revision 164333)
+++ gimplify.c  (working copy)
@@ -2477,10 +2477,13 @@ gimplify_call_expr (tree *expr_p, gimple
      gimplify_modify_expr.  */
   if (!want_value)
     {
+      gimple_stmt_iterator gsi;
       /* The CALL_EXPR in *EXPR_P is already in GIMPLE form, so all we
         have to do is replicate it as a GIMPLE_CALL tuple.  */
       call = gimple_build_call_from_tree (*expr_p);
       gimplify_seq_add_stmt (pre_p, call);
+      gsi = gsi_last (*pre_p);
+      fold_stmt (&gsi);
       *expr_p = NULL_TREE;
     }
 

but gimple_fold_obj_type_ref_known_binfo returns NULL.
Comment 16 Martin Jambor 2010-09-16 16:00:08 UTC
(In reply to comment #15)
> Like
> 
> Index: gimplify.c
> ===================================================================
> --- gimplify.c  (revision 164333)
> +++ gimplify.c  (working copy)
> @@ -2477,10 +2477,13 @@ gimplify_call_expr (tree *expr_p, gimple
>       gimplify_modify_expr.  */
>    if (!want_value)
>      {
> +      gimple_stmt_iterator gsi;
>        /* The CALL_EXPR in *EXPR_P is already in GIMPLE form, so all we
>          have to do is replicate it as a GIMPLE_CALL tuple.  */
>        call = gimple_build_call_from_tree (*expr_p);
>        gimplify_seq_add_stmt (pre_p, call);
> +      gsi = gsi_last (*pre_p);
> +      fold_stmt (&gsi);
>        *expr_p = NULL_TREE;
>      }
> 

Will this also work also for GIMPLE_CALLs with a LHS or do I have to
add something like the above also elsewhere?

> 
> but gimple_fold_obj_type_ref_known_binfo returns NULL.
> 

cgraph_function_flags_ready needs to be added to the conjunction
(!node->analyzed && !node->in_other_partition) and then it is folded.
I'll prepare a patch tomorrow.

Thanks,  Martin
Comment 17 rguenther@suse.de 2010-09-16 16:06:51 UTC
Subject: Re:  Missed devirtualization

On Thu, 16 Sep 2010, jamborm at gcc dot gnu dot org wrote:

> 
> 
> ------- Comment #16 from jamborm at gcc dot gnu dot org  2010-09-16 16:00 -------
> (In reply to comment #15)
> > Like
> > 
> > Index: gimplify.c
> > ===================================================================
> > --- gimplify.c  (revision 164333)
> > +++ gimplify.c  (working copy)
> > @@ -2477,10 +2477,13 @@ gimplify_call_expr (tree *expr_p, gimple
> >       gimplify_modify_expr.  */
> >    if (!want_value)
> >      {
> > +      gimple_stmt_iterator gsi;
> >        /* The CALL_EXPR in *EXPR_P is already in GIMPLE form, so all we
> >          have to do is replicate it as a GIMPLE_CALL tuple.  */
> >        call = gimple_build_call_from_tree (*expr_p);
> >        gimplify_seq_add_stmt (pre_p, call);
> > +      gsi = gsi_last (*pre_p);
> > +      fold_stmt (&gsi);
> >        *expr_p = NULL_TREE;
> >      }
> > 
> 
> Will this also work also for GIMPLE_CALLs with a LHS or do I have to
> add something like the above also elsewhere?

Yes.

> > 
> > but gimple_fold_obj_type_ref_known_binfo returns NULL.
> > 
> 
> cgraph_function_flags_ready needs to be added to the conjunction
> (!node->analyzed && !node->in_other_partition) and then it is folded.
> I'll prepare a patch tomorrow.
Comment 18 Martin Jambor 2010-09-17 14:46:03 UTC
Honza submitted a patch (http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01369.html) so I guess it is his PR now :-)
Comment 19 Jan Hubicka 2010-09-18 21:25:45 UTC
Subject: Bug 45605

Author: hubicka
Date: Sat Sep 18 21:25:29 2010
New Revision: 164402

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164402
Log:

	PR tree-optimization/45605
	* cgraphunit.c (cgraph_analyze_functions): Allocate bitmap obstack.
	* gimple-fold.c (static_object_in_other_unit_p): New function.
	(canonicalize_constructor_val): Use it.
	(get_symbol_constant_value): Be reaqdy for canonicalize_constructor_val
	returning NULL.
	(gimple_fold_obj_type_ref_known_binfo): Use static_object_in_other_unit_p.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraphunit.c
    trunk/gcc/gimple-fold.c

Comment 20 Jan Hubicka 2010-09-20 15:48:57 UTC
Subject: Bug 45605

Author: hubicka
Date: Mon Sep 20 15:48:42 2010
New Revision: 164438

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164438
Log:

	PR tree-optimize/45605
	* cgraph.h (const_value_known_p): Declare.
	(varpool_decide_const_value_known): Remove.
	* tree-ssa-ccp.c (get_base_constructor): Use it.
	* lto-cgraph.c (compute_ltrans_boundary): Likewise.
	* expr.c (string_constant): Likewise.
	* tree-ssa-loop-ivcanon.c (constant_after_peeling): Likewise.
	* ipa.c (ipa_discover_readonly_nonaddressable_var,
	function_and_variable_visibility): Likewise.
	* gimplify.c (gimplify_call_expr): Likewise.
	* gimple-fold.c (get_symbol_constant_value): Likewise.
	* varpool.c (varpool_decide_const_value_known): Replace by...
	(const_value_known_p): ... this one; handle other kinds of DECLs
	too and work for automatic vars.
	(varpool_finalize_decl): Use const_value_known_p.

	* lto.c (lto_promote_cross_file_statics): Use const_value_known_p.

	* g++.dg/tree-ssa/pr45605.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/tree-ssa/pr45605.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.h
    trunk/gcc/expr.c
    trunk/gcc/gimple-fold.c
    trunk/gcc/gimplify.c
    trunk/gcc/ipa.c
    trunk/gcc/lto-cgraph.c
    trunk/gcc/lto/ChangeLog
    trunk/gcc/lto/lto.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-ccp.c
    trunk/gcc/tree-ssa-loop-ivcanon.c
    trunk/gcc/varpool.c

Comment 21 Jan Hubicka 2010-09-20 15:53:51 UTC
OK, we now fold the testcase using obj_type_ref folding.  We still should do it via vtable lookup and we don't but that is for other PR I guess.