Bug 64896 - [4.9/5 Regression] ICE in get_address_mode, at rtlanal.c:5442
Summary: [4.9/5 Regression] ICE in get_address_mode, at rtlanal.c:5442
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 5.0
: P1 normal
Target Milestone: 5.0
Assignee: Jan Hubicka
URL:
Keywords:
: 61207 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-02-01 19:40 UTC by Markus Trippelsdorf
Modified: 2015-03-10 19:21 UTC (History)
3 users (show)

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work: 4.8.4
Known to fail: 4.9.2
Last reconfirmed: 2015-02-02 00:00:00


Attachments
unreduced testcase (943.59 KB, application/x-xz)
2015-02-02 11:52 UTC, Markus Trippelsdorf
Details
gcc5-pr64896.patch (749 bytes, patch)
2015-02-06 12:50 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Trippelsdorf 2015-02-01 19:40:04 UTC
qtwebkit-5.4 fails to build:

 % g++ -c -w -O2 -std=c++0x -march=amdfam10 SVGAllInOne.ii
svg/SVGSVGElement.cpp: In member function ‘WebCore::FloatRect WebCore::SVGSVGElement::viewport() const’:
svg/SVGSVGElement.cpp:60:5443: internal compiler error: in get_address_mode, at rtlanal.c:5442
0xc411f1 get_address_mode(rtx_def*)
        ../../gcc/gcc/rtlanal.c:5442
0x98fb82 adjust_address_1(rtx_def*, machine_mode, long, int, int, int, long)
        ../../gcc/gcc/emit-rtl.c:2227
0x9c600f store_field
        ../../gcc/gcc/expr.c:6715
0x9cbbc0 expand_assignment(tree_node*, tree_node*, bool)
        ../../gcc/gcc/expr.c:5000
0x8cda6e expand_gimple_stmt_1
        ../../gcc/gcc/cfgexpand.c:3385
0x8cda6e expand_gimple_stmt
        ../../gcc/gcc/cfgexpand.c:3481
0x8d4807 expand_gimple_basic_block
        ../../gcc/gcc/cfgexpand.c:5397
0x8d6027 execute
        ../../gcc/gcc/cfgexpand.c:6006
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

Adding -fno-ipa-icf fixes the issue. The testcase is very large: ~10MB.
Reduced testcase should be ready tomorrow.
Comment 1 Markus Trippelsdorf 2015-02-02 11:52:05 UTC
Created attachment 34644 [details]
unreduced testcase

I'm having a hard time reducing the testcase.
Unreduced testcase attached.
Comment 2 Jakub Jelinek 2015-02-02 15:00:22 UTC
Started with r220230.  Supposedly some thunk related ICF stuff.
Comment 3 Markus Trippelsdorf 2015-02-02 15:59:48 UTC
markus@x4 tmp % cat SVGAllInOne.ii
class A
{
  int m_x, m_y;
};
class B
{
  A m_location;
  int m_size;
};
class C
{
public:
  virtual B m_fn1 () const;
};
class D
{
  B m_fn2 () const;
  int m_fn3 () const;
  C *m_fn4 () const;
};
int
D::m_fn3 () const
{
  m_fn4 ()->m_fn1 ();
}
B
D::m_fn2 () const
{
  return B ();
}
class F : C
{
  B
  m_fn1 () const
  {
    return B ();
  }
};

markus@x4 tmp % /var/tmp/gcc_test/usr/local/bin/g++ -c -O2 SVGAllInOne.ii
SVGAllInOne.ii: In member function ‘B D::m_fn2() const’:
SVGAllInOne.ii:27:1: internal compiler error: in expand_expr_addr_expr_1, at expr.c:7735
 D::m_fn2 () const
 ^
0x9c3db7 expand_expr_addr_expr_1
        ../../gcc/gcc/expr.c:7735
0x9b81b9 expand_expr_addr_expr
        ../../gcc/gcc/expr.c:7849
0x9b81b9 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/gcc/expr.c:10723
0x9b8910 expand_expr
        ../../gcc/gcc/expr.h:254
0x9b8910 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/gcc/expr.c:9903
0x9cbb46 expand_expr
        ../../gcc/gcc/expr.h:254
0x9cbb46 expand_assignment(tree_node*, tree_node*, bool)
        ../../gcc/gcc/expr.c:5085
0x8cd8fe expand_gimple_stmt_1
        ../../gcc/gcc/cfgexpand.c:3385
0x8cd8fe expand_gimple_stmt
        ../../gcc/gcc/cfgexpand.c:3481
0x8d4697 expand_gimple_basic_block
        ../../gcc/gcc/cfgexpand.c:5397
0x8d5eb7 execute
        ../../gcc/gcc/cfgexpand.c:6006
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
Comment 4 Jan Hubicka 2015-02-02 18:11:49 UTC
Hmm, mine I suppose ;)
Comment 5 Jan Hubicka 2015-02-04 23:21:08 UTC
Hmm, an RTL expansion issue. We optimize m_fn2 as

B D::m_fn2() const (const struct D * const this)                                
{                                                                               
;;   basic block 2, loop depth 0                                                
;;    pred:       ENTRY                                                         
  MEM[(struct B *)&<retval>] = 0;                                               
  MEM[(struct B *)&<retval> + 4B] = 0;                                          
  MEM[(struct B *)&<retval> + 8B] = 0;                                          
  return <retval>;                                                              
;;    succ:       EXIT                                                          
                                                                                
}                                                                               


and we want to store

 <result_decl 0x7ffff7ff82d0 D.2359
    type <record_type 0x7ffff6c55000 B type_5 type_6 BLK
        size <integer_cst 0x7ffff6c50bd0 constant 96>
        unit size <integer_cst 0x7ffff6c50c00 constant 12>
        align 32 symtab 0 alias set 5 canonical type 0x7ffff6c55000
        fields <field_decl 0x7ffff6c438e8 m_location type <record_type 0x7ffff6c44dc8 A>
            private nonlocal decl_3 DI file /aux/hubicka/t.ii line 7 col 5
            size <integer_cst 0x7ffff6ad6e58 constant 64>
            unit size <integer_cst 0x7ffff6ad6e70 constant 8>
            align 32 offset_align 128
            offset <integer_cst 0x7ffff6ad6e88 constant 0>
            bit offset <integer_cst 0x7ffff6ad6ed0 constant 0> context <record_type 0x7ffff6c55000 B> chain <field_decl 0x7ffff6c43980 m_size>> context <translation_unit_decl 0x7ffff7ff81e0 D.1>
        full-name "class B"
        X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown
        pointer_to_this <pointer_type 0x7ffff6c5a738> chain <type_decl 0x7ffff6c437b8 B>>
    used ignored regdecl BLK file /aux/hubicka/t.ii line 27 col 13 size <integer_cst 0x7ffff6c50bd0 96> unit size <integer_cst 0x7ffff6c50c00 12>
    align 32 context <function_decl 0x7ffff6c561b0 m_fn2>
    (parallel:BLK [
        (expr_list:REG_DEP_TRUE (reg:DI 87 [ <retval> ])
            (const_int 0 [0]))
        (expr_list:REG_DEP_TRUE (reg:SI 88 [ <retval>+8 ])
            (const_int 8 [0x8]))
    ])>

into its DECL_RTL but somehow we end up not doing that correctly.

Without -fipa-icf we produce:

B D::m_fn2() const (const struct D * const this)                                
{                                                                               
  struct B D.2398;                                                              
                                                                                
  <bb 2>:                                                                       
  D.2398.m_location.m_x = 0;                                                    
  D.2398.m_location.m_y = 0;                                                    
  D.2398.m_size = 0;                                                            
  return D.2398;                                                                
                                                                                
}                                                                               

that looks equivalent but gets compiled well.

We decide to unify m_fn1 and m_fn2 as:

virtual B F::m_fn1() const (const struct F * const this)                        
{                                                                               
  struct B D.2396;                                                              
                                                                                
  <bb 2>:                                                                       
  D.2396.m_location.m_x = 0;                                                    
  D.2396.m_location.m_y = 0;                                                    
  D.2396.m_size = 0;                                                            
  return D.2396;                                                                
                                                                                
}                                                                               
                                                                                
                                                                                
B D::m_fn2() const (const struct D * const this)                                
{                                                                               
  <bb 2>:                                                                       
  <retval> = F::m_fn1 (this_2(D)); [tail call]                                  
  return <retval>;                                                              
                                                                                
}                                                                               

which eventually gets inlined back of course (I will teach ICF to skip thunk creation when inline sequence is shorter).

The inliner produces:

B D::m_fn2() const (const struct D * const this)                                
{                                                                               
  struct B D.2413;                                                              
                                                                                
  <bb 2>:                                                                       
  D.2413.m_location.m_x = 0;                                                    
  D.2413.m_location.m_y = 0;                                                    
  D.2413.m_size = 0;                                                            
  <retval> = D.2413;                                                            
  return <retval>;                                                              
                                                                                
}                                                                               

I suppose the extra <retval> store is the problem.

Implementing wrapper by hand gives me:
B D::m_fn5() const (const struct D * const this)                                
{                                                                               
  struct B D.2406;                                                              
                                                                                
  <bb 2>:                                                                       
  D.2406 = D::m_fn2 (this_2(D));                                                
  return D.2406;                                                                
                                                                                
}                                                                               

which is bit uncool by adding extra copy and I also think it won't fly for DECL_BY_REFERENCE stuff.

Jakub/Richi can we get the direct return to work or shall we extend thunk generation to introduce a temporary? If so under what conditions?

thunk expansion already does:
          if (DECL_BY_REFERENCE (resdecl))                                      
            {                                                                   
              restmp = gimple_fold_indirect_ref (resdecl);                      
              if (!restmp)                                                      
                restmp = build2 (MEM_REF,                                       
                                 TREE_TYPE (TREE_TYPE (DECL_RESULT (alias))),   
                                 resdecl,                                       
                                 build_int_cst (TREE_TYPE                       
                                   (DECL_RESULT (alias)), 0));                  
            }                                                                   
          else if (!is_gimple_reg_type (restype))                               
            {                                                                   
              restmp = resdecl;                                                 
                                                                                
              if (TREE_CODE (restmp) == VAR_DECL)                               
                add_local_decl (cfun, restmp);                                  
              BLOCK_VARS (DECL_INITIAL (current_function_decl)) = restmp;       
            }                                                                   
          else                                                                  
            restmp = create_tmp_reg (restype, "retval");                        

I suppose we may want to add case for !DECL_BY_REFERENCE that needs temporary "retval", too.
Comment 6 Jakub Jelinek 2015-02-06 12:50:32 UTC
Created attachment 34685 [details]
gcc5-pr64896.patch

I think if !aggregate_value_p, we really should be using a temporary var rather than RESULT_DECL.

That said, if it doesn't generate optimal code, we should optimize it, but it wouldn't be IPA's or expand_thunk's task, such problem would affect all similar user written code.  But, on this exact testcase with the patch I get identical assembly to -fno-ipa-icf.  The
        movl    $0, -24(%rsp)
        movl    $0, -20(%rsp)
        xorl    %edx, %edx
        movq    -24(%rsp), %rax
is of course not optimal, xorl %eax, %eax; xorl %edx, %edx would do too, but it is a matter of some RTL optimization of late GIMPLE to improve this.

But, related to this, I've noticed that:
1) pass_nrv doesn't seem to work very well on x86_64, apparently the thing is that the temporaries usually have DECL_ALIGN bumped by LOCAL_DECL_ALIGNMENT to 128 bits, while RESULT_DECL typically does not that "optimization", so the nrv pass gives up.  Wonder at least for the case where the decl isn't addressable why would we care about DECL_ALIGN of the temporary (rather than just TYPE_ALIGN).
2) even on i386 where tree nrv usually works, I see on testcase like:
struct A
{
  int m_x, m_y;
};
struct Q
{
  struct A m_location;
  int m_size;
  long m_foo;
};
struct Q foo ();
struct Q bar ()
{
  struct Q x = foo ();
  return x;
}
(in C, so that C++ nrv doesn't trigger) unnecessary stack adjustments
Comment 7 Jan Hubicka 2015-02-06 17:22:04 UTC
Hmm, OK, lets go for extra temporary and hope that copy propagation/NRV will do the trick.

I wonder if we don't want to drop the extra alignment on x86_64 local vars. Vectorizer knows how to upgrade and I doubdt the memcpy/memset expanders care too much given that the offset in frame can often be determined.
Comment 8 Jakub Jelinek 2015-02-06 20:47:52 UTC
Author: jakub
Date: Fri Feb  6 20:47:20 2015
New Revision: 220489

URL: https://gcc.gnu.org/viewcvs?rev=220489&root=gcc&view=rev
Log:
	PR ipa/64896
	* cgraphunit.c (cgraph_node::expand_thunk): If
	restype is not is_gimple_reg_type nor the thunk_fndecl
	returns aggregate_value_p, set restmp to a temporary variable
	instead of resdecl.

	* g++.dg/ipa/pr64896.C: New test.

Added:
    trunk/gcc/testsuite/g++.dg/ipa/pr64896.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraphunit.c
    trunk/gcc/testsuite/ChangeLog
Comment 9 Jakub Jelinek 2015-02-06 20:50:42 UTC
Fixed.
Comment 10 Yvan Roux 2015-03-08 20:39:55 UTC
*** Bug 61207 has been marked as a duplicate of this bug. ***
Comment 11 Yvan Roux 2015-03-10 19:21:01 UTC
Author: yroux
Date: Tue Mar 10 19:20:30 2015
New Revision: 221333

URL: https://gcc.gnu.org/viewcvs?rev=221333&root=gcc&view=rev
Log:
gcc/
2015-03-10  Yvan Roux  <yvan.roux@linaro.org>

	Backport from trunk r220489.
	2015-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR ipa/64896
	* cgraphunit.c (cgraph_node::expand_thunk): If
	restype is not is_gimple_reg_type nor the thunk_fndecl
	returns aggregate_value_p, set restmp to a temporary variable
	instead of resdecl.

gcc/testsuite/
2015-03-10  Yvan Roux  <yvan.roux@linaro.org>

	Backport from trunk r220489.
	2015-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR ipa/64896
	* g++.dg/ipa/pr64896.C: New test.


Added:
    branches/gcc-4_9-branch/gcc/testsuite/g++.dg/ipa/pr64896.C
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/cgraphunit.c
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog