Summary: | [tree-ssa] SRA does not work for classes that use inheritance with an empty base | ||
---|---|---|---|
Product: | gcc | Reporter: | Dan Nicolaescu <dann> |
Component: | c++ | Assignee: | Richard Biener <rguenth> |
Status: | RESOLVED FIXED | ||
Severity: | enhancement | CC: | gcc-bugs, rguenth, rth |
Priority: | P2 | Keywords: | missed-optimization |
Version: | tree-ssa | ||
Target Milestone: | 4.7.0 | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2009-09-26 15:30:16 | |
Bug Depends on: | |||
Bug Blocks: | 22501 |
Description
Dan Nicolaescu
2004-01-31 21:15:34 UTC
Confirmed, the problem here is more complicated: Cannot scalarize variable param because it must live in memory Cannot scalarize variable local because it must live in memory Cannot scalarize variable MT.3 because it must live in memory The problem here is that: void copystruct1(teststruct) (param) { struct { double d; char f1; } * local.0; struct { double d; char f1; } * param.1; char T.2; { struct teststruct local; param.f1 = 0; local.0 = (struct { double d; char f1; } *)&local; param.1 = (struct { double d; char f1; } *)¶m; *local.0 = *param.1; { T.2 = local.f1; if (T.2 != 0) { { link_error (); } } else { } } } } Which means it does not using the right structs assigning or something werid is going on. This:
> Cannot scalarize variable param because it must live in memory
happens because for "param" is_gimple_non_addressable returns true
because "param" has the TREE_ADDRESSABLE bit set.
Mine. Happens because the C++ front end is amazingly stupid about how it emits access to members that are not located in virtual bases. Fixing this requires rearranging how offsetof is handled. Mine as I am fixing offsetof and some of the C++ front-end also. This is a C++ front-end issue. Testing a patch. Well, no I'm not. This is a different problem than I thought. This is the case of the C++ front end wanting to perform a block copy between two structures, but *without* copying the trailing padding of the structure. It decides that the best way to do this is to cast the two structures to an internal type that doesn't include the padding. Presumably this is to handle cases in which someone else inherits from "teststruct", and reuses the tail padding. As inheritence is allowed to do. I almost think that just using __builtin_memcpy would be a better representation, but that wouldn't have any effect on the scalarizability of this test case. At least yet. Exactly. teststruct is a non-POD (because it derives from base), so you can't touch its padding because it could contain data. For the record, the cast-vs-memcpy thing has been fixed. http://gcc.gnu.org/ml/gcc-patches/2004-06/msg02647.html Not working on it. In principle this blocks optimization of tramp3d domain operations (if it were not structure-aliasing fixing most of the problems). We now get <bb 2>: # param_2 = V_MAY_DEF <param_1>; param.f1 = 0; # param_6 = V_MAY_DEF <param_2>; # SFT.0_7 = V_MAY_DEF <SFT.0_3>; # NONLOCAL.6_8 = V_MAY_DEF <NONLOCAL.6_5>; # NONLOCAL.12_13 = V_MAY_DEF <NONLOCAL.12_12>; # NONLOCAL.18_16 = V_MAY_DEF <NONLOCAL.18_15>; # NONLOCAL.24_19 = V_MAY_DEF <NONLOCAL.24_18>; # NONLOCAL.30_22 = V_MAY_DEF <NONLOCAL.30_21>; # NONLOCAL.36_25 = V_MAY_DEF <NONLOCAL.36_24>; __builtin_memcpy (&local, ¶m, 9); # VUSE <SFT.0_7>; D.2668_4 = local.f1; if (D.2668_4 != 0) goto <L0>; else goto <L1>; <L0>:; # param_9 = V_MAY_DEF <param_6>; # SFT.0_10 = V_MAY_DEF <SFT.0_7>; # NONLOCAL.6_11 = V_MAY_DEF <NONLOCAL.6_8>; # NONLOCAL.12_14 = V_MAY_DEF <NONLOCAL.12_13>; # NONLOCAL.18_17 = V_MAY_DEF <NONLOCAL.18_16>; # NONLOCAL.24_20 = V_MAY_DEF <NONLOCAL.24_19>; # NONLOCAL.30_23 = V_MAY_DEF <NONLOCAL.30_22>; # NONLOCAL.36_26 = V_MAY_DEF <NONLOCAL.36_25>; link_error (); <L1>:; return; (In reply to comment #13) > We now get > > <bb 2>: > # param_2 = V_MAY_DEF <param_1>; > param.f1 = 0; > # param_6 = V_MAY_DEF <param_2>; > # SFT.0_7 = V_MAY_DEF <SFT.0_3>; > # NONLOCAL.6_8 = V_MAY_DEF <NONLOCAL.6_5>; > # NONLOCAL.12_13 = V_MAY_DEF <NONLOCAL.12_12>; > # NONLOCAL.18_16 = V_MAY_DEF <NONLOCAL.18_15>; > # NONLOCAL.24_19 = V_MAY_DEF <NONLOCAL.24_18>; > # NONLOCAL.30_22 = V_MAY_DEF <NONLOCAL.30_21>; > # NONLOCAL.36_25 = V_MAY_DEF <NONLOCAL.36_24>; > __builtin_memcpy (&local, ¶m, 9); > # VUSE <SFT.0_7>; > D.2668_4 = local.f1; > if (D.2668_4 != 0) goto <L0>; else goto <L1>; > > <L0>:; > # param_9 = V_MAY_DEF <param_6>; > # SFT.0_10 = V_MAY_DEF <SFT.0_7>; > # NONLOCAL.6_11 = V_MAY_DEF <NONLOCAL.6_8>; > # NONLOCAL.12_14 = V_MAY_DEF <NONLOCAL.12_13>; > # NONLOCAL.18_17 = V_MAY_DEF <NONLOCAL.18_16>; > # NONLOCAL.24_20 = V_MAY_DEF <NONLOCAL.24_19>; > # NONLOCAL.30_23 = V_MAY_DEF <NONLOCAL.30_22>; > # NONLOCAL.36_26 = V_MAY_DEF <NONLOCAL.36_25>; > link_error (); > > <L1>:; > return; > Uh, there should only ever be one non-local var per function, and referenced vars should be reset at the end of each function so why do you have multiple NONLOCAL's here? On trunk we get <bb 2>: # .MEM_3 = VDEF <.MEM_2(D)> param.f1 = 0; # .MEM_4 = VDEF <.MEM_3> memcpy (&local, ¶m, 9); # VUSE <.MEM_4> D.1743_1 = local.f1; if (D.1743_1 != 0) goto <bb 3>; else goto <bb 4>; <bb 3>: # .MEM_5 = VDEF <.MEM_4> link_error (); <bb 4>: return; which shows this is an issue of value-numbering not looking through memcpy. SRA obviously does not work here because both vars have their address taken. The issue in the summary should have been fixed by the SRA rewrite. I'm looking at the VN issue. We can also expand __builtin_memcpy (&local, ¶m, 9); to multiple copies based on src/dest alignment and size (similar to store_by_pieces) I have a patch, queued for 4.7. Author: rguenth Date: Tue Mar 15 13:37:23 2011 New Revision: 170994 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170994 Log: 2011-03-15 Richard Guenther <rguenther@suse.de> PR tree-optimization/13954 * tree-ssa-sccvn.c (vn_reference_lookup_3): Look through memcpy and friends. * g++.dg/tree-ssa/pr13954.C: New testcase. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/pr13954.C Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-sccvn.c Fixed. |