This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types (was: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6]))


Hi!

(CCing Bernd and Jakub -- for your information, or: "amusement" -- as
you've discussed offloading preload_common_nodes issues before...)

Got to look into this some more, yesterday:

On Thu, 08 Sep 2016 13:43:30 +0200, I wrote:
> On Wed, 7 Sep 2016 14:23:18 +0200, Richard Biener <richard.guenther@gmail.com> wrote:
> > On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge <thomas@codesourcery.com> wrote:
> > > On Fri, 19 Aug 2016 11:05:59 +0000, Joseph Myers <joseph@codesourcery.com> wrote:
> > >> On Fri, 19 Aug 2016, Richard Biener wrote:
> > >> > Can you quickly verify if LTO works with the new types?  I don't see anything
> > >> > that would prevent it but having new global trees and backends initializing them
> > >> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
> > >>
> > >> Well, the execution tests are in gcc.dg/torture, which is run with various
> > >> options including -flto (and I've checked the testsuite logs to confirm
> > >> these tests are indeed run with such options).  Is there something else
> > >> you think should be tested?
> > >
> > > As I noted in <https://gcc.gnu.org/PR77458>:
> > >
> > >     As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", nvptx
> > >     offloading is broken, ICEs in LTO stream-in.  Probably some kind of data-type
> > >     mismatch that is not visible with Intel MIC offloading (using the same data
> > >     types) but explodes with nvptx.  I'm having a look.

> > [...] preload_common_nodes.  This is carefully crafted to _not_ diverge by
> > frontend (!) it wasn't even designed to cope with global trees being present
> > or not dependent on target (well, because the target is always the
> > same! mind you!)
> 
> Scary.  ;-/
> 
> > Now -- in theory it should deal with NULLs just fine (resulting in
> > error_mark_node), but it can diverge when there are additional
> > compount types (like vectors, complex
> > or array or record types) whose element types are not in the set of
> > global trees.
> > The complex _FloatN types would be such a case given they appear before their
> > components.  That mixes up the ordering at least.
> 
> ACK, but that's also an issue for "regular" float/complex float, which
> also is in "reverse" order.  But that's "fixed" by the recursion in
> gcc/tree-streamer.c:record_common_node for "TREE_CODE (node) ==
> COMPLEX_TYPE".  This likewise seems to work for the _FloatN types.

So, that mechanism does work, but what's going wrong is the following:
with differing target vs. offload target, we potentially (and actually
do, in the case of x86_64 vs. nvptx) have different sets of _FloatN and
_FloatNx types.  That is, for nvptx, a few of these don't exist (so,
NULL_TREE), and so it follows their complex variants also don't exist,
and thus the recursion that I just mentioned for complex types is no
longer done in lockstep in the x86_64 cc1 vs. the nvptx lto1, hence we
get an offset (of two, in this specific case), and consequently streaming
explodes, for example, as soon as it hits a forward-reference (due to
looking for tree 185 (x86_64 cc1 view; as encoded in the stream) when it
should be looking for tree 183 (nvptx lto1 view).

> (I've
> put "fixed" in quotes -- doesn't that recursion mean that we're thus
> putting "complex float", "float", [...], "float" (again) into the cache?
> Anyway that's for later...)

Maybe it would make sense to do this tree streaming in two passes: first
build a set of what we actually need, and then stream that, without
duplicates.  (Or, is also for these "pickled" trees their order relevant,
so that one tree may only refer to other earlier but not later ones?
Anyway, we could still remember the set of trees already streamed, and
avoid the double streaming I described?)

So I now understand that due to the stream format, the integer tree IDs
(cache->next_id) have to match in all cc1/lto1s (etc.), so we'll have to
make that work for x86_64 target vs. nvptx offload target being
different.  (I'm pondering some ideas about how to rework that integer
tree ID generation.)

(I have not digested completely yet what the implications are for the
skipping we're doing for some trees in preload_common_nodes), but here is
a first patch that at least gets us rid of the immediate problem.  (I'll
fix the TODO by adding a "#define TI_COMPLEX_FLOATN_NX_TYPE_LAST" to
gcc/tree-core.h, OK?)  Will such a patch be OK for trunk, at least for
now?

commit d2045bd028104be2ede5ae1d5d7b9395e67a8180
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Thu Sep 15 21:56:16 2016 +0200

    [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types
    
    	gcc/
    	PR lto/77458
    	* tree-streamer.c (preload_common_nodes): Skip _FloatN and
    	_FloatNx types if offloading.
---
 gcc/tree-streamer.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git gcc/tree-streamer.c gcc/tree-streamer.c
index 7ea7096..061d831 100644
--- gcc/tree-streamer.c
+++ gcc/tree-streamer.c
@@ -333,7 +333,12 @@ preload_common_nodes (struct streamer_tree_cache_d *cache)
 	&& (!lto_stream_offload_p
 	    || (i != TI_VA_LIST_TYPE
 		&& i != TI_VA_LIST_GPR_COUNTER_FIELD
-		&& i != TI_VA_LIST_FPR_COUNTER_FIELD)))
+		&& i != TI_VA_LIST_FPR_COUNTER_FIELD
+		/* Likewise for the _FloatN and _FloatNx types.  */
+		&& (i < TI_FLOATN_NX_TYPE_FIRST
+		    || i > TI_FLOATN_NX_TYPE_LAST)
+		&& (i < TI_COMPLEX_FLOATN_NX_TYPE_FIRST
+		    || i > /* TODO */ TI_COMPLEX_FLOAT128X_TYPE))))
       record_common_node (cache, global_trees[i]);
 }
 


Grüße
 Thomas


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]