This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/44423] [4.5/4.6 Regression] Massive performance regression in SSE code due to SRA



------- Comment #11 from jamborm at gcc dot gnu dot org  2010-06-09 09:02 -------
(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #8)
> > > I don't think you need flow-sensitivity.
> > > 
> > > Basically when you have only aggregate uses (as in this case)
> > 
> > Vectors are considered scalars in GCC.  That is why the solutions
> > described above work.
> > 
> > > then you only want to scalarize if the destination of the use is
> > > scalarized as well (to be able to copyprop out the aggregate copy).
> > 
> > Well, that is what I thought until someone filed PR 43846.
> 
> Hm, yes.  But there you know that
> 
>   D.2464.m[0] = D.2473_20;
>   D.2464.m[1] = D.2472_19;
>   D.2464.m[2] = D.2471_18;
>   *b_1(D) = D.2464;
> 
> D.2464 will be dead after scalarization.

If D.2464 was larger than just m, that would not necessarily be the
case and we would still want to avoid the extra copies.

However, I it is true that it would make sense to take
grp_assignment_read into account only if the whole access subtree
would end up with grp_unscalarized_data set to zero but that would
require quite a rewrite of analyze_access_subtree and would not help
in this case because grp_unscalarized_data is zero, the union is
covered by scalar replacements.

The real issue is that

>  In the particular case of the
> current bug the aggregate remains live because of the load from va.v
> which we cannot scalarize(*).

we determine this very late, in sra_modify_assign (right after the big
comment) and in the most general form this can be determined only when
we already have the whole access tree (so if we wanted to do this
during analysis, we would have to scan the function body twice).
Nevertheless, for scalar accesses that have scalar sub-accesses this
is always true and it can be easily detected and so we can simply
disallow them, like I wrote in comment #7.  And disallow them always,
since otherwise it would be easy to _add_ stuff to the function that
is causing trouble now so that any heuristics is confused and decides
to produce replacements.

I'll submit a patch in a while.

> 
> (*) we can scalarize this particular case if you manage to build a
> proper constructor from the elements - but that's probably a bit
> involved.
> 

Well, I don't think I want to implement that... but I am curious,
would that actually lead to better (or even different) code if I
placed something like that into the loop?  And I also thought that in
gimple, constructors only could have invariants in them.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44423


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]