This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] add optional split pass before CSE2


Hi Steven,

Thanks for the reply, sorry I wasn't detailed enough in my previous
posts.  

I see that I need to reevaluate my approach.  I have been carrying this
patch forward internally since 4.1.  A lot has changed, so the
optimizations between expand and cse2 might not be as important as they
were before.  So simply doing it at expand time might be better now.
Or, even if some RTL passes are still effective, I understand that it is
preferable to fix certain problems in GIMPLE rather than RTL.


* Steven Bosscher <stevenb.gcc@gmail.com> [2009-05-04 14:30]:
> This explanation doesn't help me, at least, understand why this new
> split pass is necessary. Questions I'm left with:
> 
> (1) What is it that makes it impossible to split up everything when
> expanding to RTL?

It is possible.  Previously, it lead to worse performance.

The problem I was originally fixing is with aggregates and the fact that
the SPU architecture has only aligned, 16 byte loads and stores.  (For
non-aggregates, our ABI guarantees that scalars occupy an entire 16
bytes.)

When loading a member of an aggregate we have to convert to 16 byte
loads/stores and change the alias set to account for any adjacent
members that are loaded incidentally as part of that 16 bytes.  The
current implementation simply changes the alias set to 0.


> (2) Why can't (or aren't) the higher-level optimization handled on GIMPLE?

Previously, we did not do it at expand time because many of the RTL
optimizations were still essential to getting good performance.  For
example, GCSE and the old loop optimizations did profitable load/store
motion, which did not work as well when some alias sets were changed to
0.

> (3) What does "correct alias information" mean and why is it
> "incorrect" if you split earlier?

Sorry, "correct" was the wrong word to use.  When changing an aggregate
member load/store to a 16 byte load/store the alias set is changed to 0,
so we can no longer benefit from the aliasing information.

> (4) What other solutions have you considered?

Previously, I tried expanding at the start and splitting at split1.  I
also tried adding an additional CSE pass after split1.

> (5) Have you looked if the alias-export infrastructure would be a
> solution for you?

Hmm, based on my understanding of what it does, I don't think it will
help.  I have to change the alias set to account for adjacent members,
so I don't think the alias info provided by alias-export makes a big
difference, though it might allow me to be more accurate rather than
just changing to alias set 0.

> 
> ...
> 
> 
> This idea of splitting before CSE2 is another step away from proper
> instruction selection in GCC. It is a really big step in the wrong
> direction IMNSHO.

I am inclined to agree.  I will reevaluate doing everything at expand
time and see what performance issues occur.

Thanks,
Trevor


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]