This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Question about "instruction merge" pass when optimizing for size


On 08/20/2015 01:07 AM, sarah@hederstierna.com wrote:

________________________________________
From: Jeff Law <law@redhat.com>
More important is to determine *why* we're getting these patterns.  In
the IRA/LRA world, they should be a lot less common.

Yes I agree this phenomena seems more common after introducing LRA.

Though I was thinking that such a pass still maybe can be relevant.

Thinking hypothetically of an architecture, lets call it cortex-X,
assume this specific target type have an op-code for ADD with 5-operands.

Optimal code for

a = a + b + c + d

would be

addx Ra,Ra,Rb,Rc,Rd

where in the optimization process do we introduce the merging into this target type specific instruction.
Can the more generic IRA/LRA handle this?
Lots of passes could be involved. It's better to work with a real example on a real target for this kind of discussion.

Assuming sensible three address code comes out of the gimple with non-overlapping lifetimes, then I'd expect this to be primarily a combiner issue.


And maybe patterns can appear across different BB, or somewhere that the normal optimizers have hard to find, or figure out?

Sorry if I'm ignorant, I don't know the internals of the different optimizers, but I'm trying to learn and understand how to come forward on this issue we have with code size currently.
(I tried to but some bugs on it also Bug 61578 and Bug 67213.)
Unfortunately, 61578 has multiple testcases. Each should be its own bug that can be addressed and tracked individually.

Peeking at the last testcase in c#19 is interesting. Presumably the typecasting is necessary to avoid doing multiple comparisons and the assumption is that the casts will be NOPs at the RTL level.

That assumption seems to be fine through IRA. The allocation seems sane, except there's a reload needed for thumb1_addsi3_addgeu to ensure operand 1 and operand 0 match due to the matching constraint.

That points to two issues.

1. Is IRA correctly tracking the need for those two operands to be the same and accouting for that in its cost model.

2. In the case where IRA still generates code that needs a reload, why was the old reload code able to eliminate the copy, the LRA can't.


67213 is probably a costing issue somewhere. Since Richi is already involved, I'll let the two of you dig into the details.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]