This is the mail archive of the
mailing list for the GCC project.
Re: Question about "instruction merge" pass when optimizing for size
- From: Jeff Law <law at redhat dot com>
- To: "sarah at hederstierna dot com" <fredrik at hederstierna dot com>, DJ Delorie <dj at redhat dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Thu, 20 Aug 2015 09:36:48 -0600
- Subject: Re: Question about "instruction merge" pass when optimizing for size
- Authentication-results: sourceware.org; auth=none
- References: <CE36BD26828FA5408B9F87E4DD2ACB0B0135774D9F16 at MBXVS01 dot HMC dot local> <4ad27b0cf3dd4b12a4a2a4530ce2f15a at DAG03 dot HMC dot local> <201508192038 dot t7JKcVV7000807 at greed dot delorie dot com> <55D4F00A dot 2 at redhat dot com> <1440054622429 dot 92418 at hederstierna dot com>
On 08/20/2015 01:07 AM, email@example.com wrote:
Lots of passes could be involved. It's better to work with a real
example on a real target for this kind of discussion.
From: Jeff Law <firstname.lastname@example.org>
More important is to determine *why* we're getting these patterns. In
the IRA/LRA world, they should be a lot less common.
Yes I agree this phenomena seems more common after introducing LRA.
Though I was thinking that such a pass still maybe can be relevant.
Thinking hypothetically of an architecture, lets call it cortex-X,
assume this specific target type have an op-code for ADD with 5-operands.
Optimal code for
a = a + b + c + d
where in the optimization process do we introduce the merging into this target type specific instruction.
Can the more generic IRA/LRA handle this?
Assuming sensible three address code comes out of the gimple with
non-overlapping lifetimes, then I'd expect this to be primarily a
Unfortunately, 61578 has multiple testcases. Each should be its own bug
that can be addressed and tracked individually.
And maybe patterns can appear across different BB, or somewhere that the normal optimizers have hard to find, or figure out?
Sorry if I'm ignorant, I don't know the internals of the different optimizers, but I'm trying to learn and understand how to come forward on this issue we have with code size currently.
(I tried to but some bugs on it also Bug 61578 and Bug 67213.)
Peeking at the last testcase in c#19 is interesting. Presumably the
typecasting is necessary to avoid doing multiple comparisons and the
assumption is that the casts will be NOPs at the RTL level.
That assumption seems to be fine through IRA. The allocation seems
sane, except there's a reload needed for thumb1_addsi3_addgeu to ensure
operand 1 and operand 0 match due to the matching constraint.
That points to two issues.
1. Is IRA correctly tracking the need for those two operands to be the
same and accouting for that in its cost model.
2. In the case where IRA still generates code that needs a reload, why
was the old reload code able to eliminate the copy, the LRA can't.
67213 is probably a costing issue somewhere. Since Richi is already
involved, I'll let the two of you dig into the details.