This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Resubmit/ping: peephole2 vs cond-exec vs df
- From: Bernd Schmidt <bernds at codesourcery dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 29 Jun 2010 02:26:24 +0200
- Subject: Re: Resubmit/ping: peephole2 vs cond-exec vs df
- References: <4BD075C9.1080502@codesourcery.com> <4C0D0656.4060903@codesourcery.com> <4C293731.4080500@redhat.com>
On 06/29/2010 01:58 AM, Richard Henderson wrote:
>> + /* If an insn has RTX_FRAME_RELATED_P set, peephole substitution would lose
>> + the REG_FRAME_RELATED_EXPR that is attached. */
>> + if (RTX_FRAME_RELATED_P (insn))
>> + {
>> + /* Let the buffer drain first. */
>> + if (peep2_current_count > 0)
>> + return false;
>> + df_simulate_one_insn_forwards (bb, insn, live);
>> + return true;
>
> You've changed the logic so that frame-related insns are allowed to be
> peepholed, so long as they begin a match. Previously we skipped these
> insns entirely, which I think is more correct.
I'm not sure this is true. It never adds a RTX_FRAME_RELATED_P insn to
the buffer, it just returns true so that we process the next insn?
I admit the buffer logic is a bit convoluted, and I'd welcome
suggestions how to make it clearer. An explicit state machine?
>> + /* If we did not fill an empty buffer, it signals the end of the
>> + block. */
>> + if (peep2_current_count == 0)
>> break;
>> +
>> + /* The buffer filled to the current maximum, so try to match. */
>
> Why do you wait until you've completely filled the buffer before trying
> a match? Doesn't this lead to useless work? Or does this on average
> pay off because of less calls into peep2_recog (although more calls into
> df scanning insns)?
To ensure that we always match the longest possible peephole if there
are similar ones. A target can order them in order of descending
length, but that won't always work if we start to match against
partially filled buffers. It's 2am so I may be blind but I don't see
how it causes more calls into df_simulate or other useless work? There
should be exactly one such call per insn as it is added to the buffer
(plus any updates necessary after we match something and have to
regenerate life information). Life information is stored in the buffer
so we don't have to do anything when advancing the buffer start.
It turns out that this doesn't completely solve the problem I was hoping
to solve (ARM peepholes for ldm/stm generation - there are several for
sequences of different lengths, and I was hoping to avoid having
patterns that match previous combinations), so I guess I could drop this
part. Conceptually, I still think it's better to do it this way.
Looking at the patch again now, I think the saved_live thing may be a
remnant from an earlier attempt to also regenerate life information
forwards after a substitution. That probably needs to go, I'll have a
look again in the morning.
Thanks for reviewing this.
Bernd