This is the mail archive of the
mailing list for the GCC project.
Re: Tweak BB analysis for dr_analyze_innermost
- From: Richard Sandiford <richard dot sandiford at linaro dot org>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 29 Jun 2017 11:32:06 +0100
- Subject: Re: Tweak BB analysis for dr_analyze_innermost
- Authentication-results: sourceware.org; auth=none
- References: <email@example.com> <CAFiYyc2TZOn=ERAPfWOJyAhybLCoHLG9ET=wBb7d+BFOaZo39g@mail.gmail.com>
Richard Biener <firstname.lastname@example.org> writes:
> On Wed, Jun 28, 2017 at 3:36 PM, Richard Sandiford
> <email@example.com> wrote:
>> dr_analyze_innermost had a "struct loop *nest" parameter that acted
>> like a boolean. This was added in r179161, with the idea that a
>> null nest selected BB-level analysis rather than loop analysis.
>> The handling seemed strange though. If the DR was part of a loop,
>> we still tried to express the base and offset values as IVs, potentially
>> giving a nonzero step. If that failed for any reason, we'd revert to
>> using the original base and offset, just as we would if we hadn't asked
>> for an IV in the first place.
>> It seems more natural to use the !in_loop handling whenever nest is null
>> and always set the step to zero. This actually enables one more SLP
>> opportunity in bb-slp-pr65935.c.
>> I checked out r179161 and tried the patch there. The test case added
>> in that revision still passes, so I don't think there was any particular
>> need to check simple_iv.
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?
> I have a few additional comments for consideration. I remembered code
> in vect_compute_data_ref_alignment explicitely looking at DR_STEP in
> BB mode:
> /* Similarly we can only use base and misalignment information relative to
> an innermost loop if the misalignment stays the same throughout the
> execution of the loop. As above, this is the case if the stride of
> the dataref evenly divides by the vector size. */
> tree step = DR_STEP (dr);
> unsigned vf = loop ? LOOP_VINFO_VECT_FACTOR (loop_vinfo) : 1;
> if (tree_fits_shwi_p (step)
> && ((tree_to_shwi (step) * vf)
> % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0))
> if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> "step doesn't divide the vector-size.\n");
> misalign = NULL_TREE;
> so I guess with the change we may end up with worse (misalign is
> always NULL?) or bogus (with DR_STEP == 0 the test is always false)
> alignment analysis results for BB vectorization? I guess worse
> or we had the issue before if DR_STEP was just not analyzable.
DR_STEP will always be zero for bb vectorisation, so the results
shouldn't get worse. But the value that was previously a nonzero
step is now part of the base or offset instead (the choice between
the two being the same as it would be for get_inner_reference).
We still take the alignments of those into account, so I think
we should be safe (at least after DR_BASE_ALIGNMENT).
Like you say, we previously had the same situation for bases
that weren't simple IVs, or for bases that were simple IVs but
had an invariant rather than constant step.
> The testcase that now gets vectorized tried to use the asm to prevent
> vectorization but that of course doesn't work for BB vectorization.
> Using volatile stores might. I guess the chance that we miscompile
> both loops in a way that we still pass the testcase (even the compare
> loop could be BB vectorized I guess) is unlikely.
Putting an extra volatile asm between the real and imag stores stops
the vectorisation, if you'd prefer that.
> The patch is ok but I guess we need to keep an eye on BB vectorization
> results for targets w/o unaligned vector loads/stores for the above
> alignment issue?