This is the mail archive of the
mailing list for the GCC project.
Re: PR81136: ICE from inconsistent DR_MISALIGNMENTs
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Sandiford <richard dot sandiford at linaro dot org>
- Date: Mon, 26 Jun 2017 13:36:20 +0200
- Subject: Re: PR81136: ICE from inconsistent DR_MISALIGNMENTs
- Authentication-results: sourceware.org; auth=none
- References: <email@example.com> <CAFiYyc015ZVKVk0qegFFBhgzkOjSuoC6=iznQzju41uyiW9XtQ@mail.gmail.com> <firstname.lastname@example.org> <CAFiYyc0hAx8hFTiMRjM7f=e=B2g2o4=nqM790RThuj8yxngzpQ@mail.gmail.com> <email@example.com>
On Mon, Jun 26, 2017 at 1:14 PM, Richard Sandiford
> Richard Biener <firstname.lastname@example.org> writes:
>> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
>> <email@example.com> wrote:
>>> Richard Biener <firstname.lastname@example.org> writes:
>>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>>> <email@example.com> wrote:
>>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>> gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>> DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>>> base alignment than the other.
>>>>> In the test case this looks like a missed opportunity. Both references
>>>>> are unconditional, so it should be possible to use the highest of the
>>>>> available base alignment guarantees when analyzing each reference.
>>>>> The patch does this.
>>>>> However, as the comment in the patch says, the base alignment guarantees
>>>>> provided by a conditional reference only apply if the reference occurs
>>>>> at least once. In this case it would be legitimate for two references
>>>>> to have the same runtime misalignment and for one reference to provide a
>>>>> stronger compile-time guarantee than the other about what the misalignment
>>>>> actually is. The patch therefore relaxes the assert to handle that case.
>>>> Hmm, but you don't actually check whether a reference occurs only
>>>> do you? You just seem to say that for masked loads/stores the reference
>>>> is conditional (I believe that's not true). But for a loop like
>>>> for (;;)
>>>> if (a[i])
>>>> sum += b[j];
>>>> you still assume b[j] executes unconditionally?
>>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>>> was supposed to mean "even if the containing statement executes
>>> and runs to completion, the reference might not actually occur".
>>> The example above isn't conditional in that sense because the
>>> reference to b[j] does occur if the store is reached and completes.
>>> Masked loads and stores are conditional in that sense though.
>>> The reference only occurs if the mask is nonzero; the memory
>>> isn't touched otherwise. The functions are used to if-convert
>>> things like:
>>> for (...)
>>> a[i] = b[i] ? c[i] : d[i];
>>> where there's no guarantee that it's safe to access c[i] when !b[i]
>>> (or d[i] when b[i]). No reference occurs for an all-false mask.
>> But as you touch generic data-ref code here you should apply more
>> sensible semantics to DR_IS_CONDITIONAL than just marking
>> masked loads/stores but not DRs occuring inside BBs only executed
>> conditionally ...
> I don't see why that's more sensible though. If a statement is only
> conditionally executed in a loop, it's up to the consumer to decide
> what to do about that. The conditions under which the statement
> is reached are a control-flow issue and tree-data-ref.c doesn't
> have any special information about it.
> Masked loads and stores are special because the DR_REFs created by
> tree-data-ref.c are artificial: they didn't exist as MEM_REFs in the
> original DR_STMT. And AIUI they didn't exist as MEM_REFs precisely
> because they're not guaranteed to happen, even if the load or store
> statement itself is executed. So in this case the DR_IS_CONDITIONAL
> is reflecting something that tree-data-ref.c itself has done.
> How about calling it DR_IS_CONDITIONAL_IN_STMT to avoid the
> general-sounding name?
That sounds better and avoids the ambiguity.
>>>> The vectorizer of course only sees unconditionally executed stmts.
>>>> So - I'd simply not add this DR_IS_CONDITIONAL. Did you run into
>>>> any real-world (testsuite) issues without this?
>>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>>> assumptions in silly corner cases. I could add a scan test for it,
>>> for targets with masked loads and stores. It wouldn't trigger
>>> an execution failure though because we assume that targets with
>>> masked loads and stores allow unaligned accesses:
>>> /* For now assume all conditional loads/stores support unaligned
>>> access without any special code. */
>>> if (is_gimple_call (stmt)
>>> && gimple_call_internal_p (stmt)
>>> && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>> || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>> return dr_unaligned_supported;
>>> So the worst that would happen is that we'd supposedly peel for
>>> alignment, but actually misalign everything instead, and so make
>>> things slower rather than quicker.
>>>> Note that the assert is to prevent bogus information. Iff we aligned
>>>> DR with base alignment 8 and misalign 3 then if another same-align
>>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>>> as it still can be 8 after aligning DR.
>>>> So I think it's wrong to put DRs with differing base-alignment into
>>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>>> updated independenlty after peeling.
>>> DR_MISALIGNMENT is relative to the vector alignment rather than
>>> the base alignment though. So:
>> We seem to use it that way, yes (looking at set_ptr_info_alignment
>> uses). So why not fix the assert then by capping the alignment/misalignment
>> we compute at this value as well? (and document this in the header
>> around DR_MISALIGNMENT)
>> Ideally we'd do alignment analysis independent of the vector size
>> though (for those stupid targets with multiple vector sizes to consider...).
>>> a) when looking for references *A1 and *A2 with the same alignment,
>>> we simply have to prove that A1 % vecalign == A2 % vecalign.
>>> This doesn't require any knowledge about the base alignment.
>>> If we break the addresses down as:
>>> A1 = BASE1 + REST1, REST1 = INIT1 + OFFSET1 + X * STEP1
>>> A2 = BASE2 + REST2, REST2 = INIT2 + OFFSET2 + X * STEP2
>>> and can prove that BASE1 == BASE2, the alignment of that base
>>> isn't important. We simply need to prove that REST1 % vecalign
>>> == REST2 % vecalign for all X.
>>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>> to be vector-aligned. If DR_PEEL is A1 in the example above, we have
>>> A1 % vecalign == 0, so A2 % vecalign must be 0 too. This again doesn't
>>> rely on the base alignment being known.
>>> What a high base alignment for DR_PEEL gives us is the ability to know
>>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>>> But the points above apply regardless of whether we know that value at
>>> compile time or not.
>>> In examples like the test case, we would have known at compile time that
>>> VF-1 iterations would need to be peeled if we'd picked the store as the
>>> DR_PEEL, but would have treated the number of peels as variable if we'd
>>> picked the load. The value calculated at runtime would still have been
>>> VF-1, it's just that the code wouldn't have been as efficient.
>>> One of the benefits of pooling the alignments for unconditional references
>>> is that it no longer matters which DR we pick: the number of peels will
>>> be a compile-time constant both ways.
>>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>>> larger align for the other DR if possible.
>> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?
> I don't think the problem is the lack of a cap. In the test case we
> see that:
> 1. B is known at compile time to be X * vecsize + Y when considered in
> isolation, because the base alignment derived from its DR_REF >= vecsize.
> So DR_MISALIGNMENT (B) == Y.
> 2. A's misalignment wrt vecsize is not known at compile time when
> considered in isolation, because no useful base alignment can be
> derived from its DR_REF. (The DR_REF is to a plain int rather than
> to a structure with a high alignment.) So DR_MISALIGNMENT (A) == -1.
> 3. A and B when considered as a pair trivially have the same misalignment
> wrt vecsize, for the reasons above.
> Each of these results is individually correct. The problem is that the
> assert is conflating two things: it's saying that if we know two datarefs
> have the same misaligment, we must either be able to calculate a
> compile-time misalignment for both datarefs in isolation, or we must
> fail to calculate a compile-time misalignment for both datarefs in
> isolation. That isn't true: it's valid to have situations in which the
> compile-time misalignment is known for one dataref in isolation but not
> for the other.
True. So the assert should then become
gcc_assert (! known_alignment_for_access_p (dr)
|| DR_MISALIGNMENT (dr) / dr_size ==
DR_MISALIGNMENT (dr_peel) / dr_peel_size);