This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR81136: ICE from inconsistent DR_MISALIGNMENTs


On Mon, Jun 26, 2017 at 1:50 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Mon, Jun 26, 2017 at 1:14 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
>>>> <richard.sandiford@linaro.org> wrote:
>>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>>>>> <richard.sandiford@linaro.org> wrote:
>>>>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>>>>
>>>>>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>>>>
>>>>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>>>>> base alignment than the other.
>>>>>>>
>>>>>>> In the test case this looks like a missed opportunity.  Both references
>>>>>>> are unconditional, so it should be possible to use the highest of the
>>>>>>> available base alignment guarantees when analyzing each reference.
>>>>>>> The patch does this.
>>>>>>>
>>>>>>> However, as the comment in the patch says, the base alignment guarantees
>>>>>>> provided by a conditional reference only apply if the reference occurs
>>>>>>> at least once.  In this case it would be legitimate for two references
>>>>>>> to have the same runtime misalignment and for one reference to provide a
>>>>>>> stronger compile-time guarantee than the other about what the misalignment
>>>>>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>>>>>
>>>>>> Hmm, but you don't actually check whether a reference occurs only
>>>> conditional,
>>>>>> do you?  You just seem to say that for masked loads/stores the reference
>>>>>> is conditional (I believe that's not true).  But for a loop like
>>>>>>
>>>>>>  for (;;)
>>>>>>    if (a[i])
>>>>>>      sum += b[j];
>>>>>>
>>>>>> you still assume b[j] executes unconditionally?
>>>>>
>>>>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>>>>> was supposed to mean "even if the containing statement executes
>>>>> and runs to completion, the reference might not actually occur".
>>>>> The example above isn't conditional in that sense because the
>>>>> reference to b[j] does occur if the store is reached and completes.
>>>>>
>>>>> Masked loads and stores are conditional in that sense though.
>>>>> The reference only occurs if the mask is nonzero; the memory
>>>>> isn't touched otherwise.  The functions are used to if-convert
>>>>> things like:
>>>>>
>>>>>    for (...)
>>>>>      a[i] = b[i] ? c[i] : d[i];
>>>>>
>>>>> where there's no guarantee that it's safe to access c[i] when !b[i]
>>>>> (or d[i] when b[i]).  No reference occurs for an all-false mask.
>>>>
>>>> But as you touch generic data-ref code here you should apply more
>>>> sensible semantics to DR_IS_CONDITIONAL than just marking
>>>> masked loads/stores but not DRs occuring inside BBs only executed
>>>> conditionally ...
>>>
>>> I don't see why that's more sensible though.  If a statement is only
>>> conditionally executed in a loop, it's up to the consumer to decide
>>> what to do about that.  The conditions under which the statement
>>> is reached are a control-flow issue and tree-data-ref.c doesn't
>>> have any special information about it.
>>>
>>> Masked loads and stores are special because the DR_REFs created by
>>> tree-data-ref.c are artificial: they didn't exist as MEM_REFs in the
>>> original DR_STMT.  And AIUI they didn't exist as MEM_REFs precisely
>>> because they're not guaranteed to happen, even if the load or store
>>> statement itself is executed.  So in this case the DR_IS_CONDITIONAL
>>> is reflecting something that tree-data-ref.c itself has done.
>>>
>>> How about calling it DR_IS_CONDITIONAL_IN_STMT to avoid the
>>> general-sounding name?
>>
>> That sounds better and avoids the ambiguity.
>
> OK.
>
>>>>>> The vectorizer of course only sees unconditionally executed stmts.
>>>>>>
>>>>>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>>>>>> any real-world (testsuite) issues without this?
>>>>>
>>>>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>>>>> assumptions in silly corner cases.  I could add a scan test for it,
>>>>> for targets with masked loads and stores.  It wouldn't trigger
>>>>> an execution failure though because we assume that targets with
>>>>> masked loads and stores allow unaligned accesses:
>>>>>
>>>>>   /* For now assume all conditional loads/stores support unaligned
>>>>>      access without any special code.  */
>>>>>   if (is_gimple_call (stmt)
>>>>>       && gimple_call_internal_p (stmt)
>>>>>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>>>>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>>>>     return dr_unaligned_supported;
>>>>>
>>>>> So the worst that would happen is that we'd supposedly peel for
>>>>> alignment, but actually misalign everything instead, and so make
>>>>> things slower rather than quicker.
>>>>>
>>>>>> Note that the assert is to prevent bogus information.  Iff we aligned
>>>>>> DR with base alignment 8 and misalign 3 then if another same-align
>>>>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>>>>> as it still can be 8 after aligning DR.
>>>>>>
>>>>>> So I think it's wrong to put DRs with differing base-alignment into
>>>>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>>>>> updated independenlty after peeling.
>>>>>
>>>>> DR_MISALIGNMENT is relative to the vector alignment rather than
>>>>> the base alignment though.  So:
>>>>
>>>> We seem to use it that way, yes (looking at set_ptr_info_alignment
>>>> uses).  So why not fix the assert then by capping the alignment/misalignment
>>>> we compute at this value as well?  (and document this in the header
>>>> around DR_MISALIGNMENT)
>>>>
>>>> Ideally we'd do alignment analysis independent of the vector size
>>>> though (for those stupid targets with multiple vector sizes to consider...).
>>>>
>>>>> a) when looking for references *A1 and *A2 with the same alignment,
>>>>>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>>>>>    This doesn't require any knowledge about the base alignment.
>>>>>    If we break the addresses down as:
>>>>>
>>>>>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>>>>>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>>>>>
>>>>>    and can prove that BASE1 == BASE2, the alignment of that base
>>>>>    isn't important.  We simply need to prove that REST1 % vecalign
>>>>>    == REST2 % vecalign for all X.
>>>>>
>>>>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>>>>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>>>>>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>>>>>    rely on the base alignment being known.
>>>>>
>>>>> What a high base alignment for DR_PEEL gives us is the ability to know
>>>>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>>>>> But the points above apply regardless of whether we know that value at
>>>>> compile time or not.
>>>>>
>>>>> In examples like the test case, we would have known at compile time that
>>>>> VF-1 iterations would need to be peeled if we'd picked the store as the
>>>>> DR_PEEL, but would have treated the number of peels as variable if we'd
>>>>> picked the load.  The value calculated at runtime would still have been
>>>>> VF-1, it's just that the code wouldn't have been as efficient.
>>>>>
>>>>> One of the benefits of pooling the alignments for unconditional references
>>>>> is that it no longer matters which DR we pick: the number of peels will
>>>>> be a compile-time constant both ways.
>>>>>
>>>>> Thanks,
>>>>> Richard
>>>>>
>>>>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>>>>> larger align for the other DR if possible.
>>>>
>>>> ^^^
>>>>
>>>> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?
>>>
>>> I don't think the problem is the lack of a cap.  In the test case we
>>> see that:
>>>
>>> 1. B is known at compile time to be X * vecsize + Y when considered in
>>>    isolation, because the base alignment derived from its DR_REF >= vecsize.
>>>    So DR_MISALIGNMENT (B) == Y.
>>>
>>> 2. A's misalignment wrt vecsize is not known at compile time when
>>>    considered in isolation, because no useful base alignment can be
>>>    derived from its DR_REF.  (The DR_REF is to a plain int rather than
>>>    to a structure with a high alignment.)  So DR_MISALIGNMENT (A) == -1.
>>>
>>> 3. A and B when considered as a pair trivially have the same misalignment
>>>    wrt vecsize, for the reasons above.
>>>
>>> Each of these results is individually correct.  The problem is that the
>>> assert is conflating two things: it's saying that if we know two datarefs
>>> have the same misaligment, we must either be able to calculate a
>>> compile-time misalignment for both datarefs in isolation, or we must
>>> fail to calculate a compile-time misalignment for both datarefs in
>>> isolation.  That isn't true: it's valid to have situations in which the
>>> compile-time misalignment is known for one dataref in isolation but not
>>> for the other.
>>
>> True.  So the assert should then become
>>
>>       gcc_assert (! known_alignment_for_access_p (dr)
>>                   || DR_MISALIGNMENT (dr) / dr_size ==
>>                     DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>
>> ?
>
> I think it would need to be:
>
>       gcc_assert (!known_alignment_for_access_p (dr)
>                   || !known_alignment_for_access_p (dr_peel)
>                   || (DR_MISALIGNMENT (dr) / dr_size
>                       == DR_MISALIGNMENT (dr_peel) / dr_peel_size));

I think for !known_alignment_for_access_p (dr_peel) the assert doesn't make
any sense (DR_MISALIGNMENT is -1), so yes, you are right.

> But yeah, that would work too.  The idea with the assert in the patch was
> that for unconditional references we probably *do* want to try to compute
> the same compile-time misalignment, but for optimisation reasons rather
> than correctness.  Maybe that's more properly a gcc_checking_assert
> though, since nothing goes wrong if it fails.  So perhaps:

We shouldn't have asserts for optimization reasons, even with checking
IMHO.

>
>      gcc_checking_assert (DR_IS_CONDITIONAL_IN_STMT (dr)
>                           || DR_IS_CONDITIONAL_IN_STMT (dr_peel)
>                           || (known_alignment_for_access_p (dr)
>                               == known_alignment_for_access_p (dr_peel)));
>
> as a follow-on assert.
>
> Should I split it into two patches, one to change the gcc_assert and
> another to add the optimisation?

Yes please.

Thanks,
Richard.

> Thanks,
> Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]