This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: PATCH: Update classification of aggregates with __m256 ([AVX]: Update x86-64 psABI for aggregates with __m256)


Hi HJ,
>>
>> -\item If the size of an object is larger than two \eightbytes, or
>> +\item If the size of an object is larger than four \eightbytes, or
>> + it contains unaligned fields, it has class MEMORY.
>>
>> I think this may be confusing to the reader. It maybe better to
state:
>>
>> "If the size of an object is larger than four eightbytes, or it
contains unaligned fields, it has
>class MEMORY.
>
>I can't tell the difference between my wording and yours.

None for this sentence, but it would be better if written together with
the two sentences below.

>> The post merger clean up ensures that, for the processors that do not
support the __m256 type, if
>the size of an object is larger than two eightbytes and the first
eightbyte is not SSE or any other
>eightbyte is not SSEUP, it still has class MEMORY.
>>
>> This in turn ensures that for processors that do support the __m256
type, if the size of an
>object is four eightbytes and the first eightbyte is SSE and all other
eightbytes are SSEUP, it can
>be passed in a register."
>

One has to read much forward in the docs to figure out that nothing has
changed for processors that do not support the __m256 type. Stating
these points separately and in some cases leaving them to implication
will confuse readers and its best to clarify upfront.

>
>> -  \item If SSEUP is not preceeded by SSE, it is converted to SSE.
>> +  \item If the size of the aggregate exceeds two \eightbytes and the
first
>> +    \eightbyte isn't SSE or any other \eightbyte isn't SSEUP, the
whole
>> +    argument is passed in memory.
>> +  \item If SSEUP is not preceded by SSE or SSEUP, it is converted to
SSE.
>>
>> Again it may be better to say:
>> "Otherwise, if SSEUP is not preceded by SSE or SSEUP and the size of
the aggregate does not
>exceed four eightbytes, it is converted to SSE."
>
>The post merger clean up rules are in strict order. You added "and the
>size of the aggregate does not exceed four eightbytes". I don't think
it is
>necessary since it is covered by "If the size of the aggregate
>exceeds two \eightbytes and the first \eightbyte isn't SSE or any other
>\eightbyte isn't SSEUP, the whole argument is passed in memory."
>You don't need to check the size of the aggregate for the last
>rule.

I don't believe we say any place in the document that all these things
are checked in strict order and also the related things checked in order
are separated over the document and not listed together.

>>> -\item If the class is SSE, the next available SSE register is used,
the
>>> +\item If the class is SSE, the next available vector register is
used,
>>> the
>>>     registers are taken in the order from \reg{xmm0} to \reg{xmm7}.
>>>
>>> -\item If the class is SSE, the next available SSE register of the
>>> +\item If the class is SSE, the next available vector register of
the
>>>     sequence \reg{xmm0}, \reg{xmm1} is used.
>>>
>>
>> Maybe better to say \reg{xmm0} to \reg{xmm7} or \reg{ymm0} to
\reg{ymm7}.
>
>You can have mixed xmmN/ymmN. xmmN refers to vector register N,
>which can be either 128bit or 256bit. Adding "\reg{ymm0} to \reg{ymm7}"
>doesn't make it easier to understand.
>

Ok, I have no strong preference on this.

>>> Here are a few comments on the patch?
>>>
>>> -------------
>>> @@ -5331,14 +5352,22 @@ construct_container (enum machine_mode m
>>>           break;
>>>         case X86_64_SSE_CLASS:
>>>           if (i < n - 1 && regclass[i + 1] == X86_64_SSEUP_CLASS)
>>> -           tmpmode = TImode;
>>> +           {
>>> +             if (regclass[i + 2] == X86_64_SSEUP_CLASS
>>> +                 || regclass[i + 3] == X86_64_SSEUP_CLASS)
>>> +               tmpmode = OImode;
>>> +             else
>>> +               tmpmode = TImode;
>>> +           }
>>>
>>
>>
>> I would think a check for n is needed here. If n is 2, regclass[i +
2] and regclass[i + 3] would
>not be valid, right?
>>
>
>You are right. Let me think about it.

Ok.

>Thanks.
>
>
>--
>H.J.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]