[EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification

Jeff Law jeffreyalaw@gmail.com
Tue Nov 23 20:02:37 GMT 2021



On 11/23/2021 12:42 PM, Navid Rahimi wrote:
> In case of x86_64. This is the code:
>
> src_1(bool, bool):
>          cmp     dil, sil
>          setb    al
>          ret
>
> tgt_1(bool, bool):
>          xor     edi, 1
>          mov     eax, edi
>          and     eax, esi
>          ret
>
>
> Lets look at the latency of the src_1:
> cmp: latency of 1: (page 663, table C-17)
> setb: latency of 2. They don't report setb latency in intel instruction manual. But the closest instruction to this setbe does have latency of 2.
>
> But for tgt_1:
> xor: latency 1.
> mov: latency 1. (But it seems x86_64 does optimize this instruction and basically it is latency 0 in this case.  In Zero-Latency MOV Instructions section they explain it [1].)
> and: latency 1.
>
> So even if you consider setb as latency of 1 it is equal. But if it is latency of 2, it should be a 1 latency win.
>
> 1) https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
But these are target issues you've raised -- those should be handled in 
the RTL pipeline and are not a significant concern for gimple.

In gimple your primary goal should be to reduce the number of 
expressions that are evaluated.  This patch does the opposite.

jeff



More information about the Gcc-patches mailing list