[EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification

Jeff Law jeffreyalaw@gmail.com
Tue Nov 23 20:02:37 GMT 2021

On 11/23/2021 12:42 PM, Navid Rahimi wrote:
> In case of x86_64. This is the code:
> src_1(bool, bool):
>          cmp     dil, sil
>          setb    al
>          ret
> tgt_1(bool, bool):
>          xor     edi, 1
>          mov     eax, edi
>          and     eax, esi
>          ret
> Lets look at the latency of the src_1:
> cmp: latency of 1: (page 663, table C-17)
> setb: latency of 2. They don't report setb latency in intel instruction manual. But the closest instruction to this setbe does have latency of 2.
> But for tgt_1:
> xor: latency 1.
> mov: latency 1. (But it seems x86_64 does optimize this instruction and basically it is latency 0 in this case.  In Zero-Latency MOV Instructions section they explain it [1].)
> and: latency 1.
> So even if you consider setb as latency of 1 it is equal. But if it is latency of 2, it should be a 1 latency win.
> 1) https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
But these are target issues you've raised -- those should be handled in 
the RTL pipeline and are not a significant concern for gimple.

In gimple your primary goal should be to reduce the number of 
expressions that are evaluated.  This patch does the opposite.


