[EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification
Tue Nov 23 20:08:58 GMT 2021
> In gimple your primary goal should be to reduce the number of
> expressions that are evaluated. This patch does the opposite.
That is actually a really good point in my opinion. I am hesitant about this patch and wanted to hear gcc-patch opinion about this. Doing something like this in IR level is a little bit counter intuitive to me. I will take a look at LLVM in my spare time to see where they are transferring that pattern and what was the rationale behind it.
From: Jeff Law <firstname.lastname@example.org>
Sent: Tuesday, November 23, 2021 12:02
To: Navid Rahimi; Navid Rahimi via Gcc-patches
Subject: Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification
On 11/23/2021 12:42 PM, Navid Rahimi wrote:
> In case of x86_64. This is the code:
> src_1(bool, bool):
> cmp dil, sil
> setb al
> tgt_1(bool, bool):
> xor edi, 1
> mov eax, edi
> and eax, esi
> Lets look at the latency of the src_1:
> cmp: latency of 1: (page 663, table C-17)
> setb: latency of 2. They don't report setb latency in intel instruction manual. But the closest instruction to this setbe does have latency of 2.
> But for tgt_1:
> xor: latency 1.
> mov: latency 1. (But it seems x86_64 does optimize this instruction and basically it is latency 0 in this case. In Zero-Latency MOV Instructions section they explain it .)
> and: latency 1.
> So even if you consider setb as latency of 1 it is equal. But if it is latency of 2, it should be a 1 latency win.
> 1) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.intel.com%2Fcontent%2Fdam%2Fwww%2Fpublic%2Fus%2Fen%2Fdocuments%2Fmanuals%2F64-ia-32-architectures-optimization-manual.pdf&data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cda4bfe80ceaa432a813e08d9aebc33ee%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732945624565576%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=sopToDx8Y4xfROdI7nRYxYQ%2FCHJPgjIKKGEaWiAXmL4%3D&reserved=0
But these are target issues you've raised -- those should be handled in
the RTL pipeline and are not a significant concern for gimple.
In gimple your primary goal should be to reduce the number of
expressions that are evaluated. This patch does the opposite.
More information about the Gcc-patches