[EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd

Navid Rahimi navidrahimi@microsoft.com
Tue Nov 23 03:10:28 GMT 2021

Thanks Jeff for this too.

Best wishes,

From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Monday, November 22, 2021 19:09
To: Richard Biener; Navid Rahimi
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd

[You don't often get email from jeffreyalaw@gmail.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]

On 11/10/2021 1:35 AM, Richard Biener via Gcc-patches wrote:
> On Tue, Nov 9, 2021 at 5:25 PM Navid Rahimi <navidrahimi@microsoft.com> wrote:
>> Hi Richard,
>> Thank you so much for your detailed feedback. I am attaching another version of the patch which does include all the changes you mentioned.
>> Bellow you can see my response to your feedbacks:
>>> the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
>>> constants come last - you can then remove the 'c'
>> Fixed. I was not aware of the canonical order.
>>> you can use INTEGRAL_TYPE_P (type).
>> Fixed. Didn't know about "type" either.
>>> this test is not necessary
>> Fixed.
>>> But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
>>> it looks like a conflict with the x * (1 + b) <-> x + x * b transform
>>> (fold_plusminus_mult)?  That said, the special case of one
>>> definitely makes the result cheaper (one less operation).
>> For this special case, it does remove operation indeed. But I was not able to reproduce it for any other constant [1]. If it was possible to do it with other constants I would've changed the pattern to have be more general like "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to me when there is a "- y" at the end "x * (1 + b)", there is opportunity to optimize. Without that "- y" I was not able to make any operation more performant. Either direction, looks like same amount of computation.
>> 1) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FdWsq7Tzf4&data=04%7C01%7Cnavidrahimi%40microsoft.com%7C1ddd355d86ee4a5d645808d9ae2e9e73%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732337521412439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=K4ecZcigYkh8%2BSiqDegRxm72gkL%2FnsbuDDp5nsM3ZqA%3D&reserved=0
>>> Please move the pattern next to the most relatest which I think is
>> Fixed.
>>> the return value of 1 is an unreliable way to fail, please instead call
>>> __builtin_abort ();
>> Fixed.
>>> do we really need -O3 for this?  I think that -O should be enough here?
>> We don't specifically need that. But I realized that the optimization can happen in two different level at the compiler. It seems if you spread the computation over multiple statement like:
>>    int c = a/b;
>>    int y = b * (1 + c);
>>    return y - a;
>> instead of :
>>    return b * (1 + a / b) - a;
>> Then you have to have at least -O1 to have it optimized. Granted, I am not doing that in the testcase. In the new patch I am changing it to -O. Let me know if you have any suggestions.
> -O is fine, generally at -O0 we shouldn't expect such transforms to
> happen (but they still do, of course).
> The patch looks OK now.
I've pushed this to the trunk for Navid.


More information about the Gcc-patches mailing list