This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Factor out division by squares and remove division around comparisons (1/2)


On 09/16/2017 03:39 PM, Bernhard Reutner-Fischer wrote:
> On 15 September 2017 18:50:26 CEST, Jeff Law <law@redhat.com> wrote:
>> On 09/13/2017 03:20 PM, Wilco Dijkstra wrote:
>>> Jeff Law wrote:
>>>> On 09/06/2017 03:55 AM, Jackson Woodruff wrote:
>>>>> On 08/30/2017 01:46 PM, Richard Biener wrote:
>>>
>>>>>>    rdivtmp = 1 / (y*C);
>>>>>>    tem = x *rdivtmp;
>>>>>>    tem2= z * rdivtmp;
>>>>>>
>>>>>> instead of
>>>>>>
>>>>>>    rdivtmp = 1/y;
>>>>>>    tem = x * 1/C * rdivtmp;
>>>>>>    tem2 = z * 1/C * rdivtmp;
>>>>>
>>>>> Ideally we would be able to CSE that into
>>>>>
>>>>> rdivtmp = 1/y * 1/C;
>>>>> tem = x * rdivtmp;
>>>>> tem2 = z * rdivtmp;
>>>> So why is your sequence significantly better than Richi's desired
>>>> seqeuence?  They both seem to need 3 mults and a division (which in
>> both
>>>> cases might be a reciprocal estimation).    In Richi's sequence we
>> have
>>>> to mult and feed the result as an operand into the reciprocal insn. 
>> In
>>>> yours we feed the result of the reciprocal into the multiply.
>>>
>>> Basically this stuff happens a lot in real code, which is exactly why
>> I proposed it.
>>> I even provided counts of how many divisions each transformation
>> avoids:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026. 
>> I don't doubt that it happens in real code.  There's a reason why MIPS
>> IV added recip and rsqrt 20 years ago.  Our ability to exploit them has
>> always been fairly limited though.
>>
>> What I'm trying to avoid is a transformation where the two forms are
>> roughly equal in terms of what they expose vs what they hide.
>>
>> If pulling the 1/c out is consistently better, then that's obviously
>> good.  If it's essentially a toss-up because of the other interactions
>> with CSE, then we need to think about it more deeply.
>>
>> I _suspect_ that pulling the 1/c out is generally better, but something
>> more concrete than my intuition would be helpful.
>>
>>
>>
>>
>>>
>>> Note this transformation is a canonicalization - if you can't merge a
>>> constant somehow, moving it out to the LHS can expose more
>>> opportunities, like in (C1 * x) / (C2 * y) -> (C1 * x * C2) / y ->
>> (C3 * x) / y.
>>> Same for negation as it behaves like * -1.
>> FWIW, I generally agree.
>>
>>>
>>> The key here is that it is at least an order of magnitude worse if
>> you have
>>> to execute an extra division than if you have an extra multiply.
>> No doubt.  I'll trade a division for a multiply any day.  Similarly 1/C
>> is just a constant, so I consider it essentially free.
>>
>>
>>>
>>>> ISTM in the end if  y*C or 1/(y*C) is a CSE, then Richi's sequence
>> wins.
>>>>  Similarly if 1/y is a CSE, then yours wins.  Is there some reason
>> to
>>>> believe that one is a more likely CSE than the other?
>>>
>>> The idea is that 1/y is much more likely a CSE than 1/(y*C).
>> Do we have anything other than intuition to back that up?
>>>
>>> We could make the pattern only fire in single use cases and see
>> whether
>>> that makes a difference. It would be easy to test old vs new vs
>> single-use
>>> new and count how many divisions we end up with. We could add 1/ (y *
>> C)
>>> to the reciprocal phase if it is unacceptable as a canonicalization,
>> but then
>>> you won't be able to optimize (C1 * x * C2) / y.
>> We could.  I think the question would then become does restricting to
>> the single-use case kill too many opportunities.
>>
>> Sigh.  I think the 4 of us could go round and round on this forever in
>> the pursuit of the perfect code.  Though in reality I'm happy with a
>> clear improvement.
> 
> 
> Btw reminds me a little bit of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417
> which IIRC never was applied. Maybe someone could talk Denys (a colleague of yours nowadays, Jeff) into submitting a real patch? ;)
Denys has been a Red Hatter for a long time (approaching 10 years now I
think).  He's not in the compiler group, but is in the same organization
as the compiler group.

Tege hasn't worked on GCC regularly in years and Denys hasn't ever
really engaged the GCC community all that well.  I wouldn't expect 28417
to move forward without something other than Tege and Denys pushing on it.


jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]