Bug 85758 - questionable bitwise folding (missing single use check?)
Summary: questionable bitwise folding (missing single use check?)
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-12 10:51 UTC by Alexander Monakov
Modified: 2018-08-27 14:26 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Monakov 2018-05-12 10:51:17 UTC
The following should be translated as-is:

void f(int a, int b);
void g(int a, int b, int m, int s)
{
    m &= s;
    a += m;
    m ^= s;
    b += m;
    f(a, b);
}

However instead of and/add/xor/add we get mov/not/and/and/add/add:

        movl    %edx, %eax
        notl    %edx
        andl    %ecx, %eax
        andl    %edx, %ecx
        addl    %eax, %edi
        addl    %ecx, %esi
        jmp     f

This is because forwprop applies an identity to m = (m & s) ^ s:

g (int a, int b, int m, int s)
{
  <bb 2> :
  m_3 = m_1(D) & s_2(D);
  a_5 = a_4(D) + m_3;
  m_6 = m_3 ^ s_2(D);
  b_8 = b_7(D) + m_6;
  f (a_5, b_8);
  return;
}

gimple_simplified to _11 = ~m_1(D);
m_6 = s_2(D) & _11;
g (int a, int b, int m, int s)
{
  int _11;

  <bb 2> :
  m_3 = m_1(D) & s_2(D);
  a_5 = m_3 + a_4(D);
  _11 = ~m_1(D);
  m_6 = s_2(D) & _11;
  b_8 = m_6 + b_7(D);
  f (a_5, b_8);
  return;
}

However since m_3 is used, this is more costly. Shouldn't this folding check for single use of the intermediate expr? From a quick look, this is probably match.pd:/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y.  */
Comment 1 Marc Glisse 2018-05-12 12:41:54 UTC
Direct translation would be (from clang):

	andl	%ecx, %edx
	addl	%edx, %edi
	xorl	%ecx, %edx
	addl	%edx, %esi

With -mbmi, I get

	andn	%ecx, %edx, %eax
	andl	%ecx, %edx
	addl	%eax, %esi
	addl	%edx, %edi

which is comparable to the original (andn has a longer encoding, but the operations are less dependent)

We could use opi:cs in the transformation if that really helps...
Comment 2 Richard Biener 2018-05-14 08:16:40 UTC
(In reply to Alexander Monakov from comment #0)
> However since m_3 is used, this is more costly. Shouldn't this folding check
> for single use of the intermediate expr? From a quick look, this is probably
> match.pd:/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y.  */

Just as a note I'd consider ~X & Y cheaper canonicalization-wise because it
has one unary and one binary op vs two binary ones.

Yes, this is probably a candidate for :s

With the current :s semantics :s should eventually be applied automagically by
genmatch.  Note that given match.pd patterns cannot rely on immediate uses
being up-to-date all this is fragile heuristics anyways...
Comment 3 Alexander Monakov 2018-08-27 14:09:21 UTC
Author: amonakov
Date: Mon Aug 27 14:08:50 2018
New Revision: 263887

URL: https://gcc.gnu.org/viewcvs?rev=263887&root=gcc&view=rev
Log:
match.pd: add single-use check for (x & y) ^ y -> ~x & y (PR 85758)

	PR tree-optimization/85758
	* match.pd ((X & Y) ^ Y): Add :s qualifier to inner expression.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/match.pd
Comment 4 Alexander Monakov 2018-08-27 14:26:00 UTC
Fixed.