Bug 19792 - Missed optimizations due to signedness in the way
Summary: Missed optimizations due to signedness in the way
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: unknown
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, TREE
Depends on: 15459
Blocks: 19721 19986
  Show dependency treegraph
 
Reported: 2005-02-06 13:50 UTC by Kazu Hirata
Modified: 2023-12-30 01:17 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2019-03-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kazu Hirata 2005-02-06 13:50:36 UTC
Consider:

extern unsigned char size_lookup[257];

int
foo (unsigned int t)
{
  return (size_lookup [(int) t] == size_lookup[t]);
}

int
bar (unsigned int t)
{
  int a = t;
  return a == t;
}

Both functions should return 1, and in fact that's what the RTL optimizers
notice, but the tree optimizers don't.

This is somewhat related to PR 19790.
Comment 1 Andrew Pinski 2005-02-06 16:17:02 UTC
Confirmed, For bar, my tree combiner fixes the missed optimization.
Not for foo.
Comment 2 Kazu Hirata 2005-02-13 14:17:08 UTC
The testcase in this PR was reduced from ggc-page.c:ggc_alloc_typed_stat.
Comment 3 Kazu Hirata 2005-03-16 14:59:51 UTC
FYI, here is the tailc dump that I get with the current mainline.

;; Function foo (foo)

foo (t)
{
  unsigned char D.1137;
  unsigned int t.1;
  unsigned char D.1135;
  int t.0;
  int D.1133;

<bb 0>:
  t.0_2 = (int) t_1;
  D.1135_4 = size_lookup[t.0_2];
  D.1137_6 = size_lookup[t_1];
  D.1133_7 = D.1135_4 == D.1137_6;
  return D.1133_7;

}



;; Function bar (bar)

bar (t)
{
  int a;
  unsigned int a.2;
  int D.1142;

<bb 0>:
  a_2 = (int) t_1;
  a.2_3 = (unsigned int) a_2;
  D.1142_4 = t_1 == a.2_3;
  return D.1142_4;

}
Comment 4 Richard Biener 2006-05-04 13:57:08 UTC
Subject: Bug 19792

Author: rguenth
Date: Thu May  4 13:56:52 2006
New Revision: 113527

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113527
Log:
2006-05-04  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/14287
	PR tree-optimization/14844
	PR tree-optimization/19792
	PR tree-optimization/21608
	PR tree-optimization/27090
	* tree-ssa-pre.c (try_combine_conversion): New function.
	(compute_avail): After constructing the value-handle
	expression, use try_combine_conversion to combine NOP_EXPRs
	with previous value-handle expressions and use the result if it
	is available.

	* gcc.dg/tree-ssa/ssa-fre-1.c: New testcase.
	* gcc.dg/tree-ssa/ssa-fre-2.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-1.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-pre.c

Comment 5 Richard Biener 2006-05-04 15:04:21 UTC
bar is now fixed.


;; Function foo (foo)

Analyzing Edge Insertions.
foo (t)
{
<bb 2>:
  return size_lookup[(int) t] == size_lookup[t];

}



;; Function bar (bar)

Analyzing Edge Insertions.
bar (t)
{
<bb 2>:
  return 1;

}
Comment 6 Andrew Pinski 2006-05-05 09:17:47 UTC
the issue for foo might be considered a different harder issue to resolve.  I think we need to decide what type is the argument for the ARRAY_REF, is it the type which the ARRAY has for its bounds or some other type?
Comment 7 Steven Bosscher 2019-03-04 10:26:59 UTC
Still an issue as of "g++ (Compiler-Explorer-Build) 9.0.1 20190303 (experimental)" on x86-64 at -O2:

_Z3fooj:
        movslq  %edi, %rax
        movl    %edi, %edi
        movzbl  size_lookup(%rdi), %edx
        cmpb    %dl, size_lookup(%rax)
        sete    %al
        movzbl  %al, %eax
        ret
_Z3barj:
        movl    $1, %eax
        ret



FWIW "clang version 7.0.0 (tags/RELEASE_700/final 342594)" at -O2:

_Z3fooj:                                # @_Z3fooj
        movslq  %edi, %rax
        movb    size_lookup(%rax), %cl
        movl    %eax, %edx
        xorl    %eax, %eax
        cmpb    size_lookup(%rdx), %cl
        sete    %al
        retq
_Z3barj:                                # @_Z3barj
        movl    $1, %eax
        retq



and "icc (ICC) 19.0.1.144 20181018":

_Z3fooj:
        xorl      %eax, %eax                                    #6.36
        movslq    %edi, %rdi                                    #6.11
        movb      size_lookup(%rdi), %dl                        #6.11
        movl      %edi, %edi                                    #6.36
        cmpb      size_lookup(%rdi), %dl                        #6.36
        sete      %al                                           #6.36
        ret                                                     #6.36
_Z3barj:
        movl      $1, %eax                                      #13.15
        ret
Comment 8 Richard Biener 2019-03-04 11:02:04 UTC
I think for foo the "solution" is to only consider indices with well-defined
behavior, that is, for size_lookup[] indices in [0,257] which means we
should be safely able to truncate the index to unsigned char (for the
purpose of analysis only!).

Note we're not value-numbering (int) t the same as t but even
size_lookup[(short) t] should result in the same value to be loaded
unless undefined behavior is invoked.

Of course extern unsigned char size_lookup[257]; might be considered
extern unsigned char size_lookup[]; for QOI reasons and existing broken
code, so...  but just change the testcase to non-extern size_lookup.

The solution might ly in value-numbering which could, when value-numbering
the two loads lookup (and insert!) VNs of (shortest-allowed-type)index.
The danger is of course that the shortest-allowed-type might differ
depentent on the shape of the lookup where otherwise we'd compute the
same VN.

I'm quite sure we can't simply trust TYPE_DOMAIN of TREE_TYPE of
TREE_OPERAND (array-ref, 0) when constraining operand 1.