Consider: extern unsigned char size_lookup[257]; int foo (unsigned int t) { return (size_lookup [(int) t] == size_lookup[t]); } int bar (unsigned int t) { int a = t; return a == t; } Both functions should return 1, and in fact that's what the RTL optimizers notice, but the tree optimizers don't. This is somewhat related to PR 19790.
Confirmed, For bar, my tree combiner fixes the missed optimization. Not for foo.
The testcase in this PR was reduced from ggc-page.c:ggc_alloc_typed_stat.
FYI, here is the tailc dump that I get with the current mainline. ;; Function foo (foo) foo (t) { unsigned char D.1137; unsigned int t.1; unsigned char D.1135; int t.0; int D.1133; <bb 0>: t.0_2 = (int) t_1; D.1135_4 = size_lookup[t.0_2]; D.1137_6 = size_lookup[t_1]; D.1133_7 = D.1135_4 == D.1137_6; return D.1133_7; } ;; Function bar (bar) bar (t) { int a; unsigned int a.2; int D.1142; <bb 0>: a_2 = (int) t_1; a.2_3 = (unsigned int) a_2; D.1142_4 = t_1 == a.2_3; return D.1142_4; }
Subject: Bug 19792 Author: rguenth Date: Thu May 4 13:56:52 2006 New Revision: 113527 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113527 Log: 2006-05-04 Richard Guenther <rguenther@suse.de> PR tree-optimization/14287 PR tree-optimization/14844 PR tree-optimization/19792 PR tree-optimization/21608 PR tree-optimization/27090 * tree-ssa-pre.c (try_combine_conversion): New function. (compute_avail): After constructing the value-handle expression, use try_combine_conversion to combine NOP_EXPRs with previous value-handle expressions and use the result if it is available. * gcc.dg/tree-ssa/ssa-fre-1.c: New testcase. * gcc.dg/tree-ssa/ssa-fre-2.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-3.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-4.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-5.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-pre.c
bar is now fixed. ;; Function foo (foo) Analyzing Edge Insertions. foo (t) { <bb 2>: return size_lookup[(int) t] == size_lookup[t]; } ;; Function bar (bar) Analyzing Edge Insertions. bar (t) { <bb 2>: return 1; }
the issue for foo might be considered a different harder issue to resolve. I think we need to decide what type is the argument for the ARRAY_REF, is it the type which the ARRAY has for its bounds or some other type?
Still an issue as of "g++ (Compiler-Explorer-Build) 9.0.1 20190303 (experimental)" on x86-64 at -O2: _Z3fooj: movslq %edi, %rax movl %edi, %edi movzbl size_lookup(%rdi), %edx cmpb %dl, size_lookup(%rax) sete %al movzbl %al, %eax ret _Z3barj: movl $1, %eax ret FWIW "clang version 7.0.0 (tags/RELEASE_700/final 342594)" at -O2: _Z3fooj: # @_Z3fooj movslq %edi, %rax movb size_lookup(%rax), %cl movl %eax, %edx xorl %eax, %eax cmpb size_lookup(%rdx), %cl sete %al retq _Z3barj: # @_Z3barj movl $1, %eax retq and "icc (ICC) 19.0.1.144 20181018": _Z3fooj: xorl %eax, %eax #6.36 movslq %edi, %rdi #6.11 movb size_lookup(%rdi), %dl #6.11 movl %edi, %edi #6.36 cmpb size_lookup(%rdi), %dl #6.36 sete %al #6.36 ret #6.36 _Z3barj: movl $1, %eax #13.15 ret
I think for foo the "solution" is to only consider indices with well-defined behavior, that is, for size_lookup[] indices in [0,257] which means we should be safely able to truncate the index to unsigned char (for the purpose of analysis only!). Note we're not value-numbering (int) t the same as t but even size_lookup[(short) t] should result in the same value to be loaded unless undefined behavior is invoked. Of course extern unsigned char size_lookup[257]; might be considered extern unsigned char size_lookup[]; for QOI reasons and existing broken code, so... but just change the testcase to non-extern size_lookup. The solution might ly in value-numbering which could, when value-numbering the two loads lookup (and insert!) VNs of (shortest-allowed-type)index. The danger is of course that the shortest-allowed-type might differ depentent on the shape of the lookup where otherwise we'd compute the same VN. I'm quite sure we can't simply trust TYPE_DOMAIN of TREE_TYPE of TREE_OPERAND (array-ref, 0) when constraining operand 1.