Bug 21463 - min/max and references
Summary: min/max and references
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.1.0
: P2 enhancement
Target Milestone: 4.3.0
Assignee: Richard Biener
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: missed-optimization
Depends on: 21462 19431
Blocks:
  Show dependency treegraph
 
Reported: 2005-05-09 08:36 UTC by tbp
Modified: 2007-04-18 12:46 UTC (History)
3 users (show)

See Also:
Host: x86*
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-02-14 16:18:43


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tbp 2005-05-09 08:36:49 UTC
I don't quite know how to characterize this one, so i'll let it up to those in
the know to fix the summary/description.
Primo, AFAIK, gcc has always struggled to get this right but 4.1 is setting a
new record; so i'll qualify that as a regression vs 3.3/3.4.
Secundo, i haven't found any related bugreports but one about PPC.
So here we go.

Consider this testcase:
#include <algorithm>

template<class T> static inline T max(const T a, const T b) { return a<b ? b : a; }
template<class T> static inline T min(const T a, const T b) { return a<b ? a : b; }

template<class T> static inline const T &ref_max(const T &a, const T &b) {
return a<b ? b : a; }
template<class T> static inline const T &ref_min(const T &a, const T &b) {
return a<b ? a : b; }

template<class T> struct foo_t {
	T a0, a1;
	T bar(const T b, const T c) {
		return max(min(a0, c), min(max(a1, c), b));
	}
	T bar_ref(const T b, const T c) {
		return ref_max(ref_min(a0, c), ref_min(ref_max(a1, c), b));
	}
	T bar_stl(const T b, const T c) {
		return std::max(std::min(a0, c), std::min(std::max(a1, c), b));
	}
};
template struct foo_t<int>;
template struct foo_t<float>;

int main() { return 0; }

With g++-4120050501 -O3 i get such creative & entertaining code as:
0000000000400610 <foo_t<float>::bar_ref(float, float)>:
  400610:       ucomiss 0x4(%rdi),%xmm1
  400614:       lea    0x4(%rdi),%rax
  400618:       lea    0xfffffffffffffff8(%rsp),%rdx
  40061d:       movss  %xmm0,0xfffffffffffffffc(%rsp)
  400623:       movss  %xmm1,0xfffffffffffffff8(%rsp)
  400629:       movaps %xmm1,%xmm2
  40062c:       cmova  %rdx,%rax
  400630:       movss  (%rax),%xmm1
  400634:       ucomiss %xmm1,%xmm0
  400637:       ja     400641 <foo_t<float>::bar_ref(float, float)+0x31>
  400639:       lea    0xfffffffffffffffc(%rsp),%rax
  40063e:       movaps %xmm0,%xmm1
  400641:       ucomiss (%rdi),%xmm2
  400644:       cmova  %rdi,%rdx
  400648:       movss  (%rdx),%xmm0
  40064c:       ucomiss %xmm0,%xmm1
  40064f:       jbe    400655 <foo_t<float>::bar_ref(float, float)+0x45>
  400651:       movss  (%rax),%xmm0
  400655:       repz retq

Compare that to what g++-3.4.4-20050314 gives (g++ 3.3.6 is similar):
0000000000400610 <foo_t<float>::bar_ref(float, float)>:
  400610:       ucomiss (%rdi),%xmm1
  400613:       lea    0xfffffffffffffffc(%rsp),%rsi
  400618:       mov    %rdi,%rcx
  40061b:       lea    0x4(%rdi),%rdx
  40061f:       movss  %xmm0,0xfffffffffffffff8(%rsp)
  400625:       lea    0xfffffffffffffff8(%rsp),%rax
  40062a:       movss  %xmm1,0xfffffffffffffffc(%rsp)
  400630:       cmovbe %rsi,%rcx
  400634:       ucomiss 0x4(%rdi),%xmm1
  400638:       cmova  %rsi,%rdx
  40063c:       ucomiss (%rdx),%xmm0
  40063f:       cmova  %rdx,%rax
  400643:       movss  (%rax),%xmm0
  400647:       ucomiss (%rcx),%xmm0
  40064a:       cmovbe %rcx,%rax
  40064e:       movss  (%rax),%xmm0
  400652:       retq
Certainly not optimal, and in fact quite ugly, but at least there's no branch.

Happens as soon as references are used and enough min/max are piled up, and that
means that unless you use your own min/max instead of the STL version, you're
doomed.
Comment 1 tbp 2005-05-09 08:50:29 UTC
I forgot to say that using ternary operators or if/else while changing the
codegen slightly doesn't make much difference.
Comment 2 Andrew Pinski 2005-05-09 21:33:44 UTC

*** This bug has been marked as a duplicate of 21462 ***
Comment 3 Andrew Pinski 2005-05-09 21:34:42 UTC
Actually I take that back.
Comment 4 Andrew Pinski 2005-05-09 21:36:00 UTC
Confirmed, the problem here is slightly different from 21462 but closely related.
Comment 5 tbp 2005-05-09 22:20:24 UTC
Additional note, using -fno-gcse slightly reduce the cruft (this one is my new
pet peeve :).
Comment 6 Andrew Pinski 2005-09-24 17:38:10 UTC
On the mainline we get:
_ZN5foo_tIfE7bar_refEff:
.LFB661:
        ucomiss 4(%rdi), %xmm1
        leaq    -8(%rsp), %rdx
        leaq    4(%rdi), %rax
        movss   %xmm0, -4(%rsp)
        movss   %xmm1, -8(%rsp)
        cmova   %rdx, %rax
        ucomiss (%rdi), %xmm1
        movss   (%rax), %xmm0
        minss   -4(%rsp), %xmm0
        cmova   %rdi, %rdx
        movss   (%rdx), %xmm1
        maxss   %xmm1, %xmm0
        ret

But on the tree level, there are a couple of issues.
The first is that, we have 
a_1 = &b_2->c;
temp_3 = *a_1;
...
use a_1
Which really should be converted to
temp_3 = b_2->c;
Which will help.
Comment 7 Andrew Pinski 2006-02-05 21:09:34 UTC
The problem here is actually not I would I had orginally thought but instead the following issue:
struct f
{
  int i;
};

int g(int i, int c, struct f *ff)
{
  int *t;
  if (c)
   t = &i;
  else
   t = &ff->i;
  return *t;
}

We don't change the *t into the i and ff->i which causes the min/max not be reconized.  This is related to PR 19431 which is a similar issue.
Comment 8 Richard Biener 2007-02-14 16:18:43 UTC
The patch I have also fixes this one.
Comment 9 Richard Biener 2007-04-18 12:45:29 UTC
Subject: Bug 21463

Author: rguenth
Date: Wed Apr 18 12:45:09 2007
New Revision: 123946

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=123946
Log:
2007-04-18  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/19431
	PR tree-optimization/21463
	* tree-pass.h (pass_phiprop): Declare.
	* passes.c (init_optimization_passes): New phiprop pass.
	* tree-ssa-forwprop.c (struct phiprop_d): New structure.
	(phivn_valid_p): New helper function.
	(phiprop_insert_phi): Likewise.
	(propagate_with_phi): Likewise.
	(tree_ssa_phiprop): New propagator propagating loads
	through phi nodes if profitable.

	* gcc.c-torture/execute/20070212-1.c: New testcase.
	* gcc.c-torture/execute/20070212-2.c: Likewise.
	* gcc.c-torture/execute/20070212-3.c: Likewise.
	* gcc.dg/tree-ssa/pr19431.c: Likewise.
	* gcc.dg/tree-ssa/pr21463.c: Likewise.
	* g++.dg/tree-ssa/pr21463.C: Likewise.
	* g++.dg/tree-ssa/pr30738.C: Likewise.

Added:
    trunk/gcc/testsuite/g++.dg/tree-ssa/pr21463.C
    trunk/gcc/testsuite/g++.dg/tree-ssa/pr30738.C
    trunk/gcc/testsuite/gcc.c-torture/execute/20070212-1.c
    trunk/gcc/testsuite/gcc.c-torture/execute/20070212-2.c
    trunk/gcc/testsuite/gcc.c-torture/execute/20070212-3.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr19431.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr21463.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/passes.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-pass.h
    trunk/gcc/tree-ssa-forwprop.c

Comment 10 Richard Biener 2007-04-18 12:46:25 UTC
Fixed.