Bug 82692 - [8 Regression] Ordered comparisons used for unordered built-ins
Summary: [8 Regression] Ordered comparisons used for unordered built-ins
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: 8.0
Assignee: Uroš Bizjak
URL:
Keywords: wrong-code
: 82733 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-10-23 22:23 UTC by Joseph S. Myers
Modified: 2017-10-27 18:23 UTC (History)
3 users (show)

See Also:
Host:
Target: x86
Build:
Known to work:
Known to fail:
Last reconfirmed: 2017-10-24 00:00:00


Attachments
WIP patch (3.99 KB, patch)
2017-10-26 11:34 UTC, Uroš Bizjak
Details | Diff
WIP 2 patch (4.77 KB, patch)
2017-10-26 20:36 UTC, Uroš Bizjak
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Joseph S. Myers 2017-10-23 22:23:00 UTC
This was probably introduced by the fix for bug 52451.

With GCC mainline (tested with trunk r254025), some uses of unordered __builtin_* wrongly use ordered comparison instructions on x86_64, so resulting in spurious "invalid" exceptions for quiet NaN operands.  The following test aborts when built with -O2 for x86_64-linux-gnu, but succeeds with GCC 7.  This affects glibc's y0/y1/yn wrappers, so resulting in glibc test failures.

#include <fenv.h>

extern void abort (void);
extern void exit (int);

double __attribute__ ((noinline, noclone))
foo (double x)
{
  if (__builtin_islessequal (x, 0.0) || __builtin_isgreater (x, 1.0))
    return x + x;
  return x * x;
}

int
main (void)
{
  volatile double x = foo (__builtin_nan (""));
  if (fetestexcept (FE_INVALID))
    abort ();
  exit (0);
}
Comment 1 Uroš Bizjak 2017-10-24 06:24:53 UTC
Happens in combine:

(insn 7 6 8 2 (set (reg:CCFPU 17 flags)
        (compare:CCFPU (reg:DF 95)
            (reg/v:DF 91 [ x ]))) "pr82692.c":9 52 {*cmpiudf}
     (expr_list:REG_DEAD (reg:DF 95)
        (expr_list:REG_EQUAL (compare:CCFPU (const_double:DF 0.0 [0x0.0p+0])
                (reg/v:DF 91 [ x ]))
            (nil))))
(insn 8 7 9 2 (set (reg:QI 94)
        (unlt:QI (reg:CCFPU 17 flags)
            (const_int 0 [0]))) "pr82692.c":9 664 {*setcc_qi}
     (expr_list:REG_DEAD (reg:CCFPU 17 flags)
        (nil)))
(insn 9 8 10 2 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:QI 94)
            (const_int 0 [0]))) "pr82692.c":9 1 {*cmpqi_ccno_1}
     (expr_list:REG_DEAD (reg:QI 94)
        (nil)))

combines to:

Trying 8 -> 9:
Failed to match this instruction:
(set (reg:CCFPU 17 flags)
    (reg:CCFPU 17 flags))
Successfully matched this instruction:
(set (pc)
    (if_then_else (ge (reg:CCFPU 17 flags)
            (const_int 0 [0]))
        (label_ref:DI 34)
        (pc)))

Trying 7 -> 9:
Successfully matched this instruction:
(set (reg:CCFP 17 flags)
    (compare:CCFP (reg:DF 95)
        (reg/v:DF 91 [ x ])))
Successfully matched this instruction:
(set (pc)
    (if_then_else (ge (reg:CCFP 17 flags)
            (const_int 0 [0]))
        (label_ref:DI 34)
        (pc)))
Comment 2 Uroš Bizjak 2017-10-24 08:26:00 UTC
This is the problem in the combine pass. It is substituting non-trapping compare with trapping via SELECT_CC_MODE. This particular problem happens in line 6788:

--cut here--
      /* If this machine has CC modes other than CCmode, check to see if we
	 need to use a different CC mode here.  */
      if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC)
	compare_mode = GET_MODE (op0);
      else if (inner_compare
	       && GET_MODE_CLASS (GET_MODE (inner_compare)) == MODE_CC
	       && new_code == old_code
	       && op0 == XEXP (inner_compare, 0)
	       && op1 == XEXP (inner_compare, 1))
	compare_mode = GET_MODE (inner_compare);
      else
	compare_mode = SELECT_CC_MODE (new_code, op0, op1);
--cut here--

New compare mode should NOT change trapping of the compare insn.
Comment 3 Segher Boessenkool 2017-10-24 11:15:40 UTC
The combine output you showed is _not_ succeeding though?  "matched"
just means the rtx was recog()'ed; not that it was actually replaced.
Comment 4 Uroš Bizjak 2017-10-24 13:48:48 UTC
(In reply to Segher Boessenkool from comment #3)
> The combine output you showed is _not_ succeeding though?  "matched"
> just means the rtx was recog()'ed; not that it was actually replaced.

FAOD, this is the sequence before combine:

--cut here--
(insn 7 6 8 2 (set (reg:CCFPU 17 flags)
        (compare:CCFPU (reg:DF 95)
            (reg/v:DF 91 [ x ]))) "pr82692.c":9 52 {*cmpiudf}
     (expr_list:REG_DEAD (reg:DF 95)
        (expr_list:REG_EQUAL (compare:CCFPU (const_double:DF 0.0 [0x0.0p+0])
                (reg/v:DF 91 [ x ]))
            (nil))))
(insn 8 7 9 2 (set (reg:QI 94)
        (unlt:QI (reg:CCFPU 17 flags)
            (const_int 0 [0]))) "pr82692.c":9 664 {*setcc_qi}
     (expr_list:REG_DEAD (reg:CCFPU 17 flags)
        (nil)))
(insn 9 8 10 2 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:QI 94)
            (const_int 0 [0]))) "pr82692.c":9 1 {*cmpqi_ccno_1}
     (expr_list:REG_DEAD (reg:QI 94)
        (nil)))
(jump_insn 10 9 32 2 (set (pc)
        (if_then_else (eq (reg:CCZ 17 flags)
                (const_int 0 [0]))
            (label_ref:DI 34)
            (pc))) "pr82692.c":9 668 {*jcc}
     (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (int_list:REG_BR_PROB 182536112 (nil)))
 -> 34)
--cut here--

which is converted by combine pass to:

--cut here--
(note 7 6 8 2 NOTE_INSN_DELETED)
(note 8 7 9 2 NOTE_INSN_DELETED)
(insn 9 8 10 2 (set (reg:CCFP 17 flags)
        (compare:CCFP (reg:DF 95)
            (reg/v:DF 91 [ x ]))) "pr82692.c":9 50 {*cmpidf}
     (expr_list:REG_DEAD (reg:DF 95)
        (nil)))
(jump_insn 10 9 32 2 (set (pc)
        (if_then_else (ge (reg:CCFP 17 flags)
                (const_int 0 [0]))
            (label_ref:DI 34)
            (pc))) "pr82692.c":9 668 {*jcc}
     (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (int_list:REG_BR_PROB 182536112 (nil)))
 -> 34)
--cut here--

UNLT compare (CCFPUmode) has been converted to GE compare (CCFPmode). This is not correct as far as traps are concerned, since UNLT doesn't trap on qNaN, while GE does.
Comment 5 Segher Boessenkool 2017-10-24 16:33:07 UTC
The combination 8 -> 9 (where the GE is introduced) does not call
SELECT_CC_MODE at all.
Comment 6 Uroš Bizjak 2017-10-24 17:40:51 UTC
(In reply to Segher Boessenkool from comment #5)
> The combination 8 -> 9 (where the GE is introduced) does not call
> SELECT_CC_MODE at all.

The problematic conversion is 7 -> 9, *after* 8 -> 9 is performed.

Please see this gdb session:

(gdb) b ix86_cc_mode

Breakpoint 1, ix86_cc_mode (code=GE, op0=0x7fffeff09ab0, op1=0x7fffeff09960) at /home/uros/gcc-svn/trunk/gcc/config/i386/i386.c:21718
21718   {
(gdb) bt
#0  ix86_cc_mode (code=GE, op0=0x7fffeff09ab0, op1=0x7fffeff09960) at /home/uros/gcc-svn/trunk/gcc/config/i386/i386.c:21718
#1  0x00000000012eb45d in simplify_set (x=x@entry=0x7fffeff09c60) at /home/uros/gcc-svn/trunk/gcc/combine.c:6788
#2  0x00000000012ecb48 in combine_simplify_rtx(rtx_def*, machine_mode, int, int) () at /home/uros/gcc-svn/trunk/gcc/combine.c:6293
#3  0x00000000012eee32 in subst(rtx_def*, rtx_def*, rtx_def*, int, int, int) () at /home/uros/gcc-svn/trunk/gcc/combine.c:5573
#4  0x00000000012f1e42 in try_combine(rtx_insn*, rtx_insn*, rtx_insn*, rtx_insn*, int*, rtx_insn*) () at /home/uros/gcc-svn/trunk/gcc/combine.c:3332
#5  0x00000000012f7d91 in combine_instructions (nregs=<optimized out>, f=<optimized out>) at /home/uros/gcc-svn/trunk/gcc/combine.c:1301

(gdb) f4
#4  0x00000000012f1e42 in try_combine(rtx_insn*, rtx_insn*, rtx_insn*, rtx_insn*, int*, rtx_insn*) () at /home/uros/gcc-svn/trunk/gcc/combine.c:3332
3332          newpat = subst (PATTERN (i3), i2dest, i2src, 0, 0,

(gdb) p debug_rtx (i3)
(insn 9 8 10 2 (set (reg:CCFPU 17 flags)
        (compare:CCFPU (reg:DF 95)
            (reg/v:DF 91 [ x ]))) "pr82692.c":9 2147483647 {NOOP_MOVE}
     (nil))

(gdb) b combine.c:3333
Breakpoint 2 at 0x12f1dfb: file /home/uros/gcc-svn/trunk/gcc/combine.c, line 3333.

(gdb) c
Continuing.

Breakpoint 2, try_combine(rtx_insn*, rtx_insn*, rtx_insn*, rtx_insn*, int*, rtx_insn*) () at /home/uros/gcc-svn/trunk/gcc/combine.c:3334
3334                          || ((i0_feeds_i2_n || (i0_feeds_i1_n && i1_feeds_i2_n))

(gdb) p debug_rtx (newpat)
(set (reg:CCFP 17 flags)
    (compare:CCFP (reg:DF 95)
        (reg/v:DF 91 [ x ])))

And also changes mode of the conditional jump.
Comment 7 Uroš Bizjak 2017-10-24 18:12:50 UTC
To illustrate the problem, following patch fixes the failure:

--cut here--
Index: combine.c
===================================================================
--- combine.c   (revision 254011)
+++ combine.c   (working copy)
@@ -6784,7 +6784,7 @@ simplify_set (rtx x)
               && op0 == XEXP (inner_compare, 0)
               && op1 == XEXP (inner_compare, 1))
        compare_mode = GET_MODE (inner_compare);
-      else
+      else if (!FLOAT_MODE_P (GET_MODE (op0)))
        compare_mode = SELECT_CC_MODE (new_code, op0, op1);
 
       /* If the mode changed, we have to change SET_DEST, the mode in the
--cut here--

The patch avoids mode changes for floating-point operands.

(It will work for x86, since it has all comparisons trapping and non-trapping).
Comment 8 Segher Boessenkool 2017-10-24 18:48:26 UTC
Maybe you can handle this in can_change_dest_mode?  That will catch
the similar cases, too.
Comment 9 Uroš Bizjak 2017-10-24 19:32:07 UTC
(In reply to Segher Boessenkool from comment #8)
> Maybe you can handle this in can_change_dest_mode?  That will catch
> the similar cases, too.

No, because we only have to prevent CCmode changes that apply to FP operands. can_change_dest_mode only looks at mode changes, but CCFPmode and CCFPUmode are x86 specific.

I have looked at other SELECT_CC_MODE changes, and they deal with propagation of compares into arithmetic operations. This is the only place that can change CCmode of FP compares.
Comment 10 Segher Boessenkool 2017-10-24 21:19:30 UTC
So should combine use targetm.cc_modes_compatible here?
Comment 11 Uroš Bizjak 2017-10-25 06:40:21 UTC
(In reply to Segher Boessenkool from comment #10)
> So should combine use targetm.cc_modes_compatible here?

Yes. The trappines of FP compares is distinguished by their mode, so
I guess something along the following patch should work:

--cut here--
diff --git a/gcc/combine.c b/gcc/combine.c
index d71e50fdefb5..0220be2e484e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6787,6 +6787,14 @@ simplify_set (rtx x)
       else
        compare_mode = SELECT_CC_MODE (new_code, op0, op1);
 
+      /* Do not change compare mode of a floating point compare to
+        an incompatible mode.  Targets distingush trapping and
+        non-traping compares by their compare mode, and SELECT_CC_MODE
+        could return different mode for a new_code.  */
+      if (FLOAT_MODE_P (GET_MODE (op0))
+         && !targetm.cc_modes_compatible (compare_mode, GET_MODE (dest)))
+       compare_mode = GET_MODE (dest);
+
       /* If the mode changed, we have to change SET_DEST, the mode in the
         compare, and the mode in the place SET_DEST is used.  If SET_DEST is
         a hard register, just build new versions with the proper mode.  If it
--cut here--
Comment 12 Segher Boessenkool 2017-10-25 10:45:16 UTC
But why only do this for FLOAT_MODE_P?  Either the logic here isn't
correct, or cc_modes_compatible isn't the correct hook (we'll need
a new hook then?), or determining ordered/unordered by CC mode does
not work (does not fit into how RTL works).
Comment 13 Uroš Bizjak 2017-10-25 11:57:28 UTC
(In reply to Segher Boessenkool from comment #12)
> But why only do this for FLOAT_MODE_P?  Either the logic here isn't
> correct, or cc_modes_compatible isn't the correct hook (we'll need
> a new hook then?), or determining ordered/unordered by CC mode does
> not work (does not fit into how RTL works).

Non-FP compares can select different mode depending on their operands (e.g. CCmode to CCZmode when one operand is zero) without secondary effects. But when reversing the condition from UNGE -> LT, SELECT_CC_MODE will return trapping mode (LT), whereas original, non-reversed mode (UNGE) was non-trapping. Please see how targets depend mode of their FP compares on the condition code.

The solution here is to keep the original comparison mode for FP compares (as was proposed in the first version of the patch): when qNaN is encountered at this point, an exception has to be generated, no matter how the condition was changed.
Comment 14 Uroš Bizjak 2017-10-25 11:59:23 UTC
(In reply to Uroš Bizjak from comment #13)
> (In reply to Segher Boessenkool from comment #12)
> > But why only do this for FLOAT_MODE_P?  Either the logic here isn't
> > correct, or cc_modes_compatible isn't the correct hook (we'll need
> > a new hook then?), or determining ordered/unordered by CC mode does
> > not work (does not fit into how RTL works).
> 
> Non-FP compares can select different mode depending on their operands (e.g.
> CCmode to CCZmode when one operand is zero) without secondary effects. But
> when reversing the condition from UNGE -> LT, SELECT_CC_MODE will return
> trapping mode (LT), whereas original, non-reversed mode (UNGE) was
> non-trapping. Please see how targets depend mode of their FP compares on the
> condition code.
> 
> The solution here is to keep the original comparison mode for FP compares
> (as was proposed in the first version of the patch): when qNaN is
> encountered at this point, an exception has to be generated, no matter how
> the condition was changed.

... an exception should not be generated ... in the above case, when UNGE is reversed to LT.
Comment 15 Segher Boessenkool 2017-10-25 15:59:01 UTC
My point is that doing this only for FLOAT_MODE_P makes no real sense.
If we can describe ordered comparisons with special CC modes, we should
do tests with those modes only here.
Comment 16 Uroš Bizjak 2017-10-25 18:32:21 UTC
(In reply to Segher Boessenkool from comment #15)
> My point is that doing this only for FLOAT_MODE_P makes no real sense.
> If we can describe ordered comparisons with special CC modes, we should
> do tests with those modes only here.

I really can't see how we can use CC_MODES_COMPATIBLE check without harming integer compares. Not being an expert in this part of the compiler, I'm out of ideas what to do here - do you perhaps have a particular solution in mind?
Comment 17 Segher Boessenkool 2017-10-25 22:22:23 UTC
So we have a compare:CCFPU, the resulting flags is used in a GE
only, and ix86_cc_mode thinks the best mode to use for that is CCFP.

Which is fine, except compare:CCFPU is a different instruction, and GE
for the resulting insn means a different thing?!

How is this supposed to work?  How can generic code know this?

Everything worked fine, except the compare insn did not do the side
effect of setting a status flag.  Perhaps an unspec (or even an
unspec_volatile) should have been used for the compare?
Comment 18 Uroš Bizjak 2017-10-26 11:29:30 UTC
*** Bug 82733 has been marked as a duplicate of this bug. ***
Comment 19 Uroš Bizjak 2017-10-26 11:34:53 UTC
Created attachment 42479 [details]
WIP patch

This is a WIP patch that tags all unorederd compares with:

(unspec [(const_int 0)] UNSPEC_NOTRAP)

WIP patch, because there is SSE subst pattern that involves CCFPU mode. This mode will be removed, and I have to figure out what the subst does...
Comment 20 Uroš Bizjak 2017-10-26 20:36:34 UTC
Created attachment 42481 [details]
WIP 2 patch
Comment 21 uros 2017-10-27 18:13:45 UTC
Author: uros
Date: Fri Oct 27 18:13:14 2017
New Revision: 254167

URL: https://gcc.gnu.org/viewcvs?rev=254167&root=gcc&view=rev
Log:
	PR target/82692
	* config/i386/i386-modes.def (CCFPU): Remove definition.
	* config/i386/i386.c (put_condition_mode): Remove CCFPU mode handling.
	(ix86_cc_modes_compatible): Ditto.
	(ix86_expand_carry_flag_compare): Ditto.
	(ix86_expand_int_movcc): Ditto.
	(ix86_expand_int_addcc): Ditto.
	(ix86_reverse_condition): Ditto.
	(ix86_unordered_fp_compare): Rename from ix86_fp_compare_mode.
	Return true/false for unordered/ordered fp comparisons.
	(ix86_cc_mode): Always return CCFPmode for float mode comparisons.
	(ix86_prepare_fp_compare_args): Update for rename.
	(ix86_expand_fp_compare): Update for rename.  Generate unordered
	compare RTXes wrapped with UNSPEC_NOTRAP unspec.
	(ix86_expand_sse_compare_and_jump): Ditto.
	* config/i386/predicates.md (fcmov_comparison_operator):
	Remove CCFPU mode handling.
	(ix86_comparison_operator): Ditto.
	(ix86_carry_flag_operator): Ditto.
	* config/i386/i386.md (UNSPEC_NOTRAP): New unspec.
	(*cmpu<mode>_i387): Wrap compare RTX with UNSPEC_NOTRAP unspec.
	(*cmpu<mode>_cc_i387): Ditto.
	(FPCMP): Remove mode iterator.
	(unord): Remove mode attribute.
	(unord_subst): New define_subst transformation
	(unord): New define_subst attribute.
	(unordered): Ditto.
	(*cmpi<unord><MODEF:mode>): Rewrite using unord_subst transformation.
	(*cmpi<unord>xf_i387): Ditto.
	* config/i386/sse.md (<sse>_<unord>comi<round_saeonly_name>): Merge
	from <sse>_comi<round_saeonly_name> and <sse>_ucomi<round_saeonly_name>
	using unord_subst transformation.
	* config/i386/subst.md (SUBST_A): Remove CCFP and CCFPU modes.
	(round_saeonly): Also handle CCFP mode.
	* reg-stack.c (subst_stack_regs_pat): Handle UNSPEC_NOTRAP unspec.
	Remove UNSPEC_SAHF unspec handling.

testsuite/ChangeLog:

	PR target/82692
	* gcc.dg/torture/pr82692.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr82692.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386-modes.def
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.md
    trunk/gcc/config/i386/predicates.md
    trunk/gcc/config/i386/sse.md
    trunk/gcc/config/i386/subst.md
    trunk/gcc/reg-stack.c
    trunk/gcc/testsuite/ChangeLog
Comment 22 Uroš Bizjak 2017-10-27 18:23:47 UTC
Fixed.