We found this problem on ARM but I believe the problem affect other platforms as well. In the function distribute_and_simplify() of combine.c, there is not check for floating point expressions. Sometimes it incorrectly optimizes a floating point RTL. -----bug.c---- static const double one=1.0; double f(double x) { /* This is incorrectly transformed to x + x*x */ return x*(one+x); } --------- $ arm-eabi-gcc -O2 -S -march=armv7-a -mfloat-abi=hard -mfpu=neon -fno-unsafe-math-optimizations -g0 bug.c $ cat bug.s .arch armv7-a .eabi_attribute 27, 3 .eabi_attribute 28, 1 .fpu neon .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 1 .eabi_attribute 30, 2 .eabi_attribute 18, 4 .file "bug.c" .text .align 2 .global f .type f, %function f: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. fmacd d0, d0, d0 bx lr .size f, .-f .ident "GCC: (GNU) 4.5.0 20091005 (experimental)" The expression x*(1.0 + x) above is optmized into x + x * x using the fmacd instruction, which multiplies and accumlates. The following is part combine pass dump, note how instruction 13 is modified.: --- ;; Function f (f) starting the processing of deferred insns ending the processing of deferred insns df_analyze called insn_cost 2: 4 insn_cost 6: 4 insn_cost 7: 4 insn_cost 8: 4 insn_cost 13: 4 insn_cost 16: 0 deferring deletion of insn with uid = 2. modifying insn i2 7 r138:DF=s0:DF+r139:DF REG_DEAD: r139:DF deferring rescan insn with uid = 7. modifying insn i3 8 r137:DF=r138:DF*s0:DF REG_DEAD: r138:DF deferring rescan insn with uid = 8. deferring deletion of insn with uid = 7. deferring deletion of insn with uid = 6. modifying insn i3 8 r137:DF=s0:DF*s0:DF+s0:DF deferring rescan insn with uid = 8. deferring deletion of insn with uid = 8. modifying insn i3 13 s0:DF=s0:DF*s0:DF+s0:DF deferring rescan insn with uid = 13. (note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (note 2 4 3 2 NOTE_INSN_DELETED) (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG) (note 6 3 7 2 NOTE_INSN_DELETED) (note 7 6 8 2 NOTE_INSN_DELETED) (note 8 7 13 2 NOTE_INSN_DELETED) (insn 13 8 16 2 bug.c:8 (set (reg/i:DF 63 s0) (plus:DF (mult:DF (reg:DF 63 s0 [ x ]) (reg:DF 63 s0 [ x ])) (reg:DF 63 s0 [ x ]))) 610 {*muldf3adddf_vfp} (nil)) (insn 16 13 0 2 bug.c:8 (use (reg/i:DF 63 s0)) -1 (nil)) starting the processing of deferred insns deleting insn with uid = 2. deleting insn with uid = 6. deleting insn with uid = 7. deleting insn with uid = 8. rescanning insn with uid = 13. deleting insn with uid = 13. ending the processing of deferred insns ;; Combiner totals: 10 attempts, 10 substitutions (3 requiring new space), ;; 3 successes. --- The problem happens in this part of distribute_and_simplify_rtx (): tmp = apply_distributive_law (simplify_gen_binary (inner_code, mode, new_op0, new_op1)); if (GET_CODE (tmp) != outer_code && rtx_cost (tmp, SET, optimize_this_for_speed_p) < rtx_cost (x, SET, optimize_this_for_speed_p)) return tmp; It synthesizes a new expression by distributing one of the sub-expressions and then call apply_distribute_law. In the test-case above, apply_distribute_law detects a floating point RTL expression and returns immediately but the simplified expression generated has a lower RTL-cost than the original expression. Hence distribute_and_simplify_rtx returns the simplified expression, eventhough it should not unless -funsafe-math-optimizations is given. I think the fix is to add this test at the entrance of the function distribute_and_simplify_rtx /* Distributivity is not true for floating point as it can change the value. So we don't do it unless -funsafe-math-optimizations. */ if (FLOAT_MODE_P (GET_MODE (x)) && ! flag_unsafe_math_optimizations) return NULL_RTX;
Subject: Bug 41574 Author: dougkwan Date: Mon Oct 5 09:08:46 2009 New Revision: 152443 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152443 Log: 2009-10-05 Doug Kwan <dougkwan@google.com> PR rtl-optimization/41574 Index: combine.c (distribute_and_simplify_rtx): Quit if RTX mode is floating point and we are not doing unsafe math optimizations. Modified: trunk/gcc/ChangeLog trunk/gcc/combine.c
Fixed.
The ChangeLog entry is wrong.
> The ChangeLog entry is wrong. And folks from Google shouldn't feel entitled to break a freeze imposed by other folks from Google even if, yes, it is annoyingly long. :-)
Subject: Re: Distribute floating point expressions causes bad code. I am aware of the fact the stage one has ended but this is a bug fix, not an experimental new feature. Did I break a code freeze? If so, I am sorry and can back out the fix until the tree is reopen. 2009/10/5 ebotcazou at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: > > > ------- Comment #4 from ebotcazou at gcc dot gnu dot org 2009-10-05 10:00 ------- >> The ChangeLog entry is wrong. > > And folks from Google shouldn't feel entitled to break a freeze imposed by > other folks from Google even if, yes, it is annoyingly long. :-) > > > -- > > ebotcazou at gcc dot gnu dot org changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |ebotcazou at gcc dot gnu dot > | |org > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41574 > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. >
Subject: Re: Distribute floating point expressions causes bad code. Just saw Diego's e-mail about the me breaking the freeze. Sorry I should have checked that before checking thing in. It was just my bad. 2009/10/5 Doug Kwan (關振德) <dougkwan@google.com>: > I am aware of the fact the stage one has ended but this is a bug fix, > not an experimental new feature. Did I break a code freeze? If so, I > am sorry and can back out the fix until the tree is reopen. > > 2009/10/5 ebotcazou at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: >> >> >> ------- Comment #4 from ebotcazou at gcc dot gnu dot org 2009-10-05 10:00 ------- >>> The ChangeLog entry is wrong. >> >> And folks from Google shouldn't feel entitled to break a freeze imposed by >> other folks from Google even if, yes, it is annoyingly long. :-) >> >> >> -- >> >> ebotcazou at gcc dot gnu dot org changed: >> >> What |Removed |Added >> ---------------------------------------------------------------------------- >> CC| |ebotcazou at gcc dot gnu dot >> | |org >> >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41574 >> >> ------- You are receiving this mail because: ------- >> You reported the bug, or are watching the reporter. >> >
Subject: Bug 41574 Author: dougkwan Date: Thu Oct 8 22:16:58 2009 New Revision: 152580 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152580 Log: 2009-10-08 Doug Kwan <dougkwan@google.com> PR rtl-optimization/41574 * gcc.dg/pr41574.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr41574.c Modified: trunk/gcc/testsuite/ChangeLog
This is fixed in trunk but at least gcc-4.4.0, where this bug was found, is still broken.
(In reply to comment #8) > This is fixed in trunk but at least gcc-4.4.0, where this bug was found, is > still broken. > I have no approval rights but can you test & ask to backport this to 4.4 branch after the freeze for the 4.4.2 release is lifted ? Ramana
(In reply to comment #9) > (In reply to comment #8) > > This is fixed in trunk but at least gcc-4.4.0, where this bug was found, is > > still broken. > > > > I have no approval rights but can you test & ask to backport this to 4.4 branch > after the freeze for the 4.4.2 release is lifted ? Sorry about the late reply. Yes, I can prepare a fix for 4.4.2 -Doug
Subject: Bug 41574 Author: ramana Date: Fri Dec 11 11:21:33 2009 New Revision: 155157 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155157 Log: Fix PR41574 on 4.4 branch. 2009-12-11 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> 2009-10-05 Doug Kwan <dougkwan@google.com> PR rtl-optimization/41574 * combine.c (distribute_and_simplify_rtx): Quit if RTX mode is floating point and we are not doing unsafe math optimizations. 2009-12-11 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> 2009-10-08 Doug Kwan <dougkwan@google.com> PR rtl-optimization/41574 * gcc.dg/pr41574.c: New test. Added: branches/gcc-4_4-branch/gcc/testsuite/gcc.dg/pr41574.c - copied unchanged from r152580, trunk/gcc/testsuite/gcc.dg/pr41574.c Modified: branches/gcc-4_4-branch/gcc/ChangeLog branches/gcc-4_4-branch/gcc/combine.c branches/gcc-4_4-branch/gcc/testsuite/ChangeLog
Fixed .