Bug 47617 - SSE rounding mode works -g, not -O3
Summary: SSE rounding mode works -g, not -O3
Status: RESOLVED DUPLICATE of bug 34678
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.3.2
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-05 22:27 UTC by cck0011
Modified: 2011-02-12 18:20 UTC (History)
0 users

See Also:
Host:
Target: i?86-*-linux
Build:
Known to work:
Known to fail:
Last reconfirmed: 2011-02-07 11:52:28


Attachments
generated .i file (12.70 KB, application/octet-stream)
2011-02-05 22:27 UTC, cck0011
Details
source file (2.52 KB, text/x-csrc)
2011-02-08 01:37 UTC, cck0011
Details

Note You need to log in before you can comment on or make changes to this bug.
Description cck0011 2011-02-05 22:27:34 UTC
Created attachment 23252 [details]
generated .i file

Hi folks,

  I'm working with SSE intrinsics and think I have a rounding problem. When I try to change modes with _MM_SET_ROUNDING_MODE, I see different results when compiled "-g", but not "-O3". 

  What am I missing?

thanks!

Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl
=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-che
cking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-languages=c,
c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/us
r/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar
=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-cpu=generic --build=i386-redhat-linux
Thread model: posix
gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC)
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps' '-v' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.2/cc1 -E -quiet -v round.c -msse -mmmx -mtune=generic -Wall -O3 -fp
ch-preprocess -o round.i
ignoring nonexistent directory "/usr/lib/gcc/i386-redhat-linux/4.3.2/include-fixed"
ignoring nonexistent directory "/usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../i386-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc/i386-redhat-linux/4.3.2/include
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps' '-v' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.2/cc1 -fpreprocessed round.i -quiet -dumpbase round.c -msse -mmmx -
mtune=generic -auxbase round -O3 -Wall -version -o round.s
GNU C (GCC) version 4.3.2 20081105 (Red Hat 4.3.2-7) (i386-redhat-linux)
        compiled by GNU C version 4.3.2 20081105 (Red Hat 4.3.2-7), GMP version 4.2.2, MPFR version 2.3.2.
GGC heuristics: --param ggc-min-expand=55 --param ggc-min-heapsize=48000
Compiler executable checksum: 3bee52601079f736b7b63b762646f4ba
round.c: In function ‘test_sse1_feature’:
round.c:150: warning: unused variable ‘sig’
round.c:150: warning: unused variable ‘extensions’
round.c:149: warning: ‘edx’ may be used uninitialized in this function
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps' '-v' '-mtune=generic'
 as -V -Qy -o round.o round.s
GNU assembler version 2.18.50.0.9 (i386-redhat-linux) using BFD version version 2.18.50.0.9-8.fc10 20080822
COMPILER_PATH=/usr/libexec/gcc/i386-redhat-linux/4.3.2/:/usr/libexec/gcc/i386-redhat-linux/4.3.2/:/usr/libe
xec/gcc/i386-redhat-linux/:/usr/lib/gcc/i386-redhat-linux/4.3.2/:/usr/lib/gcc/i386-redhat-linux/:/usr/libex
ec/gcc/i386-redhat-linux/4.3.2/:/usr/libexec/gcc/i386-redhat-linux/:/usr/lib/gcc/i386-redhat-linux/4.3.2/:/
usr/lib/gcc/i386-redhat-linux/
LIBRARY_PATH=/usr/lib/gcc/i386-redhat-linux/4.3.2/:/usr/lib/gcc/i386-redhat-linux/4.3.2/:/usr/lib/gcc/i386-
redhat-linux/4.3.2/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps' '-v' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.2/collect2 --eh-frame-hdr --build-id -m elf_i386 --hash-style=gnu -
dynamic-linker /lib/ld-linux.so.2 -o round /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../crt1.o /usr/lib/gc
c/i386-redhat-linux/4.3.2/../../../crti.o /usr/lib/gcc/i386-redhat-linux/4.3.2/crtbegin.o -L/usr/lib/gcc/i3
86-redhat-linux/4.3.2 -L/usr/lib/gcc/i386-redhat-linux/4.3.2 -L/usr/lib/gcc/i386-redhat-linux/4.3.2/../../.
. round.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gc
c/i386-redhat-linux/4.3.2/crtend.o /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../crtn.o
Comment 1 Andrew Pinski 2011-02-06 02:13:37 UTC
I think you need to use -frounding-math.  GCC assumes by default the rounding mode is round-to-nearest.  See http://gcc.gnu.org/onlinedocs/gcc-4.5.2/gcc/Optimize-Options.html#index-frounding_002dmath-819 .
Comment 2 cck0011 2011-02-06 16:25:55 UTC
(In reply to comment #1)
> I think you need to use -frounding-math.  GCC assumes by default the rounding
> mode is round-to-nearest.  See
> http://gcc.gnu.org/onlinedocs/gcc-4.5.2/gcc/Optimize-Options.html#index-frounding_002dmath-819
> .

Hi Andrew, 

  thanks for writing. I tried -frounding-math and the result is still the same. Adding/removing -mfpmath=sse doesn't change it either. Is there any additional information I can provide?

Thanks!
Comment 3 Richard Biener 2011-02-07 11:52:28 UTC
Can you provide non-preprocessed source?  I have difficulties in compiling with
newer releases.
Comment 4 cck0011 2011-02-08 01:37:58 UTC
Created attachment 23273 [details]
source file

Here's the source code. Rename to round.c.
Comment 5 cck0011 2011-02-08 01:46:18 UTC
(In reply to comment #4)
> Created attachment 23273 [details]
> source file
> 
> Here's the source code. Rename to round.c.

Hi Richard,

  here's the source code. Rename to round.c. 

  I think I must be doing something wrong here. Someone would have noticed that results from _mm_cvtps_pi16 weren't changing when _MM_SET_ROUNDING_MODE() was called. -) I'm puzzled by it working with -g, but not with -O3.

  Any additional information I can provide?

  Thanks!
Comment 6 Andrew Pinski 2011-02-08 01:58:40 UTC
The problem is the same as recorded as PR 34678.  We are optimizing all the _mm_cvtps_pi16 to one of them because we don't see the rounding mode has changed.  To get the correct values each time do the following:

void test_rounding(void)
{
 __m128 source = {-1.1, 0.0, 1.1, 1.5};
 __m64 dest;
 unsigned int initial_mode;

 initial_mode = _MM_GET_ROUNDING_MODE();
 print_rounding_mode("initial rounding mode", initial_mode);
 
 /* now set the rounding mode to each value to see the result  */
 
 asm("":"+X"(source)); // force source to be different but the same
 _MM_SET_ROUNDING_MODE(_MM_ROUND_NEAREST);

 dest = _mm_cvtps_pi16(source);
 _mm_empty();
 print_round_results("with _MM_ROUND_NEAREST ", source, dest);

 asm("":"+X"(source)); // force source to be different but the same
 _MM_SET_ROUNDING_MODE(_MM_ROUND_DOWN);

 dest = _mm_cvtps_pi16(source);
 _mm_empty();
 print_round_results("with _MM_ROUND_DOWN ", source, dest);
 
 asm("":"+X"(source)); // force source to be different but the same
 _MM_SET_ROUNDING_MODE(_MM_ROUND_UP);

 dest = _mm_cvtps_pi16(source);
 _mm_empty();
 print_round_results("with _MM_ROUND_UP ", source, dest);

 asm("":"+X"(source)); // force source to be different but the same
 _MM_SET_ROUNDING_MODE(_MM_ROUND_TOWARD_ZERO);

 dest = _mm_cvtps_pi16(source);
 _mm_empty();
 print_round_results("with _MM_ROUND_TOWARD_ZERO ", source, dest);

 /* restore initial rounding mode  */
  _MM_SET_ROUNDING_MODE(initial_mode);
  
}

*** This bug has been marked as a duplicate of bug 34678 ***
Comment 7 Richard Biener 2011-02-08 11:35:32 UTC
Well, this case is slightly different as we simply have const/pure builtins
that do not only depend on their arguments (but the FP state).  Thus we'd need
to trop the attributes from these functions for -frounding-math.  Not that
it would help a lot, given PR34678 ...
Comment 8 cck0011 2011-02-09 02:08:23 UTC
Hi folks,

  First, thanks for working on this.

  Second, I read the link and I _think_ I understand it. Let me paraphrase it back to you and you can tell me if I've got the point:

  There is an optimizer that extracts common expressions and evaluates them once instead of every time they occur. (What's the name of that so I can call it by the right name?) In my code it finds the expression:

  dest = _mm_cvtps_pi16(source);

  Several times. Since it doesn't see source changing, this expression only gets evaluated once. Now, the change to rounding mode that happens with _MM_SET_ROUNDING_MODE(...) isn't detected as something that would change the value of the _mm_cvtps_pi16(...) expression, so the optimization is not removed. Recognizing that change to rounding mode and reacting to it is what's at the heart of bug 34678, and that's why this is a duplicate.

  The work-arounds are:

1)insert 'asm("":"+X"(source));' before changing rounding mode to make the compiler re-evaluate expressions that use source.

2) do _MM_SET_ROUNDING_MODE(...) before any divisions or integer conversions that might get optimized out. The scope of the optimization is a function body and any inlined code. So do _MM_SET_ROUNDING_MODE early within that scope. 

  Is my understanding correct? 

  A few more questions:

  Will this bug exist on non-X86 processors?

  What does the 'asm("":"+X"(source));' expression do ?

  Will this syntax work for non-X86 processors?

  To be correct, should I compile with -frounding-math ?


Thanks!
Comment 9 cck0011 2011-02-12 18:20:03 UTC
Hi folks,

  I tried the asm("":"+X"(source));  as shown. I get an error: inconsistent operand constraints in an ‘asm’.

  The info pages make it look like this should work, but the Inline Assembly Howto doesn't mention the X constraint. If the compiler should agree with the info pages, I'm doing something wrong. What am I missing?

thanks