Bug 44214 - Compiler does not optimize vector divide with -freciprocal-math (or -ffast-math)
Summary: Compiler does not optimize vector divide with -freciprocal-math (or -ffast-math)
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.5.0
: P3 enhancement
Target Milestone: 4.8.0
Assignee: Bill Schmidt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-20 17:49 UTC by Michael Meissner
Modified: 2012-04-20 14:21 UTC (History)
3 users (show)

See Also:
Host: powerpc64-unknown-linux-gnu
Target: powerpc64-unknown-linux-gnu
Build: powerpc64-unknown-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2010-05-21 09:47:56


Attachments
Example program that shows the issue on powerpc. (223 bytes, text/plain)
2010-05-20 18:02 UTC, Michael Meissner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Meissner 2010-05-20 17:49:05 UTC
If you have code that does division by a constant that can be auto vectorized by the compiler, the compiler does not convert the division to multiplication by the reciprocal if -freciprocal-math (or -ffast-math), but instead does the division.

The bug is in fold-const.c near line 11254, where the code for handling REAL_CST should be cloned to handle VECTOR_CST (and presumably COMPLEX_CST also).
Comment 1 Michael Meissner 2010-05-20 18:00:00 UTC
Actually in looking at it further, I was wrong in the initial claim.  Auto vectorization now handles division by a constant.  Explicit vectors like PowerPC (and probably SPU) do show the problem.
Comment 2 Michael Meissner 2010-05-20 18:02:10 UTC
Created attachment 20712 [details]
Example program that shows the issue on powerpc.

Compile with -mcpu=power7 on powerpc.
Comment 3 Richard Biener 2010-05-21 09:47:56 UTC
The fold code should probably simply use fold_binary to do the constant
folding (which already should handle 1/x for x vector and complex.  There
is a build_one_cst to build the constant 1 for any type).  The exact
result check would need to use mpc (and I'm not sure its correct for
-frounding-math anyway).
Comment 4 Bill Schmidt 2012-04-19 15:20:21 UTC
I'll take this one.
Comment 5 Bill Schmidt 2012-04-20 14:19:23 UTC
Author: wschmidt
Date: Fri Apr 20 14:19:13 2012
New Revision: 186625

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186625
Log:
gcc:

2012-04-20  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR rtl-optimization/44214
	* fold-const.c (exact_inverse): New function.
	(fold_binary_loc): Fold vector and complex division by constant into
	multiply by recripocal with flag_reciprocal_math; fold vector division
	by constant into multiply by reciprocal with exact inverse.

gcc/testsuite:

2012-04-20  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR rtl-optimization/44214
	* gcc.dg/pr44214-1.c: New test.
	* gcc.dg/pr44214-2.c: Likewise.
	* gcc.dg/pr44214-3.c: Likewise.



Added:
    trunk/gcc/testsuite/gcc.dg/pr44214-1.c
    trunk/gcc/testsuite/gcc.dg/pr44214-2.c
    trunk/gcc/testsuite/gcc.dg/pr44214-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog
Comment 6 Bill Schmidt 2012-04-20 14:21:09 UTC
Fixed.