44214 – Compiler does not optimize vector divide with -freciprocal-math (or -ffast-math)

Bug 44214 - Compiler does not optimize vector divide with -freciprocal-math (or -ffast-math)

Summary: Compiler does not optimize vector divide with -freciprocal-math (or -ffast-math)

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	4.5.0

Importance:	P3 enhancement
Target Milestone:	4.8.0
Assignee:	Bill Schmidt

URL:
Keywords:

Depends on:
Blocks:

Reported:	2010-05-20 17:49 UTC by Michael Meissner
Modified:	2012-04-20 14:21 UTC (History)
CC List:	3 users (show)

See Also:
Host:	powerpc64-unknown-linux-gnu
Target:	powerpc64-unknown-linux-gnu
Build:	powerpc64-unknown-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:	2010-05-21 09:47:56

Attachments
Example program that shows the issue on powerpc. (223 bytes, text/plain) 2010-05-20 18:02 UTC, Michael Meissner	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Michael Meissner 2010-05-20 17:49:05 UTC

If you have code that does division by a constant that can be auto vectorized by the compiler, the compiler does not convert the division to multiplication by the reciprocal if -freciprocal-math (or -ffast-math), but instead does the division.

The bug is in fold-const.c near line 11254, where the code for handling REAL_CST should be cloned to handle VECTOR_CST (and presumably COMPLEX_CST also).

Comment 1 Michael Meissner 2010-05-20 18:00:00 UTC

Actually in looking at it further, I was wrong in the initial claim.  Auto vectorization now handles division by a constant.  Explicit vectors like PowerPC (and probably SPU) do show the problem.

Comment 2 Michael Meissner 2010-05-20 18:02:10 UTC

Created attachment 20712 [details]
Example program that shows the issue on powerpc.

Compile with -mcpu=power7 on powerpc.

Comment 3 Richard Biener 2010-05-21 09:47:56 UTC

The fold code should probably simply use fold_binary to do the constant
folding (which already should handle 1/x for x vector and complex.  There
is a build_one_cst to build the constant 1 for any type).  The exact
result check would need to use mpc (and I'm not sure its correct for
-frounding-math anyway).

Comment 4 Bill Schmidt 2012-04-19 15:20:21 UTC

I'll take this one.

Comment 5 Bill Schmidt 2012-04-20 14:19:23 UTC

Author: wschmidt
Date: Fri Apr 20 14:19:13 2012
New Revision: 186625

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186625
Log:
gcc:

2012-04-20  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR rtl-optimization/44214
	* fold-const.c (exact_inverse): New function.
	(fold_binary_loc): Fold vector and complex division by constant into
	multiply by recripocal with flag_reciprocal_math; fold vector division
	by constant into multiply by reciprocal with exact inverse.

gcc/testsuite:

2012-04-20  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR rtl-optimization/44214
	* gcc.dg/pr44214-1.c: New test.
	* gcc.dg/pr44214-2.c: Likewise.
	* gcc.dg/pr44214-3.c: Likewise.



Added:
    trunk/gcc/testsuite/gcc.dg/pr44214-1.c
    trunk/gcc/testsuite/gcc.dg/pr44214-2.c
    trunk/gcc/testsuite/gcc.dg/pr44214-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog

Comment 6 Bill Schmidt 2012-04-20 14:21:09 UTC

Fixed.