Motivated by PR57071. In numerical code, it is not unlike to find code of the form "(-1.0) raised to the power of k", in Fortran: (-1.0)**k. That translates into: __builtin_powif (-1.0e+0, k) which stays that way even with -O3. Expected: It gets optimized to "1.0 - 2.0 * (K%2)" Fortran test case: real function f(k) integer, value :: k f = (-1.0)**k end
Modulo is an integer only operation, so I'd say you want (k & 1) ? -1.0 : 1.0 instead, converting (k & 1) into floating point and multiplying and subtracting it or (k & 1) << 1 into floating point and subtracting it is is likely going to be more expensive, though of course it should be benchmarked.
As James pointed out at c.l.f, the algorithm fails if k < 0 - and suggests: 1.0 - (float) ((k & 1) << 1)
(In reply to comment #1) > Modulo is an integer only operation, so I'd say you want > (k & 1) ? -1.0 : 1.0 > instead, converting (k & 1) into floating point and multiplying and subtracting > it or (k & 1) << 1 into floating point and subtracting it is is likely going to > be more expensive, though of course it should be benchmarked. You're right, it is much faster: g25@linux-fd1f:~/Krempel/P2> cat a.c #include <stdio.h> volatile float b; int main() { int i; for (i=0; i<1000000000; i++) b = (i & 1) ? -1.0 : 1.0; printf("%f\n",b); } ig25@linux-fd1f:~/Krempel/P2> gcc -O a.c && time ./a.out -1.000000 real 0m1.948s user 0m1.946s sys 0m0.000s ig25@linux-fd1f:~/Krempel/P2> cat b.c #include <stdio.h> volatile float b; int main() { int i; for (i=0; i<1000000000; i++) b = 1.0 - ((i&1)<<1); printf("%f\n",b); } ig25@linux-fd1f:~/Krempel/P2> gcc -O b.c && time ./a.out -1.000000 real 0m7.059s user 0m7.053s sys 0m0.001s
Created attachment 29968 [details] patch that fails The patch in the attachment fails with the test case program main integer :: i real(8) :: a a = -1.0_8 read (*,*) i print *,(-1.0_8)**i,a**i end program main gfortran -O foo.f90 foo.f90: In function 'MAIN__': foo.f90:6:0: internal compiler error: in gimplify_modify_expr, at gimplify.c:4941 print *,(-1.0_8)**i,a**i ^ 0x7fbc17 gimplify_modify_expr ../../trunk/gcc/gimplify.c:4941 0x7f0211 gimplify_expr(tree_node**, gimple_statement_d**, gimple_statement_d**, bool (*)(tree_node*), int) ../../trunk/gcc/gimplify.c:7156 I wonder why.
(In reply to comment #4) > patch that fails The Fortran patch of the attachments looks fine, except for: + one = gfc_copy_expr (op1); + gfc_free_expr (op1); + gfc_free_expr (op2); + *e = *one; I would simply use "op1" directly instead of copying it and then freeing the original expression. Otherwise, I am okay with that patch (with a test case). * * * For the middle-end patch: Richard things that the problem is probably in the COND_EXPR re-gimplification - which means the problem is elsewhere.
Created attachment 30011 [details] Patch that should work, but doesn't. Here's the actual patch that fails, without the Fortran front end part.
The problem is the following in gimplify.c: In gimplify_cond_expr, one has: /* If this COND_EXPR has a value, copy the values into a temporary within the arms. */ if (!VOID_TYPE_P (type)) ... /* If either an rvalue is ok or we do not require an lvalue, create the temporary. But we cannot do that if the type is addressable. */ if (((fallback & fb_rvalue) || !(fallback & fb_lvalue)) && !TREE_ADDRESSABLE (type)) ... tmp = create_tmp_var (type, "iftmp"); result = tmp; This is then used in the then_ and else_ branches, e.g. <modify_expr 0x7ffff67f2e38 type <real_type 0x7ffff66f5e70 real(kind=4) SF arg 0 <var_decl 0x7ffff67ea850 iftmp.1 type <real_type ... real(kind=4)> arg 1 <real_cst 0x7ffff67f32a0 type <real_type ... real(kind=4)> constant 1.0e+0>> That's passed to gimplify_modify_expr, where one runs into the assert: if (gimplify_ctxp->into_ssa && is_gimple_reg (*to_p)) { /* We should have got an SSA name from the start. */ gcc_assert (TREE_CODE (*to_p) == SSA_NAME); } which fails as the TREE_CODE of "iftmp.1" is VAR_DECL.
For completeness, gimplify_ctxp->into_ssa is 0 for the -O0 optimization and it gets set to 1 in gimple_regimplify_operands. Thus, it is not surprising that it works with the -O0 example ("(-1.0)**k") but not with the -O1 example ("a=-1;a**k"). If one ignores the assert, one runs into: control flow within basic block 2.
After some discussion with Jakub and Richard - and after doing some code reads, I think it is best do handle this in tree-ssa-math-opts.c (search for gimple_expand_builtin_powi and BUILT_IN_POWI). For the code insertion, asan.c's build_check_stmt can serve as template.
Created attachment 30142 [details] Another patch that doesn't work... This time with the right PR... I tried to follow the advice in comment#9, but I hit a wall (again). With the attached patch, I hit foo.f90:1:0: error: SSA_NAME_DEF_STMT is wrong program main ^ Expected definition statement: _16 = 1.0e+0; Actual definition statement: _16 = _24; so I suspect I need PHI nodes here. Is this the right approach in general? Am I missing something simple here?
Created attachment 30228 [details] Yet another patch that doesn't work This one fails with program main ^ _24 = i_1 & 1; power_6.f90:4:0: internal compiler error: verify_gimple failed 0x98ca4c verify_gimple_in_cfg(function*) ../../trunk/gcc/tree-cfg.c:4792 0x8c4d87 execute_function_todo ../../trunk/gcc/passes.c:1969 0x8c56e7 execute_todo ../../trunk/gcc/passes.c:2002 We have a few alternatives now: 1. This doesn't get fixed at all 2. This doesn't get fixed by me 3. This gets fixed in the Fortran front end 4. Somebody points out where my patch is going wrong, and helps me understant a bit more of the basics of GIMPLE.
Author: burnus Date: Thu May 30 21:32:53 2013 New Revision: 199461 URL: http://gcc.gnu.org/viewcvs?rev=199461&root=gcc&view=rev Log: 2013-05-30 Tobias Burnus <burnus@net-b.de> Thomas Koenig <tkoenig@gcc.gnu.org> PR middle-end/57073 * tree-ssa-math-opts.c (execute_cse_sincos): Optimize powi (-1.0, k) to (k & 1) ? -1.0 : 1.0. 2013-05-30 Tobias Burnus <burnus@net-b.de> PR middle-end/57073 * gfortran.dg/power_6.f90: New. Added: trunk/gcc/testsuite/gfortran.dg/power_6.f90 Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-math-opts.c
FIXED on the trunk (for 4.9). Thanks, Thomas and Jakub, for the help and the suggestions.
The patch at http://gcc.gnu.org/ml/gcc-cvs/2013-05/msg01050.html is wrong; the optimization is only done when, for 2.0**i, i is a constant integer. power_6.f90 also fails as a result. Tobias, could you revert this patch?
(In reply to Thomas Koenig from comment #14) > The patch at http://gcc.gnu.org/ml/gcc-cvs/2013-05/msg01050.html > is wrong I realized this myself about 6 hours ago, see http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01884.html - I was hoping to get some feedback - otherwise, I was intending to simply revert it.
Now it should be FIXED again. Author: burnus Date: Sat Jun 1 08:39:59 2013 New Revision: 199575 URL: http://gcc.gnu.org/viewcvs?rev=199575&root=gcc&view=rev Log: 2013-06-01 Tobias Burnus <burnus@net-b.de> Partially reverted: 2013-05-31 Tobias Burnus <burnus@net-b.de> PR middle-end/57073 * tree-ssa-math-opts.c (execute_cse_sincos): Move check further up. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-math-opts.c