This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][GCC] Complex division improvements in GCC


On 8/29/19 9:20 AM, Elen Kalda wrote:
> Hi all,
> 
> Advice and help needed! 
> 
> This patch makes changes to the the complex division in GCC. The algorithm 
> used is same as in https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01629.html - 
> same problems, same improvement in robustness, same loss in accuracy. 
> 
> Since Baudin adds another underflow check, two more basic blocks get added 
> during the cplxlower1 pass. 
> 
> No problems with bootstrap on aarch64-none-linux-gnu. Unsurprisingly, there
> are regressions in gcc/testsuite/gcc.dg/torture/builtin-math-7.c. As in the 
> patch linked above, the regressions in that test are due to the loss in 
> accuracy.
> 
> To evaluate the performance, the same test which generates 360 000 000 random 
> numbers was used. Doing one less division results in a nice 11.32% 
> improvement in performance:
> 
>             | CPU time
> --------------------
> smiths  | 7 290 996
> b1div   | 6 465 590
> 
> That implementation works (in a sense that it produces an expected result), 
> but it could be made more efficient and clean. As an example, the cplxlower1
> pass assigns one variable to another variable, which seems redundant:
> 
> [...]
> 
>  <bb 3> [local count: 1063004407]:
>   # i_19 = PHI <0(2), i_15(7)>
>   _9 = REALPART_EXPR <x[i_19]>;
>   _7 = IMAGPART_EXPR <x[i_19]>;
>   _1 = COMPLEX_EXPR <_9, _7>;
>   _18 = REALPART_EXPR <y[i_19]>;
>   _17 = IMAGPART_EXPR <y[i_19]>;
>   _2 = COMPLEX_EXPR <_18, _17>;
>   _16 = ABS_EXPR <_18>;
>   _21 = ABS_EXPR <_17>;
>   _22 = _16 > _21;
>   if (_22 != 0)
>     goto <bb 9>; [50.00%]
>   else
>     goto <bb 10>; [50.00%]
> 
>   <bb 9> [local count: 531502204]:
>   _23 = _17 / _18;
>   _24 = _17 * _23;
>   _25 = _18 + _24;
>   _26 = 1.0e+0 / _25;
>   _27 = _23 == 0.0;
>   if (_27 != 0)
>     goto <bb 12>; [50.00%]
>   else
>     goto <bb 13>; [50.00%]
> 
>   <bb 12> [local count: 265751102]:
>   _28 = _7 / _18;
>   _29 = _17 * _28;
>   _30 = _9 + _29;
>   _31 = _26 * _30;
>   _32 = _9 / _18;
>   _33 = _17 * _32;
>   _34 = _7 - _33;
>   _35 = _26 * _34;
>   _83 = _31; <----------------------- could these extra assignments be avoided?
>   _84 = _35; <-----------------------|
>   goto <bb 11>; [100.00%]
I wouldn't really worry about them at this point.  They'll almost
certainly get propagated away as they're simple copies.


> 
> 2019-08-29  Elen Kalda  <elen.kalda@arm.com>
> 
>     * fold-const.c (fold_negate_const): Make the fold_negate_const function 
>     non-static
>     (const_binop): Implement Baudin's algorithm for complex division
>     * fold-const.h (fold_negate_const): Add a fold_negate_const function 
>     declaration
>     * tree-complex.c (complex_div_internal_wide): New function to aid with the 
>     wide complex division
>     (expand_complex_div_wide): Implement Baudin's algorithm for complex 
>     division
I'd like Joseph to chime in when he can.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]