This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: autopar reduction and OMP_ATOMIC gimplify/expand changes - final patch
- From: Razya Ladelsky <RAZYA at il dot ibm dot com>
- To: "Diego Novillo" <dnovillo at google dot com>
- Cc: gcc-patches at gcc dot gnu dot org, "Zdenek Dvorak" <rakdver at kam dot mff dot cuni dot cz>
- Date: Wed, 31 Oct 2007 10:35:26 +0200
- Subject: Re: autopar reduction and OMP_ATOMIC gimplify/expand changes - final patch
> You are planning to include _all_ that gimple code in a comment as an
> example of the transformation? That's way too detailed and hard to
> follow. It needs to be much more succinct.
I hoped to demonstrate the reduction code in all parts it is involved in.
I hope now it's clearer.
Still too long?
Thanks,
Razya
source code:
parloop
{
int sum=1;
for (i = 0; i < N/1000; i++)
{
x[i] = i + 3;
sum+=x[i];
}
}
gimple-like code:
header_bb:
# sum_29 = PHI <sum_11(5), 1(3)>
# i_28 = PHI <i_12(5), 0(3)>
D.1795_8 = i_28 + 3;
x[i_28] = D.1795_8;
sum_11 = D.1795_8 + sum_29;
i_12 = i_28 + 1;
if (N_6(D) > i_12)
goto header_bb;
exit_bb:
# sum_21 = PHI <sum_11(4)>
printf (&"%d"[0], sum_21);
after reduction transformation (only relevant parts):
parloop
{
....
# Two new variables are created for each reduction:
"reduction" is the variable holding the neutral element for the
particular operation, e.g. 0 for PLUS_EXPR, 1 for MULT_EXPR, etc.
"reduction_initial" is the initial value given by the user.
It is kept and will be used after the parallel computing is done. #
reduction_initial.24_46 = 1;
reduction.23_45 = 0;
.paral_data_store.32.reduction.23 = reduction.23_45;
#pragma omp parallel num_threads(4)
reduction.28_48 = .paral_data_load.33_51->reduction.23;
#pragma omp for schedule(static)
# sum.27_29 = PHI <sum.27_11, reduction.28_48>
sum.27_11 = D.1827_8 + sum.27_29;
OMP_CONTINUE
# Adding this reduction phi is done at create_phi_for_local_result() #
# reduction.23_58 = PHI <sum.27_11(5), 0(23)>
OMP_RETURN
# Creating the atomic operation is done at
create_call_for_reduction_1() #
#pragma omp atomic_load
D.1839_59 = *&.paral_data_load.33_51->reduction.23;
D.1840_60 = reduction.23_58 + D.1839_59;
#pragma omp atomic_store (D.1840_60);
OMP_RETURN
# collecting the result after the join of the threads is done at
create_loads_for_reductions().
a new variable "reduction_final" is created. It calculates the final
value from the initial value and the value computed by the threads #
reduction_final.34_53 = .paral_data_load.33_52->reduction.23;
sum_37 = reduction_initial.24_46 + reduction_final.34_53;
sum_43 = D.1795_41 + sum_37;
exit bb:
# sum_21 = PHI <sum_43, sum_26>
printf (&"%d"[0], sum_21);
...
}