This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Altering OpenMP emitted code
- From: Amittai Aviram <amittai dot aviram at yale dot edu>
- To: gcc at gcc dot gnu dot org
- Date: Sat, 11 Feb 2012 17:20:24 -0500
- Subject: Altering OpenMP emitted code
[I've just posted this query to gcc-help, but it occurred to me that this list might be more appropriate. I am sorry for the duplication for people who subscribe to both lists.]
Hi! I'm reaching the point of exhaustion in trying to understand GCC code, so I need help. I want to change the code that GCC emits when the source code has an OpenMP reduction clause.
WHAT GCC DOES NOW
Suppose your source code looks like this, a minimal example:
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
int main(void) {
omp_set_num_threads(4);
int x = 42;
#pragma omp parallel reduction(+:x)
{
x++;
}
printf("x = %d\n", x);
return EXIT_SUCCESS;
}
GCC creates an external function, "main.omp_fn.0," for the OpenMP parallel block. Within main.omp_fn.0, in order to represent the reduction clause, GCC uses a temporary stack variable (let's call it x_prime), initialized to 0, in place of the original x. Near the end of main.omp_fn.0, it then adds the current value of x_prime to the original x, using an atomic instruction, such as the LOCK ADD instruction for x86. Here's the assembly code for x86_64/Ubuntu Linux, with labels and some dot-directives removed:
main.omp_fn.0:
pushq %rbp
movq %rsp, %rbp
movq %rdi, -24(%rbp)
movl $0, -4(%rbp)
addl $1, -4(%rbp)
movq -24(%rbp), %rax
movl -4(%rbp), %edx
lock addl %edx, (%rax)
leave
ret
HOW I WOULD LIKE TO CHANGE GCC'S BEHAVIOR
I want to replace the LOCK ADD instruction with a call to my own function (let's say "omp_reduction"). I will need to pass to omp_reduction the following parameters:
-- An enumerator value dependent on the operator originally used in the reduction--here, say, "OP_PLUS" for the original + operator.
-- The address of (original) x
-- The address of x_prime
-- An enumerator value for the type of x and x_prime
So the signature of omp_reduction would be
void omp_reduction(enum op_type op, void * var, void * tmp, enum operand_type type);
And the call, if written in C, would look like this, if (say) x were a 32-bit integer:
omp_reduction(OP_PLUS, &x, &x_prime, INT32);
WHERE I AM NOW (LOST)
I think the atomic instruction at the end (e.g., LOCK ADD) is represented by the gimple_reduction_merge field of type gimple_seq in the tree_omp_clause structure defined in tree.h:
struct GTY(()) tree_omp_clause {
/* (.. Other fields ...) */
/* The gimplification of OMP_CLAUSE_REDUCTION_{INIT,MERGE} for omp-low's
usage. */
gimple_seq gimple_reduction_init;
gimple_seq gimple_reduction_merge;
tree GTY ((length ("omp_clause_num_ops[OMP_CLAUSE_CODE ((tree)&%h)]"))) ops[1];
};
But I do not understand how GCC assigns or uses this field or how I can alter GCC's behavior WRT it. I cannot seem to find the relevant source code in gcc/gcc.
I'd really appreciate help or guidance. Thanks!
Amittai Aviram
PhD Student in Computer Science
Yale University
646 483 2639
amittai.aviram@yale.edu
http://www.amittai.com
Amittai Aviram
PhD Student in Computer Science
Yale University
646 483 2639
amittai.aviram@yale.edu
http://www.amittai.com