This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libgomp/65589] New: OpenMP 3.1 produces random results for simple array copy


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65589

            Bug ID: 65589
           Summary: OpenMP 3.1 produces random results for simple array
                    copy
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: blocker
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: felix.ospald at gmx dot de
                CC: jakub at gcc dot gnu.org

/*
gcc --version
gcc (SUSE Linux) 4.8.1 20130909 [gcc-4_8-branch revision 202388]

echo | cpp -fopenmp -dM | grep -i open
#define _OPENMP 201107

cat /proc/version
Linux version 3.11.10-17-desktop (geeko@buildhost) (gcc version 4.8.1 20130909
[gcc-4_8-branch revision 202388] (SUSE Linux) ) #1 SMP PREEMPT Mon Jun 16
15:28:13 UTC 2014 (fba7c1f)

cat /proc/meminfo
MemTotal:       529410640 kB

lsb_release -a
LSB Version:   
core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
Distributor ID: openSUSE project
Description:    openSUSE 13.1 (Bottle) (x86_64)
Release:        13.1
Codename:       Bottle

lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             4
NUMA node(s):          4
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Model name:            Intel(R) Xeon(R) CPU E5-4640 0 @ 2.40GHz
Stepping:              7
CPU MHz:               2712.000
BogoMIPS:              4815.64
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-7,32-39
NUMA node1 CPU(s):     8-15,40-47
NUMA node2 CPU(s):     16-23,48-55
NUMA node3 CPU(s):     24-31,56-63

Running this program results in an output like

loop 0
loop 1
loop 2
loop 3
loop 4
loop 5
loop 6
loop 7
loop 8
fault index=716440 value=0.5

So it seems like that the value at index 716440 is never set to 1.
Sometimes several attempts (ctrl+C and run again) and sometimes several
hundered loops are required until the error occurs.
This error rarely occurs when run with less than 32 threads (never on 1 core).
The "rand()" function is called because it seems to make the error occour more
often (but it also occurs without this line).
In general code wich produces different runtime on each thread seems to make
the error occour more often. The "rand()" function seems to have internal
thread locking, wich casuses different delays for each thread.
The error was reproduced on two different machines (so bad memory is unlikely,
however both run the same os+gcc versions).
The following things do not have any influence:
- gcc optimization swich -O0
- gcc -march switch (native or x86-64)
- #pragma omp flush (at various places)
- schedule static/dynamic
- catching exceptions inside the loop

I have no clue what is going on. Any help is very appreciated.
*/

#include <iostream>
#include <cmath>
#include <stdlib.h>
#include <omp.h>

int main(int argc, char* argv[])
{
    int num_threads = omp_get_num_procs();

    if (argc > 1) {
        num_threads = atoi(argv[1]);
    }

    omp_set_dynamic(0);
    omp_set_nested(0);
    omp_set_num_threads(num_threads);
    std::cout << "num_threads=" << omp_get_max_threads() << std::endl;

    const int n = 512*512*4;
    double* phi0 = new double[n];
    double* sigma0 = new double[n];

    for (int iter = 0;; iter++)
    {
        std::cout << "loop " << iter << std::endl;

        for (int k = 0; k < n; k++)
        {
            phi0[k] = 1;
            sigma0[k] = 0.5;
        }

        #pragma omp parallel for schedule(static)
        for (int i = 0; i < n; i++)
        {
            //#pragma omp critical
            rand();

            sigma0[i] = phi0[i];
        }

        for (int j = 0; j < n; j++)
        { 
            if (sigma0[j] != 1) {
                std::cout << "fault index=" << j << " value=" << sigma0[j] <<
std::endl;
                return 1;
            }
        }
    }

    return 0;
}


The CMakeLists.txt:

cmake_minimum_required(VERSION 2.8)
SET(CMAKE_BUILD_TYPE Release)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -march=native -O2 -fopenmp")
#SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -march=x86-64 -mtune=generic -O0
-fopenmp -fstack-check -fbounds-check")
SET(SOURCES main.cpp)
ADD_EXECUTABLE(main ${SOURCES})


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]