Bug 36081

Summary: gcc optimizations and threads (pthread)
Product: gcc Reporter: snes2002
Component: driverAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED INVALID    
Severity: normal CC: gcc-bugs
Priority: P3    
Version: 4.2.1   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed:
Attachments: gcc output
c source
preprocessed file
assembler listing

Description snes2002 2008-04-29 18:51:39 UTC
gcc version 4.2.1 (Debian 4.2.1-3)
System: debian-installation from knoppix 5.1.1

When optimizations are turned on (-O, -O1, -O2, -O3),
two or more threads don't recognize when the value of a global variable changes.
It doesn't matter if it's a global variable or a pointer to a local variable.
The change of its value by thread no 1 will not be recognized by thread no 2.
Comment 1 snes2002 2008-04-29 18:54:23 UTC
Created attachment 15547 [details]
gcc output
Comment 2 Andrew Pinski 2008-04-29 18:54:52 UTC
This is correct behavior, you need either use volatile or use locks.
Comment 3 snes2002 2008-04-29 18:55:29 UTC
Created attachment 15548 [details]
c source
Comment 4 snes2002 2008-04-29 18:56:36 UTC
Created attachment 15549 [details]
preprocessed file
Comment 5 snes2002 2008-04-29 18:57:23 UTC
Created attachment 15550 [details]
assembler listing
Comment 6 snes2002 2008-04-29 18:59:25 UTC
This c-code works with optimizations turned off.
With borland compiler it also works with optimizations turned on.
Comment 7 snes2002 2008-04-29 19:06:07 UTC
Subject: Re:  gcc optimizations and threads (pthread)

thanks for the quick reply.

But this code works without opimizations. Even with complex constructs.
For the solution of my problem i can't use lock's.
I don't want so serialize the threads, I want to use all 2 processors of my dual-core.
And for syncing the threads, that's the easiest way.

Works fine whithout optimizations and also works with borland c++-compiler with all optimizations.


----- original Nachricht --------

Betreff: [Bug driver/36081] gcc optimizations and threads (pthread)
Gesendet: Di 29 Apr 2008 20:55:20 CEST
Von: "pinskia at gcc dot gnu dot org"<gcc-bugzilla@gcc.gnu.org>

> 
> 
> ------- Comment #2 from pinskia at gcc dot gnu dot org  2008-04-29 18:54
> -------
> This is correct behavior, you need either use volatile or use locks.
> 
> 
> -- 
> 
> pinskia at gcc dot gnu dot org changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
> 
>              Status|UNCONFIRMED                 |RESOLVED
>          Resolution|                            |INVALID
> 
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36081
> 
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
> 

--- original Nachricht Ende ----









Comment 8 Andrew Pinski 2008-04-29 19:11:53 UTC
>And for syncing the threads, that's the easiest way.

Use mutexes then, they are designed exactly for this.

>I don't want so serialize the threads, I want to use all 2 processors of my
dual-core.

You are serializing them by doing a busy loop.  In fact using a busy loop will just make the CPUs spin without doing anything except for checking a condition.  Using a mutex allows for other stuff to happen while that thread is waiting for more work to do.

I guess you need to read up on threading programming more because right now you are causing extra work to happen on thread for no reason.

-- Pinski
Comment 9 snes2002 2008-04-30 12:26:02 UTC
Subject: Re:  gcc optimizations and threads (pthread)

Hello,

I read about mutexes, cond_wait and cond_signal.
When I use these things instead of "busy loop" there is no performance gain at all for my problem (=generating the first 50000 prime numbers).
I tested the execution-times of all variants:

Single-thread (compiled with -O2) : 17,5 s
Double-thread with busy-loop (compiled with -O0) : 11 s
Double-thread with mutexes (compiled with -O2) : 47,1 s !!!

But all this is not the problem or the reason why I'm reporting this.

The problem is, that the same code (whatever it does) compiled with "gcc -O0" or "bcc32 -O0" (with wine) or "bcc32 -O2" works.
But if compiled with "gcc -O2" it doesn't work.
In this case the second thread doesn't read or dereference the actual pointer.
It seems to be cached somehow in a register and not read realtime.

If there are additional exit-conditions in loops (because of speed-optimization)
they won't be recognized.



----- original Nachricht --------

Betreff: [Bug driver/36081] gcc optimizations and threads (pthread)
Gesendet: Di 29 Apr 2008 21:11:58 CEST
Von: "pinskia at gcc dot gnu dot org"<gcc-bugzilla@gcc.gnu.org>

> 
> 
> ------- Comment #8 from pinskia at gcc dot gnu dot org  2008-04-29 19:11
> -------
> >And for syncing the threads, that's the easiest way.
> 
> Use mutexes then, they are designed exactly for this.
> 
> >I don't want so serialize the threads, I want to use all 2 processors of
> my
> dual-core.
> 
> You are serializing them by doing a busy loop.  In fact using a busy loop
> will
> just make the CPUs spin without doing anything except for checking a
> condition.
>  Using a mutex allows for other stuff to happen while that thread is
> waiting
> for more work to do.
> 
> I guess you need to read up on threading programming more because right now
> you
> are causing extra work to happen on thread for no reason.
> 
> -- Pinski
> 
> 
> -- 
> 
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36081
> 
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
> 

--- original Nachricht Ende ----









Comment 10 snes2002 2008-05-03 10:26:26 UTC
This is definitely a bug.
The same source-code must do the same thing, no matter what optimize-options
are enabled or disabled.
Comment 11 Richard Biener 2008-05-03 11:03:48 UTC
No it must not.  If your program is bogus then it is bogus.