Bug 39217 - g++4.3.3 OpenMP (aka omp) for loop hangs
Summary: g++4.3.3 OpenMP (aka omp) for loop hangs
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: libgomp (show other bugs)
Version: 4.3.3
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-17 18:55 UTC by Andrew Leaver-Fay
Modified: 2012-01-09 04:12 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Leaver-Fay 2009-02-17 18:55:59 UTC
I'm testing some autoparallelization on my mactel running g++4.3.3 which I downloaded from macports.

In every other run, I see my code hang, though I cannot find an error in gdb. Through a long series of cout-debugging steps, have traced it to the beginning of an omp loop.  The code does not hang on a linux box running g++4.1.0.

The code in question looks like this:

	utility::vector1< DOF_Node * > dof_nodes;
	//dof_nodes.reserve( min_map.size() );
	for ( MinimizerMap::iterator iter = min_map.begin(), iter_e = min_map.end();
				iter != iter_e; ++iter ) {
		dof_nodes.push_back( *iter );
	}
	std::cout << "s" << std::flush;
	
	int const ndofs = dof_nodes.size();
	#pragma omp parallel for
	for ( int ii = 1; ii <= ndofs; ++ii ) {
		std::cout << "*" << std::flush;
		DOF_Node & dof_node( *dof_nodes[ ii ] );
		std::cout << "(" << ii << "," << dof_nodes[ ii ] << ")" << std::flush;

		// loop through atoms first moved by this torsion
		for ( DOF_Node::AtomIDs::const_iterator it1=dof_node.atoms().begin(),
				it1e = dof_node.atoms().end();
				it1 != it1e; ++it1 ) {
			id::AtomID const & atom_id( *it1 );

			scorefxn.eval_atom_derivative( atom_id, pose, min_map.domain_map(), dof_node.F1(), dof_node.F2() );
		} // atom1
	} // tor
	//std::cout << "B" << std::flush;

sample output from a hang: notice that it ends with "s"

SEADb0x1SEADf0SEADb0x1SEADf0*(0x2c,0x83933c0)SEADb0x1SEADf0SEADb0x1SEADf0SEADb0x1SEADf0*(0x2d,0x8397aa0)SEADb0x1SEADf0SEADb0x1SEADf0*(0x2e,0x8393360)SEADb0x1SEADf0SEADb0x1SEADf0SEADb0x1SEADf0*(0x2f,0x83c67d0)SEADb0x1SEADf0SEADb0x1SEADf0SEADb0x1SEADf0*(0x30,0x83c6750)SEADb0x1SEADf0*(0x31,0x83c66d0)SEADb0x1SEADf0SEADb0x1SEADf0*(0x32,0x83c6650)SEADb0x1SEADf0SEADb0x1SEADf0SEADb0x1SEADf0*(0x33,0x83c65d0)SEADb0x1SEADf0SEADb0x1SEADf0s

"SEAD" is from the function called within this loop -- Scorefunction Evaluate Atom Derivative. -- I output SEAD twice, once with a "b" for beginning of the function and once with an "f" for finished.

(It's unclear to me why the integer ii is getting written out in hex.)

The code hangs in several places besides this loop.  I have not found any similar bug reporting for g++4.3.3 on the mac.

It is certainly possible that there is a bug somewhere else in the code that results in the hang here.  I'm at a loss, though, for how to detect it.  If anyone has even the most mild of suggestions of where to begin, I'd be glad to hear them.
Comment 1 Andrew Pinski 2009-02-18 21:55:33 UTC
We need a preprocessed source or at least a self contained example.  It might be the case you don't use the correct barriers or atomics when doing updates of a global variable.
Comment 2 Andrew Pinski 2012-01-08 21:33:08 UTC
No testcase in over 2 years so closing as invalid.
Comment 3 Andrew Leaver-Fay 2012-01-09 04:12:16 UTC
Hi,

I'm working with a large program and I am not sure if there are global
variables that are being read without my knowledge.  I knew to look
for global variables, but, not finding any, I didn't know what else I
could do.  I was hoping for something along the lines of "oh, pop open
your program in gdb and type X and then when the craziness appears,
you can check the stacks of the miscreant threads" or "start dumping X
messages to the screen at points Y and Z and you should see one thread
fail."

Debugging multithreaded programs is tricky.

Sadly, I never got this program to work with openmp.  I've given up.
Thank you for at least getting back to me.

Best,
Andrew

On Sun, Jan 8, 2012 at 4:33 PM, pinskia at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39217
>
> Andrew Pinski <pinskia at gcc dot gnu.org> changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>             Status|WAITING                     |RESOLVED
>         Resolution|                            |INVALID
>
> --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-01-08 21:33:08 UTC ---
> No testcase in over 2 years so closing as invalid.
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.