This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/46032] openmp inhibits loop vectorization
- From: "vries at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 26 May 2015 16:38:58 +0000
- Subject: [Bug tree-optimization/46032] openmp inhibits loop vectorization
- Auto-submitted: auto-generated
- References: <bug-46032-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46032
--- Comment #16 from vries at gcc dot gnu.org ---
(In reply to Richard Biener from comment #12)
> (In reply to vries from comment #11)
> > The ipa-pta solution no longer works. In 4.6, we had:
> > ...
> > # USE = anything
> > # CLB = anything
> > GOMP_parallel_startD.1048 (main._omp_fn.0D.1472, &.omp_data_o.1D.1484, 0);
> > # USE = anything
> > # CLB = anything
> > main._omp_fn.0D.1472 (&.omp_data_o.1D.1484);
> > # USE = anything
> > # CLB = anything
> > GOMP_parallel_endD.1049 ();
> > ...
> >
> > On trunk, we have now:
> > ...
> > # USE = anything
> > # CLB = anything
> > GOMP_parallelD.1345 (main._omp_fn.0D.1844, &.omp_data_o.1D.1856, 0, 0);
> > ...
> >
> > So there's no longer a path in the call graph from main to main._omp_fn.
> > Perhaps a dummy body for GOMP_parallel could fix that.
>
> Hm? The IPA PTA "solution" was to tell IPA PTA that the call to
> GOMP_parallel
[ GOMP_parallel_start ]
> doesn't make .omp_data_o escape.
>
Right, for 4.6, adding fnspec ".rw" to GOMP_parallel_start has this effect in
ipa-pta:
...
D.1505_14 = { ESCAPED NONLOCAL pData }
D.1509_18 = { ESCAPED NONLOCAL results }
-->
D.1505_14 = { pData }
D.1509_18 = { results }
...
where _14 and _18 are the omp_data_i relative loads in the split-off function:
...
# VUSE <.MEMD.1514_20>
# PT = nonlocal
D.1505_14 = .omp_data_iD.1474_13(D)->pDataD.1477;
# VUSE <.MEMD.1514_20>
D.1506_15 = *D.1505_14[idxD.1495_1];
...
# VUSE <.MEMD.1514_20>
# PT = nonlocal
D.1509_18 = .omp_data_iD.1474_13(D)->resultsD.1479;
# .MEMD.1514_22 = VDEF <.MEMD.1514_20>
*D.1509_18[idxD.1495_1] = D.1508_17;
...
> The attached patch doesn't work because it only patches GOMP_parallel_start,
> not GOMP_parallel.
>
[ GOMP_parallel_start is no longer around on trunk. ] Applying the 4.6 patch on
trunk (and dropping the loop in the hunk for intra_create_variable_infos that
does not apply cleanly anymore) and applying fnspec ".rw" on GOMP_parallel,
gives us in ipa-pta:
...
_17 = { }
_21 = { }
...
where _17 and _21 are the omp_data_i relative loads in the split-off function:
...
# VUSE <.MEM_4>
# PT = nonlocal escaped
_17 = MEM[(struct .omp_data_s.0D.1713 &).omp_data_i_16(D) clique 1 base
1].pDataD.1719;
# VUSE <.MEM_4>
_18 = *_17[idx_1];
# VUSE <.MEM_4>
# PT = nonlocal escaped
_21 = MEM[(struct .omp_data_s.0D.1713 &).omp_data_i_16(D) clique 1 base
1].resultsD.1721;
# .MEM_22 = VDEF <.MEM_4>
*_21[idx_1] = _20;
...
It is reasonable to assume that we no longer are able to relate back these
loads in the split-off function to pData and result in the donor function, due
to the fact that there's no longer a direct function call to main._omp_fn in
the donor function.
On 4.6, that direct function call to main._omp_fn still existed. On trunk, not
anymore.