45472 – [4.7 Regression] [Middle-end volatile semantics] ICE: in move_op_ascend, at sel-sched.c:6124 with -fselective-scheduling2

Bug 45472 - [4.7 Regression] [Middle-end volatile semantics] ICE: in move_op_ascend, at sel-sched.c:6124 with -fselective-scheduling2

Summary: [4.7 Regression] [Middle-end volatile semantics] ICE: in move_op_ascend, at s...

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	4.6.0

Importance:	P2 normal
Target Milestone:	4.8.0
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2010-08-31 23:57 UTC by Zdenek Sojka
Modified:	2019-06-16 18:18 UTC (History)
CC List:	10 users (show)

See Also:	66660
Host:	x86_64-pc-linux-gnu
Target:	x86_64-pc-linux-gnu, powerpc--*
Build:
Known to work:	4.8.0
Known to fail:	4.7.4
Last reconfirmed:	2011-12-09 00:00:00

Attachments
reduced testcase (136 bytes, text/plain) 2010-08-31 23:58 UTC, Zdenek Sojka	Details
scheduler patch (459 bytes, patch) 2013-02-26 09:21 UTC, Andrey Belevantsev	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Zdenek Sojka 2010-08-31 23:57:14 UTC

Command line:
$ gcc -O1 -fschedule-insns2 -fselective-scheduling2 testcase.c

Compiler output:
$ gcc -O1 -fschedule-insns2 -fselective-scheduling2 testcase.c
testcase.c: In function 'foo':
testcase.c:18:1: internal compiler error: in move_op_ascend, at sel-sched.c:6124
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

Tested revisions:
r163636 - crash
r153685 - crash
4.4 r149995 - OK

Comment 1 Zdenek Sojka 2010-08-31 23:58:25 UTC

Created attachment 21630 [details]
reduced testcase

$ gcc -O1 -fschedule-insns2 -fselective-scheduling2 pr45472.c

Comment 2 H.J. Lu 2010-09-01 03:03:31 UTC

It is caused by revision 147282:

http://gcc.gnu.org/ml/gcc-cvs/2009-05/msg00256.html

Comment 3 Andrey Belevantsev 2010-09-20 13:05:09 UTC

We have the code like this:

if (...)
{
 17 cx:DI=[`s2']  //comes from s2.vl += s1.vl;
 ...
}
27 dx:DI=[`s2']   //comes from s1 = s2; 

When the scheduler tries to move insn 27 before if (...), it also unifies its right-hand sides, as they seem equal.  The scheduler wants to get:

27 dx:DI=[`s2']
if (...)
{
  cx = dx
  ...
  dx:DI=[`s2'] // bookkeeping
} 

The insn 17 has its MEM with volatile bit set, the insn 27 has it unset.  It so happens that when gathering the available insn set and when moving the actually selected insn, insns 17 and 27 got merged in the different order.  First we don't have the volatile bit on the resulting insn, thus we believe the load does not trap and we move it up through a jump before the 'if'.  Second we have the bit and thus insn traps and we don't move it, then we hit the consistency assert in the scheduler.

Now, I'm happy to implement the correct merging of the may_trap_p bit in the scheduler which would "fix" this.  However, looking at the original C code it looks like both MEMs should have their volatile bit set.  I can only say that the original bits seem to come from expand, the addresses got propagated by fwprop but this doesn't seem to be the issue.

Comment 4 Alexander Monakov 2010-09-20 14:49:38 UTC

A small testcase to illustrate the problem with volatile fields.

//---8<---
struct vv {volatile long a, b;} vv1, vv2;

int foo()
{
  vv1 = vv2;
}
//---8<---

gcc/cc1 -O2 -frename-registers -fschedule-insns2 vol.c

    movq    vv2+8(%rip), %rax
    movq    vv2(%rip), %rdx
    movq    %rax, vv1+8(%rip)
    movq    %rdx, vv1(%rip)

The compiler reorders accesses to volatile fields.  As Andrey said, /v bits are missing on MEMs even in the .expand dump.

Comment 5 Paolo Bonzini 2010-09-20 16:01:50 UTC

Looks like a problem in expand.  CCing Matz.

Comment 6 Andrey Belevantsev 2010-10-18 10:57:23 UTC

Anybody familiar with the expand/tree level can take a look on this?  I can spare some time later this week if folks are busy.

Comment 7 Michael Matz 2010-10-18 11:46:57 UTC

Well, the problem is not only in expand.  As you say we start with this
code:

...
  vv1 = vv2;
...

That is already a non-volatile statement on the gimple level.  Making the
whole vv1/vv2 object volatile at the top-level of course changes this
into a volatile statement:

...
  vv1 ={v} vv2;
...

The problem with this is that at the gimple level the volatility is not
represented, and hence could already result in invalid transformations.
This extends into expand, so in a way the wrong MEM expressions are
a result of that.

Assuming we'd extend expand to expand such block moves, where parts of the
block are volatile into a series of volatile accesses, we could fix the bug
at hand.  That would still leave the problem of non-volatility on the gimple
level.

This all points to a deeper problem IMO, one we need to decide how to solve
before tackling the details of this bug, namely: how are non-volatile objects
containing volatile subobjects handled?  I don't know if the language
standard says anything about this.

We have several possibilities:
1) make all objects containing volatile subobjects volatile itself
2) make instructions touching such objects volatile
3) make instructions touching the volatile components volatile

The first has the problem that volatile objects with aggregate type
implicitely contain only volatile subobjects, that is for such object:
  struct {int a; volatile int b;} half_volatile;
with solution (1) this would be equivalent to
  volatile struct {int a; volatile int b;} half_volatile;
and hence halt_volatile.a would be volatile too, probably unlike the author
intended.

The other two solutions are more work, especially because the half-volatility
wouldn't be reflected in the objects type.

Independend of what we decide for the middle-end, I'd like to hear the opinion
of the C frontend people about this issue.

Comment 8 Paolo Bonzini 2010-10-18 12:20:39 UTC

Would it make sense to make the statement volatile even if only some subcomponents (or all subcomponents) are volatile?

I like (2); if I understand it correctly, in this case vv1 and vv2 would not be volatile, but you'd still have

   vv1 ={v} vv2;

in the GIMPLE source.  It should be possible to use a bit on {ARRAY,RECORD,UNION,QUAL_UNION}_TYPE to cache this, e.g.

#define TYPE_HAS_VOLATILE_PARTS(T) \
   (AGGREGATE_TYPE_P (T) \
    ? TYPE_UNSIGNED (T) \
    : TYPE_VOLATILE (T))

#define AGGREGATE_TYPE_CHECK(T) \
   TREE_CHECK4(T, ARRAY_TYPE, RECORD_TYPE, UNION_TYPE, QUAL_UNION_TYPE)

#define SET_TYPE_HAS_VOLATILE_PARTS(T, V) \
   (TYPE_UNSIGNED (AGGREGATE_TYPE_CHECK (T)) = (V))

Separately, expand would of course need to be taught about expanding accesses to volatile subcomponents as mem/v.  If this approach was feasible, it would have the advantage of splitting the task in two parts, one for GIMPLE (including possibly the verifier) and one for expand.

Comment 9 Richard Biener 2010-10-18 15:42:44 UTC

(In reply to comment #8)
> Would it make sense to make the statement volatile even if only some
> subcomponents (or all subcomponents) are volatile?
> 
> I like (2); if I understand it correctly, in this case vv1 and vv2 would not be
> volatile, but you'd still have
> 
>    vv1 ={v} vv2;
> 
> in the GIMPLE source.  It should be possible to use a bit on
> {ARRAY,RECORD,UNION,QUAL_UNION}_TYPE to cache this, e.g.
> 
> #define TYPE_HAS_VOLATILE_PARTS(T) \
>    (AGGREGATE_TYPE_P (T) \
>     ? TYPE_UNSIGNED (T) \
>     : TYPE_VOLATILE (T))
> 
> #define AGGREGATE_TYPE_CHECK(T) \
>    TREE_CHECK4(T, ARRAY_TYPE, RECORD_TYPE, UNION_TYPE, QUAL_UNION_TYPE)
> 
> #define SET_TYPE_HAS_VOLATILE_PARTS(T, V) \
>    (TYPE_UNSIGNED (AGGREGATE_TYPE_CHECK (T)) = (V))
> 
> Separately, expand would of course need to be taught about expanding accesses
> to volatile subcomponents as mem/v.  If this approach was feasible, it would
> have the advantage of splitting the task in two parts, one for GIMPLE
> (including possibly the verifier) and one for expand.

If we want to treat vv1 = vv2 as volatile then the frontends can now simply
emit MEM_REF <&vv1> = MEM_REF <&vv2> with TREE_THIS_VOLATILE set and things
should work.  That leaves it up to the frontend on how to deal with this.

The much harder question is how to expand vv1 = vv2 "correctly".  Thus,
we need to define what happens and document it.

Also consider memcpy (&vv1, &vv2) and eventually the compiler optimizing
that to vv1 = vv2 (note the lack of {v} here).

Comment 10 Michael Matz 2010-10-18 15:58:26 UTC

One idea we had was that this is all frontends business anyway, and hence
it should (if it so desires) simply create volatile MEM_REFs for references
to half-volatile objects.  That alone would result in the copy statement
being marked volatile, and would also (I guess, haven't checked) do the right
thing in expand.

So, if we (the frontend) decide that accesses to objects containing volatile
subobjects should itself be regarded as volatile, then generating the right
kind of MEM_REF would already provide that.

Comment 11 jsm-csl@polyomino.org.uk 2010-10-18 16:41:03 UTC

On Mon, 18 Oct 2010, rguenth at gcc dot gnu.org wrote:

> Also consider memcpy (&vv1, &vv2) and eventually the compiler optimizing
> that to vv1 = vv2 (note the lack of {v} here).

Using memcpy when any part of the source or destination is an object 
defined with volatile type has undefined behavior (at runtime).

Comment 12 Paolo Bonzini 2010-10-18 17:12:59 UTC

It would be nice if for

    struct a {
        char a,b,c,d;
        volatile int e;
    };

    struct a v1, v2;
    ...
    v1 = v2;

the compiler emitted only _two_ memory accesses, one for a/b/c/d and one for e.  I'm not sure a MEM_REF(&vv1) = MEM_REF(&vv2) with volatile arguments would achieve this.

So, even though it's just an optimization, it would be better if the IR was designed to allow it.  Would this be the case if the front-end emitted a volatile MODIFY_EXPR?

Comment 13 Richard Biener 2010-12-16 13:03:14 UTC

GCC 4.5.2 is being released, adjusting target milestone.

Comment 14 Andrey Belevantsev 2011-01-13 10:04:18 UTC

Do we want at least the patch properly merging the volatile bits in the scheduler for 4.6?  Or is this better be s plain ICE instead of a silent miscompile?

Comment 15 Andrey Belevantsev 2011-04-08 06:42:22 UTC

Just to remind the middle-end folks that stage 1 is a fine time for deciding on middle-end volatile semantics, before we forget about this for another 6 months :)

Comment 16 Richard Biener 2011-04-28 14:51:25 UTC

GCC 4.5.3 is being released, adjusting target milestone.

Comment 17 Andrew Pinski 2011-12-10 03:57:18 UTC

By the way I think we could get cases where the user wrote volatile in one case and non-volatile in another so fixing up the merging is still a good idea.

Comment 18 Andrey Belevantsev 2012-01-19 09:28:56 UTC

(In reply to comment #17)
> By the way I think we could get cases where the user wrote volatile in one case
> and non-volatile in another so fixing up the merging is still a good idea.

Sure, but the comment 14 still applies -- I think that unless the middle-end will have the proper volatile semantics, it will be wrong to hide this issue in the scheduler, so I'm leaving this as it is for now.

Comment 19 Jason Merrill 2012-02-16 19:41:29 UTC

It seems to me that volatile reads/writes should get their own gimple statements, not be part of a larger block move.  So instead of

  vv1 = vv2;

we should have

  vv1.a ={v} vv2.a;
  vv1.b ={v} vv2.b;

I agree with Paolo's comment in #12 that we want to copy the non-volatile parts as a block when possible.  It seems like breaking a simple struct assignment into these separate statements would be best done in the gimplifier so that front ends don't need to get this right independently.

Out of curiousity, what is the use case for a non-volatile struct object with volatile members?

Comment 20 Zdenek Sojka 2012-02-16 20:14:54 UTC

I can think of two use-cases from threaded environment:
- using the volatile member as a semaphore for the structure
- when one needs to assure some data will be written in certain order (eg. first write data, then the 'data valid' flag), while other members of the structure don't need to be volatile (data used by only one thread, or data that are only read; marking the whole struct volatile would prevent some optimisations)

Other cases:
- when debugging and you want to prevent optimisations of certain variable
- when the structure is allocated on a memory-mapped IO address, and only some parts of that IO need to be marked as volatile
- in OOP, where the variable is a member, and would otherwise be a global volatile variable

Comment 21 Richard Biener 2012-02-20 11:43:01 UTC

(In reply to comment #19)
> It seems to me that volatile reads/writes should get their own gimple
> statements, not be part of a larger block move.  So instead of
> 
>   vv1 = vv2;
> 
> we should have
> 
>   vv1.a ={v} vv2.a;
>   vv1.b ={v} vv2.b;
> 
> I agree with Paolo's comment in #12 that we want to copy the non-volatile parts
> as a block when possible.  It seems like breaking a simple struct assignment
> into these separate statements would be best done in the gimplifier so that
> front ends don't need to get this right independently.
> 
> Out of curiousity, what is the use case for a non-volatile struct object with
> volatile members?

There is no valid use-case for this.  So I think we should just declare this
issue a non-issue (middle-end wise).  If the C or C++ standards say that
vv1 = vv2 should behave as if the copy was elementwise then the frontends
need changing.  Certainly not the gimplifier - that's not the kitchen-sink
for things you don't want to properly describe in GENERIC to the middle-end ;)

Comment 22 Eric Botcazou 2012-02-27 08:45:52 UTC

> If the C or C++ standards say that vv1 = vv2 should behave as if the copy was
> elementwise then the frontends need changing.  Certainly not the gimplifier -
> that's not the kitchen-sink for things you don't want to properly describe in 
> GENERIC to the middle-end ;)

Yes, I think that we wouldn't want this in Ada for example.

Comment 23 Richard Biener 2012-07-02 11:19:03 UTC

The 4.5 branch is being closed, adjusting target milestone.

Comment 24 David Edelsohn 2013-01-05 21:12:05 UTC

This ICE has started to appear for target POWER when compiling testsuite/gcc.dg/tree-prof/pr50907.c

$ /tmp/20130104/gcc/xgcc -B/tmp/20130104/gcc/ /nasfarm/dje/src/src/gcc/testsuite/gcc.dg/tree-prof/pr50907.c  -fno-diagnostics-show-caret   -O -freorder-blocks-and-partition -fschedule-insns -fselective-scheduling -fpic -pthread -fprofile-generate -D_PROFILE_GENERATE  -lm   -o /tmp/20130104/gcc/testsuite/gcc/pr50907.x01

/nasfarm/dje/src/src/gcc/testsuite/gcc.dg/tree-prof/pr45354.c: In function 'test_ifelse2':
/nasfarm/dje/src/src/gcc/testsuite/gcc.dg/tree-prof/pr45354.c:23:1: internal compiler error: in move_op_ascend, at sel-sched.c:6153

Comment 25 Zdenek Sojka 2013-01-05 21:43:55 UTC

(In reply to comment #24)
> This ICE has started to appear for target POWER when compiling
> testsuite/gcc.dg/tree-prof/pr50907.c
> 

That testcase (or pr45354.c which it includes) doesn't contain any 'volatile' keyword, so it is likely a different bug. I would recommend you opening a separate PR for that. This PR is about treating 'volatile' on struct members.

Comment 26 Andrey Belevantsev 2013-02-26 09:21:14 UTC

Created attachment 29539 [details]
scheduler patch

I'm testing the attached patch to fix the ICE in the scheduler.  Then the issue/non-issue of the front-end of not presenting the proper volatile bits to the middle-end can be discussed separately.

Comment 27 Andrey Belevantsev 2013-02-27 08:56:15 UTC

Author: abel
Date: Wed Feb 27 08:56:08 2013
New Revision: 196308

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=196308
Log:
        PR middle-end/45472

	gcc/
        * sel-sched-ir.c (merge_expr): Also change vinsn of merged expr
        when the may_trap_p bit of the exprs being merged differs.

        Reorder tests for speculativeness in the logical and operator.

	testsuite/
	* gcc.dg/45472.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/pr45472.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/sel-sched-ir.c
    trunk/gcc/testsuite/ChangeLog

Comment 28 Andrey Belevantsev 2013-02-27 09:04:31 UTC

Fixed on trunk.

Comment 29 Andrey Belevantsev 2013-04-03 05:59:53 UTC

Ported to 4.7 and 4.6, though no bugzilla commit made.  Do I close the bug or do we want to fix the front-end to produce the proper volatile bits?

Comment 30 Jakub Jelinek 2013-04-12 15:15:58 UTC

GCC 4.6.4 has been released and the branch has been closed.

Comment 31 Richard Biener 2014-06-12 12:57:52 UTC

Fixed for 4.8.0