61118 – [12/13/14/15 Regression] Indirect call generated for pthread_cleanup_push with constant cleanup function

Bug 61118 - [12/13/14/15 Regression] Indirect call generated for pthread_cleanup_push with constant cleanup function

Summary: [12/13/14/15 Regression] Indirect call generated for pthread_cleanup_push wit...

Status:	REOPENED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	4.9.0

Importance:	P2 normal
Target Milestone:	12.5
Assignee:	Jeffrey A. Law

URL:
Keywords:	diagnostic

Depends on:
Blocks:

Reported:	2014-05-08 20:45 UTC by Tavian Barnes
Modified:	2024-07-19 12:56 UTC (History)
CC List:	9 users (show)

See Also:	https://sourceware.org/bugzilla/show_bug.cgi?id=21513 48968 65041 21161 54561 83162
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2017-11-21 00:00:00

Attachments
Source file (290 bytes, text/x-csrc) 2014-05-08 20:45 UTC, Tavian Barnes	Details
Preprocessed source file (5.02 KB, text/plain) 2014-05-08 20:46 UTC, Tavian Barnes	Details
Run "gunzip t.i.gz; gcc -O2 -S -Wclobbered t.i" to reproduce the false positives (431.67 KB, application/gzip) 2024-04-29 04:26 UTC, Paul Eggert	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tavian Barnes 2014-05-08 20:45:07 UTC

Created attachment 32762 [details]
Source file

On gcc 4.9.0 (but not on 4.8.2 or earlier), the attached code gives two spurious warnings:

$ gcc -std=gnu99 -Wclobbered -O2 -c clobber_warning.c -pthread -save-temps
clobber_warning.c: In function ‘dmnsn_future_wait’:
clobber_warning.c:23:54: warning: variable ‘__cancel_routine’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]
     pthread_cleanup_push(cleanup_fn, &future->mutex);
                                                      ^
clobber_warning.c:23:103: warning: variable ‘__cancel_arg’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]
     pthread_cleanup_push(cleanup_fn, &future->mutex);

Those variables come from the expansion of the pthread_cleanup_{push,pop} macros, which look like this (reformatted a little):

do {
  __pthread_unwind_buf_t __cancel_buf;
  void (*__cancel_routine) (void *) = (cleanup_fn);
  void *__cancel_arg = (&future->mutex);
  int __not_first_call = __sigsetjmp ((struct __jmp_buf_tag *) (void *) __cancel_buf.__cancel_jmp_buf, 0);
  if (__builtin_expect ((__not_first_call), 0)) {
    __cancel_routine (__cancel_arg);
    __pthread_unwind_next (&__cancel_buf);
  }
  __pthread_register_cancel (&__cancel_buf);
  do {;
    pthread_cond_wait(&future->cond, &future->mutex);
    do { } while (0);
  } while (0);
  __pthread_unregister_cancel (&__cancel_buf);
  if (0)
    __cancel_routine (__cancel_arg);
} while (0);

The __cancel_routine and __cancel_arg variables are never modified, so I don't see how they can be clobbered.

Comment 1 Tavian Barnes 2014-05-08 20:46:11 UTC

Created attachment 32763 [details]
Preprocessed source file

Comment 2 Daniel Pouzzner 2015-04-02 05:48:30 UTC

Alas still there in 4.9.1 as bundled with Ubuntu 14.10.

Comment 3 Arthur LAMBERT 2015-09-10 10:16:55 UTC

Same problem for me after updating my gcc version from 4.8.4 to 4.9.3.

I have a piece of code using pthread_clean_push/pop which lead to the same warning :

error: variable ‘x’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered]
error: variable ‘y’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered]
error: variable ‘__cancel_routine’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered]

Just need to comment the pthread_clean_X call to remove the warning.

Any update on this ?

Comment 4 Gladkov Dmitry 2016-03-24 06:38:55 UTC

The same has been seen by me in gcc 4.9.2 on SLES 12.
The __cancel_routine variable is never modifed, but I see the following:

error: variable __cancel_routine might be clobbered by "longjmp" or "vfork" [-Werror=clobbered]
 #define ThreadCleanUpPush_m(func,arg)     pthread_cleanup_push((func),\

Please, provide any workaround for the problem.

Comment 5 Hervé Codina 2016-11-07 16:24:25 UTC

Same problem using Tavian attached source file on gcc 4.9.3 and 5.2.0.

Note that the problem appears when optimization is turned on (flag -O1 or more)
Without optimization, no warning.

Comment 6 Yuri Gribov 2017-03-19 09:01:59 UTC

Hm, isn't this a valid warning about bad code in pthread.h? As can be see in https://github.com/gcc-mirror/gcc/blob/67b5f619e75b3085374598d6f0626ec00d4c9de6/gcc/function.c#L4389 , the warning is issued for variables which are alive after return from longjmp but not marked as volatile. Such variables will have undefined value according to C standard (http://pubs.opengroup.org/onlinepubs/7908799/xsh/longjmp.html).

Namely in this case, __cancel_arg and __cancel_routine are used after longjmp (it's the case of __not_first_call != 0) and are not marked volatile so their values are effectively undefined. I checked that adding volatile to them supresses warnings.

Comment 7 Tavian Barnes 2017-03-20 01:30:37 UTC

> the warning is issued for variables which are alive after return from longjmp but not marked as volatile. Such variables will have undefined value according to C standard (http://pubs.opengroup.org/onlinepubs/7908799/xsh/longjmp.html).

But this condition is not met:

> - They are changed between the setjmp() invocation and longjmp() call.

Comment 8 Yuri Gribov 2017-03-20 06:49:28 UTC

(In reply to Yuri Gribov from comment #6)
> the warning is issued for variables which are alive after return
> from longjmp

Meant setjmp of course.

Comment 9 Yuri Gribov 2017-03-20 07:07:05 UTC

(In reply to Tavian Barnes from comment #7)
> But this condition is not met:
> 
>> - They are changed between the setjmp() invocation and longjmp() call.

I think it is - __cancel_arg is assigned inside a while loop and can thus be overwritten on successive iterations (compiler doesn't know that longjmp will only be from the same iteration).

Comment 10 Tavian Barnes 2017-03-20 18:28:23 UTC

> I think it is - __cancel_arg is assigned inside a while loop

Specifically a do { } while(0); loop, which obviously has only one iteration.

Comment 11 Yuri Gribov 2017-03-21 18:26:28 UTC

(In reply to Tavian Barnes from comment #10)
> > I think it is - __cancel_arg is assigned inside a while loop
> 
> Specifically a do { } while(0); loop, which obviously has only one iteration.

Actually I was talking about surrounding while ((double)future->progress/future->total < progress)...

Comment 12 Paul Eggert 2017-11-20 01:19:17 UTC

(In reply to Yuri Gribov from comment #11)
> (In reply to Tavian Barnes from comment #10)
> > > I think it is - __cancel_arg is assigned inside a while loop
> > 
> > Specifically a do { } while(0); loop, which obviously has only one iteration.
> 
> Actually I was talking about surrounding while
> ((double)future->progress/future->total < progress)...

The variables in question do not survive from one iteration to the next of the surrounding while loop, so they cannot contribute to a setjmp/longjmp problem. The code looks like this:

  while ((double)future->progress/future->total < progress) {
    ...
    void (*__cancel_routine) (void *) = (cleanup_fn);
    void *__cancel_arg = (&future->mutex);
    if (__sigsetjmp (((struct __jmp_buf_tag *) (void *)
		      __cancel_buf.__cancel_jmp_buf),
		     0)) {
      __cancel_routine (__cancel_arg);
      ...
    }
    ...
  }

As the addresses of the locals do not escape and they are never assigned to after initialization and they do not survive until the next call to __sigsetjmp, the warnings are false alarms.

Possibly GCC is hoisting the locals out of the loop, incorrectly transforming the above code into this:

  void (*__cancel_routine) (void *);
  void *__cancel_arg;
  while ((double)future->progress/future->total < progress) {
    ...
    __cancel_routine = (cleanup_fn);
    __cancel_arg = (&future->mutex);
    if (__sigsetjmp (((struct __jmp_buf_tag *) (void *)
		      __cancel_buf.__cancel_jmp_buf),
		     0)) {
      __cancel_routine (__cancel_arg);
      ...
    }
    ...
  }

where the warning would be valid.

Also see Bug 48968 which has similar symptoms.

Comment 13 Florian Weimer 2017-11-21 19:22:42 UTC

This is not simply a spurious warning (the variables in question are only assigned once and their address is not taken, so C semantics do not allow that they are clobbered by setjmp+longjmp).

I compiled the reproducer with

    pthread_cleanup_pop(1);

to see the cleanup action on the non-cancellation path, too.  All compilation with -fpic to avoid yet another GCC 7 weirdness (movl $cleanup_fn; %eax, call *%rax).

GCC 4.8.5 20150623 (Red Hat 4.8.5-22) produces:

        call    pthread_cond_wait@PLT
        movq    %r14, %rdi
        call    __pthread_unregister_cancel@PLT
        movq    %rbx, %rdi
        call    cleanup_fn@PLT

GCC 7.2.1 20170829 (Red Hat 7.2.1-1) gives me:

        movq    cleanup_fn@GOTPCREL(%rip), %rax
        movq    %rax, 32(%rsp)
…
        call    pthread_cond_wait@PLT
        movq    %rbx, %rdi
        call    __pthread_unregister_cancel@PLT
        movq    32(%rsp), %rax
        movq    40(%rsp), %rdi
        call    *%rax

This is quite unsatisfying from a security hardening point of view because we now have an unencrypted function pointer on the stack.  This situation is somewhat analogous to Windows SEH, which turned out very problematic.  (The jmp_buf on the stack isn't ideal, but at least the PC value in it is mangled.)

Comment 14 Paul Eggert 2017-11-25 18:15:03 UTC

Also please see related reports Bug 21161, Bug 48968, Bug 54561, Bug 65041, and Bug 83162. The last-listed one also is a regression, perhaps induced by the fancier optimization in recent GCC versions. I suspect that the bugs should be merged.

Comment 15 Eric Gallager 2017-11-25 21:57:15 UTC

(In reply to Paul Eggert from comment #14)
> Also please see related reports Bug 21161, Bug 48968, Bug 54561, Bug 65041,
> and Bug 83162. The last-listed one also is a regression, perhaps induced by
> the fancier optimization in recent GCC versions. I suspect that the bugs
> should be merged.

I can merge them once we choose which of them to use as the base bug.

Comment 16 Joseph S. Myers 2017-12-14 18:32:02 UTC

I think the -Wclobbered warning probably needs to be reimplemented on GIMPLE, in a way that actually looks at whether it's possible for a variable to be set after the returns-twice call and before the variable goes out of scope, but without depending on the variable being assigned to a pseudo and basing things on the properties of that pseudo.

Comment 17 Yury Gribov 2017-12-14 18:43:32 UTC

(In reply to Joseph S. Myers from comment #16)
> I think the -Wclobbered warning probably needs to be reimplemented on
> GIMPLE

FYI I tried implementing this in GIMPLE earlier this year but it was complicated by the fact that GCC created a single abnormal dispatcher block for all setjmps in a function which made tracking data dependencies impossible.

Comment 18 Eric Gallager 2017-12-30 02:23:08 UTC

(In reply to Eric Gallager from comment #15)
> (In reply to Paul Eggert from comment #14)
> > Also please see related reports Bug 21161, Bug 48968, Bug 54561, Bug 65041,
> > and Bug 83162. The last-listed one also is a regression, perhaps induced by
> > the fancier optimization in recent GCC versions. I suspect that the bugs
> > should be merged.
> 
> I can merge them once we choose which of them to use as the base bug.

OK decided; merging into bug 21161 as per Paul Eggert's comments in bug 48968

*** This bug has been marked as a duplicate of bug 21161 ***

Comment 19 Jeffrey A. Law 2018-01-02 19:20:55 UTC

Note you lost the regression marker when this was made a duplicate of 21161.  So it's unlikely anyone would have looked at it until the next release cycle.

My understanding from Florian is that at least some of the issues here are regressions, so I've added the marker to 21161.

Comment 20 Eric Gallager 2018-01-02 23:48:51 UTC

(In reply to Jeffrey A. Law from comment #19)
> Note you lost the regression marker when this was made a duplicate of 21161.
> So it's unlikely anyone would have looked at it until the next release cycle.
> 
> My understanding from Florian is that at least some of the issues here are
> regressions, so I've added the marker to 21161.

Oops, sorry, thanks for fixing it

Comment 21 Jeffrey A. Law 2018-01-02 23:53:36 UTC

No problem Eric.  I'm monitoring on behalf of Florian who'd really like to see this fixed for gcc-8.

Actually just noticed it still wasn't showing up in the queries.  It didn't have a target milestone set either, which I've just fixed.

Comment 22 Jeffrey A. Law 2018-02-21 19:34:12 UTC

I don't think this is a duplicate of 21161.

AFAICT the core of the problem here is the multiple static assignments to the pseudos holding cancel_arg and cancel_routine.

They set the objects to the same value, but the multiple-assignment nature is a key aspect of the longjmp clobbering analysis.

I think the multiple assignment largely steps from the PHIs in this block:

;;   basic block 5, loop depth 0, count 536656163 (estimated locally), maybe hot
;;    prev block 4, next block 6, flags: (NEW, REACHABLE, IRREDUCIBLE_LOOP, VISITED)
;;    pred:       4 [always (guessed)]  count:268328082 (estimated locally) (FALLTHRU,IRREDUCIBLE_LOOP,EXECUTABLE)
;;                6 [always (guessed)]  count:1716613 (estimated locally) (ABNORMAL,DFS_BACK,IRREDUCIBLE_LOOP,EXECUTABLE)
;;                3 [50.0% (guessed)]  count:268328082 (estimated locally) (IRREDUCIBLE_LOOP,FALSE_VALUE,EXECUTABLE)
  # __cancel_routine_9(ab) = PHI <cleanup_fn(4), __cancel_routine_10(ab)(6), cleanup_fn(3)>
  # __cancel_arg_12(ab) = PHI <_1(4), __cancel_arg_13(ab)(6), _1(3)>
  _24 = __sigsetjmp (&__cancel_buf.__cancel_jmp_buf, 0);
  goto <bb 7>; [99.96%]


This happens in CCP1 and FRE1 for cancel_routine and cancel_arg respectively. (replacinging a cancel_{arg,routine} with an equivalent).   Once replaced, it's exceedingly hard to undo.  One could easily argue that we shouldn't to the replacement in an abnormal PHI.

Regardless, this is separate from 21161.  There is a slim chance that fixing 21161 is a requirement to fix this BZ, but they are not duplicates AFAICT.

Comment 23 Jeffrey A. Law 2018-02-21 23:59:20 UTC

Note that if we fix the abnormal handler to target the block after the setjmp rather than the setjmp itself all the problems magically disappear.  That's actually a more accurate CFG and arguably the right thing to do.

The concern is whether or not there are assumptions, particularly in the gimple->rtl phase that fixing the CFG would break.

Comment 24 Jakub Jelinek 2018-10-26 10:12:23 UTC

GCC 6 branch is being closed

Comment 25 Richard Biener 2019-11-14 07:52:03 UTC

The GCC 7 branch is being closed, re-targeting to GCC 8.4.

Comment 26 Jakub Jelinek 2020-03-04 09:44:16 UTC

GCC 8.4.0 has been released, adjusting target milestone.

Comment 27 Jakub Jelinek 2021-05-14 09:47:12 UTC

GCC 8 branch is being closed.

Comment 28 Richard Biener 2021-06-01 08:06:17 UTC

GCC 9.4 is being released, retargeting bugs to GCC 9.5.

Comment 29 Richard Biener 2022-05-27 09:35:12 UTC

GCC 9 branch is being closed

Comment 30 Jakub Jelinek 2022-06-28 10:30:55 UTC

GCC 10.4 is being released, retargeting bugs to GCC 10.5.

Comment 31 Richard Biener 2023-07-07 10:30:17 UTC

GCC 10 branch is being closed.

Comment 32 Paul Eggert 2024-04-29 04:26:40 UTC

Created attachment 58065 [details]
Run "gunzip t.i.gz; gcc -O2 -S -Wclobbered t.i" to reproduce the false positives

I ran into this bug today when compiling GNU Emacs with gcc (GCC) 14.0.1 20240411 (Red Hat 14.0.1-0) on x86-64 (Fedora 40). I didn't see it with earlier GCC releases so I thought I'd attach a test case, derived from Emacs. Compile with:

gunzip t.i
gcc -O2 -S -Wclobbered t.i

and the incorrect diagnostics are:

t.i: In function ‘internal_lisp_condition_case’:
t.i:7969:15: warning: variable ‘sym’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]
 7969 |   Lisp_Object sym = XSYMBOL_WITH_POS (a)->sym;
      |               ^~~
t.i:94273:43: warning: argument ‘var’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]
94273 | internal_lisp_condition_case (Lisp_Object var, Lisp_Object bodyform,
      |                               ~~~~~~~~~~~~^~~

Comment 33 Andrew Pinski 2024-04-29 04:41:13 UTC

(In reply to Paul Eggert from comment #32)
> Created attachment 58065 [details]
> Run "gunzip t.i.gz; gcc -O2 -S -Wclobbered t.i" to reproduce the false
> positives
> 
> I ran into this bug today when compiling GNU Emacs with gcc (GCC) 14.0.1
> 20240411 (Red Hat 14.0.1-0) on x86-64 (Fedora 40). I didn't see it with
> earlier GCC releases so I thought I'd attach a test case, derived from
> Emacs. Compile with:

It is there for GCC 11.4.1 for me:
[apinski@xeond2 upstream-gcc-new]$ gcc ~/src/t.i -S -O2 -S -Wclobbered -fdump-tree-optimized
/home/apinski/src/t.i: In function ‘internal_lisp_condition_case’:
/home/apinski/src/t.i:7969:15: warning: variable ‘sym’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]
 7969 |   Lisp_Object sym = XSYMBOL_WITH_POS (a)->sym;
      |               ^~~
/home/apinski/src/t.i:94273:43: warning: argument ‘var’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]
94273 | internal_lisp_condition_case (Lisp_Object var, Lisp_Object bodyform,
      |                               ~~~~~~~~~~~~^~~
[apinski@xeond2 upstream-gcc-new]$ gcc ~/src/t.i -S -O2 -S -Wclobbered -fdump-tree-optimized --version
gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Comment 34 Richard Biener 2024-07-19 12:56:58 UTC

GCC 11 branch is being closed.