Bug 25957

Summary: -fstack-protector code on i386/x86_64 can be improved.
Product: gcc Reporter: Andi Kleen <andi-gcc>
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: enhancement CC: gcc-bugs
Priority: P3    
Version: 4.2.0   
Target Milestone: ---   
Host: x86_64-linux Target: x86_64-linux
Build: Known to work:
Known to fail: Last reconfirmed:

Description Andi Kleen 2006-01-25 11:27:07 UTC
Compiling this simple test case with

GNU C version 4.2.0 20060118 (experimental) (x86_64-unknown-linux-gnu)

extern void f2(char *s);
void f(void) 
{
        char x[100];
        f2(x);
}

i get

...
        xorq    __stack_chk_guard(%rip), %rax
        jne     .L5
        addq    $120, %rsp
        ret
.L5:
        .p2align 4,,5
        call    __stack_chk_fail

Suggestions for improvement:
- It shouldn't use p2align 4,,5 for the __stack_chk_fail trampoline
because that wastes space in very infrequent code
- It should use jne to jump the function directly because it 
should never return (when it is called the stack is compromised
and it would be a security hole)
Comment 1 Andrew Pinski 2006-01-25 13:21:38 UTC
Nope and Nope.
The alignment is because well -falign-jumps is default.

__stack_chk_fail is a noreturn function so we don't sib call it because it would be too hard for debugging.
Second jumping directly via an indirect branch is wrong as jumps only have a certain range.
Comment 2 Andi Kleen 2006-01-25 14:15:21 UTC
You're wrong. On i386 and x86-64 call and conditional jumps have the same 
range.

That you're using the normal align-jumps for such uncommon code is a bug in my opinion and that is why I opened this request. It is a waste of space
for code that is known to be uncommon at compile time like this.

The debugging argument seems bogus in this case too. You'll see the
original function in the backtrace, but that is ok here.

Comment 3 Andrew Pinski 2006-01-25 14:21:53 UTC
Actually you will not see the correct backtrace, you will see the function which calls f and not f itself.

The alignment only saves instruction space it does nothing really
Comment 4 Andi Kleen 2006-01-25 14:30:47 UTC
Yes that's the whole point of the bug. To save code space and precious
icache.
Comment 5 Andrew Pinski 2006-01-25 14:33:14 UTC
Use -Os for that.
Comment 6 Andi Kleen 2006-01-25 14:38:28 UTC
-Os is only when there should be a trade off between size and performance.
But there isn't any performance consideration here because this code
is only executed on program abort.

(is there a button to reassign the bug to a non braindead bugmaster?)
Comment 7 Andrew Pinski 2006-01-25 14:43:20 UTC
The alignment does nothing, repeat nothing even if it is not executed that much, it does not change anything because it is last in the function.  Try compiling more than this simple example and you will see that it always last for the emitted asm.  Use -fno-align-jumps which is not default on x86.

And as I have mentioned before jumping directly to the other function is not useful at all (I already filed that bug and was shot down by RTH).
Comment 8 Steven Bosscher 2006-01-25 14:46:18 UTC
It seems to me that not aligning jumps for known infrequent jumps may be useful.  Especially when you get so many of them as you do with ssp.
Comment 9 Andi Kleen 2006-01-25 14:55:51 UTC
Again the alignment wastes precious icache which is enough reason to get
rid of it. If there are a lot of such functions it will also lead to 
worse paging behaviour on big programs.

Can you explain again why it was shot down? Because of the missing frame
in the stack backtrace or something else?
Comment 11 Andi Kleen 2011-10-08 23:27:02 UTC
I just checked and the problem is still there with

 4.7.0 20111002

       xorq    %fs:40, %rax
        jne     .L4
        addq    $120, %rsp
        .cfi_remember_state
        .cfi_def_cfa_offset 8
        ret
.L4:
        .cfi_restore_state
        .p2align 4,,6
        call    __stack_chk_fail
        .cfi_endproc
.LFE0:


unnecessary wasteful alignment for the call to
abort. The basic block should be marked cold.
Comment 12 Andi Kleen 2014-09-26 17:45:54 UTC
Problem is still there in 

gcc50 (GCC) 4.9.0 20130617 (experimental)


The stack protector edge should be cold and alignment disabled.