Bug 42497 - Generate conditional tail calls .
Summary: Generate conditional tail calls .
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.5.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
Keywords: missed-optimization
Depends on:
Reported: 2009-12-25 08:55 UTC by Carrot
Modified: 2015-09-02 02:45 UTC (History)
5 users (show)

See Also:
Host: i686-linux
Target: arm-eabi
Build: i686-linux
Known to work:
Known to fail:
Last reconfirmed: 2009-12-30 17:32:59


Note You need to log in before you can comment on or make changes to this bug.
Description Carrot 2009-12-25 08:55:15 UTC
Compile following code with options -march=armv5te -O2

extern void *memcpy(void *dst, const void *src, int n);

void *memmove(void *dst, const void *src, int n)
   const char *p = src;
   char *q = dst;
   if (__builtin_expect(q < p, 1)) {
          return memcpy(dst, src, n);
   } else {
          int i=0;
          for (; i<n; i++)
               q[i] = p[i];
   return dst;

gcc generates:

        cmp     r1, r0              
        str     r4, [sp, #-4]!
        mov     r3, r0
        mov     ip, r1
        mov     r4, r2
        bls     .L8
        ldmfd   sp!, {r4}
        b       memcpy
        cmp     r2, #0
        movgt   r2, #0
        ble     .L4
        ldrb    r1, [ip, r2]    @ zero_extendqisi2
        strb    r1, [r3, r2]
        add     r2, r2, #1
        cmp     r2, r4
        bne     .L5
        mov     r0, r3
        ldmfd   sp!, {r4}
        bx      lr

The if block is expected to be more frequent than the else block, but the generated code is not very efficient. Better code could be:

        cmp     r1, r0              
        bhi     memcpy
        str     r4, [sp, #-4]!
        mov     r3, r0
        mov     ip, r1
        mov     r4, r2
Comment 1 Ramana Radhakrishnan 2009-12-30 17:32:59 UTC
The problem here essentially appears to be that GCC can't seem to generate conditional tail-calls (or conditional calls for that matter in this case) with -fno-optimize-sibling-calls . I don't read this as a problem with builtin_expect per-se but that of GCC not being able to generate a conditional tail-call / call.

A simpler test is essentially the following. 

void foo (int x)
  if (x)
    bar ();
    baz ();

This is also not just a target problem but probably one for the RTL optimizers rather any where else..

Comment 2 Richard Biener 2009-12-31 15:33:01 UTC
What do you expect with -fno-optimize-sibling-calls ...
Comment 3 Peter Cordes 2015-09-02 02:45:58 UTC
This bug is still present with gcc 5.2 -O3 (which does include -foptimize-sibling-calls).

void fire_special_event(void);
void conditional_call(int cond) {  if(cond) fire_special_event();   }

 The above code compiles to (x86-64 gcc 5.2 -O3)

	testl	%edi, %edi
	jne	.L4
	rep ret
        jmp	fire_special_event

  This sequence would be better:
	testl	%edi, %edi
	jne	fire_special_event

godbolt link: https://goo.gl/0K6EZx
Later functions in that listing are related to

 Is there a linker limitation on relocations for conditional-branch targets that aren't part of the current compilation unit?  neither clang 3.7 nor icc 13 do any better than gcc.  It seems to work for me when modifying the asm by hand to  	jnz	_Z18fire_special_eventv, and linking to a separately-compiled definition.