detection of a variable being decremented to reach zero missed optimization. int main(uint16_t, uint16_t ); int main(uint16_t x, uint16_t y) { uint16_t z = x; while (x > y) { if ( --z == 0 ) return 1; x++; } return 0; } produces with gcc 4.8.0 and 4.8.1 and I expect 4.8.2 as well. compiled with -Os the code at 0x82 to 0x8a uses a compare against zero, when the subi and sbc, leave the zero flag set on a atmega8. 7a: 9c 01 movw r18, r24 7c: 68 17 cp r22, r24 7e: 79 07 cpc r23, r25 80: 38 f4 brcc .+14 ; 0x90 <main+0x16> 82: 21 50 subi r18, 0x01 ; 1 84: 31 09 sbc r19, r1 86: 21 15 cp r18, r1 88: 31 05 cpc r19, r1 8a: 29 f0 breq .+10 ; 0x96 <main+0x1c> 8c: 01 96 adiw r24, 0x01 ; 1 8e: f6 cf rjmp .-20 ; 0x7c <main+0x2> 90: 80 e0 ldi r24, 0x00 ; 0 92: 90 e0 ldi r25, 0x00 ; 0 94: 08 95 ret 96: 81 e0 ldi r24, 0x01 ; 1 98: 90 e0 ldi r25, 0x00 ; 0 9a: 08 95 ret in gcc/config/avr/avr.md the code for (define_insn "add<mode>3" no longer says it alters the zero of negative flag which the 4.7.2 branch did depending on which choice of add & adc or sub & sbc choice it used.
Created attachment 32326 [details] C test case Here is a valid test case. Compiled with avr-gcc 4.8.2 $ avr-gcc pr60486.c -S -Os -mmcu=atmega8 -dp we get: pr60486: movw r18,r24 ; 5 *movhi/1 [length = 1] .L2: cp r22,r24 ; 21 *cmphi/3 [length = 2] cpc r23,r25 brsh .L7 ; 22 branch [length = 1] subi r18,1 ; 13 addhi3_clobber/2 [length = 2] sbc r19,__zero_reg__ cp r18,__zero_reg__ ; 14 *cmphi/2 [length = 2] cpc r19,__zero_reg__ breq .L5 ; 15 branch [length = 1] adiw r24,1 ; 17 addhi3_clobber/1 [length = 1] rjmp .L2 ; 55 jump [length = 1] .L7: ldi r24,0 ; 7 *movhi/2 [length = 2] ldi r25,0 ret ; 49 return [length = 1] .L5: ldi r24,lo8(1) ; 6 *movhi/5 [length = 2] ldi r25,0 ret ; 51 return [length = 1] The superfluous insn is #14 (*cmphi). This worked with 4.7 with the output as follows: pr60486: movw r18,r24 ; 7 *movhi/1 [length = 1] rjmp .L2 ; 48 jump [length = 1] .L4: subi r18,1 ; 15 addhi3_clobber/2 [length = 2] sbc r19,__zero_reg__ breq .L5 ; 17 branch [length = 1] adiw r24,1 ; 19 addhi3_clobber/1 [length = 1] .L2: cp r22,r24 ; 23 *cmphi/3 [length = 2] cpc r23,r25 brlo .L4 ; 24 branch [length = 1] ldi r18,0 ; 9 *movhi/2 [length = 2] ldi r19,0 rjmp .L3 ; 50 jump [length = 1] .L5: ldi r18,lo8(1) ; 8 *movhi/5 [length = 2] ldi r19,0 .L3: movw r24,r18 ; 56 *movhi/1 [length = 1] ret ; 55 return [length = 1]
cc0 should be set appropriately by avr.c:notice_update_cc which eventually calls avr.c:avr_out_plus_1 to get cc0 for addhi3_clobber (insn #13).
Here is a smaller test case with similar artifact (insn #7): extern void foo (unsigned); char v; void pr_60486 (unsigned z) { if (--z == 0) v = 0; foo (z); } pr_60486: sbiw r24,1 ; 6 addhi3_clobber/1 [length = 1] sbiw r24,0 ; 7 *cmphi/1 [length = 1] brne .L9 ; 8 branch [length = 1] sts v,__zero_reg__ ; 10 movqi_insn/3 [length = 2] .L9: rjmp foo ; 14 call_insn/4 [length = 1]
details also posted on avrfreaks.net http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&p=1143389 the bug has existed since the code base copy from 4.7.2 to create the vanilla 4.8.0 so all the 4.8.[0,1,2] suffer this bug namely these lines in avr_out_plus are the culprits IMO. /* Work out the shortest sequence. */ avr_out_plus_1 (op, &len_minus, MINUS, &cc_plus, code_sat, sign, out_label); avr_out_plus_1 (op, &len_plus, PLUS, &cc_minus, code_sat, sign, out_label); notice the &cc_plus and &cc_minus being the wrong way around.
Target issue. (In reply to Darryl Piper from comment #4) > details also posted on avrfreaks.net I am well aware of this post. But GCC reports are supposed to be self contained. So here we go: Root cause is swapped cc_plus and cc_minus in avr.c:avr_out_plus : /* Work out the shortest sequence. */ avr_out_plus_1 (op, &len_minus, MINUS, &cc_plus, code_sat, sign, out_label); avr_out_plus_1 (op, &len_plus, PLUS, &cc_minus, code_sat, sign, out_label); Thus there are also cases where wrong code is generated like the following one: extern void foo (unsigned); char v; void bar (unsigned long z) { if (++z == 0) v = 0; foo (z); } Output is missing the comparison because addsi3 does not set cc0 in a usable way when is uses the PLUS alternative: bar: movw r26,r24 ; 20 *movsi/1 [length = 2] movw r24,r22 adiw r24,1 ; 6 addsi3/2 [length = 3] adc r26,__zero_reg__ adc r27,__zero_reg__ brne .L5 ; 8 branch [length = 1] sts v,__zero_reg__ ; 10 movqi_insn/3 [length = 2] .L5: rjmp foo ; 14 call_insn/4 [length = 1]
Author: gjl Date: Thu Mar 13 09:16:53 2014 New Revision: 208532 URL: http://gcc.gnu.org/viewcvs?rev=208532&root=gcc&view=rev Log: PR target/60486 * config/avr/avr.c (avr_out_plus): Swap cc_plus and cc_minus in calls of avr_out_plus_1. Modified: trunk/gcc/ChangeLog trunk/gcc/config/avr/avr.c
Author: gjl Date: Thu Mar 13 09:35:42 2014 New Revision: 208534 URL: http://gcc.gnu.org/viewcvs?rev=208534&root=gcc&view=rev Log: Backport from 2014-03-13 trunk r208532. PR target/60486 * config/avr/avr.c (avr_out_plus): Swap cc_plus and cc_minus in calls of avr_out_plus_1. Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/config/avr/avr.c
Fixed for 4.8.3.