The following C test case int wmul (char a, char b) { return a * (char) (b << 3); } $ avr-gcc wmul.c -S -Os -mmcu=atmega8 -dp produces with current avr-gcc: wmul: ldi r25,lo8(8) ; 25 movqi_insn/2 [length = 1] muls r22,r25 ; 26 mulqihi3 [length = 3] movw r22,r0 clr __zero_reg__ muls r24,r22 ; 17 mulqihi3 [length = 3] movw r24,r0 clr __zero_reg__ ret ; 29 return [length = 1] .ident "GCC: (GNU) 4.8.0 20121004 (experimental)" avr-gcc-4.7 was smarter with its code: wmul: lsl r22 ; 10 *ashlqi3/5 [length = 3] lsl r22 lsl r22 muls r24,r22 ; 12 mulqihi3 [length = 3] movw r22,r0 clr __zero_reg__ movw r24,r22 ; 31 *movhi/1 [length = 1] ret ; 30 return [length = 1] .ident "GCC: (GNU) 4.7.2" The 4.7 code is faster, smaller and has smaller register pressure.
The following code has the same problem: #include <avr/io.h> #include <stdint.h> uint16_t b; uint8_t a; template<typename A, typename B> B Mul(const A a, const B b) { static constexpr uint8_t shift = (sizeof(B) - sizeof(A)) * 8; return static_cast<A>(b >> shift) * a ; } int main() { return Mul(a, b); } with 4.6.4. it produces: main: lds r24,a lds r25,b+1 mul r25,r24 movw r24,r0 clr r1 ret with actual 12.2 it produces missing optimization: main: lds r24,b+1 ldi r25,0 lds r18,a movw r20,r24 mul r18,r20 movw r24,r0 mul r18,r21 add r25,r0 clr __zero_reg__ ret Interistingly the follwing code produces optimal code also with 12.2: template<typename A, typename B> B MulX(const A a, const B b) { static const uint8_t shift = (sizeof(B) - sizeof(A)) * 8; return static_cast<A>((b >> shift) + 1) * a ; }
The original problem looks to be fixed on mainline. Can you confirm this Wilhelm? If so we can close this PR. With -Os -mmcu=atmega8, we currently generate (the desired): wmul: lsl r22 lsl r22 lsl r22 muls r22,r24 movw r24,r0 clr __zero_reg__ ret
(In reply to Roger Sayle from comment #2) > The original problem looks to be fixed on mainline. Can you confirm this > Wilhelm? If so we can close this PR. > > With -Os -mmcu=atmega8, we currently generate (the desired): > wmul: lsl r22 > lsl r22 > lsl r22 > muls r22,r24 > movw r24,r0 > clr __zero_reg__ > ret Yes, this seems to be fixed in mainline.
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>: https://gcc.gnu.org/g:f006d1a5a1e136be29c78b96c8742ebd3710f4d0 commit r13-7197-gf006d1a5a1e136be29c78b96c8742ebd3710f4d0 Author: Roger Sayle <roger@nextmovesoftware.com> Date: Sun Apr 16 13:03:10 2023 +0100 [Committed] New test case gcc.target/avr/pr54816.c PR target/54816 is now fixed on mainline. This adds a test case to check that it doesn't regress in future. Tested with a cross compiler to avr-elf. Committed as obvious. 2023-04-16 Roger Sayle <roger@nextmovesoftware.com> gcc/testsuite/ChangeLog PR target/54816 * gcc.target/avr/pr54816.c: New test case.
This is now fixed on mainline [but was present in GCC 12.2], and a new test case added to ensure this stays fixed.
(In reply to Roger Sayle from comment #5) > This is now fixed on mainline [but was present in GCC 12.2], and a new test > case added to ensure this stays fixed. Hi Roger, I am having a problem with your new test case in gcc.target/avr/pr54816.c : When we run the testsuite for any device other than ATmega8, it will fail due to the explicit -mmcu=atmega8 in dg-options: xgcc: error: specified option '-mmcu' more than once compiler exited with status 1 FAIL: gcc.target/avr/pr54816.c (test for excess errors) Usually, one would run the testsuite several times for a variety of different devices like ATmega128, ATtiny40, etc. so that explicit -mmcu in dg-options is to be avoided. (The -mmcu will be provided by the board description file like atmega128-sim.exp). If a test requires a specific device, then place it at gcc.target/avr/mmcu/. The avr-mmcu.exp will care to remove unwanted -mmcu to that testcases can set -mmcu as they wish. In your case, as you scan assembly for "muls" instruction, you need some -mmcu that supports MULS (like ATmega8). Hence, could you move pr54816.c to the gcc.target/avr/mmcu subfolder? Alternatively, you can extend lib/target-supports.exp by a new feature like check_effective_target_avr_mul. A new function could be similar to already existing check_effective_target_avr_tiny, but check for built-in macro __AVR_HAVE_MUL__. Then use the new functon as a filter like in /* { dg-do compile { target { avr_mul } } } */
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>: https://gcc.gnu.org/g:911db256258004b2eec9a0ca3fa47f9bcb5c5856 commit r14-168-g911db256258004b2eec9a0ca3fa47f9bcb5c5856 Author: Roger Sayle <roger@nextmovesoftware.com> Date: Sat Apr 22 20:57:28 2023 +0100 [Committed] Move new test case to gcc.target/avr/mmcu/pr54816.c AVR test cases that specify a specific -mmcu option need to be placed in the gcc.target/avr/mmcu subdirectory. Moved thusly. 2023-04-22 Roger Sayle <roger@nextmovesoftware.com> gcc/testsuite/ChangeLog PR target/54816 * gcc.target/avr/pr54816.c: Move to... * gcc.target/avr/mmcu/pr54816.c: ... here.