Bug 8603 - [Alpha] s?addl pattern doesn't work
Summary: [Alpha] s?addl pattern doesn't work
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.2.1
: P3 enhancement
Target Milestone: 4.3.5
Assignee: Uroš Bizjak
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2002-11-16 02:16 UTC by 161432
Modified: 2009-08-14 08:11 UTC (History)
2 users (show)

See Also:
Host: alpha-linux
Target: alpha-linux
Build:
Known to work: 4.3.5 4.4.2 4.5.0
Known to fail: 4.3.4 4.4.1
Last reconfirmed: 2007-11-14 19:10:09


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description 161432 2002-11-16 02:16:03 UTC
[ Reported to the Debian BTS as report #161432.
  Please CC 161432@bugs.debian.org on replies.
  Log of report can be found at http://bugs.debian.org/161432 ]

[ gcc version was gcc-3.2 branch 20020913 ]

the s?addl and s?subl patterns don't work, as illustrated by these programs:

falk@juist:/tmp% cat test.c
int f(int x, int y) { return 4 * x + y; }
int g(int x) { return 3 * x; }
falk@juist:/tmp% gcc-3.2 -O3 -c test.c
falk@juist:/tmp% objdump -d test.o

test.o:     file format elf64-alpha

Disassembly of section .text:

0000000000000000 <f>:
   0:   41 04 11 42     s4addq  a0,a1,t0
   4:   00 00 3f 40     addl    t0,zero,v0
   8:   01 80 fa 6b     ret
   c:   00 00 fe 2f     unop

0000000000000010 <g>:
  10:   61 05 10 42     s4subq  a0,a0,t0
  14:   00 00 3f 40     addl    t0,zero,v0
  18:   01 80 fa 6b     ret
  1c:   00 00 fe 2f     unop

Release:
3.2.1 (Debian) (Debian unstable)

Environment:
System: Debian GNU/Linux (unstable)
Architecture: alpha
host: alpha-linux
Configured with: /home/packages/gcc/3.2/gcc-3.2-3.2.1ds5/src/configure -v --enable-languages=c,c++,java,f77,proto,objc,ada --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.2 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-java-gc=boehm --enable-objc-gc alpha-linux
Comment 1 Dara Hazeghi 2003-05-26 05:13:09 UTC
Hello,

with gcc 3.2.3, I get the reported behavior. With gcc 3.3 branch and mainline (20030524) I the 
following result. Is this correct? Thanks,

Dara

test.o:     file format elf64-alpha

Disassembly of section .text:

0000000000000000 <f>:
   0:   41 04 11 42     s4addq  a0,a1,t0
   4:   00 00 e1 43     sextl   t0,v0
   8:   01 80 fa 6b     ret
   c:   00 00 fe 2f     unop    

0000000000000010 <g>:
  10:   61 05 10 42     s4subq  a0,a0,t0
  14:   00 00 e1 43     sextl   t0,v0
  18:   01 80 fa 6b     ret
  1c:   00 00 fe 2f     unop    
Comment 2 falk.hueffner 2003-05-26 06:34:32 UTC
Subject: [Bug target/8603] [Alpha] s?addl pattern doesn't work

> ------- Additional Comments From dhazeghi@yahoo.com  2003-05-26 05:13 -------
> with gcc 3.2.3, I get the reported behavior. With gcc 3.3 branch and
> mainline (20030524) I the following result. Is this correct? Thanks,
> 
> test.o:     file format elf64-alpha
> 
> Disassembly of section .text:
> 
> 0000000000000000 <f>:
>    0:   41 04 11 42     s4addq  a0,a1,t0
>    4:   00 00 e1 43     sextl   t0,v0
>    8:   01 80 fa 6b     ret
>    c:   00 00 fe 2f     unop    

No, that is actually the same. The two instructions should be folded
into a single s4addl a0,a1,v0. There is a pattern for that, it even
looks correct to me and used to work, but it doesn't for some
reason...

Comment 3 Andrew Pinski 2003-05-26 14:21:47 UTC
see submitter comment.
Comment 4 Rask Ingemann Lambertsen 2007-11-14 19:10:09 UTC
For f(), combine wants a pattern to match

(set (reg:DI 76)
    (sign_extend:DI (subreg:SI (plus:DI (subreg:DI (mult:SI (reg:SI 16 $16 [ x ])
                        (const_int 4 [0x4])) 0)
                (reg:DI 17 $17 [ y ])) 0)))

but the closest one is

(define_insn "*saddl_se"
  [(set (match_operand:DI 0 "register_operand" "=r,r")
        (sign_extend:DI
         (plus:SI (mult:SI (match_operand:SI 1 "reg_not_elim_operand" "r,r")
                           (match_operand:SI 2 "const48_operand" "I,I"))
                  (match_operand:SI 3 "sext_add_operand" "rI,O"))))]

and similarily for g() where the "*ssubl_se" pattern doesn't match.

I wonder where the (subreg:DI (mult:SI ...)) part comes from. That can't be right.
Comment 5 Uroš Bizjak 2009-01-07 17:56:33 UTC
Following patch that changes addsi3 and subsi3 expander constraint fixes this problem:

--cut here--
Index: alpha.md
===================================================================
--- alpha.md	(revision 143157)
+++ alpha.md	(working copy)
@@ -261,7 +261,7 @@
   [(set (match_operand:SI 0 "register_operand" "")
 	(plus:SI (match_operand:SI 1 "reg_or_0_operand" "")
 		 (match_operand:SI 2 "add_operand" "")))]
-  "! optimize"
+  ""
   "")
 
 (define_insn "*addsi_internal"
@@ -622,7 +622,7 @@
   [(set (match_operand:SI 0 "register_operand" "")
 	(minus:SI (match_operand:SI 1 "reg_or_0_operand" "")
 		  (match_operand:SI 2 "reg_or_8bit_operand" "")))]
-  "! optimize"
+  ""
   "")
 
 (define_insn "*subsi_internal"
--cut here--

With the patch (gcc -O2):

f:
	s4addl $16,$17,$0
	ret $31,($26),1

g:
	s4subl $16,$16,$0
	ret $31,($26),1

This regression was introduced by:

Sat Feb 23 08:42:47 2002  Richard Kenner  <kenner@vlsi1.ultra.nyu.edu>

	* expr.c (store_expr): When converting expression to promoted
	equivalent type, allow using SUBREG_REG of TARGET as the target
	of the expansion of EXP.
	* loop.c (basic_induction_var, case SUBREG): Always look inside.
	* config/alpha/alpha.c (rtx_equiv_function_matters): Delete decl.
	(alpha_emit_set_const): Handle SImode when can't make new pseudos.
	(alpha_emit_set_const_1, alpha_sa_mask): Use no_new_pseudos.
	* config/alpha/alpha.md (addsi3, subsi3): Don't use if optimizing.

And the comment above addsi3 says:

;; Don't say we have addsi3 if optimizing.  This generates better code.  We
;; have the anonymous addsi3 pattern below in case combine wants to make it.

So, is this still true? Can somebody benchmark this patch? We can perhaps look at csibe numbers.
Comment 6 Matt Turner 2009-08-11 02:38:53 UTC
To show how worthwhile this trivial patch is -- the following table shows the number of times s{4,8}{add,sub}l are used in building the Linux kernel (2.6.31-rc5) with unpatched and patched gcc (4.3.4).

		unpatched	patched
s4addl		53		395
s8addl		79		132
s4subl		0		111
s8subl		0		35

This patch also causes gcc to produce exactly the same output as Compaq's C compiler (this is a good thing!) for the two test cases given in the report.

For example --
Test case:
UP1500 gcc-tests # cat s_addl.c 
int f(int x, int y) { return 4 * x + y; }
int g(int x) { return 3 * x; }

Results with unpatched gcc-4.3.x

UP1500 gcc-tests # gcc-unpatched -O3 -mcpu=ev67 -c s_addl.c 
UP1500 gcc-tests # objdump -d s_addl.o 

s_addl.o:     file format elf64-alpha

Disassembly of section .text:

0000000000000000 <f>:
   0:   40 04 11 42     s4addq  a0,a1,v0
   4:   00 00 e0 43     sextl   v0,v0         <-- unnecessary
   8:   01 80 fa 6b     ret
   c:   00 00 fe 2f     unop

0000000000000010 <g>:
  10:   60 05 10 42     s4subq  a0,a0,v0
  14:   00 00 e0 43     sextl   v0,v0         <-- unnecessary
  18:   01 80 fa 6b     ret
  1c:   00 00 fe 2f     unop

Results with patched gcc-4.3.x

UP1500 gcc-tests # gcc-patched -O3 -mcpu=ev67 -c s_addl.c 
UP1500 gcc-tests # objdump -d s_addl.o 

s_addl.o:     file format elf64-alpha

Disassembly of section .text:

0000000000000000 <f>:
   0:   40 00 11 42     s4addl  a0,a1,v0
   4:   01 80 fa 6b     ret
   8:   1f 04 ff 47     nop
   c:   00 00 fe 2f     unop

0000000000000010 <g>:
  10:   60 01 10 42     s4subl  a0,a0,v0
  14:   01 80 fa 6b     ret
  18:   1f 04 ff 47     nop
  1c:   00 00 fe 2f     unop

Results with Compaq C compiler (what we're trying to replicate)

UP1500 gcc-tests # ccc -fast -host -c s_addl.c 
UP1500 gcc-tests # objdump -d s_addl.o 

s_addl.o:     file format elf64-alpha

Disassembly of section .text:

0000000000000000 <f>:
   0:   40 00 11 42     s4addl  a0,a1,v0
   4:   01 80 fa 6b     ret
   8:   00 00 fe 2f     unop
   c:   00 00 fe 2f     unop

0000000000000010 <g>:
  10:   60 01 10 42     s4subl  a0,a0,v0
  14:   01 80 fa 6b     ret


Please add to gcc-4.3.x and gcc-4.4.x.
Comment 7 Uroš Bizjak 2009-08-11 06:46:32 UTC
(In reply to comment #6)

> Please add to gcc-4.3.x and gcc-4.4.x.

OK, I will post the patch, thanks for the analysis.

Comment 8 uros 2009-08-11 17:06:04 UTC
Subject: Bug 8603

Author: uros
Date: Tue Aug 11 17:05:38 2009
New Revision: 150654

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=150654
Log:
	PR target/8603
	* config/alpha/alpha.md (addsi3): Remove expander.
	(addsi3): Rename from *addsi3_internal insn pattern.
	(subsi3): Remove expander.
	(subsi3): Rename from *subsi3_internal insn pattern.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/alpha/alpha.md

Comment 9 uros 2009-08-13 18:57:32 UTC
Subject: Bug 8603

Author: uros
Date: Thu Aug 13 18:57:15 2009
New Revision: 150723

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=150723
Log:
	PR target/8603
	* config/alpha/alpha.md (addsi3): Remove expander.
	(addsi3): Rename from *addsi3_internal insn pattern.
	(subsi3): Remove expander.
	(subsi3): Rename from *subsi3_internal insn pattern.


Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/config/alpha/alpha.md

Comment 10 uros 2009-08-14 07:41:35 UTC
Subject: Bug 8603

Author: uros
Date: Fri Aug 14 07:41:17 2009
New Revision: 150735

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=150735
Log:
	Backport from mainline:
	2009-08-11  Uros Bizjak  <ubizjak@gmail.com>

	PR target/8603
	* config/alpha/alpha.md (addsi3): Remove expander.
	(addsi3): Rename from *addsi3_internal insn pattern.
	(subsi3): Remove expander.
	(subsi3): Rename from *subsi3_internal insn pattern.


Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/config/alpha/alpha.md

Comment 11 Uroš Bizjak 2009-08-14 08:11:09 UTC
Fixed.