This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
- From: "oleg dot endo at t-online dot de" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 23 Oct 2011 21:56:56 +0000
- Subject: [Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
- Auto-submitted: auto-generated
- References: <bug-50751-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751
--- Comment #4 from Oleg Endo <oleg.endo@t-online.de> 2011-10-23 21:56:56 UTC ---
Created attachment 25582
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25582
Experimental patch for mov.b with displacement addressing
> (In reply to comment #2)
>
> Welcome to the spill-failure-for-class-'R0_REGS' club :-)
The attached patch is an experimental example only. Please ignore wrong
formatting, comments, and the fact that it cripples some of the SH2A support :}
It adds support for 'mov.b @(disp, Rm), R0' and 'mov.b R0, @(disp, Rn)'
instructions. I haven't tested it fully, only building the CSiBE
set for -m4-single -ml.
Although it results in suboptimal code like ...
...
mov.l @(4,r15),r2
mov r1,r0
mov.b r0,@(5,r2)
rts
mov r1,r0 ! redundant
.. or like ...
mov.b @(1,r4),r0
mov r0,r4
mov.b @(2,r5),r0
add r0,r4 ! better: add r4, r0
mov r4,r0 ! not needed if add operands are swapped
mov.b r0,@(5,r6)
rts
mov r4,r0 ! redundant
.. it already shows some code size improvements:
avg: -563.222222 / -0.942376 %
max: compiler 22804 -> 22928 +124 / +0.543764 %
min: OpenTCP-1.0.4 27069 -> 25989 -1080 / -3.989804 %
top 5 files
mpeg2dec-0.3.1 libmpeg2/motion_comp
6044 -> 4796 -1248 / -20.648577 %
libpng-1.2.5 pngrtran
19668 -> 18904 -764 / -3.884482 %
linux-2.4.23-pre3-testplatform arch/testplatform/kernel/traps
6192 -> 5532 -660 / -10.658915 %
lwip-0.5.3.preproc src/core/tcp_input
5424 -> 5040 -384 / -7.079646 %
libmspack test/cabextract_md5
21780 -> 21424 -356 / -1.634527 %
The R0 clobber in the movqi expander and the explicit usage of R0 in the
splits effectively disable some optimizations, but this is the only
thing I could get to work so far.
I've left the straight forward but non-working patterns as comments in the
patch as a reference. Basically, without the R0 clobber in the movqi expander
it eventually ends up like that...
error: insn does not satisfy its constraints:
(insn 737 40 42 4 (set (reg:QI 10 r10)
(mem/c:QI (plus:SI (reg:SI 1 r1 [386])
(const_int 1 [0x1])) [0 *D.4946_20+0 S1 A8]))
{*movqi_m_reg_disp_load}
(nil))
internal compiler error: in reload_cse_simplify_operands, at postreload.c:403
I'm puzzled why the register allocator ignores the constraint "z" when it
starts to run out of registers. In the error case above it tries to produce
something like 'mov.b @(1,r1),r10' which of course is impossible.
Any hints are highly appreciated.