PATCH: PR target/44588: Very inefficient 8bit mod/div
Uros Bizjak
ubizjak@gmail.com
Tue Jun 22 19:12:00 GMT 2010
On Tue, 2010-06-22 at 11:27 -0700, H.J. Lu wrote:
> >> This patch adds 8bit divmov pattern for x86. X86 8bit divide
> >> instructions return result in AX with
> >>
> >> AL <- Quotient
> >> AH <- Remainder
> >>
> >> This patch models it and properly extends quotient. Tested
> >> on Intel64 with -m64 and -m32. There are no regressions.
> >> OK for trunk?
> >>
> >> BTW, there is only one divb used in subreg_get_info in
> >> gcc compilers. The old code is
> >>
> >> movzbl mode_size(%r13), %edi
> >> movzbl mode_size(%r14), %esi
> >> xorl %edx, %edx
> >> movl %edi, %eax
> >> divw %si
> >> testw %dx, %dx
> >> jne .L1194
> >>
> >> The new one is
> >>
> >> movzbl mode_size(%r13), %edi
> >> movl %edi, %eax
> >> divb mode_size(%r14)
> >> movzbl %ah, %eax
> >> testb %al, %al
> >> jne .L1194
> >>
> >
> > Hm, something is not combined correctly, I'd say "testb %ah, %ah" is
> > optimal in the second case.
> >
>
> Here is another update adjusted for mov pattern changes in i386.md.
>
> 8bit result is stored in
>
> AL <- Quotient
> AH <- Remainder
>
> If we use AX for quotient in 8bit divmod pattern, we have to make
> sure that AX is valid for quotient. We have to extend AL with UNSPEC
> since AH isn't the part of quotient,. Instead, I use AL for quotient and
> use UNSPEC_MOVQI_EXTZH to extract remainder from AL. Quotient
> access can be optimized very nicely. If remainder is used, we may have
> an extract move for UNSPEC_MOVQI_EXTZH. I think this is a reasonable
> comprise.
Why we need to reinvent movqi_extzv_2 ?
I guess that <u>divqi3 has to be implemented as multiple-set divmod
pattern using strict_low_part subregs to exactly describe in which
subreg quotient and remainder go.
Uros.
More information about the Gcc-patches
mailing list