This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/50856] New: ARM: suboptimal code for absolute difference calculation


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856

             Bug #: 50856
           Summary: ARM: suboptimal code for absolute difference
                    calculation
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: siarhei.siamashka@gmail.com


gcc generates suboptimal code on ARM for "abs(a - b)" type of operation, which
is used for example in paeth png filter: http://www.w3.org/TR/PNG-Filters.html

Given the following test code:


int absolute_difference1(unsigned char a, unsigned char b)
{
    return a > b ? a - b : b - a;
}

int absolute_difference2(unsigned char a, unsigned char b)
{
    int tmp = a;
    if ((tmp -= b) < 0)
        tmp = -tmp;
    return tmp;
}


The current gcc svn trunk (r180383) generates the following code for -O2 and
-Os optimizations:

        .cpu arm10tdmi
        .eabi_attribute 27, 3
        .eabi_attribute 28, 1
        .fpu vfp
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .eabi_attribute 26, 2
        .eabi_attribute 30, 4
        .eabi_attribute 34, 0
        .eabi_attribute 18, 4
        .file   "test.c"
        .text
        .align  2
        .global absolute_difference1
        .type   absolute_difference1, %function
absolute_difference1:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        cmp     r0, r1
        rsbhi   r0, r1, r0
        rsbls   r0, r0, r1
        bx      lr
        .size   absolute_difference1, .-absolute_difference1
        .align  2
        .global absolute_difference2
        .type   absolute_difference2, %function
absolute_difference2:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        rsb     r0, r1, r0
        cmp     r0, #0
        rsblt   r0, r0, #0
        bx      lr
        .size   absolute_difference2, .-absolute_difference2
        .ident  "GCC: (GNU) 4.7.0 20111024 (experimental)"
        .section        .note.GNU-stack,"",%progbits

Even in the quite explicit second code variant ('absolute_difference2'
function), gcc does not generate the expected SUBS + NEGLT pair of
instructions. Also for ARMv6 capable processors even a single USAD8 instruction
could be used here if both operands are known to have values in [0-255] range
and if high latency of this instruction can be hidden.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]