User account creation filtered due to spam.

Bug 57503 - [5/6/7/8 Regression] Expand uses wrong multiply routine
Summary: [5/6/7/8 Regression] Expand uses wrong multiply routine
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.7.2
: P4 normal
Target Milestone: 5.5
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2013-06-02 08:54 UTC by Georg-Johann Lay
Modified: 2016-08-03 11:32 UTC (History)
2 users (show)

See Also:
Host:
Target: avr
Build:
Known to work:
Known to fail: 4.7.2, 4.8.1, 4.9.0
Last reconfirmed: 2013-10-07 00:00:00


Attachments
cecky.c (173 bytes, text/plain)
2013-06-02 08:54 UTC, Georg-Johann Lay
Details
.expand dump (1.94 KB, text/plain)
2013-06-02 09:00 UTC, Georg-Johann Lay
Details
cecky2.c: More test case (201 bytes, text/plain)
2013-06-02 17:32 UTC, Georg-Johann Lay
Details
cecky2.s: Addembler dump with -dP from 4.8.0 (experimental) 2013-03-06 (1.74 KB, text/plain)
2013-06-02 17:34 UTC, Georg-Johann Lay
Details
.165r.expand: dump file with -Os -mmcu=atmega168 -fno-expensive-optimizations (1.15 KB, text/plain)
2013-06-03 15:00 UTC, Georg-Johann Lay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Georg-Johann Lay 2013-06-02 08:54:32 UTC
Created attachment 30241 [details]
cecky.c

== Test case ==

#include <stdlib.h>

long __attribute__((__noinline__))
rot (unsigned char v)
{
    unsigned vs = v * 255;
    return (long) vs * 254;
}

int main (void)
{
    long r = rot (129);

    if (r != 8355330)
        abort();
    exit (0);
}

This program hits abort if compiled with 4.7.2 or 4.8.0.

== Command line ==

avr-gcc -mmcu=atmega168 cecky.c -Os -save-temps -dp -funsigned-char -o cecky.elf -fdump-rtl-expand-details

== Configure == 

Configured with: ../../gcc.gnu.org/gcc-4_7-branch/configure --target=avr --prefix=/local/gnu/install/gcc-4.7 --enable-languages=c,c++ --disable-nls --disable-shared --with-dwarf2 --with-avrlibc=yes
Comment 1 Georg-Johann Lay 2013-06-02 09:00:15 UTC
Created attachment 30242 [details]
.expand dump

Notice that in rot(), long D.1484_5 is unused and instead the 32-bit value D.1482_3 is used.

BTW, what does "w*" mean?



;; Function rot (rot, funcdef_no=4, decl_uid=1471, cgraph_uid=4)

rot (unsigned char v)
{
  long int D.1484;
  long int D.1483;
  int D.1482;
  int D.1481;

  # BLOCK 2 freq:10000
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  D.1481_2 = (int) v_1(D);
  D.1482_3 = v_1(D) w* 255;
  D.1484_5 = (long int) D.1482_3;
  D.1483_6 = D.1482_3 w* 254;
  return D.1483_6;
  # SUCC: EXIT [100.0%] 
}
Comment 2 Georg-Johann Lay 2013-06-02 09:02:47 UTC
(In reply to Georg-Johann Lay from comment #1)
> Created attachment 30242 [details]
> .expand dump
> 
> Notice that in rot(), long D.1484_5 is unused and instead the 32-bit value
> D.1482_3 is used.

16-bit, of course.  In the s-file you can see that __usmulhisi3 is used.  This is a widening 16-bit unsigned * 16-bit signed multiplication and not correct here.
Comment 3 Andrew Pinski 2013-06-02 15:49:58 UTC
I thought this was fixed for 4.8 by bug 54295.
Comment 4 Georg-Johann Lay 2013-06-02 17:32:44 UTC
Created attachment 30245 [details]
cecky2.c: More test case

(In reply to Andrew Pinski from comment #3)
> I thought this was fixed for 4.8 by bug 54295.

My 4.8 is dated some days before the release candidate, around 2013-03-06.  If I understand correctly, that 4.8.0 (experimental) contains the fix for PR54295 but still generates wrong code?

And I don't see a backport to 4.7.

Thus the fix for PR54295 is not complete, or this is an other, unrelated problem.
Comment 5 Georg-Johann Lay 2013-06-02 17:34:29 UTC
Created attachment 30246 [details]
cecky2.s: Addembler dump with -dP from 4.8.0 (experimental) 2013-03-06
Comment 6 Georg-Johann Lay 2013-06-03 15:00:41 UTC
Created attachment 30252 [details]
.165r.expand: dump file with -Os -mmcu=atmega168 -fno-expensive-optimizations

The tree-ssa part for widening_mul can be disabled by -fno-expensive-optimizations, but the code is still wrong.  Cf. the attached .expand dump of

long
func1 (unsigned char a, unsigned char b, int c)
{
    unsigned ab = a * b;
    return (long) ab * c;
}

ab is computed in insn 10 with zero-extended inputs:

(insn 8 5 9 2 (set (reg:HI 53 [ D.1447 ])
        (zero_extend:HI (reg/v:QI 49 [ a ]))) cecky.c:4 -1
     (nil))
(insn 9 8 10 2 (set (reg:HI 54 [ D.1447 ])
        (zero_extend:HI (reg/v:QI 50 [ b ]))) cecky.c:4 -1
     (nil))
(insn 10 9 11 2 (set (reg:HI 55 [ D.1447 ])
        (mult:HI (reg:HI 53 [ D.1447 ])
            (reg:HI 54 [ D.1447 ]))) cecky.c:4 -1

This is correct, but then in insn 11 ab gets sign-extended even though it is unsigned:

(insn 11 10 12 2 (set (reg:SI 56 [ D.1448 ])
        (sign_extend:SI (reg:HI 55 [ D.1447 ]))) cecky.c:5 -1
     (nil))

Insn combine combines this wrong extension into a widening multiplication.
Comment 7 Richard Biener 2013-09-09 08:14:41 UTC
Can you please investigate the behavior on current trunk and the top of the 4.8 branch?
Comment 8 Georg-Johann Lay 2013-10-07 09:19:51 UTC
(In reply to Richard Biener from comment #7)
> Can you please investigate the behavior on current trunk and the top of the
> 4.8 branch?

Using the code from comment #7 the problem persists

$ avr-gcc -S -Os pr57503.c -mmcu=atmega8 -dP


long
func1 (unsigned char a, unsigned char b, int c)
{
    unsigned ab = a * b;
    return (long) ab * c;
}


The code is a bit different but still wrong:


func1:
 ; (insn 10 5 25 (set (reg:HI 18 r18 [orig:44 D.1450 ] [44])
 ;         (mult:HI (zero_extend:HI (reg:QI 24 r24 [ a ]))
 ;             (zero_extend:HI (reg/v:QI 22 r22 [orig:50 b ] [50])))) pr57503.c:4 168 {umulqihi3}
 ;      (expr_list:REG_DEAD (reg:QI 24 r24 [ a ])
 ;         (expr_list:REG_DEAD (reg/v:QI 22 r22 [orig:50 b ] [50])
 ;             (nil))))
	mul r24,r22	 ;  10	umulqihi3	[length = 3]
	movw r18,r0
	clr __zero_reg__
 ; (insn 25 10 26 (set (reg:HI 26 r26)
 ;         (reg/v:HI 20 r20 [orig:51 c ] [51])) pr57503.c:5 82 {*movhi}
 ;      (expr_list:REG_DEAD (reg/v:HI 20 r20 [orig:51 c ] [51])
 ;         (nil)))
	movw r26,r20	 ;  25	*movhi/1	[length = 1]
 ; (insn 26 25 21 (set (reg:SI 22 r22)
 ;         (mult:SI (sign_extend:SI (reg:HI 18 r18))
 ;             (sign_extend:SI (reg:HI 26 r26)))) pr57503.c:5 231 {*mulhisi3_call}
 ;      (expr_list:REG_DEAD (reg:HI 26 r26)
 ;         (expr_list:REG_DEAD (reg:HI 18 r18)
 ;             (nil))))
	rcall __mulhisi3	 ;  26	*mulhisi3_call	[length = 1]
 ; (jump_insn 31 21 30 (return) pr57503.c:6 451 {return}
 ;      (nil)
 ;  -> return)
	ret	 ;  31	return	[length = 1]


Insn 26 sign-extends both inputs but R18 (unsigned ab) should be zero-extended.

Tested with SVN 203240

gcc version 4.9.0 20131007 (experimental) (GCC)
Comment 9 Georg-Johann Lay 2013-10-07 10:18:03 UTC
(In reply to Richard Biener from comment #7)
> Can you please investigate the behavior on current trunk and the top of the
> 4.8 branch?

Same applies to the 4.8 branch, again it is insn 26 that is wrong:

func1:
 ; (insn 10 5 25 (set (reg:HI 18 r18 [orig:44 D.1447 ] [44])
 ;         (mult:HI (zero_extend:HI (reg:QI 24 r24 [ a ]))
 ;             (zero_extend:HI (reg/v:QI 22 r22 [orig:50 b ] [50])))) pr57503.c:4 168 {umulqihi3}
 ;      (expr_list:REG_DEAD (reg:QI 24 r24 [ a ])
 ;         (expr_list:REG_DEAD (reg/v:QI 22 r22 [orig:50 b ] [50])
 ;             (nil))))
	mul r24,r22	 ;  10	umulqihi3	[length = 3]
	movw r18,r0
	clr __zero_reg__
 ; (insn 25 10 26 (set (reg:HI 26 r26)
 ;         (reg/v:HI 20 r20 [orig:51 c ] [51])) pr57503.c:5 82 {*movhi}
 ;      (expr_list:REG_DEAD (reg/v:HI 20 r20 [orig:51 c ] [51])
 ;         (nil)))
	movw r26,r20	 ;  25	*movhi/1	[length = 1]
 ; (insn 26 25 21 (set (reg:SI 22 r22)
 ;         (mult:SI (sign_extend:SI (reg:HI 18 r18))
 ;             (sign_extend:SI (reg:HI 26 r26)))) pr57503.c:5 231 {*mulhisi3_call}
 ;      (expr_list:REG_DEAD (reg:HI 26 r26)
 ;         (expr_list:REG_DEAD (reg:HI 18 r18)
 ;             (nil))))
	rcall __mulhisi3	 ;  26	*mulhisi3_call	[length = 1]
 ; (jump_insn 31 21 30 (return) pr57503.c:6 451 {return}
 ;      (nil)
 ;  -> return)
	ret	 ;  31	return	[length = 1]
	.size	func1, .-func1
	.ident	"GCC: (GNU) 4.8.2 20131007 (prerelease)"
Comment 10 Richard Biener 2014-06-12 13:46:58 UTC
The 4.7 branch is being closed, moving target milestone to 4.8.4.
Comment 11 Jakub Jelinek 2014-12-19 13:29:53 UTC
GCC 4.8.4 has been released.
Comment 12 Richard Biener 2015-06-23 08:19:05 UTC
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
Comment 13 Jakub Jelinek 2015-06-26 19:55:02 UTC
GCC 4.9.3 has been released.
Comment 14 Richard Biener 2016-08-03 11:02:29 UTC
GCC 4.9 branch is being closed