Bug 35519 - COMBINE repeating same matches and can SEG fault
Summary: COMBINE repeating same matches and can SEG fault
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: 4.3.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 35872
  Show dependency treegraph
 
Reported: 2008-03-09 23:51 UTC by Andy Hutchinson
Modified: 2008-08-22 01:17 UTC (History)
3 users (show)

See Also:
Host: i686-oc-cyqwin
Target: avr-unknown-none
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Patch for consideratiom towards a solution (492 bytes, patch)
2008-03-09 23:52 UTC, Andy Hutchinson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andy Hutchinson 2008-03-09 23:51:31 UTC
This problem potentiall affects all all targets

The data flow information that combine uses can cause Segmentation fault. I have found this with AVR experimental build but it would seem that it can affect any target.

The problem is  that the LOG_LINKS that combine creates from DF  can include multiple references between instruction pairs.

DF  will produce multiple reference between instructions if they share a register that decomposes into several smaller registers.

The multiple cross references are then used by Combine to select instruction pairs and triples to match. This results in repeat trials of the same instructions!


By was on an example, tThe RTL that triggered my problem was:

(insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ])
       (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil))


(insn 46 45 47 4 920625-1.c:55 (set (reg:SI 18 r18)
       (mem:SI (plus:HI (reg:HI 68 [ ivtmp.18 ])
               (const_int 4 [0x4])) [2 S4 A8])) 19 {*movsi} (nil))


(insn 47 46 48 4 920625-1.c:55 (parallel [
           (set (reg:SI 22 r22)
               (mult:SI (reg:SI 22 r22)
                   (reg:SI 18 r18)))
           (clobber (reg:HI 26 r26))
           (clobber (reg:HI 30 r30))
       ]) 43 {*mulsi3_call} (expr_list:REG_DEAD (reg:SI 18 r18)
       (expr_list:REG_UNUSED (reg:HI 30 r30)
           (expr_list:REG_UNUSED (reg:HI 26 r26)
               (nil)))))



This is call to library function, and the parameter for instruction 47 are hard registers like SI:R22 - which is decomposed in DF as  R22,23,24 and 25.
DF marks all 4 sub parts in def/use chains (which seems entirely correct)

When DF information is transferred into LOG_LINKS we still have 4 references back to the definition in instructions 45 and 47. From gdb this was:

(gdb) print uid_log_links[47]
$8 = (rtx) 0x7ff140d0
(gdb) pr
(insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 45
(insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 46 (insn_list:REG_DEP_TRUE 4
6 (insn_list:REG_DEP_TRUE 46 (insn_list:REG_DEP_TRUE 46 (nil)))))))))


These multiple references causes COMBINE to try the same combination of instruction 45 and 47 multiple times ( thinking they are different instructions). In this case the match is tried 4 times - 3 more than needed. Thi part appears most benign - except for processing time/memory used.

BUT when combine tries three instructions, it can crash. In this example, combine ends up trying to combine 2 duplicate instruction 45 with 47:

I1=
(insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ])
       (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil))
I2=
(insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ])
       (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil))

I3=
(insn 47 46 48 4 920625-1.c:55 (parallel [
           (set (reg:SI 22 r22)
               (mult:SI (reg:SI 22 r22)
                   (reg:SI 18 r18)))
           (clobber (reg:HI 26 r26))
           (clobber (reg:HI 30 r30))
       ]) 43 {*mulsi3_call} (expr_list:REG_DEAD (reg:SI 18 r18)
       (expr_list:REG_UNUSED (reg:HI 30 r30)
           (expr_list:REG_UNUSED (reg:HI 26 r26)
               (nil)))))


Combine merges I1 into i3 and deletes I1. Combine notes that the life of R22 terminates in I2 and attempt to put a REG_DEAD note on I2 - except of course the deletion of I1 also deletes the identical i2. Segmentation fault occurs.
Comment 1 Andy Hutchinson 2008-03-09 23:52:48 UTC
Created attachment 15287 [details]
Patch for consideratiom towards a solution


Patch that removes duplicates when LOG_LINKS is created.
Comment 2 Steven Bosscher 2008-03-10 20:04:08 UTC
The patch makes adding log use an algorithm quadratic in the number of log links per insn.  It is probably better to:
1. build the log links.
2. filter out the duplicates as a post pass (and maybe sort them while at it?)
Comment 3 Andy Hutchinson 2008-03-10 22:24:51 UTC
Subject: Re:  COMBINE repeating same matches and can
 SEG fault

The quadratic nature  does not seem to be particularly problem with the 
data involved.

The log_links is build up incrementally. (with duplicates at present).

The patch does a "do over"  to check that a potential link is not one 
already recorded. So initially the list is 0 and grows (hence quadratic)
However, it is only truely quadratic if there are no duplicates.

Duplicates are grouped together, and if I understand correctly,  the 
last element added will always be checked first - so  then inner loop 
will terminate after 1 iteration if a duplicate is present.

The final  list is quite small (typically 2-3 elements in length) - 
directly reflecting the number of  links between  RTL instruction.

If instead the list is created - then pruned afterwards, we have a 
longer list. So the quadratic nature of the loop is now replaced by 
check/sorting. This could be better- but only if the final list is of a 
significant size that  can exploit non-quadratic searching

Given the overhead of adding/removing linked items, and the small length 
of list, I believe the patch is better.

If, however, the number of RTL operands could be much larger than I have 
assumed, then perhaps a change is needed. Personally I can't think of 
many insn  that refer to more than 3 other instructions.  But I could be 
wrong.

Andy




steven at gcc dot gnu dot org wrote:
> ------- Comment #2 from steven at gcc dot gnu dot org  2008-03-10 20:04 -------
> The patch makes adding log use an algorithm quadratic in the number of log
> links per insn.  It is probably better to:
> 1. build the log links.
> 2. filter out the duplicates as a post pass (and maybe sort them while at it?)
>
>
>   
Comment 4 Andy Hutchinson 2008-03-15 23:49:20 UTC
This bug also causes incorrect code and appears to be regression from 4.2

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916

The good news is that the fix is effective.

Anything else I can do to help expedite the implementation of the patch or alternate fix? 

Comment 5 Andy Hutchinson 2008-04-04 23:46:32 UTC
Subject: Bug 35519

Author: hutchinsonandy
Date: Fri Apr  4 23:45:46 2008
New Revision: 133920

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=133920
Log:
PR rtl-optimization/34916
PR middle-end/35519
* combine.c (create_log_links): Do not create duplicate LOG_LINKS
between instruction pairs

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c

Comment 6 Andy Hutchinson 2008-04-09 22:51:28 UTC
Subject: Bug 35519

Author: hutchinsonandy
Date: Wed Apr  9 22:50:42 2008
New Revision: 134152

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134152
Log:
2008-04-09 Andy Hutchinson <hutchinsonandy@aim.com>

	PR rtl-optimization/34916
	PR middle-end/35519
	* combine.c (create_log_links): Do not create duplicate LOG_LINKS
	between instruction pairs

Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/combine.c

Comment 7 Andy Hutchinson 2008-04-12 15:27:17 UTC
Fixed 4.3 and 4.4