This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: optimization/8092: cross-jump triggers too often



On Sun, 6 Oct 2002, Bernd Paysan wrote:

> 
> Compile with -O2 -fno-cross-jump to see how the initializing code for "global 
> expressions" like stdout/stderr gets cluttered all over the place. The dump 
> obtained with -dG shows which expression is copied:
> 
> GCSE pass 1
> 
> SET hash table (11 buckets, 3 entries)
> Index 0 (hash value 5)
>   (set (reg/v:SI 60)
>     (const_int 12 [0xc]))
> Index 1 (hash value 6)
>   (set (reg/v:SI 61)
>     (const_int 123 [0x7b]))
> Index 2 (hash value 7)
>   (set (reg/v:SI 62)
>     (const_int 1234 [0x4d2]))
> 
> 
> LDST list: 
>   Pattern (  0): (mem/f:SI (symbol_ref:SI ("stderr")) [5 stderr+0 S4 A32])
> 	 Loads : (insn_list 42 (nil))
> 	Stores : (nil)
> 
>   Pattern (  0): (mem/f:SI (symbol_ref:SI ("stdout")) [5 stdout+0 S4 A32])
> 	 Loads : (insn_list 28 (nil))
> 	Stores : (nil)
> 
>   Pattern (  0): (mem/f:SI (symbol_ref:SI ("stdin")) [5 stdin+0 S4 A32])
> 	 Loads : (insn_list 14 (nil))
> 	Stores : (nil)
> 
> 
> Expression hash table (15 buckets, 5 entries)
> Index 0 (hash value 7)
>   (mem/f:SI (symbol_ref:SI ("stdin")) [5 stdin+0 S4 A32])
> Index 1 (hash value 2)
>   (mem:SI (reg/v/f:SI 59) [4 S4 A32])
> Index 2 (hash value 12)
>   (plus:SI (reg/v/f:SI 59)
>     (const_int 4 [0x4]))
> Index 3 (hash value 7)
>   (mem/f:SI (symbol_ref:SI ("stdout")) [5 stdout+0 S4 A32])
> Index 4 (hash value 8)
>   (mem/f:SI (symbol_ref:SI ("stderr")) [5 stderr+0 S4 A32])
> 
> PRE: redundant insn 14 (expression 0) in bb 1, reaching reg is 77
> PRE: redundant insn 28 (expression 3) in bb 2, reaching reg is 78
> PRE: redundant insn 42 (expression 4) in bb 3, reaching reg is 79
> PRE/HOIST: edge (0,1), copy expression 0
> PRE/HOIST: end of bb 1, insn 139, copying expression 3 to reg 78
> PRE/HOIST: edge (1,2), copy expression 3
> PRE/HOIST: end of bb 1, insn 142, copying expression 4 to reg 79
> PRE/HOIST: edge (1,3), copy expression 4
> PRE/HOIST: end of bb 2, insn 145, copying expression 4 to reg 79
> PRE/HOIST: edge (2,3), copy expression 4
> PRE/HOIST: end of bb 3, insn 148, copying expression 3 to reg 78
> PRE/HOIST: edge (3,2), copy expression 3
> PRE/HOIST: end of bb 4, insn 151, copying expression 3 to reg 78
> PRE/HOIST: edge (4,2), copy expression 3
> PRE/HOIST: end of bb 4, insn 154, copying expression 4 to reg 79
> PRE/HOIST: edge (4,3), copy expression 4
> PRE/HOIST: end of bb 5, insn 157, copying expression 3 to reg 78
> PRE/HOIST: edge (5,2), copy expression 3
> PRE/HOIST: end of bb 5, insn 160, copying expression 4 to reg 79
> PRE/HOIST: edge (5,3), copy expression 4
> PRE/HOIST: end of bb 6, insn 163, copying expression 3 to reg 78
> PRE/HOIST: edge (6,2), copy expression 3
> PRE/HOIST: end of bb 6, insn 166, copying expression 4 to reg 79
> PRE/HOIST: edge (6,3), copy expression 4
> 
> PRE GCSE of foo, pass 1: 4072 bytes needed, 3 substs, 21 insns created
> 
> Apparently, GCSE copies constant expressions that seem to be "sufficiently 
> complicated" all over the place. Although, in this case, the target 
> architecture implements (mem/f (symbol_ref xxx)) with the same instruction as 
> (const_int xxx), which is not copied. If I "poison" symbol_ref expressions in 
> gcse.c (so that they aren't considdered), I still get a few redundant 
> expressions spread all over the place, e.g. increments of the stack pointer 
> (which happen fairly often, but doing so in advance is stupid).
> 
> As far as I understood the code, gcse.c copies "partially redundant" 
> expressions throughout the entire CFG
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Not the entire CFG, only where it is necessary to make them fully 
available, so it can remove a later use.

In your case, all blocks have abnormal edges to all other blocks. This 
requires it to end up inserting everywhere because all blocks lead 
to all other blocks.
It's still optimal, given this CFG.
You just "know" more than the compiler, so you think it's wrong.


> to "make them fully redundant". That's 
> what's happening here. How can an expression that occurs only once be 
> "redundant"?

It's not, it appears load motion is being stupid, and not checking 
if a load occurs more than once.

> And why doesn't gcse.c try to remove copied expressions 
> afterwards after it finds out that the attempt to improve the code has 
> failed? 
Because it's lifetime optimal, so it can't possibly have made the 
expression live longer than it used to, and thus, it has always improved 
the code (in terms of number of calculations).

Now, if you want it *cost* optimal, that's a slightly different 
algorithm.

>And why is there no cost calculation? If you copy an expression from 
> one block to m blocks, the costs are multiplied by m.

Not necessarily. You assume the code is completely straight line.
GCSE should never make an expression be calculated *more times* than it 
used to be (probably some corner case i'm forgetting about).
That's the whole point.
GCSE makes the number of calculations along a given path optimal in 
number, not in cost.

Cost optimal code motion often makes the lifetime of an expression way too 
long.

I suggest you read papers on Lazy Code Motion, and cost optimal code 
motion.



> 
> 




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]