37471 – Move invariant pulls too many cmps out of a loop

Bug 37471 - Move invariant pulls too many cmps out of a loop

Summary: Move invariant pulls too many cmps out of a loop

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	4.4.0

Importance:	P3 enhancement
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2008-09-10 22:56 UTC by Andrew Pinski
Modified:	2023-01-16 02:43 UTC (History)
CC List:	3 users (show)

See Also:	108412
Host:
Target:	powerpc--*
Build:
Known to work:
Known to fail:	3.4.0, 4.0.0, 4.8.3, 4.9.3, 6.0
Last reconfirmed:	2021-12-18 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andrew Pinski 2008-09-10 22:56:49 UTC

I was looking at some code generation for an internal benchmark when I noticed this.  With the current trunk on powerpc, we get a mfcr which is slow for the cell (and most likely other PPCs too) as we pulled too many cmps with some loops.
A simple example:
int f(int b, int l, int d, int c, int e)
{
  int i = 0;

  for(i = 0;i< l;i ++)
  {
    if (b)
      g();
    if (c)
      g();
    if (d)
      g();
    if (e)
      g();
  }
}

Comment 1 Andrew Pinski 2008-09-14 04:42:38 UTC

Note this is not a regression as loop.c did the same.

Comment 2 Andrew Pinski 2008-09-21 00:32:12 UTC

While looking a different bug dealing with invariant motion, I noticed that estimate_reg_pressure_cost does not take into account the mode of the new register.  This seems like a big issue.  Also init_set_costs always uses SImode which is not a good representation of the register pressure in general.

Comment 3 Andrew Pinski 2010-03-02 18:53:39 UTC

Still happens on the trunk.

Comment 4 Steven Bosscher 2010-03-02 21:58:22 UTC

Can you post the output .s of gcc, and the .s you expect?

Comment 5 Andrew Pinski 2010-03-02 22:02:22 UTC

It is pretty obvious from doing a cross build.  We get a couple sets of:
        lwz 0,112(1)
        rlwinm 0,0,4,0xffffffff
        mtcrf 1,0
        rlwinm 0,0,28,0xffffffff
        beq 7,.L6

Which loads r0 from the stack and then puts it into a conditional register and the branches.  Note the rlwinm's are there to shift the registers around to put it into the correct location for the mtcrf.

Comment 6 Segher Boessenkool 2015-11-22 20:24:00 UTC

Oh wow.

... Still happens with trunk.  Confirmed.

Comment 7 Andrew Pinski 2021-12-19 00:50:05 UTC

I Notice LLVM does similarly on this testcase too.