Bug 51506 - Function cloning misses constant struct
Summary: Function cloning misses constant struct
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.6.2
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
Keywords: missed-optimization
Depends on:
Reported: 2011-12-11 23:41 UTC by Peter Ward
Modified: 2011-12-12 23:25 UTC (History)
2 users (show)

See Also:
Known to work:
Known to fail:
Last reconfirmed: 2011-12-12 00:00:00


Note You need to log in before you can comment on or make changes to this bug.
Description Peter Ward 2011-12-11 23:41:13 UTC
The actual problem I’m dealing with is with avr-gcc, so the goal is to achieve a small code size. I’m trying to write my code like this:
lcd_init(lcd_t l, ...)
where the first parameter is passed a *constant* struct which contains the memory addresses of each of the pins for the LCD. Thus, I want the compiler to note that all calls have the same first argument, clone the function, and propagate the constant.

However, it doesn’t seem to be working in practice.
In trying to build this test case, I found the compiler would just inline all the functions, which defeats the point (in the actual code, the cost of inlining is too high). So, I’ve added the noinline attribute, which I don’t think should stop this optimisation, but apologies if it does.

Anyhow, here’s the testcase.
(using gcc version 4.6.2 (Debian 4.6.2-5), on 64-bit Linux)

$ cat test.c
typedef struct {
    int a;
    int b;
} dint;

static int compute_int(int x, int var) {
    int y = 0;
    for (int i = 0; i < x; i++)
        y += i * x;
    return y + var;

static int compute_dint(dint x, int var) {
    int z = x.a + x.b;
    int y = 0;
    for (int i = 0; i < z; i++)
        y += i * z;
    return y + var;

int main() {
    int rv;
    rv += compute_dint((dint) {6, 1}, 1);
    rv += compute_dint((dint) {6, 1}, 2);
    rv += compute_dint((dint) {6, 1}, 3);
    rv += compute_int(5, 1);
    rv += compute_int(5, 2);
    rv += compute_int(5, 3);
    return rv;
$ gcc -fdump-ipa-all -fipa-cp -fipa-cp-clone -Os -std=c99 test.c

Expected result:
both compute_int and compute_dint should be optimised to versions where "x" is constant.

Actual reslut:
only compute_int is optimised.
Comment 1 Richard Biener 2011-12-12 10:02:22 UTC
We currently do not easily see that they are constant:

  D.1599.a = 6;
  D.1599.b = 1;
  D.1603_1 = compute_dint (D.1599, 1);

but we could in theory improve our IL by not forcing the aggregate argument
to a temporary during gimplification of

  rv = compute_dint (<<< Unknown tree: compound_literal_expr
    struct dint D.1599 = {.a=6, .b=1}; >>>, 1) + rv;

but simply allow !is_gimple_reg_type CONSTRUCTORs that are TREE_CONSTANT,
thus have

  D.1603_1 = compute_dint ({.a=6, .b=1}, 1);

in the IL.  That would still require ipa-cp to handle aggregates though.

The above would also mean that

  D.1599 = {.a=6, .b=1};

would be valid GIMPLE (I see no good reason to disallow this either).
Comment 2 Andrew Pinski 2011-12-12 23:25:04 UTC
Hmm, shouldn't this be handled in steps?  First by IPA-SRA and then by IPA-CCP ?