This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [new-regalloc] Dataflow too slow...


On Sat, 3 Feb 2001, Geert Bosch wrote:

>
>
> On Sat, 3 Feb 2001, Daniel Berlin wrote:
> |How many basic blocks do you have in the function.
>
> In the function that is now compiling for about 30 minutes,
> I have 2639 basic blocks it seems:
>
> (gdb) f 0
> #0  0x856499a in df_rd_global_compute (df=0x8b6f158, blocks=0x8acd458)
>     at ../../gcc/gcc/df.c:1549
> 1549              sbitmap_a_or_b (bb_info->rd_in, bb_info->rd_in,
> (gdb) p *df
> $6 = {flags = 1276, bbs = 0x40f82008, defs = 0x8cbffb0, uses = 0x40ec1008,
>   reg_def_last = 0x8b81778, regs = 0x8bd3f88, reg_size = 1076,
>   insns = 0x4110a008, insn_size = 23701, def_id = 28418, def_size = 29626,
>   n_defs = 28418, use_id = 11995, use_size = 47402, n_uses = 11995,
>   n_bbs = 2639, n_regs = 1076, def_id_save = 0, use_id_save = 0,
>   insns_modified = 0x8acc750, bbs_modified = 0x8acd2f8,
>   all_blocks = 0x8acd458, dom = 0x0}
> (gdb) p *blocks
> $7 = {n_bits = 2639, size = 83, bytes = 332, elms = {4294967295}}
>
> The function in question is one switch statements of 4811 lines,
> which a case for each Ada pragma and a little bit of processing.
> Most per-pragma processing has been broken out into different
> functions though.
>
> It seems though that the huge switch statement could not lead to extra
> dataflow iterations, as every case seen as a unit, has only one predecessor
> and one successor. So it seems that this should not be harder than
> calculating the in/out/kill/gen sets for each of the cases and one
> extra iteration to merge them all together. I don't see where there
> is inherently quadratic (or worse) behaviour...
>

It's from the ordering, i'm guessing.

I just compiled gcc.c-torture/compile/20001226-1.c with my current
df.c/new-regalloc.c[1], and it does it in 36.71 seconds at -O1 (-O0 takes
up too much memory for this machine, with 16k basic blocks, and 59k
pseudos). That's 8k basic blocks, 24k pseudos registers.

Memory usage peaks at  74 meg.

Global register allocation, including df_analyse, takes 8.02 seconds.
Tht's 3rd from the top (combiner = 20 seconds, jump= 13 seconds)

This is, of course, a profiling version.

The main change I modified df.c to use bitmaps rather than sbitmaps
to save memory, except for the worklists on reaching use/def global
calculation (since testing a given bit using bitmaps is a little to slow
compared to sbitmaps).

It appears to also save a lot of time, for various reasons (implemenation
of bitmaps, allocation costs, etc).

I'm just finishing testing on it.
--Dan



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]