This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: strange top statistics for 3.1 cc1


Brad Lucier <lucier@math.purdue.edu> writes:

>> 
>> Brad Lucier wrote:
>> > 
>> > > Could you give a source file and a cc1 command line to help track
>> > > this down?
>> > 
>> > [lucier@curie lib]$ gcc -fpic -fomit-frame-pointer -O1 -fno-math-errno -fschedule-insns2 -fno-strict-aliasing -Wall -W -Wno-unused  -c -I.  -D___PRIMAL -D___LIBRARY -D___SHARED -D___SINGLE_HOST _num.c -save-temps
>> > 
>> > and
>> > 
>> > http://www.math.purdue.edu/~lucier/_num.i.gz
>> 
>> Thanks.
>> 
>> Some remarks.  I am using the trunk checked out today, configured
>> identically to yours
>> (except for --prefix). I am also running Red Hat 7.1, with the updated
>> 2.4.3-12 kernel,
>> on a 533 MHz Celeron box.
>> 
>> This case is weird.  When I first started running the testcase, it
>> looked as though I
>> wasn't seeing your problem.  I let it run for some 14 CPU minutes and
>> never saw any
>> sign of excessive system CPU usage.
>> 
>> Then I tried running it again under /usr/bin/time, so that I could get a
>> total cpu usage
>> report, and went to do other things.
>> 
>> Well, it's still running, having consumed more than 160 CPU minutes! 
>> And now I see the
>> system CPU usage effect you describe.  I tried attaching to the process
>> under GDB, and
>> I saw it was bogged down in split_all_insns() at line 3486 of toplev.c,
>> doing lots of
>> sbitmap operations.  Doing strace -p on the process showed that it was
>> performing lots of
>> mmap2() and munmap() calls, each mapping and unmapping a large sbitmap
>> (about 11 MB).

Yummy.
This one is easy, for once.
And it's got to be where all the excessive cpu usage is coming from.
Excluding verify_flow_info (whenever i hit it, i jumped to the end of
the function), i hit split_all_insns in 4:38 of cpu time on my
powerbook.
What happens is split_all_insns calls find_sub_basic_blocks on each
basic block.
find_sub_basic_blocks calls make_edges.
make_edges makes an edge cache. an sbitmap vector that is
n_basic_blocks * n_basic_blocks. 9423 sbitmaps with 9423 bits in each.
(9423 * 9423) / 8 = 11,099,116 bytes.
There's your 11 meg sbitmap..
Now remember, it does this for every basic block.

So we create and destroy 11 meg 9423 times.
There are a bunch of solutions.
We could have only one edge cache that make_edges uses, ever.
Right now, it does the allocate, zero it out, pass it along, free it,
on every call.

There is should be no harm in just having make_edges use a file level
variable that is an  edge cache, and rather than allocate, zero it,
pass along, free, instead just zero it, and only reallocate it if the number
of basic blocks has changed. Should have the same effect, no?

The other solution that pops into mind immediately is moving towards
this in a, IMHO, way that makes less sense.
That is to create an edge cache in split_all_insns, and make
find_sub_basic_blocks and make_edges take an edge cache as well, and
pass it along.

It's a lot more work to update all the callers of those functions,
etc, than it is to just remove lines of code allocating/freeing the
edge cache in make_edges, and use one file level variable that is an
edge_cache (you can tell the number of basic blocks when you allocated
the edge cache last from the n_bits member of
file_level_edge_cache[0]), and pass *that* to make_edges.


I'll do it tomorrow afternoon if no one beats me to it.

--Dan
> Brad

-- 
"Everywhere is walking distance if you have the time.
"-Steven Wright


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]