This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] fix PR opt/2391 (mostly ARM)


Richard Henderson writes:
 > >  > Um, how often do these trigger?
 > > 
 > > Rarely.  But the first and the second do trigger in my testcases:
 > 
 > Enough to cause exponential problems?  This function is simple 
 > enough that I can't imagine real problems with it.

I guess I misunderstood your original question then.  I thought you
were worried about the overhead in the general case.  Yes, of course
in the test case it triggers quite often, enough to eliminate the
exponential problem.  See example below.

Maybe I wasn't clear enough in my first email when I explained the
result of my analysis.  The reason for the exponential behavior lies
in the way reg_last_set_value is set up and not in the functions
traversing it.

Before setting reg_last_set_value of a register to a new value,
record_value_for_reg replaces every occurrence of the register in the
new value with the original reg_last_set_value.

For example after the first a^=a+a, we have (xor (ashift r0 1) r0) in
reg_last_set_value[0].  After the second, we have
(xor (ashift (xor (ashift r0 1) r0) 1) 
     (xor (ashift r0 1) r0)) 

Thus, with every step we double the size of the expression doubling
also the time necessary to completely traverse last values.  My
changes avoid complete traversal for ``recursive'' binary expressions
of this kind which eliminates the exponential behavior.

Here are some examples for n = 20, 21 or 22 (n is the number of a^=a+a
expressions).  I guess your question can be answered by looking at the
differences in the number of recursive calls for update_table_tick.

First, with disabling my changes in update_table_tick:

$ grep "a \^= a + a" t.c | wc -l
     20
$ ./cc1 -O t.c 2>&1 | grep combiner
 combiner              :   2.75 (99%) usr   0.00 ( 0%) sys   2.75 (98%) wall
$ gprof  cc1 |grep '^\[.*update_table_tick'
[2]     96.5    1.39    0.00      92+10486199 update_table_tick [2]

$ grep "a \^= a + a" t.c | wc -l
     21
$ ./cc1 -O t.c 2>&1 | grep combiner
 combiner              :   5.37 (99%) usr   0.00 ( 0%) sys   5.37 (99%) wall
$ gprof  cc1 |grep '^\[.*update_table_tick'
[2]     97.8    2.73    0.00      96+20972002 update_table_tick [2]

$ grep "a \^= a + a" t.c | wc -l
     22
$ ./cc1 -O t.c 2>&1 | grep combiner
 combiner              :  10.77 (99%) usr   0.00 ( 0%) sys  10.78 (99%) wall
$ gprof  cc1 |grep '^\[.*update_table_tick'
[2]     98.6    5.78    0.00     100+41943567 update_table_tick [2]


And with enabling them:

$ grep "a \^= a + a" t.c | wc -l
     20
$ ./cc1 -O t.c 2>&1 | grep combiner
 combiner              :   0.02 (40%) usr   0.00 ( 0%) sys   0.02 (25%) wall
$ gprof  cc1 |grep '^\[.*update_table_tick'
[41]     0.0    0.00    0.00      92+1027    update_table_tick [41]

$ grep "a \^= a + a" t.c | wc -l
     21
$ ./cc1 -O t.c 2>&1 | grep combiner
 combiner              :   0.03 (38%) usr   0.00 ( 0%) sys   0.03 (38%) wall
$ gprof  cc1 |grep '^\[.*update_table_tick'
[42]     0.0    0.00    0.00      96+1120    update_table_tick [42]

$ grep "a \^= a + a" t.c | wc -l
     22
$ ./cc1 -O t.c 2>&1 | grep combiner
 combiner              :   0.02 (29%) usr   0.01 (100%) sys   0.03 (38%) wall
$ gprof  cc1 |grep '^\[.*update_table_tick'
[38]     0.0    0.00    0.00     100+1217    update_table_tick [38]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]