This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Enhance reload_cse_move2add


On 07/02/2010 02:10 AM, Jeff Law wrote:
On 06/30/10 11:13, Jie Zhang wrote:
On 07/01/2010 12:47 AM, Jeff Law wrote:
On 06/30/10 01:45, Jie Zhang wrote:
Currently reload_cse_move2add can transform

(set (REGX) (CONST_INT A))
...
(set (REGX) (CONST_INT B))

to

(set (REGX) (CONST_INT A))
...
(set (REGX) (plus (REGX) (CONST_INT B-A)))

This patch enhances it to be able to transform

(set (REGX) (CONST (PLUS (SYMBOL_REF) (CONST_INT A))))
...
(set (REGY) (CONST (PLUS (SYMBOL_REF) (CONST_INT B))))

to

(set (REGX) (CONST (PLUS (SYMBOL_REF) (CONST_INT A))))
...
(set (REGY) (CONST (PLUS (REGX) (CONST_INT B-A))))


Benchmarking using EEMBC on ARM Cortex-A8 shows performance improvement on one test:

idctrn01: 6%
Was this a size or runtime performance improvement?

This is a runtime performance improvement.
That's quite a surprise. Just for giggles, does x86 show any change on
idctrn01? I realize it's an eembc benchmark, but if it's that sensitive
to this optimization, we ought to see some change in behaviour for x86
as well.

I have not benchmarked it on x86 using EEMBC. We use SPEC2000 for benchmarking on x86 here. I need to ask if it's possible to setup EEMBC for x86 in our environment.




Benchmarking using SPEC2000 on AMD Athlon64 X2 3800+ shows 0.4% regression on CINT2000 and 0.1% improvement on CFP2000.

Bootstrapped and regression tested on x86_64.
Any thoughts on why spec2k showed a regression and was it a size or
runtime regression?

I'm not sure what caused the regressions. I'm redoing the
benchmarking. This time I do it without X and shut down running
servers as much as I can. Hope this can remove measuring error.
Strongly advised (shut down X and as many services as possible, turn off
speed scaling in the processor, etc).

Yes, I did all of these this time.

If you can get size #s, that would be interesting too -- I'd expect this
to be a small size improvement independent of the processor
architecture. Runtime performance I don't have a good feel for -- I can
easily envision cases where it's going to be better and others where
it's going to be worse.

I use 5 iterations and test -O2 and -O3 this time. It have been running for 24 hours. It might need one or two more hours to complete. So I have to report the results tomorrow.


Thanks, -- Jie Zhang CodeSourcery


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]