This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [trunk] patch to remove REG_NO_CONFLICT_BLOCKS


Steven Bosscher wrote:
On Wed, Apr 23, 2008 at 12:50 AM, Vladimir Makarov <vmakarov@redhat.com> wrote:
I saw the same difference for SH
too.

OK, I'll see if I can reproduce it with a cross-compiler to SH. Do you remember which of the SH targets shows this?

You could use sh-elf and option -m4. It will show about 1.8Gb vs 170Mb in peak memory consumption difference. I've attached the diagrams.
I've just checked it again (I applied ra1.diff from Ken's mail at
134529). And I see about 0.5% degradation for SPECINT2000 and
about 1.2% for SPECPF2000 when only global is used. It was checked
on Core2 in 32-bit mode with -O2. The code is also bigger 0.03%
and 0.4% correspondingly for SPECINT and SPECFP when only global is
used. So for me obviously that removing local will be unreasonable.

The spread in the results for April 22 is quite large (see mcf for example). There is also a clear positive jump for gcc in SPEC2000, but that could be a lucky one. I also see no visible effect in 64 bit mode for SPECINT2000 and a small jump for SPECFP2000. All of that is within the noise band width of your tester, AFAICT. So I wouldn't jump to the conclusion that there is improvement or degradation until after a few more SPEC runs. The code size increase is something we could take a look at. This wouldn't be the first time that turning off part of the compiler exposes tuning opportunities that went unnoticed previously.


I'd like to point people every performance or compiler speed degradation and reasons for that but I can not do it physically. Although I can do and will do it in my expertise area.

PS: May be it will be interesting for you, but it looks like the byte
accurate conflict presentation gives no visible improvement for SPEC2000 on
x86/x86_64 but slows down compiler about 0.6% for base and 0.7% for peak on
x86_64 (approximately the same number for x86). You can see it on the
tester on http://vmakarov.fedorapeople.org/spec/ (see difference on between
Apr21 and Apr22 after Ken applied his patch).

That's not really unexpected. Without the bits for global, the byte level df is only used for a byte level DCE, but I don't think there are any cases in SPEC for which this is likely to do much (except crafty, maybe).

Although I did not analyzed that but I guess Ian's and Richard's work on splitting subregs removed this problem for crafty.
The slowdown is not good.  We'd have to look into it.  There is still
a lot of room for improvement.  I've been thinking about a
current_function_has_mw_regs or something, that turns this path in the
compiler off in the case that there are no multiword regs to deal with
(i.e. the common case for most architectures).

Gr.
Steven


(P.S. I'd encourage you to try IRA for targets like AVR. Since IRA
currently doesn't handle REG_NO_CONFLICT-like constructs, it might
have cause some trouble for cute little 8/16 bit AVR, which seems to
more-or-less depend on REG_NO_CONFLICT for many insns... It is this
kind of regressions that we are trying to avoid with the df byte level
work.)
Yes, I know about this port. I am working with AVR compiler people on IRA problems and getting a very good feedback from them.

Attachment: sh4-local.ps
Description: x-application/ps

Attachment: sh4-global.ps
Description: x-application/ps


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]