This is the mail archive of the
mailing list for the GCC project.
Re: Combine versus volatile (Was: Re: Performance of IntegerMultiplication on PIII)
- To: Jan Hubicka <jh at suse dot cz>
- Subject: Re: Combine versus volatile (Was: Re: Performance of IntegerMultiplication on PIII)
- From: Linus Torvalds <torvalds at transmeta dot com>
- Date: Thu, 8 Nov 2001 08:39:43 -0800 (PST)
- cc: <gcc at gcc dot gnu dot org>
On Thu, 8 Nov 2001, Jan Hubicka wrote:
> Combine constructs the combined pattern, but as it contains volatile memory
> references the pattern is not recognized properly.
One alternative, of course, is just to remove the volatile.
I'd love to do that, but the reason for it is that the ability to specify
a "memory area" to gcc in asms is pitifully limited.
Btw, one of the major problems with gcc asms historically was that you
absolutely _could_not_ say that a memory area was both read and written.
Impossible. Which is why the code in question uses "volatile", so that gcc
doesn't think that the asm changes the whole word and that previous
accesses to it have died.
When did "+" become acceptable in asms? Is it there in 2.7.2 already, or
was it a 2.95 thing? I no longer have 2.7.2 to test (it's not supposed to
be used for 2.4.x kernels any more, but some people probably still do, and
I don't want to break it on purpose).
> Unfortunately I am not at all sure if it is needed in combine - I am not
> able to come with counterexample where it can cause problems. Combine,
> in worst case, IMO can change the size of memory read and I am not
> quite sure if we require this to not change in our volatile definition.
Changing the memory size would, I think, be wrong for "volatile". It's
obviously not an issue for the asms, but if it would change it for regular
volatile accesses, that would not be ok for what volatile is traditionally
used for (ie IO space accesses etc).
> Also on related note - if you are concerned about speed with constant
> bit offsets, why you just don't use builtin_constant_p to get two versions,
> one with or doing the change that executes faster on most cases.
I may have to do that. However, it is kind of sad, since the only reason
to do it seems to be gcc being stupid for little good reason.
The argument I want combined is _not_ volatile.
> Does the semantic of or with lock prefix differ from btsl?
It sure does - on SMP.