[patch/rfc] Moving constraint definitions to the machine description
Zack Weinberg
zackw@panix.com
Tue Jan 10 06:02:00 GMT 2006
On Mon, Jan 09, 2006 at 06:40:25PM +0000, Joern RENNECKE wrote:
> In http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00434.html,
> Zack Weinberg wrote:
>
> > The artificial distinction between GHIJKLMNOP constraints and all other
> > constraints withers away.
>
> Actually, I think it is nice if you can tell something about a constraint
> in a new port because of standard meanings of some letters.
Yes, but this can be a matter of documented convention rather than a
matter of having to write some constraints differently just because of
their names. Note that several ports have wedged const_int/const_double
constraints into EXTRA_CONSTRAINT(_STR) because they ran out of letters
(presumably this predates multiletter constraint strings).
> I believe some code paths are also faster because a switch on the
> initial constraint letter goes straight to the right code path.
Performance issues discussed below.
> Altough it would be cleaner if these standard constraints - also 'r'
> and 'g' - would be handled in one central place (which should be possible
> when you autogenerate the constraint handling code) so that all the
> other code does not have to do special-purpose checks.
I was not planning to autogenerate code for the standard constraints -
I don't like large blocks of verbatim code in the generator programs -
but they would be handled via the same table as machine-specific
constraints.
> When I did the multi-letter constraints, I considered that we
> wouldn't need them if we had eight-bit clean constraint codes, but
> that would have been unsuitable for programming, and unused codes might
> have to be reclaimed. Automatic building of tables could convert from
> the readable and maintainable multi-letter/digit format into more
> efficient internal bytecodes. We could also compact const modifiers,
> and eliminate spacing.
My vague idea - and I want to emphasize that I don't have the time to
code it, not for several months - was that the constraint-string field
in struct insn_operand_data would become an array of indices into
a new table of 'struct constraint_data', representing everything about
the constraint in already-parsed format. I have not reviewed every
place that processes constraints, so I don't know yet what goes into
that structure. Your bytecodes are probably the same thing as my array
indices, but I prefer thinking of them as indices because that makes
clear that they're not a stable thing across configurations. They're
like insn code numbers. Note that for purpose of this table, 'x' and
'?x' would get different entries.
I'm not sure how to handle asm_operands - in the general case one would
have to dynamically allocate code numbers and table entries, but that
plays havoc with GC.
> It would be useful to have some profiles for the compiler with the
> constraint parsing put into separate functions (like your patch does)
> on reload-heavy code to get an idea how important the speed aspect
> actually is.
>
> In order to retain the speed advantage of combining constraint parsing
> with operand type parsing, it would make sense if the generator programs
> also provided macros with list of case labels that can be used in a switch
> statement.
Well, the interesting thing would be to code the preparsed-constraint
patch (breaking all unconverted ports) and benchmark that against the
unmodified compiler. That would then provide a carrot to wave at people
to get them to convert their ports. But, as I said, I don't have time to
do that. The existing patch is designed to facilitate the conversion,
and therefore has had to trade away performance. I could probably buy
it back by moving the generated functions inline into tm-preds.h, but
frankly I don't think the difference will be measurable.
Also, I have zero interest in generating code fragments to be
inserted into the machine-independent compiler; that fails to make
progress toward the goal of being able to swap out the backend
without recompiling the world.
> Something also to keep in mind about standard constraints is that
> our matching constraints are incomplete - when the modes of the to-be
> matched operands don't agree, the only match that works is little endian
> lowpart matching. We can't do lowpart matching for big endian, and
> in general, why can't we match any part of the register we want to?
> Thus, for bytecodes, we migt want to reserve some encoding space for more
> general matching constraints.
I don't understand the issue you're raising here, but in my vague design,
there is no bytecode space to be reserved, there's just indices into a
table of data. Whatever new scheme you come up with can be fit in.
If you're trying to suggest additional syntax for define_constraint
and friends, then could you be a little clearer about what you have
in mind, please?
> > My biggest remaining concern, and perhaps we could just live with it,
> > is that every use of the *_FOR_LETTER / *_FOR_CONSTRAINT macros in a
> > back end has to be eliminated.
>
> This is putting it mildly. If we were to inline all uses of
> _OK_FOR_CONSTRAINT macros, we'd get massive code duplication, with
> all the maintenance headaches that this implies.
>
> The gen* machinery should spit out macros and/or functions that
> allow the constraints to be tested from C code.
As it is, one *can* call e.g. insn_const_ok_for_letter (foo, "K") from
C code, but this won't work with the preparsed-constraints patch. I
could, however, generate an enumeration for the constraint names and
a function, say, satisfies_constraint_p (rtx, enum constraint) - perhaps
this is an argument for the break-the-world patch, as then one wouldn't
have to fix each port twice.
> Note that the current implementation of CONSTRAINT_LEN allows to change
> the length of '?'. This idea was that you could have more fine-grained
> control over the costs modification, although the bit to decode the
> following string for the cost value was not added.
> A character or string that does not otherwise appear after '?' could be
> used to identify the multi-character sequence(s) so that a sole '?' would
> retain its old meaning.
I have no objection to this feature, but as '?' is a generic feature,
all handling of it should be generic too. I didn't change the
callers of CONSTRAINT_LEN in this patch, because it was a distraction
from the main thrust, but the intent is that CONSTRAINT_LEN /
insn_constraint_len never sees anything generic in the first place.
Syntax like ?nnn? (where nnn are [0-9]) should work, ne? A trailer is
probably needed in case anyone's put a ? on a matching constraint.
> > + for (p = name + 1; *p; p++)
> > + if (!ISALNUM (*p))
> > + {
> > + message_with_line (lineno, "constraint name '%s' must be only "
> > + "numbers and letters", name);
>
> That is too restrictive. h8300.h uses '<' and '>' . '_' might also
> make a lot of sense at times. The basic idea was to allow all printable
> ASCII characters, but I excluded ',' , '#' and '*' to avoid slowing down
> scans to get to the end of an alternative.
< and > are generic. I see real value in restricting machine-specific
constraint names to identifier characters. For one thing, it means
nice predictable names for those enumeration constants I mentioned
above. For another, it means a reader can have confidence that
punctuation has a generic meaning. The existing punctuation is
obscure enough without that additional worry.
Allowing _ is fine by me, I just disallowed it because no existing
port uses it.
zw
More information about the Gcc-patches
mailing list