[patch/rfc] Moving constraint definitions to the machine description

Tue Jan 10 06:02:00 GMT 2006

On Mon, Jan 09, 2006 at 06:40:25PM +0000, Joern RENNECKE wrote:
> In http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00434.html,
> Zack Weinberg wrote:
> 
> > The artificial distinction between GHIJKLMNOP constraints and all other
> > constraints withers away.
> 
> Actually, I think it is nice if you can tell something about a constraint
> in a new port because of standard meanings of some letters.

Yes, but this can be a matter of documented convention rather than a
matter of having to write some constraints differently just because of
their names.  Note that several ports have wedged const_int/const_double
constraints into EXTRA_CONSTRAINT(_STR) because they ran out of letters
(presumably this predates multiletter constraint strings).

> I believe some code paths are also faster because a switch on the
> initial constraint letter goes straight to the right code path.

Performance issues discussed below.

> Altough it would be cleaner if these standard constraints - also 'r'
> and 'g' - would be handled in one central place (which should be possible
> when you autogenerate the constraint handling code) so that all the
> other code does not have to do special-purpose checks.

I was not planning to autogenerate code for the standard constraints -
I don't like large blocks of verbatim code in the generator programs -
but they would be handled via the same table as machine-specific
constraints.

> When I did the multi-letter constraints, I considered that we
> wouldn't need them if we had eight-bit clean constraint codes, but
> that would have been unsuitable for programming, and unused codes might
> have to be reclaimed.  Automatic building of tables could convert from
> the readable and maintainable multi-letter/digit format into more
> efficient internal bytecodes.  We could also compact const modifiers,
> and eliminate spacing.

My vague idea - and I want to emphasize that I don't have the time to
code it, not for several months - was that the constraint-string field
in struct insn_operand_data would become an array of indices into
a new table of 'struct constraint_data', representing everything about
the constraint in already-parsed format.  I have not reviewed every
place that processes constraints, so I don't know yet what goes into
that structure.  Your bytecodes are probably the same thing as my array
indices, but I prefer thinking of them as indices because that makes
clear that they're not a stable thing across configurations.  They're
like insn code numbers.  Note that for purpose of this table, 'x' and
'?x' would get different entries.

I'm not sure how to handle asm_operands - in the general case one would
have to dynamically allocate code numbers and table entries, but that
plays havoc with GC.

> It would be useful to have some profiles for the compiler with the
> constraint parsing put into separate functions (like your patch does)
> on reload-heavy code to get an idea how important the speed aspect
> actually is.
> 
> In order to retain the speed advantage of combining constraint parsing
> with operand type parsing, it would make sense if the generator programs
> also provided macros with list of case labels that can be used in a switch
> statement.

Well, the interesting thing would be to code the preparsed-constraint
patch (breaking all unconverted ports) and benchmark that against the
unmodified compiler.  That would then provide a carrot to wave at people
to get them to convert their ports.  But, as I said, I don't have time to
do that.  The existing patch is designed to facilitate the conversion,
and therefore has had to trade away performance.  I could probably buy
it back by moving the generated functions inline into tm-preds.h, but
frankly I don't think the difference will be measurable.

Also, I have zero interest in generating code fragments to be
inserted into the machine-independent compiler; that fails to make
progress toward the goal of being able to swap out the backend
without recompiling the world.

> Something also to keep in mind about standard constraints is that
> our matching constraints are incomplete - when the modes of the to-be
> matched operands don't agree, the only match that works is little endian
> lowpart matching.  We can't do lowpart matching for big endian, and
> in general, why can't we match any part of the register we want to?
> Thus, for bytecodes, we migt want to reserve some encoding space for more
> general matching constraints.

I don't understand the issue you're raising here, but in my vague design,
there is no bytecode space to be reserved, there's just indices into a
table of data.  Whatever new scheme you come up with can be fit in.

If you're trying to suggest additional syntax for define_constraint
and friends, then could you be a little clearer about what you have
in mind, please?

> > My biggest remaining concern, and perhaps we could just live with it,
> > is that every use of the *_FOR_LETTER / *_FOR_CONSTRAINT macros in a
> > back end has to be eliminated.
> 
> This is putting it mildly.  If we were to inline all uses of
> _OK_FOR_CONSTRAINT macros, we'd get massive code duplication, with
> all the maintenance headaches that this implies.
> 
> The gen* machinery should spit out macros and/or functions that
> allow the constraints to be tested from C code.

As it is, one *can* call e.g. insn_const_ok_for_letter (foo, "K") from
C code, but this won't work with the preparsed-constraints patch.  I
could, however, generate an enumeration for the constraint names and
a function, say, satisfies_constraint_p (rtx, enum constraint) - perhaps
this is an argument for the break-the-world patch, as then one wouldn't
have to fix each port twice.

> Note that the current implementation of CONSTRAINT_LEN allows to change
> the length of '?'.  This idea was that you could have more fine-grained
> control over the costs modification, although the bit to decode the
> following string for the cost value was not added.
> A character or string that does not otherwise appear after '?' could be
> used to identify the multi-character sequence(s) so that a sole '?' would
> retain its old meaning.

I have no objection to this feature, but as '?' is a generic feature,
all handling of it should be generic too.  I didn't change the
callers of CONSTRAINT_LEN in this patch, because it was a distraction
from the main thrust, but the intent is that CONSTRAINT_LEN /
insn_constraint_len never sees anything generic in the first place.

Syntax like ?nnn? (where nnn are [0-9]) should work, ne?  A trailer is
probably needed in case anyone's put a ? on a matching constraint.

> > +  for (p = name + 1; *p; p++)
> > +    if (!ISALNUM (*p))
> > +      {
> > +     message_with_line (lineno, "constraint name '%s' must be only "
> > +                        "numbers and letters", name);
> 
> That is too restrictive.  h8300.h uses '<' and '>' .  '_' might also
> make a lot of sense at times.  The basic idea was to allow all printable
> ASCII characters, but I excluded ',' , '#' and '*' to avoid slowing down
> scans to get to the end of an alternative.

< and > are generic.  I see real value in restricting machine-specific
constraint names to identifier characters.  For one thing, it means
nice predictable names for those enumeration constants I mentioned
above.  For another, it means a reader can have confidence that
punctuation has a generic meaning.  The existing punctuation is
obscure enough without that additional worry.

Allowing _ is fine by me, I just disallowed it because no existing
port uses it.

zw