This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC]: designing customizable format attributes

On Tue, 11 Jul 2005, Ian Lance Taylor wrote:

> Intuitively the most logical approach would seem to be a state machine
> driven by the characters in the format string.  Periodically the state
> machine would emit a type.  For example

This is a lot closer to what I think the datastructures should end up 
looking like, which is why I think adding the feature with the present 
datastructures would be premature and why I think too much influence of 
the current datastructures on the appearance of the user interface to the 
feature would be dangerous.

(One previous list discussion suggested regular expressions.  Though more 
or less isomorphic to state machines, I don't think they are a good match 
to this particular problem.)

I'd use something a bit higher level than your state machine, to represent 
better the structure format strings in fact have.  For example, length 
modifiers might call a subroutine "find the first one of the strings in 
this list which matches at this point in the string [the empty string 
would be last] and record its index in this register" and then after the 
conversion specifier has been parsed "look up an entry in this 
two-dimensional array indexed by these two registers" would be called to 
find the type if any for given length and conversion specifiers.  Various 
subroutines would have ways to specify diagnostics for not matching.  
Backtracking would also be simpler than with a pure state machine: a 
decimal number after % could be either a width or an operand number; 
"optionally parse a number followed by $, storing it in this register if 
found" would be called, so the "parse width" routine would find itself 
still at the start of the number if it wasn't followed by $ rather than 
needing a state machine to do "parse number" then "if $ then operand 
number else width".

I think it should be possible to move towards such structures 
incrementally, gradually moving more logic into the datastructures and 
reducing the number of special cases with their own flags or code.

> Since the goal is to produce an attribute string, we can see that it's
> pretty easy to describe this kind of state machine using a little
> language.  LABEL is [0-9]+.  CHAR is any character in the string.
> TYPENAME is any string, meant to be the name of a type.

I think an interface taking some form of list of strings and types would 
be better, to avoid calling back into the lexer and parser to decode type 
names extracted from the string.

> In any case, what would make this useful is to be able to say "this
> attribute string is printf plus the following".  Then the string would
> add to and override the state machine created from the default printf
> string.

Which in the state machine model is difficult to do because it depends on 
fine details of how the printf state machine is implemented.

Joseph S. Myers      (personal mail) (CodeSourcery mail) (Bugzilla assignments and CCs)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]