This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch docs 1/5] Update the first section of md.texi
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: "gerald at pfeifer dot com" <gerald at pfeifer dot com>, "joseph at codesourcery dot com" <joseph at codesourcery dot com>
- Date: Tue, 6 Jan 2015 16:56:00 +0000
- Subject: Re: [Patch docs 1/5] Update the first section of md.texi
- Authentication-results: sourceware.org; auth=none
- References: <1420543302-11008-1-git-send-email-james dot greenhalgh at arm dot com> <1420543302-11008-2-git-send-email-james dot greenhalgh at arm dot com>
As requested, please find a paragraph-by-paragraph justification for
the changes in this patch below.
I hope this aids review.
Of particular interest for this patch is the removal of the text
describing the semicolon for a comment syntax. I believe that a
description of the syntax of a .md file does not belong in the
introduction. I don't have a good suggestion as to where it should
move to.
Thanks,
James
On Tue, Jan 06, 2015 at 11:21:38AM +0000, James Greenhalgh wrote:
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 7f0426c..0277f14 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -7,39 +7,39 @@
> @chapter Machine Descriptions
> @cindex machine descriptions
>
> -A machine description has two parts: a file of instruction patterns
> -(@file{.md} file) and a C header file of macro definitions.
> +A machine description has two parts: one or more files describing the
> +instructions available on the target (@file{.md} files) and one or more
> +C/C++ source files implementing target hooks and macros. These are
> +described in the next chapter (@pxref{Target Macros}).
The original text was inaccurate. .md files can be included in one
another, so it is wrong to say "a file". Real-world backends are not
just C header files of macro definitions.
Rather than just mentioning the next chapter, cross-reference it.
> -The @file{.md} file for a target machine contains a pattern for each
> -instruction that the target machine supports (or at least each instruction
> -that is worth telling the compiler about). It may also contain comments.
> -A semicolon causes the rest of the line to be a comment, unless the semicolon
> -is inside a quoted string.
A detailed description of the syntax of a .md file does not belong in an
introduction. It is a deficiency of this patch set that this text is not
added back elsewhere. Do you have any suggestions of where the text could
sensibly go?
> -See the next chapter for information on the C header file.
Moved above.
> +This chapter describes the @file{.md} files for a target machine. These
> +contain the instruction patterns which are used by the compiler when
> +expanding gimple to RTL, during the RTL optimization passes, and when
> +emitting assembler code.
New text. Give a high level overview of what is in an md file and why
we want it.
>
> @menu
> * Overview:: How the machine description is used.
> * Patterns:: How to write instruction patterns.
> * Example:: An explained example of a @code{define_insn} pattern.
> -* RTL Template:: The RTL template defines what insns match a pattern.
> +* RTL Template:: The RTL template defines which insns may match a
> + pattern.
s/what/which/ -- This reads better to me
s/match/may match/ -- Many things can result in an RTL Template not
matching; pattern ordering, predicates, failing conditions, etc.
> * Output Template:: The output template says how to make assembler code
> - from such an insn.
> + from an insn matched to an instruction pattern.
The previous text did not make sense. "such an insn" is not bound to
anything if I am reading the contents non-linearly (as I am likely to do).
> * Output Statement:: For more generality, write C code to output
> the assembler code.
> -* Predicates:: Controlling what kinds of operands can be used
> - for an insn.
> +* Predicates:: Controlling which kinds of operands can be used
> + when matching an insn to an instruction pattern.
s/what/which/ -- Reads better to me.
Add text qualifying when predicates are used - predicates can't be used
to instruct register allocation after pattern matching, they are only
used in pattern matching.
> * Constraints:: Fine-tuning operand selection.
> * Standard Names:: Names mark patterns to use for code generation.
> * Pattern Ordering:: When the order of patterns makes a difference.
> * Dependent Patterns:: Having one pattern may make you need another.
> * Jump Patterns:: Special considerations for patterns for jump insns.
> * Looping Patterns:: How to define patterns for special looping insns.
> -* Insn Canonicalizations::Canonicalization of Instructions
> +* Insn Canonicalizations::Canonicalization of instructions
We don't use title case elsewhere in the descriptions.
> * Expander Definitions::Generating a sequence of several RTL insns
> for a standard operation.
> -* Insn Splitting:: Splitting Instructions into Multiple Instructions.
> -* Including Patterns:: Including Patterns in Machine Descriptions.
> +* Insn Splitting:: Splitting insns into multiple insns.
> +* Including Patterns:: Including patterns in machine descriptions.
Insn splitting acts on insns (a compiler concept) not instructions (a
machine concept).
We don't use title case elsewhere in the descriptions.
> * Peephole Definitions::Defining machine-specific peephole optimizations.
> * Insn Attributes:: Specifying the value of attributes for generated insns.
> * Conditional Execution::Generating @code{define_insn} patterns for
> @@ -54,50 +54,60 @@ See the next chapter for information on the C header file.
> @node Overview
> @section Overview of How the Machine Description is Used
>
> -There are three main conversions that happen in the compiler:
> +There are four main conversions that happen in the compiler:
The previous text is inaccurate.
>
> @enumerate
>
> @item
> -The front end reads the source code and builds a parse tree.
> +The front end reads the source code and converts it to an intermediate,
> +front end specific, tree based representation.
The previous text is imprecise.
> @item
> -The parse tree is used to generate an RTL insn list based on named
> -instruction patterns.
> +This tree based representation is lowered (gimplified) to a gimple
> +representation.
The previous text is inaccurate.
>
> @item
> -The insn list is matched against the RTL templates to produce assembler
> -code.
> +The gimple representation is transformed (expanded) to an RTL
> +representation. This RTL representation is a doubly-linked chain of
> +objects called insns, known as the insn list.
New text describing expansion from gimple to RTL.
> +
> +@item
> +The insn list is optimized, and the optimized insn list is matched
> +against @code{define_insn} instruction patterns in the machine description
> +to produce assembler code.
Add text describing that the insn list is further optimized before
instruction selection.
>
> @end enumerate
>
> -For the generate pass, only the names of the insns matter, from either a
> -named @code{define_insn} or a @code{define_expand}. The compiler will
> +When expanding from gimple to RTL, only named @code{define_insn}
> +constructs and @code{define_expand} constructs are used. The compiler will
> choose the pattern with the right name and apply the operands according
> to the documentation later in this chapter, without regard for the RTL
> -template or operand constraints. Note that the names the compiler looks
> -for are hard-coded in the compiler---it will ignore unnamed patterns and
> -patterns with names it doesn't know about, but if you don't provide a
> -named pattern it needs, it will abort.
> +template or operand constraints. Note that the names which are used for
> +expansion are hard-coded in the compiler---unnamed patterns and patterns
> +with names which do not have a standard meaning are ignored during
> +expansion. If you don't provide a named pattern that the compiler is
> +trying to expand, it may try a different expansion strategy. If no
> +other expansion strategies are possible, the compiler will abort.
This is a general cleanup to make the text less casual. This could do
with a second pass over it to remove the repeated use of "you" and the
anthropomorphism of the compiler.
I prefer the text below. We lose "without regard for the RTL template or
operand constraints", but this is contradictory anyway, as the RTL template
is used when expanding a define_insn.
When expanding from gimple to RTL, only named @code{define_insn}
constructs and @code{define_expand} constructs are used. The set of
names which are meaningful during expansion are detailed later in this
chapter (@pxref{Standard Names}. Note that the names which are used for
expansion are hard-coded in the compiler---unnamed patterns and patterns
with names which do not have a standard meaning are ignored during
expansion. If a named pattern that the compiler is trying to expand,
is not provided, it may try a different expansion strategy. If no
other expansion strategies are possible, the compiler will abort.
>
> If a @code{define_insn} is used, the template given is inserted into the
> -insn list. If a @code{define_expand} is used, one of three things
> -happens, based on the condition logic. The condition logic may manually
> -create new insns for the insn list, say via @code{emit_insn()}, and
> -invoke @code{DONE}. For certain named patterns, it may invoke @code{FAIL} to tell the
> -compiler to use an alternate way of performing that task. If it invokes
> -neither @code{DONE} nor @code{FAIL}, the template given in the pattern
> +insn list.
> +
> +If a @code{define_expand} is used, one of three things happens, encoded in
> +the condition logic. The condition logic may manually create new insns
> +for the insn list, say via @code{emit_insn ()}, and invoke @code{DONE},
> +indicating a successful expansion. If the standard pattern name permits
> +it, the condition logic may invoke @code{FAIL} to express that an alternate
> +strategy should be used to performing the expansion. If the condition logic
> +invokes neither @code{DONE} nor @code{FAIL}, the template given in the pattern
> is inserted, as if the @code{define_expand} were a @code{define_insn}.
Split one paragraph to two. One dealing with define_insn expressions, one
for define_expand expressions. This aids readability.
s/based on the condition logic/encoded in the condition logic/ -- I
don't like the use of "based on" to describe code, the code explicitly
says what to do, the compiler doesn't "base" what to do on the code.
I dislike that the condition logic is the actor in the paragraph, but I
struggled to rewrite it, so I gave up and just replaced the unclear use of
"it" with "the condition logic".
> Once the insn list is generated, various optimization passes convert,
> -replace, and rearrange the insns in the insn list. This is where the
> -@code{define_split} and @code{define_peephole} patterns get used, for
> -example.
These examples don't add anything useful and introduce new keywords we've
not mentioned before.
> -Finally, the insn list's RTL is matched up with the RTL templates in the
> -@code{define_insn} patterns, and those patterns are used to emit the
> -final assembly code. For this purpose, each named @code{define_insn}
> -acts like it's unnamed, since the names are ignored.
> +replace, and rearrange the insns in the insn list.
> +
> +Finally, the insn list's RTL is matched with the RTL templates from the
> +@code{define_insn} instruction patterns, and those patterns are used to
> +emit the final assembly code. For this purpose, names are ignored. All
> +@code{define_insn} are considered for matching.
Just a rewording. The previous text is unnecessarily verbose.