This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 0/4] RFC: RTL frontend
- From: Jeff Law <law at redhat dot com>
- To: David Malcolm <dmalcolm at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Mon, 16 May 2016 16:42:13 -0600
- Subject: Re: [PATCH 0/4] RFC: RTL frontend
- Authentication-results: sourceware.org; auth=none
- References: <1462394970-55471-1-git-send-email-dmalcolm at redhat dot com>
On 05/04/2016 02:49 PM, David Malcolm wrote:
* The existing RTL code is structured around a single function being
optimized, so, as a simplification, the RTL frontend can only handle
one function per input file. Also, the dump format currently uses
comments to separate functions::
;; Function test_1 (test_1, funcdef_no=0, decl_uid=1758, cgraph_uid=0, symbol_order=0)
ISTM we can fix this by adding more true structure to the RTL dump.
IMHO we have the freedom to extend the RTL dumper to make it easier to
read the RTL dumps in for this kind of work.
... various pass-specific things, sometimes expressed as comments,
sometimes not
Which seems like a bug to me.
;;
;; Full RTL generated for this function:
;;
(note 1 0 6 NOTE_INSN_DELETED)
;; etc, insns for function "test_1" go here
(insn 27 26 0 6 (use (reg/i:SI 0 ax)) ../../src/gcc/testsuite/rtl.dg/test.c:7 -1
(nil))
;; Function test_2 (test_2, funcdef_no=1, decl_uid=1765, cgraph_uid=1, symbol_order=1)
... various pass-specific things, sometimes expressed as comments,
sometimes not
;;
;; Full RTL generated for this function:
;;
(note 1 0 5 NOTE_INSN_DELETED)
;; etc, insns for function "test_2" go here
(insn 59 58 0 8 (use (reg/i:SF 21 xmm0)) ../../src/gcc/testsuite/rtl.dg/test.c:31 -1
(nil))
so that there's no clear separation of the instructions between the
two functions (and no metadata e.g. function names).
This could be fixed by adding a new clause to the dump e.g.::
Which would seem like a good idea to me.
* Similarly, there are no types beyond the built-in ones; all expressions
are treated as being of type int. I suspect that this approach
will be too simplistic when it comes to e.g. aliasing.
Well, we have pointers back to the tree IL for this kind of thing, but
it's far from ideal because of the lack of separation that implies.
I wouldn't lose a ton of sleep if we punted this for a while, perhaps
just dumping the alias set splay tree so we can at least carry that
information around.
* There's no support for running more than one pass; fixing this would
require being able to run passes from a certain point onwards.
I think that's OK at this stage.
* Roundtripping of recognized instructions may be an issue (i.e. those
with INSN_CODE != -1), such as the "667 {jump}" in the following::
(jump_insn 50 49 51 10
(set (pc)
(label_ref:DI 59)) ../../src/test-switch.c:18 667 {jump}
(nil) -> 59)
since the integer ID can change when the .md files are changed
(and the associated pattern name is very much target-specific).
It may be best to reset them to -1 in the input files (and delete the
operation name), giving::
Just ignore the index and the pretty name. When you're done reading the
file, call recog on each insn to get that information filled in.
(jump_insn 50 49 51 10
(set (pc)
(label_ref:DI 59)) ../../src/test-switch.c:18 -1
(nil) -> 59)
* Currently there's no explicit CFG edge information in the dumps.
The rtl1 frontend reconstructs the edges based on jump instructions.
As I understand the distinction between cfgrtl and cfglayout modes
https://gcc.gnu.org/wiki/cfglayout_mode , this is OK for "cfgrtl" mode,
but isn't going to work for "cfglayout" mode - in the latter,
unconditional jumps are represented purely by edges in the CFG, and this
information isn't currently present in the dumps (perhaps we could add it
if it's an issue).
We could either add the CFG information or you could extract it from the
guts of the RTL you read. The former leads to the possibility of an
inconsistent view of the CFG. The latter is more up-front work and has
to deal with the differences between cfgrtl and cfglayout modes.
Open Questions
**************
* Register numbering: consider this fragment of RTL emitted during
expansion::
(reg/f:DI 82 virtual-stack-vars)
At the time of emission, register 82 is the VIRTUAL_STACK_VARS_REGNUM,
and this value is effectively hardcoded into the dump. Presumably this
is baking in assumptions about the target into the test. Also, how likely is
this value to change? When we reload the dump, should we notice that this
is tagged with virtual-stack-vars and override the specific register
number to use the current value of VIRTUAL_STACK_VARS_REGNUM on the
target rtl1 was built for?
Those change semi-regularly. Essentially anytime a new version of the
ISA shows up with new register #s.
My instinct is to drop raw numbers and just output them symbolicly. We
can map them back into the hard register numbers easy enough. We would
want to use some magic to identify pseudo regs. P1...PN in the dumps
which we'd map to FIRST_PSEUDO_REGISTER+N when we read the file in.
Jeff