This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch,avr] Fix PR 65657 - read from __memx address space tramples arguments to function call


Am 04/16/2015 um 11:28 AM schrieb Senthil Kumar Selvaraj:
On Thu, Apr 16, 2015 at 11:02:05AM +0200, Georg-Johann Lay wrote:
Am 04/16/2015 um 08:43 AM schrieb Senthil Kumar Selvaraj:
This patch fixes PR 65657.

The following artifact appears to be PR63633.


I did see that one - unfortunately, that fix won't help here. IIUC, you
check if input/output operand hard regs are in the clobber list,
and if yes, you generate pseudos to save and restore clobbered hard
regs.

In this case, the reg is actually clobbered by a different insn (one

Arrgh, yes...

that loads the next argument to the function). So unless I blindly generate pseudos for
all regs in the clobber list, the clobbering will still happen.

FWIW, I did try saving/restoring all clobbered regs, and it did fix the
problem - just that it appeared like a (worse) hack to me. Aren't we
manually replicating what RA/reload should be doing?

As it appears, we'll have to do it by hand. The attaches patch is just a sketch that indicates how the problem could be approached. Notice the new assertions in the split expanders; they will throw ICE until the fix is actually installed.

The critical insn are generated in movMM expander and are changed to have no clobbers (SCRATCHes actually). An a later pass, when appropriate life info can be made available, run yet another avr pass that

1a) Save-restore needed hard regs around the insn.

2a) Kick out hard regs overlapping the clobbers, e.g. in set_src, into new pseudos. Maybe that could happen due to some hard regs progagation, or we can use a new predicate similar combine_pseudo_register_operand.

3) Replace scratch -> hard regs for all scratch_operands.

2b) Restore respective hard regs from their pseudos.

1b) Restore respective hard regs from their pseudos.


And maybe we can get better code by allocating things like address register by hand and get better code then.

When I implemented some of the libgcc insns I tried to express the operand by means of constraints, e.h. for (reg:HI 22) and let register allocator do the job.

The resulting code was functional but *horrific*.

The register allocator is not yet ready to generate efficient code in such demanding situations...


What do you think?


IMO sooner or later we'll need such an infrastructure; maybe also for non-mov insn that are implemented by transparent libcalls like divmod, mul, etc.

When cfgexpand.c expands a function call, it first figures out the
registers in which the arguments will go, followed by expansion of the
arguments themselves (right to left). It later emits mov insns to set
the precomputed registers with the expanded RTXes.

If one of the arguments is a __memx char pointer dereference, the mov
eventually expands to gen_xload<mode>_A (for certain cases), which
clobbers R22, R21 and Z.  This causes problems if one of those
clobbered registers was precomputed to hold another argument.

In general, call expansion does not appear to take clobbers into account -

We had been warned that using hard regs is evil... But without that technique the code quality would decrease way too much.

when it checks for argument overlap, the RTX (args[i].value) is only a MEM
in QImode for the memx deref - the clobber shows up when it eventually
calls emit_move_insn, at which point, it is too late.

Such situations could only be handled by a target hook which allowed to expand specific trees by hand... Such a hook could cater for insn that must use hard registers.

This does not happen for a int pointer dereference - turns out that
precompute_register_parameters does a copy_to_mode_reg if the
cost of args[i].value is more than COSTS_N_INSNS(1) i.e., it creates a
pseudo and later assigns the pseudo to the arg register. This is done
before any moves to arg registers is done, so other arguments are not
overwritten.

Doing the same thing - providing a better cost estimate for a MEM rtx in
the non-default address space, makes this problem go away, and that is
what this patch does. Regression testing does not show any new failures.

Can you tell something about overall code quality? If it is not significantly worse then I'd propose to apply your rtx-costs solution soon. The full fix will take more time to work it out.


Johann


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]