This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: pseudo conditional traps for ix86
- To: Greg McGary <greg at mcgary dot org>
- Subject: Re: PATCH: pseudo conditional traps for ix86
- From: Linus Torvalds <torvalds at transmeta dot com>
- Date: Tue, 29 Aug 2000 12:32:53 -0700 (PDT)
- cc: gcc-patches at gcc dot gnu dot org
On 28 Aug 2000, Greg McGary wrote:
> Linus Torvalds <torvalds@transmeta.com> writes:
>
> > PS. You might try out a small modification on your original case: just
> > emit a
> >
> > "j%c0 __internal_gcc_trap"
> >
> > inside the code directly, and add a small function to libgcc that just
> > does
> >
> > __internal_gcc_trap:
> > int $4
> >
> > and nothing more. It should be better for the branch predictor (forward
> > never-taken case), and probably nicer on prefetching. But I don't think it
> > matters all that much.
>
> Unfortunately, this will screw up debugging. If every bounds
> violation branches to `__internal_gcc_trap', then we won't be able to
> tell where the violation occurred.
Ok. The Linux kernel actually has some similar issues, and there we solve
it by using ELF sections.
Basically, the "taken branch forward" case is just very expensive due to
it getting mispredicted when the branch prediction cache is empty. For
something that basically _knows_ that it never gets triggered, you really
want to have the "not taken branch forward" case, preferably with the
taken case being a _lot_ forward just so that it never even pollutes the
icache.
The way the kernel does this is by using the equivalent of
j%c0 1f
.section .fault_section
1: trap
.previous
which basically puts the fault in a section of its own. Now you'll have
"unique" faults for everything, so you can debug and/or recover from them
as you see fit.
The debugger, of course, might want to have a nicer way to get the "true"
fault address than just having to match up the fault addresses with the
callers, so you might choose to go for a larger binary (with all the extra
space being in places that never get executed or hopefully even brought
into memory) with something on the order of
1: j%c0 2f
.section .fault_section
2: pushl $1b /* Save "real" address on the stack */
trap
.previous
but basically all of these require that you have an extra section and that
your linker scripts can handle this naturally. The kernel uses a special
link script to separate out all the sections the way it wants to. I don't
know if it would be acceptable to change the default link scripts for
somehtin glike this.
Linus