This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: target_mem_ref and memory/prefetch intrinsics


On Fri, Oct 15, 2010 at 5:36 AM, Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote:
> Hi,
>
>> >> Yes, I'll burn you if you start to allow non-invariant addresses in
>> >> function arguments ;))
>> >
>> > could you please give some more justification for that? ?We already allow
>> > memory references as function arguments, so it is not as if it made anything
>> > more complicated,
>>
>> Indeed, and I'd like to get rid of memory references as function arguments
>> (we only allow them for things that cannot be assigned to registers anyway).
>>
>> If you'd allow &a[i] then why wouldn't you allow &a + 4*i for example.
>>
>> The issue is that when we allow this we'll not be able to CSE those
>> addresses, we won't be able to propagate into them (consider
>> p = &a; foo (&p->a[i]);) with the current generic code but we'd have
>> to duplicate it for call arguments. ?Etc.
>
> the point of TARGET_MEM_REFs is precisely to avoid such optimizations (which
> usually make things worse by destroying the possibility to use addressing modes).
> So, I see that as an advantage :-)
>
> The proposal would be to allow only addresses of TARGET_MEM_REFs to appear
> as arguments (and even that, only on the few selected builtins that can take the
> advantage of addressing modes).
>
> Anyway, I do not have strong preference for this; for now we can go with taking
> the address of TARGET_MEM_REFs separately, and see if that is sufficient,

Except that it can be tricky to tune the RTL forward prop.

May I suggest the following:  extend TARGET_MEM_REF to handle this?
Basically TARGET_MEM_REF can be used to represent memory intrisnics
with special code, and additional parameters from the intrinsics
parameters are additional operands.

For instance,  b = __builtin_ia32_loadhps ( (__m128)a, (__m64* const)p);

can be converted into:
b = tmr (base, iv, stride, offset, OP_loadhps, a)

And

__builtin_ia32_storehps (p, a)

==>
tmr (base, iv, stride, offset, OP_storephs) = a

Issues, 1) in the store form, the type/mode of RHS may not match the
LHS.  2) handling of prefetches.   Are there other obvious reasons
this is bad?

Thanks,

David



>
> Zdenek
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]