[patch] tuning gcc for AMDFAM10 processor (patch 2)
Jagasia, Harsha
harsha.jagasia@amd.com
Tue Jan 30 16:54:00 GMT 2007
Hi Uros,
Thanks for your suggestions, here is the correct ChangeLog. This should
also fix the issues Roger mentioned.
I used m_ATHLON_K8_AMDFAM10 for brevity since those CPUs do have similar
tuning mostly except for little divergences. I think it makes the
surrounding code easier to maintain and easier to identify the tuning
parameters where those CPU's are similar.
-----
2007-01-30 Harsha Jagasia <harsha.jagasia@amd.com>
* config/i386/i386.h (TARGET_AMDFAM10): New macro.
(TARGET_CPU_CPP_BUILTINS): Add code for amdfam10.
Define TARGET_CPU_DEFAULT_amdfam10.
(TARGET_CPU_DEFAULT_NAMES): Add amdfam10.
(processor_type): Add PROCESSOR_AMDFAM10.
* config/i386/i386.md: Add amdfam10 as a new cpu
attribute to match processor_type in config/i386/i386.h.
Enable imul peepholes for TARGET_AMDFAM10.
* config.gcc: Add support for --with-cpu option for
amdfam10.
* config/i386/i386.c (amdfam10_cost): New variable.
(m_AMDFAM10): New macro.
(m_ATHLON_K8_AMDFAM10): New macro.
(x86_use_leave, x86_push_memory, x86_movx, x86_unroll_strlen,
x86_cmove, x86_3dnow_a, x86_deep_branch, x86_use_simode_fiop,
x86_promote_QImode, x86_integer_DFmode_moves,
x86_partial_reg_dependency, x86_memory_mismatch_stall,
x86_accumulate_outgoing_args, x86_arch_always_fancy_math_387,
x86_sse_partial_reg_dependency, x86_sse_typeless_stores,
x86_use_ffreep, x86_use_incdec, x86_four_jump_limit,
x86_schedule,
x86_use_bt, x86_cmpxchg16b, x86_pad_returns): Enabled/disabled
for amdfam10.
(override_options): Add amdfam10_cost to processor_target_table.
Set up PROCESSOR_AMDFAM10 for amdfam10 entry in
processor_alias_table.
(ix86_issue_rate): Add PROCESSOR_AMDFAM10.
(ix86_adjust_cost): Add code for amdfam10.
>
>> This is the 2nd of 11 patches to tune gcc for AMD's AMDFAM10
processor
>> (based on mainline rev 121295). This patch defines mtune=amdfam10 and
>> enables/disables some existing tuning choices for amdfam10 such as
>> aligning loop tops to 32 bytes and using push/pops instead of moves
for
>> prologue/epilogue.
>
>> #define m_GENERIC64 (1<<PROCESSOR_GENERIC64)
>> #define m_GENERIC (m_GENERIC32 | m_GENERIC64)
>> +#define m_ATHLON_K8_AMDFAM10 (m_K8 | m_ATHLON | m_AMDFAM10)
>
>This part isn't described in ChangeLog.
>
>> (x86_use_leave, x86_push_memory, int x86_movx,
x86_unroll_strlen,
>> x86_cmove, x86_fisttp, x86_3dnow_a, x86_deep_branch,
>> x86_use_simode_fiop, x86_promote_QImode,
>
>x86_fisttp is gone.
>
>BTW: Is there really a reason to use combined defines , such as
>m_ATHLON_K8_AMDFAM10? These are used only in i386.c in a lines below
>and IMO don't bring us anything.
>
>Uros.
>
Thanks,
Harsha
More information about the Gcc-patches
mailing list