[patch] tuning gcc for Intel Core2

Vladimir Makarov vmakarov@redhat.com
Thu Nov 16 18:38:00 GMT 2006


H. J. Lu wrote:

>On Thu, Nov 16, 2006 at 07:41:34AM -0800, Ian Lance Taylor wrote:
>  
>
>>Vladimir Makarov <vmakarov@redhat.com> writes:
>>
>>    
>>
>>>2006-11-13  Vladimir Makarov  <vmakarov@redhat.com>
>>>
>>>	* doc/invoke.texi (core2): Add item.
>>>
>>>	* config/i386/i386.h (TARGET_CORE2, TARGET_CPU_DEFAULT_core2): New
>>>	macros.
>>>	(TARGET_CPU_CPP_BUILTINS): Add code for core2.
>>>	(TARGET_CPU_DEFAULT_generic): Change value.
>>>	(TARGET_CPU_DEFAULT_NAMES): Add core2.
>>>	(processor_type): Add new constant PROCESSOR_CORE2.
>>>
>>>	* config/i386/i386.md (cpu): Add core2.
>>>
>>>	* config/i386/i386.c (core2_cost): New initialized variable.
>>>	(m_CORE2): New macro.
>>>	(x86_movx, x86_unroll_strlen, x86_cmove, x86_deep_branch,
>>>	x86_use_sahf, x86_partial_reg_stall, x86_partial_flag_reg_stall,
>>>	x86_use_simode_fiop, x86_single_stringop, x86_himode_math,
>>>	x86_promote_hi_regs, x86_sub_esp_4, x86_sub_esp_8, x86_add_esp_4,
>>>	x86_add_esp_8, x86_integer_DFmode_moves,
>>>	x86_partial_reg_dependency, x86_accumulate_outgoing_args,
>>>	x86_prologue_using_move, x86_epilogue_using_move,
>>>	x86_arch_always_fancy_math_387, x86_sse_partial_reg_dependency,
>>>	x86_sse_load0_by_pxor, x86_rep_movl_optimal,
>>>	x86_ext_80387_constants, x86_four_jump_limit, x86_schedule,
>>>	x86_pad_returns): Add m_CORE2.
>>>	(override_options): Add entries for Core2.
>>>	(ix86_issue_rate): Add case for Core2.
>>>      
>>>
>>This is OK.
>>
>>I haven't seen any obvious conclusions in the following thread (other
>>than the fact that gcc/glibc block memory moves suck, which I already
>>know).  But, in case I missed something, I'll preapprove any minor
>>changes to the costs, or adding or removing m_CORE2 in the processor
>>feature bitmasks.
>>    
>>
Thanks, Ian.  I'll commit the patch with what H.J. recommends (core2 
will be then generic and x86_rep_movl_optimal) and continue the work to 
try to find a better combination.  My be I 'll achieve the same score as 
generic + x86_rep_movl_optimal but smaller code size.

>I think we should distinguish 32bit and 64bit since they may need
>different optimization. Also I think 64bit -mtune=core2 should be the
>same as -mtune=generic + x86_rep_movl_optimal since -mtune=generic +
>x86_rep_movl_optimal generates better SPEC CPU 2K numbers than this
>-mtune=core2. It doesn't make senses to have -mtune=core2 generate
>slower code than -mtune=generic + x86_rep_movl_optimal.
>
>
>  
>



More information about the Gcc-patches mailing list