Memset/memcpy patch

Michael Zolotukhin michael.v.zolotukhin@gmail.com
Mon Nov 21 17:16:00 GMT 2011


Hi,

Continuing investigation of fails on bootstrap I found next problem
(besides the problem with unknown alignment described above): there is
a mess with size_needed and epilogue_size_needed when we generate
epilogue loop which also use SSE-moves, but no unrolled - that's
probably the reason of the fails we saw.

Please check the attached patch - though the full testing isn't over
yet. bootstraps seem to be ok as well as arrayarg.f90-test (with
sse_loop enabled).

On 19 November 2011 05:38, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Given that x86 memset/memcpy is still broken, I think we should revert
>> it for now.
>
> Well, looking into the code, the SSE alignment issues needs work - the
> alignment test merely tests whether some alignmnet is known not whether 16 byte
> alignment is known that is the cause of failures in 32bit bootstrap.  I originally
> convinced myself that this is safe since we soot for unaligned load/stores anyway.
>
>
> I've commited the following patch that disabled SSE codegen and unbreaks atom
> bootstrap.  This seems more sensible to me given that the patch cumulated some
> good improvements on the non-SSE path as well and we could return into the SSE
> alignment issues incremntally.  There is still falure in the fortran testcase
> that I am convinced is previously latent issue.
>
> I will be offline tomorrow.  If there are futher serious problems, just fell
> free to revert the changes and we could look into them for next stage1.
>
> Honza
>
>        * i386.c (atom_cost): Disable SSE loop until alignment issues are fixed.
> Index: i386.c
> ===================================================================
> --- i386.c      (revision 181479)
> +++ i386.c      (working copy)
> @@ -1783,18 +1783,18 @@ struct processor_costs atom_cost = {
>   /* stringop_algs for memcpy.
>      SSE loops works best on Atom, but fall back into non-SSE unrolled loop variant
>      if that fails.  */
> -  {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  */
> -    {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}},
> -   {{libcall, {{2048, sse_loop}, {2048, unrolled_loop}, {-1, libcall}}}, /* Unknown alignment.  */
> -    {libcall, {{2048, sse_loop}, {2048, unrolled_loop},
> +  {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  */
> +    {libcall, {{4096, unrolled_loop}, {-1, libcall}}}},
> +   {{libcall, {{2048, unrolled_loop}, {-1, libcall}}}, /* Unknown alignment.  */
> +    {libcall, {{2048, unrolled_loop},
>               {-1, libcall}}}}},
>
>   /* stringop_algs for memset.  */
> -  {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  */
> -    {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}},
> -   {{libcall, {{1024, sse_loop}, {1024, unrolled_loop},         /* Unknown alignment.  */
> +  {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  */
> +    {libcall, {{4096, unrolled_loop}, {-1, libcall}}}},
> +   {{libcall, {{1024, unrolled_loop},   /* Unknown alignment.  */
>               {-1, libcall}}},
> -    {libcall, {{2048, sse_loop}, {2048, unrolled_loop},
> +    {libcall, {{2048, unrolled_loop},
>               {-1, libcall}}}}},
>   1,                                   /* scalar_stmt_cost.  */
>   1,                                   /* scalar load_cost.  */



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: memfunc_epilogue_loops.patch
Type: application/octet-stream
Size: 12842 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20111121/478e818b/attachment.obj>


More information about the Gcc-patches mailing list