This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch, libfortran] Improve performance of byte swapped IO


PING.

Slightly updated patch attached, which further improves the generic
size fallback that is used when the element size is not 2/4/8 bytes.
Changing the us_perf benchmark to use real(10), with the v2 patch the
performance is:

 Unformatted sequential write/read performance test
 Record size           Write MB/s                 Read MB/s
 ==========================================================
           4   59.028550429522085        86.019754350948787
           8   79.028327063130590        95.803502000733374
          16   99.980457395413296        138.68367462874946
          32   122.56886206338788        180.05609910155042
          64   152.00478266944486        212.69931319407567
         128   197.74137934940202        235.19728791956828
         256   155.36245780017779        244.60578379215929
         512   157.13385845966246        245.07467397691480
        1024   177.26553799130201        260.44908357795623
        2048   208.22852888945587        260.21587143113527
        4096   222.88410474980634        262.66162209490591
        8192   226.71167580652920        265.81191407123663
       16384   206.51818241747065        263.59395165591724
       32768   230.18707026455866        265.88990325026526
       65536   229.19783089391504        268.04485112932684
      131072   231.12215662044449        267.40543904427710
      262144   230.72012123598142        267.60086931504122
      524288   230.48959460456055        268.78750211303725

With the new v3 patch I get

 Unformatted sequential write/read performance test
 Record size           Write MB/s                 Read MB/s
 ==========================================================
           4   59.779061121239941        92.777125264010024
           8   92.727504266051341        126.64775563782673
          16   128.94793911163904        184.69194300482837
          32   169.78916283536847        267.06752001266767
          64   209.50296476919556        341.60515130910238
         128   236.36709738360679        416.73212655882151
         256   251.79029695383340        465.46804746749740
         512   259.62269939828633        500.87346060356265
        1024   265.08842337586458        508.95530627428275
        2048   268.71795530051884        532.12211365683640
        4096   280.86546884821030        546.88907054369884
        8192   286.96049684823578        569.60958187426183
       16384   292.04368984868103        608.11503416324865
       32768   292.96677387959392        629.80651297065833
       65536   291.69098580137114        624.27103478079641
      131072   292.75666234956418        605.99766136491496
      262144   291.35520038228975        611.59061455535834
      524288   292.15446100501691        623.76232623081580


On Sat, Jan 5, 2013 at 11:13 PM, Janne Blomqvist
<blomqvist.janne@gmail.com> wrote:
> On Sat, Jan 5, 2013 at 5:35 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Fri, Jan 4, 2013 at 11:35 PM, Andreas Schwab <schwab@linux-m68k.org> wrote:
>>> Janne Blomqvist <blomqvist.janne@gmail.com> writes:
>>>
>>>> diff --git a/libgfortran/io/file_pos.c b/libgfortran/io/file_pos.c
>>>> index c8ecc3a..bf2250a 100644
>>>> --- a/libgfortran/io/file_pos.c
>>>> +++ b/libgfortran/io/file_pos.c
>>>> @@ -140,15 +140,21 @@ unformatted_backspace (st_parameter_filepos *fpp, gfc_unit *u)
>>>>       }
>>>>        else
>>>>       {
>>>> +       uint32_t u32;
>>>> +       uint64_t u64;
>>>>         switch (length)
>>>>           {
>>>>           case sizeof(GFC_INTEGER_4):
>>>> -           reverse_memcpy (&m4, p, sizeof (m4));
>>>> +           memcpy (&u32, p, sizeof (u32));
>>>> +           u32 = __builtin_bswap32 (u32);
>>>> +           m4 = *(GFC_INTEGER_4*)&u32;
>>>
>>> Isn't that an aliasing violation?
>>
>> It looks like one.  Why not simply do
>>
>>    m4 = (GFC_INTEGER_4) u32;
>>
>> ?  I suppose GFC_INTEGER_4 is always the same size as uint32_t but signed?
>
> Yes, GFC_INTEGER_4 is a typedef for int32_t. As for why I didn't do
> the above, C99 6.3.1.3(3) says that if the unsigned value is outside
> the range of the signed variable, the result is
> implementation-defined. Though I suppose the sensible
> "implementation-defined behavior" in this case on a two's complement
> target is to just do a bitwise copy.
>
> Anyway, to be really safe one could use memcpy instead; the compiler
> optimizes small fixed size memcpy's just fine. Updated patch attached.
>
>
> --
> Janne Blomqvist



-- 
Janne Blomqvist

Attachment: bswap3.diff
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]