[Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

Tobias Burnus tobias@codesourcery.com
Fri Feb 17 11:13:52 GMT 2023


Short version:

This fixes potential and real bugs related to 'len=:' character variables
as for the length/byte size an old/saved expression is used instead of
the current value. - That's fine but not for allocatable/pointer with 'len=:'.


Main part of the patch: Strip the SAVE_EXPR from the size expression:

   if (len && deferred && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR)
     {
       gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
       TYPE_SIZE (type) = TREE_OPERAND (TYPE_SIZE (type), 0);
       TYPE_SIZE_UNIT (type) = TREE_OPERAND (TYPE_SIZE_UNIT (type), 0);
     }


OK for mainline?

* * *

Long version:

BACKGROUND:


(A) VLA / EXPLICIT-SIZE ARRAYS + LEN=<expr|var> STRINGS


C knows something like VLA (variable length arrays), likewise Fortran
knows explicit size array and character length where the length/size
depends on an variable set before the current scoping unit. Examples:

void f(int N)
{
   int vla[N*5];
}

subroutine foo(n)
   integer :: n
   integer :: array(n*5)
   integer :: my_len
   ...
   my_len = 5
   block
     character(len=my_len, kind=4) :: str

     my_len = 99
     print *, len(str)  ! still shows 5 - not 99
   end block
end


In all cases, the size / length is not known at compile time but it won't change.
Thus, expressions like (pseudo code)
    byte_size = n * 5 * sizeof(integer)
can be saved and re-used and do not have to be re-calculated every time the
    TYPE_SIZE or TYPE_UNIT_SIZE
is used.

In particular, the 'my_len' example shows that just using the current value of
'my_len' would be wrong as it can be overridden.


* * *


(B) DEFERRED-LENGTH STRINGS ('character(len=:), pointer/allocatable')


But with deferred-length strings, such as

   character(len=:), pointer :: pstr(:)
   ...
   allocate(character(len=2) :: pstr(5))
   ...
   !$omp target enter data map(alloc: pstr(2:5))


this leads to code like:

   integer(kind=8) .pstr;
   struct array01_character(kind=1) pstr;

   D.4302 = (sizetype) NON_LVALUE_EXPR <.pstr>;
   pstr.dtype = {.elem_len=(unsigned long) .pstr, .rank=1, .type=6};
...
     .pstr = 2;  // during allocation/pointer assignment
...
       parm.1.data = pstr.data + (sizetype) ((~pstr.dim[0].lbound * D.4287)
                                 * (integer(kind=8)) SAVE_EXPR <D.4302>);

And here D.4302 is the pre-calculated value instead of the current value,
which can be either 0 or some random value.


Such code happens when using code like:
     elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type));

Of course, there are various ways to avoid this – like obtaining somehow the
string length directly - either from the expression or from the type such as
   TYPE_DOMAIN (type)
but it can easily go wrong.

  * * *

IDEAL SOLUTION:

I think from the middle-end perspective, we should do:
   build_range_type (type, 0, NULL_TREE)
leaving the upper bound unspecified – which should also help with
type-is-the-same middle-end analysis.


PRACTICAL SOLUTION:

But as code like TYPE_SIZE_UNIT is very widely used - and we currently lack
a place to store the tree decl for the length, I propose the following as discussed
with Jakub yesterday:

We just remove SAVE_EXPR after generating the type.

Side note: In some cases, the type is already constructed with len = NULL; I have not
checked when. In that case, using TYPE_SIZE will fail at compile time.
(That remains unchanged by this patch.)

  * * *

OK for mainline?

Tobias

  * * *

PS: I have no real opinion whether we want to have any backports, thoughts?


PPS: I don't have any real example I want to add as most cases have been work-around
fixed in the meanwhile. If you want to test it, the following fails. I intent to add
an extended tests as part of a larger follow-up patch which fixes more OpenMP issues:

   character(len=:), pointer :: pstr(:)
   allocate(character(len=2) :: pstr(5))
   !$omp target enter data map(alloc: pstr(2:5))
end


Compile with -fopenmp -fdump-tree-original (or a later dump).


BEFORE the patch:

   integer(kind=8) .pstr;
   ...
   D.4291 = (sizetype) NON_LVALUE_EXPR <.pstr>;
   pstr.dtype = {.elem_len=(unsigned long) .pstr, .rank=1, .type=6};
   ...
     .pstr = 2;
   ...
         pstr.data = __builtin_malloc (10);
   ...
       parm.1.data = pstr.data + (sizetype) (((2 - pstr.dim[0].lbound) * D.4287)
                                 * (integer(kind=8)) SAVE_EXPR <D.4291>);

AFTER the patch:
   .....
       parm.1.data = pstr.data + (sizetype) (((2 - pstr.dim[0].lbound) * D.4287)
                                 * NON_LVALUE_EXPR <.pstr>);
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix-deferred-len.diff
Type: text/x-patch
Size: 13405 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20230217/d45c5e02/attachment.bin>


More information about the Gcc-patches mailing list