Title: Multiple Nonzero-Rank Part References
Submitted By: Aleksandar Donev
Status: For Consideration
References: J3/03-253
Basic Functionality:
I propose to delete the constraint that prohibits multiple nonzero rank part-refs:
"In a data-ref, there shall be no more then one part-ref with nonzero rank."
There is no justification for this constraint, and removing it would unleash a most
useful capability which Fortran is uniquely capable of with its ability to deal with
non-contiguous arrays.
Rationale:
The proposed functionality gives two gains:
1) It allows for a kind of separation between the implementation of operations on data
and the way the data is actually stored which is unprecedented in other languages.
This kind of separation is much more flexible and easy to use then inheritance-based
methods (but is more limited in that only data, not methods, are covered). An example
includes the ability to code a computational geometry package which operates on a
collection of points, without specifically indicating how the coordinates of the
points are stored - in a simple multidimensional array, or inside some complicated
hierarchy of derived types.
2) It allows the use of all the powerful array syntax and intrinsics for data stored
inside derived types.
Take the simple example:
TYPE Point3D
! A point in 3D
REAL :: coordinates(3), data(2)
END TYPE Point3D
TYPE(point3D), DIMENSION(10) :: points
! A collection of points
Finding the centroid of the selected points would be performed with,
WRITE(*,*) "The centroid is", SUM(points%coordinates, DIM=2) / SIZE(points)
which requires no loops.
Even more useful would be the ability to pass the coordinates of the selected points to
a procedure (note that this procedure need not know that the coordinates came from an
array of derived type point3D).
Estimated Impact:
The edits needed to implement this are small and localized to Section 6.1.2 (examples
are given under Specification). References with multiple non-zero part-refs are treated
in all respects like data-refs with just a single non-zero rank part-ref, namely, they
are array sections.
Therefore I estimate that no other part of the standard will need to be changed.
The implementation of this feature does require some nontrivial work.
However, the steps involved are very similar to the way current data-refs and array
pointers/sections are handled.
I have implemented extensions for the three compilers I use to be able to use such
structure components in only a hundred lines of Fortran + C code.
I essentially use low-level C code which manipulates the compiler's array descriptors
to create an higher rank array pointer to the data-refs I need, and then I can use the
array pointer when I need to access the data as a multi-rank array (see my Fortran
Forum article).
Detailed Specification:
The main edits needed are the following:
Delete "In a data-ref, there shall be no more then one part-ref with nonzero rank".
Then add constraint
The rank of a data-ref is the sum of the ranks of the part-refs with nonzero rank,
if any; otherwise, the rank is zero.
...
Cxxx: The maximum rank of a data-ref shall be 7.
and change the way the rank of data-refs is determined:
The rank and shape of a nonzero rank part-ref are determined as follows.
If the part-ref has no section-subscript-list, the rank and shape are those of
part-name. Otherwise, the rank is the number of subscript triplets and vector
subscripts in section-subscript-list, and the shape is the rank-1 array whose i-th
element is the number of integer values in the sequence indicated by the i-th subscript
triplet or vector subscript. If any of these sequences is empty, the corresponding
element in the shape is zero.
In an array-section, the rank of the array is the sum of the ranks of the nonzero rank
part-refs. The shape of the array is the rank-1 array obtained by concatenating the
shapes of the nonzero rank part-refs, in backward order, i.e., starting from the last
one. If the shape has an element with the value of zero, the array section has size zero.
There are some other edits that will be needed, mostly in Section 6.1.2.
The Shape of the data-ref
A problem in the proposal as described above is that the Fortran order of specifying
components, structure%component, as opposed to the alternative component%structure,
is the opposite of the order of concatenation of the shapes of the non-zero rank
references.
For example, the reference:
level1(1:4,1:5,1:6)%level2(1:2,1:3)%level3(1:1)
represents an array section of shape (/1,2,3,4,5,6/), and not (/4,5,6,2,3,1/) as might
be thought at first.
However, this is the best choice, for both the compiler and the standard and the user,
despite the extra cost of having to be careful with indices in certain situations. I
believe the wrong choice was made when component references were chosen to follow the
C-style ordering of object%component instead of component%object. This cannot be changed
now without introducing a whole new syntax and the associated cost for users and
implementors. Instead, we should choose the proposed shape for the data-ref that I
describe here and accept the loss of simplicity in the syntax as unavoidable due to
past mistakes.
History:
Many debates during the design of F8x...
Comments:
John Reid, JKR Associates, Oxford:
I would like to suggest that we allow arrays of arrays, such as
a(:,:)%comp(:,:)
They are not allowed because when such an array is passed to a dummy argument dum,
a(i,j)%comp(k,l)
corresponds to
dum(k,l,i,j)
and the more array parts there are, the more confusing it is seen to be.
However, I think we could get used to the rule and it is not too hard to state.
Personally, I would prefer a shorter and simpler proposal and am prepared to work on it
if we decide in favour.
I discussed this with Lawrie Schonfelder some time ago by e-mails and he wants it.
Malcolm Cohen, Nihon NAG, Tokyo:
This seems semi-reasonable, HOWEVER
(i) we still need to maintain that no pointer or allocatable component
can occur after a nonzero rank part.
(ii) it is seriously limited without expanding our current 7-dim limit.
Expanding the 7-dim limit costs (it makes runtime library routines
that traverse an array bigger - they all need rewriting).
Overall, I'd definitely put this feature as being lower priority than
expanding our current dimension limit.
SUMMARY: (1) More important to have more dimensions; pick a number (15
is the smallest number J3 came up with, and is the largest
one I'd want to see).
(2) I don't see this particular proposal as being terribly
important.