Home
ObjexxFCL 4.2
 

Developers Guide

This guide contains some supplementary information of interest to project developers who wish to understand the design and inner workings of the ObjexxFCL.

The ObjexxFCL provides a fairly complete emulation layer for features up through those of Fortran 2008 and a number of capabilities beyond those of Fortran. In particular the Array class template hierarchy is notably more complex and subtle. For this reason care should be exercised when modifying and extending the ObjexxFCL code.

The Users guide is a prerequisite for this guide.


ObjexxFCL Organization

The ObjexxFCL is organized into the following source modules:

Module Description
ObjexxFCL ObjexxFCL declarations
ObjexxFCL.Project ObjexxFCL Project-specific declarations
Array Array abstract base class template
Array.all All-dimension Array master wrapper
ArrayN ND Array abstract base class template
ArrayND ND Real Array class template
ArrayNA ND Argument (contiguous proxy) Array class template
ArrayN.all ND Array master wrapper
ArrayNS ND Slice Array class template
MArrayN ND Member array class template
ArrayInitializer Array initializer class template
ArrayTail Array contiguous tail proxy class template
IndexRange Index range class
InitializerSentinel Array initializer sentinel class
ProxySentinel Proxy array sentinel class
Vector2 Fast 2-element vector
Vector3 Fast 3-element vector
Vector4 Fast 4-element vector
CArray C-style array wrapper
CArrayA C-style array wrapper with alignment support
CArrayP C-style array wrapper/proxy
ChunkVector Chunk-contiguous 1D vector
Chunk ChunkVector 1D Chunk vector class template
ChunkExponent ChunkVector exponent wrapper class
Omit Omitted argument sentinel class
Sticky Sticky (persistent) initializer wrapper class
Cstring C-style string (char*) memory managed wrapper class
string.constants Useful std::string constants
string.functions Useful std::string functions
char.constants Useful char constants
char.functions Useful char functions
Fmath Math intrinsics/other functions
numeric Numeric intrinsics: kind-related, digits, huge, ...
bit Bit functions
byte Single-byte integer
ubyte Single-byte unsigned integer
gio Global i/o system
Stream Stream wrapper and container
Read Formatted read support
Write Formatted write support
Print Formatted output to console support
Inquire File/stream query support
Backspace Back up by one record
Rewind Move to beginning of stream/file
IOFlags I/o control and status flag collection class
Format Format class hierarchy for formatted input/output
FormattedIO Non-global formatted i/o meta header
fmt Low-level formatted stream input/output support
fmt.manipulators Stream manipulators used by fmt
time Time and date functions
random Random number functions
command Command line functions
environment Environment variable functions
Optional Optional argument wrapper class template
Required Required argument wrapper class template
Reference Reference (POINTER) wrapper class template
array.iterator C array begin and end iterator functions
floops Fortran DO loop logic support
rvalue_cast rvalue cast to reference function template
Traits* Format descriptor type traits class templates
TypeTraits Type traits class template and specializations
Vector2 Fast 2-element vector
Vector3 Fast 3-element vector
Vector4 Fast 4-element vector

Source modules may have header and implementation files or just header files. Only the header files for the modules in green would normally be included directly in project code, but the other headers can be used if desired. Classes intended for use in project code are forward declared in headers of the form Class.fwd.hh along with typedef names that are provided for coding convenience.


ObjexxFCL Applications

The ObjexxFCL is compatible with common 32 and 64-bit platforms. Very large (64-bit size type) Array and ChunkVector arrays are supported on 64-bit platforms but indexing into each dimension of an Array is still done by int types so each dimension's index range is limited to the range of a (32-bit) int.

The ObjexxFCL is currently intended for use with single-threaded applications and is not thread safe.

The ObjexxFCL can be built into a shared library or dynamic link library (DLL) but such use should be carefully validated on each platform/compiler combination to assure proper functioning. Using a shared library built with one compiler with executables built with another version of that compiler or a different compiler may not work due to the use of different C++ ABIs.


ObjexxFCL Design

Everything in the ObjexxFCL lives in the ObjexxFCL namespace. Normally projects would bring everything into visibility with a "using namespace ObjexxFCL;" directive as in the ObjexxFCL.Project.hh header provided. Even with such a directive the ObjexxFCL:: prefix can be used for explicit disambiguation.

The design of the ObjexxFCL is focused on providing near-seamless Fortran migration support and near-Fortran performance. It is not intended to provide a complete linear algebra library or high-level matrix operations. Programming errors in the use of the ObjexxFCL are caught by assertion failures to avoid slowing down release builds, but this requires testing to be done with debug builds that enable assertion checks. C++ exceptions are not used to avoid the performance cost and the burden placed on project code to handle the exceptions.


IndexRange

The IndexRange classes encapsulate the arbitrary index range that Fortran arrays can use for each dimension (unlike the zero-based array indexing of native C/C++ arrays).

Zero-sized index ranges are indicated by index ranges of the form [lower,lower-1]. (Zero-sized Arrays are supported.)

"Unbounded" index ranges, having unknown upper bounds, have an index range of the form [lower,lower-2] with size given by a constant named npos that is defined as the unsigned type size_t cast of -1. Unbounded argument Arrays are created when a bare array element is passed (the "faster" method) to an argument array pass-by-value function argument.


Array Hierarchy

Design

The Array hierarchy is designed to achieve a number of goals:

  • Fortran-compatibility:
    • Arbitrary index ranges for each dimension
    • Array passing "tricks"
    • Fortran 90+ array slicing and member array support
    • Note: This ObjexxFCL variant uses row-major array layout, unlike Fortran's column-major layout
  • Fast, near-Fortran run-time performance

The data is stored in a dynamically allocated array that is owned by the corresponding real array and pointed to, but not owned, by any argument arrays that might refer to the real array. The row-major ordering is obtained by the formulas giving the index into the linear array from the set of array dimensional indexes.

On the assumption that subscripting calls are the most common and performance-critical, the Array design uses some cached values to speed up the subscript operations. The size of all but the last dimension's index ranges are cached and an offset pointer into the data array is cached. The const subscript operator returns its value by reference, which may have a small performance cost for built-in numeric types on some platforms but is necessary to support the passing of array elements to array arguments. Linear (one-dimensional) indexing is provided for very fast access to a sequence of array elements whose linear index is easily calculated.

The argument Arrays are proxy objects that provide a view to the contiguous data of another array but act as if it is their own data. Argument Arrays don't automatically reattach to arrays that are resized. Argument Arrays are quickly constructed for use in function argument lists.

Real and argument Arrays can be passed by reference when the function array type will always match that of the caller; a reference to the common Array base class can be passed any Arrays of the same rank and can perform all of the common array operations.

The argument arrays "work" by grabbing a pointer to the passed array, array section, or element and, when possible, extracting the size of the actual data section. They can then reinterpret the pointed to data as an array of their declared rank. Since function argument declarations cannot contain constructor arguments, when the passed array is not of the same rank and dimensions as the argument array, or when array sections or elements will be passed, it is necessary to set the argument array dimensions with a call to the dim member function, as in A.dim(3,4), before the array is used in the function.

When array elements are passed to argument arrays the argument array can only extract the address of that element for its data pointer and it has indeterminate size. The dim call can set a size but this cannot be checked against the actual underlying data array. The loss of size information will propagate through subsequent passing of that argument array, eliminating the possibility of bounds checking for those arrays. For this reason the argument, a, member function is provided to pass "safer", as in A.a(2,3): an ArrayTail proxy object is constructed and passed that contains the data pointer and size information. There is a performance cost for the construction of ArrayTails that remains in release builds so there is a definite tradeoff.

Array assignment operators have value semantics but, unlike Fortran, will resize a real array if necessary. Real arrays can also be resized by the dimension member function. Resizing real Arrays invalidates any argument or slice arrays attached to them.

In order to achieve maximal run-time performance no array bounds checks are performed for any Array classes in release builds (when NDEBUG is defined). Bounds are checked by asserts in debug builds. All new code using Arrays should be tested with debug builds.

The Array implementation is heavily tuned for performance and thus has some unusual features:

  • Protected data and some manually inlined functions are used to improve the performance of non-inlining debug builds.
  • Overrides of non-virtual functions are used to allow more efficient calls to be made from the concrete Array classes.

Extensions

The Array hierarchy could be extended in many ways for specific applications:

  • Always bounds-checked subscript functions (like std::vector::at).
  • Additional whole-array operations. Avoiding temporary arrays (such as with expression templates) where possible is important for performance.
  • An optional data-preserving resize during automatic redimensioning policy.
  • Linear algebra operations: Gaussian and iterative solvers, inversion, etc.

Objexx has developed some of these and can develop custom extensions for clients.

In many cases it may make more sense to interface with existing matrix and numerical libraries. Arrays provide access to their row-major data arrays so they can work directly with libraries that accept arrays with this ordering. Copy in/out semantics will be required to interface with other array representations such as nested std::vectors, TNT, and Blitz++: this should be done as non-member functions declared and defined in separate files to avoid unnecessary dependencies.

Slices

The ArrayS hierarchy is a new system that provides the ObjexxFCL array slice (section) support. Array slices are generally non-contiguous and thus cannot use the fast contiguous implementation of the Arrays and can't efficiently provide linear indexing. For these reasons ArrayS is currently a separate hierarchy from Array. This requires a lot of duplicated code for both types and can create overload complexities in application code, so an effort to integrate slice arrays into the Array hierarchy is planned.

Slice arrays have the same characteristics as Fortran slices: some operations are slower and memory cache performance can be degraded. Unlike Fortran, ObjexxFCL does not allow passing slices to contiguous arrays.

In Fortran if you pass an array slice to a contiguous array argument, like an assumed size array, the compiler will allow it but perform slow copy in, copy out semantics. ObjexxFCL supports such usage if the slice is actually contiguous. Otherwise the best solution is probably providing the necessary slice array argument overloads.

ObjexxFCL slices use an efficient representation for good performance. This requires "drilling" into the underlying array data rather than resolving lookup calls through, possibly many, layer of slices.


ChunkVector

ChunkVector is designed to support very large 1D arrays. It uses a std::vector of Chunk objects of user-controllable size to avoid trying to allocate massive contiguous blocks of memory in a possibly fragmented memory environment. By using power-of-two Chunk sizes the 2-level indexing can be done with bit shift operations and provides speed competitive with that of std::vector. As of v.2.4.0 ChunkVector was rewritten to hold Chunk objects, which handle their own memory management, so that some problems with using nested std::vectors for the Chunks could be avoided, including:

  • No control over the possible excess capacity in each Chunk (without expensive shrink operations)
  • No way to avoid initialization of elements of built-in value types
  • No bounds checking in debug builds
  • No non-preserving resize operations

ChunkVector::resize was written to take advantage of the ability to swap the old Chunks into the new outer std::vector instead of expensively copying each Chunk as std::vector::resize would do.