Module mpp_io

mpp_io_mod, is a set of simple calls for parallel I/O on distributed systems. It is geared toward the writing of data in netCDF format. It requires the modules mpp_domains_mod and mpp_mod, upon which it is built.

In massively parallel environments, an often difficult problem is the reading and writing of data to files on disk. MPI-IO and MPI-2 IO are moving toward providing this capability, but are currently not widely implemented. Further, it is a rather abstruse API. mpp_io_mod is an attempt at a simple API encompassing a certain variety of the I/O tasks that will be required. It does not attempt to be an all-encompassing standard such as MPI, however, it can be implemented in MPI if so desired. It is equally simple to add parallel I/O capability to mpp_io_mod based on vendor-specific APIs while providing a layer of insulation for user codes.

The mpp_io_mod parallel I/O API built on top of the mpp_domains_mod and mpp_mod API for domain decomposition and message passing. Features of mpp_io_mod include:

1) Simple, minimal API, with free access to underlying API for more complicated stuff.
2) Self-describing files: comprehensive header information (metadata) in the file itself.
3) Strong focus on performance of parallel write: the climate models for which it is designed typically read a minimal amount of data (typically at the beginning of the run); but on the other hand, tend to write copious amounts of data during the run. An interface for reading is also supplied, but its performance has not yet been optimized.
4) Integrated netCDF capability: netCDF is a data format widely used in the climate/weather modeling community. netCDF is considered the principal medium of data storage for mpp_io_mod. But I provide a raw unformatted fortran I/O capability in case netCDF is not an option, either due to unavailability, inappropriateness, or poor performance.
5) May require off-line post-processing: a tool for this purpose, mppnccombine, is available. GFDL users may use ~hnv/pub/mppnccombine. Outside users may obtain the source here. It can be compiled on any C compiler and linked with the netCDF library. The program is free and is covered by the GPL license.

The internal representation of the data being written out is assumed be the default real type, which can be 4 or 8-byte. Time data is always written as 8-bytes to avoid overflow on climatic time scales in units of seconds.

I/O modes in `mpp_io_mod`

The I/O activity critical to performance in the models for which mpp_io_mod is designed is typically the writing of large datasets on a model grid volume produced at intervals during a run. Consider a 3D grid volume, where model arrays are stored as (i,j,k). The domain decomposition is typically along i or j: thus to store data to disk as a global volume, the distributed chunks of data have to be seen as non-contiguous. If we attempt to have all PEs write this data into a single file, performance can be seriously compromised because of the data reordering that will be required. Possible options are to have one PE acquire all the data and write it out, or to have all the PEs write independent files, which are recombined offline. These three modes of operation are described in the mpp_io_mod terminology in terms of two parameters, threading and fileset, as follows:

Single-threaded I/O: a single PE acquires all the data and writes it out.
Multi-threaded, single-fileset I/O: many PEs write to a single file.
Multi-threaded, multi-fileset I/O: many PEs write to independent files. This is also called distributed I/O.

The middle option is the most difficult to achieve performance. The choice of one of these modes is made when a file is opened for I/O, in mpp_open.

Metadata in `mpp_io_mod`

A requirement of the design of mpp_io_mod is that the file must be entirely self-describing: comprehensive header information describing its contents is present in the header of every file. The header information follows the model of netCDF. Variables in the file are divided into axes and fields. An axis describes a co-ordinate variable, e.g x,y,z,t. A field consists of data in the space described by the axes. An axis is described in mpp_io_mod using the defined type axistype:

   type, public :: axistype
      sequence
      character(len=128) :: name
      character(len=128) :: units
      character(len=256) :: longname
      character(len=8) :: cartesian
      integer :: len
      integer :: sense           !+/-1, depth or height?
      type(domain1D), pointer :: domain
      real, dimension(:), pointer :: data
      integer :: id, did
      integer :: type  ! external NetCDF type format for axis data
      integer :: natt
      type(atttype), pointer :: Att(:) ! axis attributes
   end type axistype

A field is described using the type fieldtype:

   type, public :: fieldtype
      sequence
      character(len=128) :: name
      character(len=128) :: units
      character(len=256) :: longname
      real :: min, max, missing, fill, scale, add
      integer :: pack
      type(axistype), dimension(:), pointer :: axes
      integer, dimension(:), pointer :: size
      integer :: time_axis_index
      integer :: id
      integer :: type ! external NetCDF format for field data
      integer :: natt, ndim
      type(atttype), pointer :: Att(:) ! field metadata
   end type fieldtype

An attribute (global, field or axis) is described using the atttype:

   type, public :: atttype
      sequence
      integer :: type, len
      character(len=128) :: name
      character(len=256)  :: catt
      real(FLOAT_KIND), pointer :: fatt(:)
   end type atttype

This default set of field attributes corresponds closely to various conventions established for netCDF files. The pack attribute of a field defines whether or not a field is to be packed on output. Allowed values of pack are 1,2,4 and 8. The value of pack is the number of variables written into 8 bytes. In typical use, we write 4-byte reals to netCDF output; thus the default value of pack is 2. For pack = 4 or 8, packing uses a simple-minded linear scaling scheme using the scale and add attributes. There is thus likely to be a significant loss of dynamic range with packing. When a field is declared to be packed, the missing and fill attributes, if supplied, are packed also.

Please note that the pack values are the same even if the default real is 4 bytes, i.e PACK=1 still follows the definition above and writes out 8 bytes.

A set of attributes for each variable is also available. The variable definitions and attribute information is written/read by calling mpp_write_meta or mpp_read_meta. A typical calling sequence for writing data might be:

   ...
     type(domain2D), dimension(:), allocatable, target :: domain
     type(fieldtype) :: field
     type(axistype) :: x, y, z, t
   ...
     call mpp_define_domains( (/1,nx,1,ny/), domain )
     allocate( a(domain(pe)%x%data%start_index:domain(pe)%x%data%end_index, &
                 domain(pe)%y%data%start_index:domain(pe)%y%data%end_index,nz) )
   ...
     call mpp_write_meta( unit, x, 'X', 'km', 'X distance', &
          domain=domain(pe)%x, data=(/(float(i),i=1,nx)/) )
     call mpp_write_meta( unit, y, 'Y', 'km', 'Y distance', &
          domain=domain(pe)%y, data=(/(float(i),i=1,ny)/) )
     call mpp_write_meta( unit, z, 'Z', 'km', 'Z distance', &
          data=(/(float(i),i=1,nz)/) )
     call mpp_write_meta( unit, t, 'Time', 'second', 'Time' )
   
     call mpp_write_meta( unit, field, (/x,y,z,t/), 'a', '(m/s)', AAA', &
          missing=-1e36 )
   ...
     call mpp_write( unit, x )
     call mpp_write( unit, y )
     call mpp_write( unit, z )
   ...

In this example, x and y have been declared as distributed axes, since a domain decomposition has been associated. z and t are undistributed axes. t is known to be a record axis (netCDF terminology) since we do not allocate the data element of the axistype. Only one record axis may be associated with a file. The call to mpp_write_meta initializes the axes, and associates a unique variable ID with each axis. The call to mpp_write_meta with argument field declared field to be a 4D variable that is a function of (x,y,z,t), and a unique variable ID is associated with it. A 3D field will be written at each call to mpp_write(field).

The data to any variable, including axes, is written by mpp_write.

Any additional attributes of variables can be added through subsequent mpp_write_meta calls, using the variable ID as a handle. Global attributes, associated with the dataset as a whole, can also be written thus. See the mpp_write_meta call syntax below for further details.

You cannot interleave calls to mpp_write and mpp_write_meta: the first call to mpp_write implies that metadata specification is complete.

A typical calling sequence for reading data might be:

   ...
     integer :: unit, natt, nvar, ntime
     type(domain2D), dimension(:), allocatable, target :: domain
     type(fieldtype), allocatable, dimension(:) :: fields
     type(atttype), allocatable, dimension(:) :: global_atts
     real, allocatable, dimension(:) :: times
   ...
     call mpp_define_domains( (/1,nx,1,ny/), domain )
   
     call mpp_read_meta(unit)
     call mpp_get_info(unit,natt,nvar,ntime)
     allocate(global_atts(natt))
     call mpp_get_atts(unit,global_atts)
     allocate(fields(nvar))
     call mpp_get_vars(unit, fields)
     allocate(times(ntime))
     call mpp_get_times(unit, times)
   
     allocate( a(domain(pe)%x%data%start_index:domain(pe)%x%data%end_index, &
                 domain(pe)%y%data%start_index:domain(pe)%y%data%end_index,nz) )
   ...
     do i=1, nvar
       if (fields(i)%name == 'a')  call mpp_read(unit,fields(i),domain(pe), a,
                                                 tindex)
     enddo
   ...

In this example, the data are distributed as in the previous example. The call to mpp_read_meta initializes all of the metadata associated with the file, including global attributes, variable attributes and non-record dimension data. The call to mpp_get_info returns the number of global attributes (natt), variables (nvar) and time levels (ntime) associated with the file identified by a unique ID (unit). mpp_get_atts returns all global attributes for the file in the derived type atttype(natt). mpp_get_vars returns variable types (fieldtype(nvar)). Since the record dimension data are not allocated for calls to mpp_write, a separate call to mpp_get_times is required to access record dimension data. Subsequent calls to mpp_read return the field data arrays corresponding to the fieldtype. The domain type is an optional argument. If domain is omitted, the incoming field array should be dimensioned for the global domain, otherwise, the field data is assigned to the computational domain of a local array.

Multi-fileset reads are not supported with mpp_read.

`unit`
`name`
`units`
`longname`
`cartesian`
`sense`
`domain`
`data`
`min, max`
`missing`
`fill`
`scale`
`add`
`pack`
`id`
`cval`
`ival`
`rval`

Module mpp_io_mod

OVERVIEW

I/O modes in `mpp_io_mod`

Metadata in `mpp_io_mod`

OTHER MODULES USED

PUBLIC INTERFACE

PUBLIC DATA

PUBLIC ROUTINES

mpp_get_atts

mpp_read

mpp_write_meta

mpp_write

DATA SETS

ERROR MESSAGES

`unit`
`field`
`domain`
`time_index`	time_index is an optional argument. It is to be omitted if the field was defined not to be a function of time. Results are unpredictable if the argument is supplied for a time- independent field, or omitted for a time-dependent field.

`unit`
`global_atts`

`axis`
`field`

Module mpp_io_mod

OVERVIEW

I/O modes in mpp_io_mod

Metadata in mpp_io_mod

OTHER MODULES USED

PUBLIC INTERFACE

PUBLIC DATA

PUBLIC ROUTINES

mpp_get_atts

mpp_read

mpp_write_meta

mpp_write

DATA SETS

ERROR MESSAGES

I/O modes in `mpp_io_mod`

Metadata in `mpp_io_mod`