!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! !! !! GNU General Public License !! !! !! !! This file is part of the Flexible Modeling System (FMS). !! !! !! !! FMS is free software; you can redistribute it and/or modify !! !! it and are expected to follow the terms of the GNU General Public !! !! License as published by the Free Software Foundation. !! !! !! !! FMS is distributed in the hope that it will be useful, !! !! but WITHOUT ANY WARRANTY; without even the implied warranty of !! !! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the !! !! GNU General Public License for more details. !! !! !! !! You should have received a copy of the GNU General Public License !! !! along with FMS; if not, write to: !! !! Free Software Foundation, Inc. !! !! 59 Temple Place, Suite 330 !! !! Boston, MA 02111-1307 USA !! !! or see: !! !! http://www.gnu.org/licenses/gpl.txt !! !! !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !----------------------------------------------------------------------- ! Communication for message-passing codes ! ! AUTHOR: V. Balaji (V.Balaji@noaa.gov) ! SGI/GFDL Princeton University ! ! This program is free software; you can redistribute it and/or modify ! it under the terms of the GNU General Public License as published by ! the Free Software Foundation; either version 2 of the License, or ! (at your option) any later version. ! ! This program is distributed in the hope that it will be useful, ! but WITHOUT ANY WARRANTY; without even the implied warranty of ! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ! GNU General Public License for more details. ! ! For the full text of the GNU General Public License, ! write to: Free Software Foundation, Inc., ! 675 Mass Ave, Cambridge, MA 02139, USA. !----------------------------------------------------------------------- module mpp_mod !a generalized communication package for use with shmem and MPI !will add: co_array_fortran, MPI2 !Balaji (V.Balaji@noaa.gov) 11 May 1998 ! ! V. Balaji ! ! ! ! ! mpp_mod, is a set of simple calls to provide a uniform interface ! to different message-passing libraries. It currently can be ! implemented either in the SGI/Cray native SHMEM library or in the MPI ! standard. Other libraries (e.g MPI-2, Co-Array Fortran) can be ! incorporated as the need arises. ! ! ! The data transfer between a processor and its own memory is based ! on load and store operations upon ! memory. Shared-memory systems (including distributed shared memory ! systems) have a single address space and any processor can acquire any ! data within the memory by load and ! store. The situation is different for distributed ! parallel systems. Specialized MPP systems such as the T3E can simulate ! shared-memory by direct data acquisition from remote memory. But if ! the parallel code is distributed across a cluster, or across the Net, ! messages must be sent and received using the protocols for ! long-distance communication, such as TCP/IP. This requires a ! ``handshaking'' between nodes of the distributed system. One can think ! of the two different methods as involving puts or ! gets (e.g the SHMEM library), or in the case of ! negotiated communication (e.g MPI), sends and ! recvs. ! ! The difference between SHMEM and MPI is that SHMEM uses one-sided ! communication, which can have very low-latency high-bandwidth ! implementations on tightly coupled systems. MPI is a standard ! developed for distributed computing across loosely-coupled systems, ! and therefore incurs a software penalty for negotiating the ! communication. It is however an open industry standard whereas SHMEM ! is a proprietary interface. Besides, the puts or ! gets on which it is based cannot currently be implemented in ! a cluster environment (there are recent announcements from Compaq that ! occasion hope). ! ! The message-passing requirements of climate and weather codes can be ! reduced to a fairly simple minimal set, which is easily implemented in ! any message-passing API. mpp_mod provides this API. ! ! Features of mpp_mod include: ! ! 1) Simple, minimal API, with free access to underlying API for ! more complicated stuff.
! 2) Design toward typical use in climate/weather CFD codes.
! 3) Performance to be not significantly lower than any native API. ! ! This module is used to develop higher-level calls for domain decomposition and parallel I/O. ! ! Parallel computing is initially daunting, but it soon becomes ! second nature, much the way many of us can now write vector code ! without much effort. The key insight required while reading and ! writing parallel code is in arriving at a mental grasp of several ! independent parallel execution streams through the same code (the SPMD ! model). Each variable you examine may have different values for each ! stream, the processor ID being an obvious example. Subroutines and ! function calls are particularly subtle, since it is not always obvious ! from looking at a call what synchronization between execution streams ! it implies. An example of erroneous code would be a global barrier ! call (see mpp_sync below) placed ! within a code block that not all PEs will execute, e.g: ! !
!   if( pe.EQ.0 )call mpp_sync()
!   
! ! Here only PE 0 reaches the barrier, where it will wait ! indefinitely. While this is a particularly egregious example to ! illustrate the coding flaw, more subtle versions of the same are ! among the most common errors in parallel code. ! ! It is therefore important to be conscious of the context of a ! subroutine or function call, and the implied synchronization. There ! are certain calls here (e.g mpp_declare_pelist, mpp_init, ! mpp_malloc, mpp_set_stack_size) which must be called by all ! PEs. There are others which must be called by a subset of PEs (here ! called a pelist) which must be called by all the PEs in the ! pelist (e.g mpp_max, mpp_sum, mpp_sync). Still ! others imply no synchronization at all. I will make every effort to ! highlight the context of each call in the MPP modules, so that the ! implicit synchronization is spelt out. ! ! For performance it is necessary to keep synchronization as limited ! as the algorithm being implemented will allow. For instance, a single ! message between two PEs should only imply synchronization across the ! PEs in question. A global synchronization (or barrier) ! is likely to be slow, and is best avoided. But codes first ! parallelized on a Cray T3E tend to have many global syncs, as very ! fast barriers were implemented there in hardware. ! ! Another reason to use pelists is to run a single program in MPMD ! mode, where different PE subsets work on different portions of the ! code. A typical example is to assign an ocean model and atmosphere ! model to different PE subsets, and couple them concurrently instead of ! running them serially. The MPP module provides the notion of a ! current pelist, which is set when a group of PEs branch off ! into a subset. Subsequent calls that omit the pelist optional ! argument (seen below in many of the individual calls) assume that the ! implied synchronization is across the current pelist. The calls ! mpp_root_pe and mpp_npes also return the values ! appropriate to the current pelist. The mpp_set_current_pelist ! call is provided to set the current pelist. !
! ! F90 is a strictly-typed language, and the syntax pass of the ! compiler requires matching of type, kind and rank (TKR). Most calls ! listed here use a generic type, shown here as MPP_TYPE_. This ! is resolved in the pre-processor stage to any of a variety of ! types. In general the MPP operations work on 4-byte and 8-byte ! variants of integer, real, complex, logical variables, of ! rank 0 to 5, leading to 48 specific module procedures under the same ! generic interface. Any of the variables below shown as ! MPP_TYPE_ is treated in this way. ! #include #if defined(use_libSMA) && defined(sgi_mipspro) use shmem_interface #endif #if defined(use_libMPI) && defined(sgi_mipspro) use mpi #endif use mpp_parameter_mod, only : MPP_VERBOSE, MPP_DEBUG, ALL_PES, ANY_PE, NULL_PE use mpp_parameter_mod, only : NOTE, WARNING, FATAL, MPP_CLOCK_DETAILED,MPP_CLOCK_SYNC use mpp_parameter_mod, only : CLOCK_COMPONENT, CLOCK_SUBCOMPONENT, CLOCK_MODULE_DRIVER use mpp_parameter_mod, only : CLOCK_MODULE, CLOCK_ROUTINE, CLOCK_LOOP, CLOCK_INFRA use mpp_parameter_mod, only : MAX_EVENTS, MAX_BINS, MAX_EVENT_TYPES, PESET_MAX, MAX_CLOCKS use mpp_parameter_mod, only : MAXPES, EVENT_WAIT, EVENT_ALLREDUCE, EVENT_BROADCAST use mpp_parameter_mod, only : EVENT_RECV, EVENT_SEND, MPP_READY, MPP_WAIT use mpp_parameter_mod, only : mpp_parameter_version=>version, mpp_parameter_tagname=>tagname use mpp_data_mod, only : stat, mpp_stack, ptr_stack, status, ptr_status, sync, ptr_sync use mpp_data_mod, only : mpp_from_pe, ptr_from, remote_data_loc, ptr_remote use mpp_data_mod, only : mpp_data_version=>version, mpp_data_tagname=>tagname implicit none private #if defined(use_libSMA) #include #endif #if defined(use_libMPI) && !defined(sgi_mipspro) #include !sgi_mipspro gets this from 'use mpi' #endif !--- public paramters ----------------------------------------------- public :: MPP_VERBOSE, MPP_DEBUG, ALL_PES, ANY_PE, NULL_PE, NOTE, WARNING, FATAL public :: MPP_CLOCK_SYNC, MPP_CLOCK_DETAILED, CLOCK_COMPONENT, CLOCK_SUBCOMPONENT public :: CLOCK_MODULE_DRIVER, CLOCK_MODULE, CLOCK_ROUTINE, CLOCK_LOOP, CLOCK_INFRA public :: MAXPES, EVENT_RECV, EVENT_SEND !--- public data from mpp_data_mod ------------------------------ public :: request !--- public interface from mpp_util.h ------------------------------ public :: stdin, stdout, stderr, stdlog, lowercase, uppercase, mpp_error, mpp_error_state public :: mpp_set_warn_level, mpp_sync, mpp_sync_self, mpp_set_stack_size, mpp_pe public :: mpp_node, mpp_npes, mpp_root_pe, mpp_set_root_pe, mpp_declare_pelist public :: mpp_get_current_pelist, mpp_set_current_pelist, mpp_clock_begin, mpp_clock_end public :: mpp_clock_id, mpp_clock_set_grain, mpp_record_timing_data, get_unit !--- public interface from mpp_comm.h ------------------------------ public :: mpp_chksum, mpp_max, mpp_min, mpp_sum, mpp_transmit, mpp_send, mpp_recv public :: mpp_broadcast, mpp_malloc, mpp_init, mpp_exit #ifdef use_MPI_GSM public :: mpp_gsm_malloc, mpp_gsm_free #endif !********************************************************************* ! ! public data type ! !********************************************************************* !peset hold communicators as SHMEM-compatible triads (start, log2(stride), num) type :: communicator private character(len=32) :: name integer, pointer :: list(:) =>NULL() integer :: count integer :: start, log2stride ! dummy variables when libMPI is defined. integer :: id, group ! MPI communicator and group id for this PE set. ! dummy variables when libSMA is defined. end type communicator type :: event private character(len=16) :: name integer(LONG_KIND), dimension(MAX_EVENTS) :: ticks, bytes integer :: calls end type event !a clock contains an array of event profiles for a region type :: clock private character(len=32) :: name integer(LONG_KIND) :: tick integer(LONG_KIND) :: total_ticks integer :: peset_num logical :: sync_on_begin, detailed integer :: grain type(event), pointer :: events(:) =>NULL() !if needed, allocate to MAX_EVENT_TYPES logical :: is_on !initialize to false. set true when calling mpp_clock_begin ! set false when calling mpp_clock_end end type clock type :: Clock_Data_Summary private character(len=16) :: name real(DOUBLE_KIND) :: msg_size_sums(MAX_BINS) real(DOUBLE_KIND) :: msg_time_sums(MAX_BINS) real(DOUBLE_KIND) :: total_data real(DOUBLE_KIND) :: total_time integer(LONG_KIND) :: msg_size_cnts(MAX_BINS) integer(LONG_KIND) :: total_cnts end type Clock_Data_Summary type :: Summary_Struct private character(len=16) :: name type (Clock_Data_Summary) :: event(MAX_EVENT_TYPES) end type Summary_Struct !*********************************************************************** ! ! public interface from mpp_util.h ! !*********************************************************************** ! ! ! Error handler. ! ! ! It is strongly recommended that all error exits pass through ! mpp_error to assure the program fails cleanly. An individual ! PE encountering a STOP statement, for instance, can cause the ! program to hang. The use of the STOP statement is strongly ! discouraged. ! ! Calling mpp_error with no arguments produces an immediate error ! exit, i.e: !
  !    call mpp_error
  !    call mpp_error(FATAL)
  !    
! are equivalent. ! ! The argument order !
  !    call mpp_error( routine, errormsg, errortype )
  !    
! is also provided to support legacy code. In this version of the ! call, none of the arguments may be omitted. ! ! The behaviour of mpp_error for a WARNING can be ! controlled with an additional call mpp_set_warn_level. !
  !    call mpp_set_warn_level(ERROR)
  !    
! causes mpp_error to treat WARNING ! exactly like FATAL. !
  !    call mpp_set_warn_level(WARNING)
  !    
! resets to the default behaviour described above. ! ! mpp_error also has an internal error state which ! maintains knowledge of whether a warning has been issued. This can be ! used at startup in a subroutine that checks if the model has been ! properly configured. You can generate a series of warnings using ! mpp_error, and then check at the end if any warnings has been ! issued using the function mpp_error_state(). If the value of ! this is WARNING, at least one warning has been issued, and ! the user can take appropriate action: ! !
  !    if( ... )call mpp_error( WARNING, '...' )
  !    if( ... )call mpp_error( WARNING, '...' )
  !    if( ... )call mpp_error( WARNING, '...' )
  !    ...
  !    if( mpp_error_state().EQ.WARNING )call mpp_error( FATAL, '...' )
  !    
!
! ! ! One of NOTE, WARNING or FATAL ! (these definitions are acquired by use association). ! NOTE writes errormsg to STDOUT. ! WARNING writes errormsg to STDERR. ! FATAL writes errormsg to STDERR, ! and induces a clean error exit with a call stack traceback. ! !
interface mpp_error module procedure mpp_error_basic module procedure mpp_error_mesg module procedure mpp_error_noargs module procedure mpp_error_is module procedure mpp_error_rs module procedure mpp_error_ia module procedure mpp_error_ra module procedure mpp_error_ia_ia module procedure mpp_error_ia_ra module procedure mpp_error_ra_ia module procedure mpp_error_ra_ra module procedure mpp_error_ia_is module procedure mpp_error_ia_rs module procedure mpp_error_ra_is module procedure mpp_error_ra_rs module procedure mpp_error_is_ia module procedure mpp_error_is_ra module procedure mpp_error_rs_ia module procedure mpp_error_rs_ra module procedure mpp_error_is_is module procedure mpp_error_is_rs module procedure mpp_error_rs_is module procedure mpp_error_rs_rs end interface interface array_to_char module procedure iarray_to_char module procedure rarray_to_char end interface !*********************************************************************** ! ! public interface from mpp_comm.h ! !*********************************************************************** #ifdef use_libSMA !currently SMA contains no generic shmem_wait for different integer kinds: !I have inserted one here interface shmem_integer_wait module procedure shmem_int4_wait_local module procedure shmem_int8_wait_local end interface #endif !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! ! ROUTINES TO INITIALIZE/FINALIZE MPP MODULE: mpp_init, mpp_exit ! ! ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! ! Initialize mpp_mod. ! ! ! Called to initialize the mpp_mod package. It is recommended ! that this call be the first executed line in your program. It sets the ! number of PEs assigned to this run (acquired from the command line, or ! through the environment variable NPES), and associates an ID ! number to each PE. These can be accessed by calling mpp_npes and mpp_pe. ! ! ! ! flags can be set to MPP_VERBOSE to ! have mpp_mod keep you informed of what it's up to. ! ! ! ! ! Exit mpp_mod. ! ! ! Called at the end of the run, or to re-initialize mpp_mod, ! should you require that for some odd reason. ! ! This call implies synchronization across all PEs. ! ! ! !####################################################################### ! ! ! Symmetric memory allocation. ! ! ! This routine is used on SGI systems when mpp_mod is ! invoked in the SHMEM library. It ensures that dynamically allocated ! memory can be used with shmem_get and ! shmem_put. This is called symmetric ! allocation and is described in the ! intro_shmem man page. ptr is a Cray ! pointer (see the section on portability). The operation can be expensive ! (since it requires a global barrier). We therefore attempt to re-use ! existing allocation whenever possible. Therefore len ! and ptr must have the SAVE attribute ! in the calling routine, and retain the information about the last call ! to mpp_malloc. Additional memory is symmetrically ! allocated if and only if newlen exceeds ! len. ! ! This is never required on Cray PVP or MPP systems. While the T3E ! manpages do talk about symmetric allocation, mpp_mod ! is coded to remove this restriction. ! ! It is never required if mpp_mod is invoked in MPI. ! ! This call implies synchronization across all PEs. ! ! ! ! a cray pointer, points to a dummy argument in this routine. ! ! ! the required allocation length for the pointer ptr ! ! ! the current allocation (0 if unallocated). ! ! !##################################################################### ! ! ! Allocate module internal workspace. ! ! ! mpp_mod maintains a private internal array called ! mpp_stack for private workspace. This call sets the length, ! in words, of this array. ! ! The mpp_init call sets this ! workspace length to a default of 32768, and this call may be used if a ! longer workspace is needed. ! ! This call implies synchronization across all PEs. ! ! This workspace is symmetrically allocated, as required for ! efficient communication on SGI and Cray MPP systems. Since symmetric ! allocation must be performed by all PEs in a job, this call ! must also be called by all PEs, using the same value of ! n. Calling mpp_set_stack_size from a subset of PEs, ! or with unequal argument n, may cause the program to hang. ! ! If any MPP call using mpp_stack overflows the declared ! stack array, the program will abort with a message specifying the ! stack length that is required. Many users wonder why, if the required ! stack length can be computed, it cannot also be specified at that ! point. This cannot be automated because there is no way for the ! program to know if all PEs are present at that call, and with equal ! values of n. The program must be rerun by the user with the ! correct argument to mpp_set_stack_size, called at an ! appropriate point in the code where all PEs are known to be present. ! ! ! ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! ! GLOBAL REDUCTION ROUTINES: mpp_max, mpp_sum, mpp_min ! ! ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! ! Reduction operations. ! ! ! Find the max of scalar a the PEs in pelist ! result is also automatically broadcast to all PEs ! ! ! ! real or integer, of 4-byte of 8-byte kind. ! ! ! If pelist is omitted, the context is assumed to be the ! current pelist. This call implies synchronization across the PEs in ! pelist, or the current pelist if pelist is absent. ! ! interface mpp_max module procedure mpp_max_real8 #ifndef no_8byte_integers module procedure mpp_max_int8 #endif #ifdef OVERLOAD_R4 module procedure mpp_max_real4 #endif module procedure mpp_max_int4 end interface interface mpp_min module procedure mpp_min_real8 #ifndef no_8byte_integers module procedure mpp_min_int8 #endif #ifdef OVERLOAD_R4 module procedure mpp_min_real4 #endif module procedure mpp_min_int4 end interface ! ! ! Reduction operation. ! ! ! MPP_TYPE_ corresponds to any 4-byte and 8-byte variant of ! integer, real, complex variables, of rank 0 or 1. A ! contiguous block from a multi-dimensional array may be passed by its ! starting address and its length, as in f77. ! ! Library reduction operators are not required or guaranteed to be ! bit-reproducible. In any case, changing the processor count changes ! the data layout, and thus very likely the order of operations. For ! bit-reproducible sums of distributed arrays, consider using the ! mpp_global_sum routine provided by the mpp_domains module. ! ! The bit_reproducible flag provided in earlier versions of ! this routine has been removed. ! ! ! If pelist is omitted, the context is assumed to be the ! current pelist. This call implies synchronization across the PEs in ! pelist, or the current pelist if pelist is absent. ! ! ! ! ! ! interface mpp_sum #ifndef no_8byte_integers module procedure mpp_sum_int8 module procedure mpp_sum_int8_scalar module procedure mpp_sum_int8_2d module procedure mpp_sum_int8_3d module procedure mpp_sum_int8_4d module procedure mpp_sum_int8_5d #endif module procedure mpp_sum_real8 module procedure mpp_sum_real8_scalar module procedure mpp_sum_real8_2d module procedure mpp_sum_real8_3d module procedure mpp_sum_real8_4d module procedure mpp_sum_real8_5d #ifdef OVERLOAD_C8 module procedure mpp_sum_cmplx8 module procedure mpp_sum_cmplx8_scalar module procedure mpp_sum_cmplx8_2d module procedure mpp_sum_cmplx8_3d module procedure mpp_sum_cmplx8_4d module procedure mpp_sum_cmplx8_5d #endif module procedure mpp_sum_int4 module procedure mpp_sum_int4_scalar module procedure mpp_sum_int4_2d module procedure mpp_sum_int4_3d module procedure mpp_sum_int4_4d module procedure mpp_sum_int4_5d #ifdef OVERLOAD_R4 module procedure mpp_sum_real4 module procedure mpp_sum_real4_scalar module procedure mpp_sum_real4_2d module procedure mpp_sum_real4_3d module procedure mpp_sum_real4_4d module procedure mpp_sum_real4_5d #endif #ifdef OVERLOAD_C4 module procedure mpp_sum_cmplx4 module procedure mpp_sum_cmplx4_scalar module procedure mpp_sum_cmplx4_2d module procedure mpp_sum_cmplx4_3d module procedure mpp_sum_cmplx4_4d module procedure mpp_sum_cmplx4_5d #endif end interface !##################################################################### ! ! ! Basic message-passing call. ! ! ! MPP_TYPE_ corresponds to any 4-byte and 8-byte variant of ! integer, real, complex, logical variables, of rank 0 or 1. A ! contiguous block from a multi-dimensional array may be passed by its ! starting address and its length, as in f77. ! ! mpp_transmit is currently implemented as asynchronous ! outward transmission and synchronous inward transmission. This follows ! the behaviour of shmem_put and shmem_get. In MPI, it ! is implemented as mpi_isend and mpi_recv. For most ! applications, transmissions occur in pairs, and are here accomplished ! in a single call. ! ! The special PE designations NULL_PE, ! ANY_PE and ALL_PES are provided by use ! association. ! ! NULL_PE: is used to disable one of the pair of ! transmissions.
! ANY_PE: is used for unspecific remote ! destination. (Please note that put_pe=ANY_PE has no meaning ! in the MPI context, though it is available in the SHMEM invocation. If ! portability is a concern, it is best avoided).
! ALL_PES: is used for broadcast operations. ! ! It is recommended that mpp_broadcast be used for ! broadcasts. ! ! The following example illustrates the use of ! NULL_PE and ALL_PES: ! !
  !    real, dimension(n) :: a
  !    if( pe.EQ.0 )then
  !        do p = 1,npes-1
  !           call mpp_transmit( a, n, p, a, n, NULL_PE )
  !        end do
  !    else
  !        call mpp_transmit( a, n, NULL_PE, a, n, 0 )
  !    end if
  !    
  !    call mpp_transmit( a, n, ALL_PES, a, n, 0 )
  !    
! ! The do loop and the broadcast operation above are equivalent. ! ! Two overloaded calls mpp_send and ! mpp_recv have also been ! provided. mpp_send calls mpp_transmit ! with get_pe=NULL_PE. mpp_recv calls ! mpp_transmit with put_pe=NULL_PE. Thus ! the do loop above could be written more succinctly: ! !
  !    if( pe.EQ.0 )then
  !        do p = 1,npes-1
  !           call mpp_send( a, n, p )
  !        end do
  !    else
  !        call mpp_recv( a, n, 0 )
  !    end if
  !    
!
! !
interface mpp_transmit module procedure mpp_transmit_real8 module procedure mpp_transmit_real8_scalar module procedure mpp_transmit_real8_2d module procedure mpp_transmit_real8_3d module procedure mpp_transmit_real8_4d module procedure mpp_transmit_real8_5d #ifdef OVERLOAD_C8 module procedure mpp_transmit_cmplx8 module procedure mpp_transmit_cmplx8_scalar module procedure mpp_transmit_cmplx8_2d module procedure mpp_transmit_cmplx8_3d module procedure mpp_transmit_cmplx8_4d module procedure mpp_transmit_cmplx8_5d #endif #ifndef no_8byte_integers module procedure mpp_transmit_int8 module procedure mpp_transmit_int8_scalar module procedure mpp_transmit_int8_2d module procedure mpp_transmit_int8_3d module procedure mpp_transmit_int8_4d module procedure mpp_transmit_int8_5d module procedure mpp_transmit_logical8 module procedure mpp_transmit_logical8_scalar module procedure mpp_transmit_logical8_2d module procedure mpp_transmit_logical8_3d module procedure mpp_transmit_logical8_4d module procedure mpp_transmit_logical8_5d #endif #ifdef OVERLOAD_R4 module procedure mpp_transmit_real4 module procedure mpp_transmit_real4_scalar module procedure mpp_transmit_real4_2d module procedure mpp_transmit_real4_3d module procedure mpp_transmit_real4_4d module procedure mpp_transmit_real4_5d #endif #ifdef OVERLOAD_C4 module procedure mpp_transmit_cmplx4 module procedure mpp_transmit_cmplx4_scalar module procedure mpp_transmit_cmplx4_2d module procedure mpp_transmit_cmplx4_3d module procedure mpp_transmit_cmplx4_4d module procedure mpp_transmit_cmplx4_5d #endif module procedure mpp_transmit_int4 module procedure mpp_transmit_int4_scalar module procedure mpp_transmit_int4_2d module procedure mpp_transmit_int4_3d module procedure mpp_transmit_int4_4d module procedure mpp_transmit_int4_5d module procedure mpp_transmit_logical4 module procedure mpp_transmit_logical4_scalar module procedure mpp_transmit_logical4_2d module procedure mpp_transmit_logical4_3d module procedure mpp_transmit_logical4_4d module procedure mpp_transmit_logical4_5d end interface interface mpp_recv module procedure mpp_recv_real8 module procedure mpp_recv_real8_scalar module procedure mpp_recv_real8_2d module procedure mpp_recv_real8_3d module procedure mpp_recv_real8_4d module procedure mpp_recv_real8_5d #ifdef OVERLOAD_C8 module procedure mpp_recv_cmplx8 module procedure mpp_recv_cmplx8_scalar module procedure mpp_recv_cmplx8_2d module procedure mpp_recv_cmplx8_3d module procedure mpp_recv_cmplx8_4d module procedure mpp_recv_cmplx8_5d #endif #ifndef no_8byte_integers module procedure mpp_recv_int8 module procedure mpp_recv_int8_scalar module procedure mpp_recv_int8_2d module procedure mpp_recv_int8_3d module procedure mpp_recv_int8_4d module procedure mpp_recv_int8_5d module procedure mpp_recv_logical8 module procedure mpp_recv_logical8_scalar module procedure mpp_recv_logical8_2d module procedure mpp_recv_logical8_3d module procedure mpp_recv_logical8_4d module procedure mpp_recv_logical8_5d #endif #ifdef OVERLOAD_R4 module procedure mpp_recv_real4 module procedure mpp_recv_real4_scalar module procedure mpp_recv_real4_2d module procedure mpp_recv_real4_3d module procedure mpp_recv_real4_4d module procedure mpp_recv_real4_5d #endif #ifdef OVERLOAD_C4 module procedure mpp_recv_cmplx4 module procedure mpp_recv_cmplx4_scalar module procedure mpp_recv_cmplx4_2d module procedure mpp_recv_cmplx4_3d module procedure mpp_recv_cmplx4_4d module procedure mpp_recv_cmplx4_5d #endif module procedure mpp_recv_int4 module procedure mpp_recv_int4_scalar module procedure mpp_recv_int4_2d module procedure mpp_recv_int4_3d module procedure mpp_recv_int4_4d module procedure mpp_recv_int4_5d module procedure mpp_recv_logical4 module procedure mpp_recv_logical4_scalar module procedure mpp_recv_logical4_2d module procedure mpp_recv_logical4_3d module procedure mpp_recv_logical4_4d module procedure mpp_recv_logical4_5d end interface interface mpp_send module procedure mpp_send_real8 module procedure mpp_send_real8_scalar module procedure mpp_send_real8_2d module procedure mpp_send_real8_3d module procedure mpp_send_real8_4d module procedure mpp_send_real8_5d #ifdef OVERLOAD_C8 module procedure mpp_send_cmplx8 module procedure mpp_send_cmplx8_scalar module procedure mpp_send_cmplx8_2d module procedure mpp_send_cmplx8_3d module procedure mpp_send_cmplx8_4d module procedure mpp_send_cmplx8_5d #endif #ifndef no_8byte_integers module procedure mpp_send_int8 module procedure mpp_send_int8_scalar module procedure mpp_send_int8_2d module procedure mpp_send_int8_3d module procedure mpp_send_int8_4d module procedure mpp_send_int8_5d module procedure mpp_send_logical8 module procedure mpp_send_logical8_scalar module procedure mpp_send_logical8_2d module procedure mpp_send_logical8_3d module procedure mpp_send_logical8_4d module procedure mpp_send_logical8_5d #endif #ifdef OVERLOAD_R4 module procedure mpp_send_real4 module procedure mpp_send_real4_scalar module procedure mpp_send_real4_2d module procedure mpp_send_real4_3d module procedure mpp_send_real4_4d module procedure mpp_send_real4_5d #endif #ifdef OVERLOAD_C4 module procedure mpp_send_cmplx4 module procedure mpp_send_cmplx4_scalar module procedure mpp_send_cmplx4_2d module procedure mpp_send_cmplx4_3d module procedure mpp_send_cmplx4_4d module procedure mpp_send_cmplx4_5d #endif module procedure mpp_send_int4 module procedure mpp_send_int4_scalar module procedure mpp_send_int4_2d module procedure mpp_send_int4_3d module procedure mpp_send_int4_4d module procedure mpp_send_int4_5d module procedure mpp_send_logical4 module procedure mpp_send_logical4_scalar module procedure mpp_send_logical4_2d module procedure mpp_send_logical4_3d module procedure mpp_send_logical4_4d module procedure mpp_send_logical4_5d end interface ! ! ! Parallel broadcasts. ! ! ! The mpp_broadcast call has been added because the original ! syntax (using ALL_PES in mpp_transmit) did not ! support a broadcast across a pelist. ! ! MPP_TYPE_ corresponds to any 4-byte and 8-byte variant of ! integer, real, complex, logical variables, of rank 0 or 1. A ! contiguous block from a multi-dimensional array may be passed by its ! starting address and its length, as in f77. ! ! Global broadcasts through the ALL_PES argument to mpp_transmit are still provided for ! backward-compatibility. ! ! If pelist is omitted, the context is assumed to be the ! current pelist. from_pe must belong to the current ! pelist. This call implies synchronization across the PEs in ! pelist, or the current pelist if pelist is absent. ! ! ! ! ! ! ! interface mpp_broadcast module procedure mpp_broadcast_real8 module procedure mpp_broadcast_real8_scalar module procedure mpp_broadcast_real8_2d module procedure mpp_broadcast_real8_3d module procedure mpp_broadcast_real8_4d module procedure mpp_broadcast_real8_5d #ifdef OVERLOAD_C8 module procedure mpp_broadcast_cmplx8 module procedure mpp_broadcast_cmplx8_scalar module procedure mpp_broadcast_cmplx8_2d module procedure mpp_broadcast_cmplx8_3d module procedure mpp_broadcast_cmplx8_4d module procedure mpp_broadcast_cmplx8_5d #endif #ifndef no_8byte_integers module procedure mpp_broadcast_int8 module procedure mpp_broadcast_int8_scalar module procedure mpp_broadcast_int8_2d module procedure mpp_broadcast_int8_3d module procedure mpp_broadcast_int8_4d module procedure mpp_broadcast_int8_5d module procedure mpp_broadcast_logical8 module procedure mpp_broadcast_logical8_scalar module procedure mpp_broadcast_logical8_2d module procedure mpp_broadcast_logical8_3d module procedure mpp_broadcast_logical8_4d module procedure mpp_broadcast_logical8_5d #endif #ifdef OVERLOAD_R4 module procedure mpp_broadcast_real4 module procedure mpp_broadcast_real4_scalar module procedure mpp_broadcast_real4_2d module procedure mpp_broadcast_real4_3d module procedure mpp_broadcast_real4_4d module procedure mpp_broadcast_real4_5d #endif #ifdef OVERLOAD_C4 module procedure mpp_broadcast_cmplx4 module procedure mpp_broadcast_cmplx4_scalar module procedure mpp_broadcast_cmplx4_2d module procedure mpp_broadcast_cmplx4_3d module procedure mpp_broadcast_cmplx4_4d module procedure mpp_broadcast_cmplx4_5d #endif module procedure mpp_broadcast_int4 module procedure mpp_broadcast_int4_scalar module procedure mpp_broadcast_int4_2d module procedure mpp_broadcast_int4_3d module procedure mpp_broadcast_int4_4d module procedure mpp_broadcast_int4_5d module procedure mpp_broadcast_logical4 module procedure mpp_broadcast_logical4_scalar module procedure mpp_broadcast_logical4_2d module procedure mpp_broadcast_logical4_3d module procedure mpp_broadcast_logical4_4d module procedure mpp_broadcast_logical4_5d end interface !##################################################################### ! ! ! Parallel checksums. ! ! ! mpp_chksum is a parallel checksum routine that returns an ! identical answer for the same array irrespective of how it has been ! partitioned across processors. LONG_KINDis the KIND ! parameter corresponding to long integers (see discussion on ! OS-dependent preprocessor directives) defined in ! the header file fms_platform.h. MPP_TYPE_ corresponds to any ! 4-byte and 8-byte variant of integer, real, complex, logical ! variables, of rank 0 to 5. ! ! Integer checksums on FP data use the F90 TRANSFER() ! intrinsic. ! ! The serial checksum module is superseded ! by this function, and is no longer being actively maintained. This ! provides identical results on a single-processor job, and to perform ! serial checksums on a single processor of a parallel job, you only ! need to use the optional pelist argument. !
  !     use mpp_mod
  !     integer :: pe, chksum
  !     real :: a(:)
  !     pe = mpp_pe()
  !     chksum = mpp_chksum( a, (/pe/) )
  !     
! ! The additional functionality of mpp_chksum over ! serial checksums is to compute the checksum across the PEs in ! pelist. The answer is guaranteed to be the same for ! the same distributed array irrespective of how it has been ! partitioned. ! ! If pelist is omitted, the context is assumed to be the ! current pelist. This call implies synchronization across the PEs in ! pelist, or the current pelist if pelist is absent. !
! ! ! !
interface mpp_chksum #ifndef no_8byte_integers module procedure mpp_chksum_i8_1d module procedure mpp_chksum_i8_2d module procedure mpp_chksum_i8_3d module procedure mpp_chksum_i8_4d #endif module procedure mpp_chksum_i4_1d module procedure mpp_chksum_i4_2d module procedure mpp_chksum_i4_3d module procedure mpp_chksum_i4_4d module procedure mpp_chksum_r8_0d module procedure mpp_chksum_r8_1d module procedure mpp_chksum_r8_2d module procedure mpp_chksum_r8_3d module procedure mpp_chksum_r8_4d module procedure mpp_chksum_r8_5d #ifdef OVERLOAD_C8 module procedure mpp_chksum_c8_0d module procedure mpp_chksum_c8_1d module procedure mpp_chksum_c8_2d module procedure mpp_chksum_c8_3d module procedure mpp_chksum_c8_4d module procedure mpp_chksum_c8_5d #endif #ifdef OVERLOAD_R4 module procedure mpp_chksum_r4_0d module procedure mpp_chksum_r4_1d module procedure mpp_chksum_r4_2d module procedure mpp_chksum_r4_3d module procedure mpp_chksum_r4_4d module procedure mpp_chksum_r4_5d #endif #ifdef OVERLOAD_C4 module procedure mpp_chksum_c4_0d module procedure mpp_chksum_c4_1d module procedure mpp_chksum_c4_2d module procedure mpp_chksum_c4_3d module procedure mpp_chksum_c4_4d module procedure mpp_chksum_c4_5d #endif end interface !*********************************************************************** ! ! module variables ! !*********************************************************************** type(communicator),save :: peset(0:PESET_MAX) !0 is a dummy used to hold single-PE "self" communicator logical :: module_is_initialized = .false. logical :: debug = .false. integer :: npes=1, root_pe=0, pe=0 integer(LONG_KIND) :: tick, ticks_per_sec, max_ticks, start_tick, end_tick, tick0=0 integer :: mpp_comm_private logical :: first_call_system_clock_mpi=.TRUE. real(DOUBLE_KIND) :: mpi_count0=0 ! use to prevent integer overflow real(DOUBLE_KIND) :: mpi_tick_rate=0.d0 ! clock rate for mpi_wtick() logical :: mpp_record_timing_data=.TRUE. type(clock),save :: clocks(MAX_CLOCKS) integer :: log_unit, etc_unit character(len=32) :: configfile='logfile' integer :: peset_num=0, current_peset_num=0 integer :: world_peset_num !the world communicator integer :: error integer :: clock_num=0, num_clock_ids=0,current_clock=0, previous_clock(MAX_CLOCKS)=0 real :: tick_rate integer, allocatable :: request(:) integer, allocatable :: request_recv(:) ! if you want to save the non-root PE information uncomment out the following line ! and comment out the assigment of etcfile to '/dev/null' #ifdef NO_DEV_NULL character(len=32) :: etcfile='._mpp.nonrootpe.msgs' #else character(len=32) :: etcfile='/dev/null' #endif #ifdef SGICRAY integer :: in_unit=100, out_unit=101, err_unit=102 !see intro_io(3F): to see why these values are used rather than 5,6,0 #else integer :: in_unit=5, out_unit=6, err_unit=0 #endif !--- variables used in mpp_util.h type(Summary_Struct) :: clock_summary(MAX_CLOCKS) logical :: warnings_are_fatal = .FALSE. integer :: error_state=0 integer :: clock_grain=CLOCK_LOOP-1 !--- variables used in mpp_comm.h #ifdef use_libMPI #ifdef _CRAYT3E !BWA: mpif.h on t3e currently does not contain MPI_INTEGER8 datatype !(O2k and t90 do) !(t3e: fixed on 3.3 I believe) integer, parameter :: MPI_INTEGER8=MPI_INTEGER #endif #endif /* use_libMPI */ #ifdef use_MPI_SMA #include integer :: pSync(SHMEM_BARRIER_SYNC_SIZE) pointer( p_pSync, pSync ) !used by SHPALLOC #endif integer :: clock0 !measures total runtime from mpp_init to mpp_exit integer :: mpp_stack_size=0, mpp_stack_hwm=0 integer :: tag=1 logical :: verbose=.FALSE. #ifdef _CRAY integer(LONG_KIND) :: word(1) #endif #if defined(sgi_mipspro) || defined(__ia64) integer(INT_KIND) :: word(1) #endif character(len=128), public :: version= & '$Id mpp.F90 $' character(len=128), public :: tagname= & '$Name: mom4p1_pubrel_dec2009_nnz $' contains #include #include #include end module mpp_mod