Programming Taskbook

Electronic problem book on programming

Main

PT for MPI-2

Task groups

MPI7Win

Solution example

PT for MPI-2 | Task groups | MPI7Win

One-sided communications (MPI-2)

When defining an access window object using the MPI_Win_create function, it is recommended to specify the disp_unit displacement (the third parameter) equal to the size of the data item of the corresponding type (it is always either MPI_INT or MPI_DOUBLE in all tasks, so the size can be obtained using the MPI_Type_size function). In this case, one can indicate the target_disp displacement (the fifth parameter of the MPI_Get, MPI_Put, MPI_Accumulate functions) equal to the initial index of the used part of the array (rather than the number of bytes from the beginning of the array to the required part, as in the case of the disp_unit parameter equal to 1).

It is suffice to specify the MPI_INFO_NULL constant as the info parameter (the fourth parameter of the MPI_Win_create function).

If you do not need to create a window in some processes, then the value 0 should be specified as the size parameter (the second parameter of the MPI_Win_create function) in these processes.

It is suffice to specify a constant 0 as the assert parameter in all synchronizing functions (MPI_Win_fence, MPI_Win_start, MPI_Win_post, MPI_Win_lock); this parameter is the last but one parameter in all these functions.

In the first subgroup (MPI7Win1–MPI7Win17), you should use the MPI_Win_fence collective function as a synchronizing function that should be called both before the actions related to the one-sided data transfer and after these actions, but before the actions related to access to the transferred data.

The tasks of the second subgroup (MPI7Win18–MPI7Win30) require the use of local synchronization: the MPI_Win_start, MPI_Win_complete, MPI_Win_post, MPI_Win_wait functions or a pair of the MPI_Win_lock, MPI_Win_unlock functions. In the tasks of this subgroup, there is always specified which kind of the local synchronization you should use.

One-sided communications with the simplest synchronization

MPI7Win1°. An integer is given in each slave process. Create an access window of the size K of integers in the master process (K is the number of slave processes). Using the MPI_Put function call in the slave processes, send all the given integers to the master process and output received integers in the ascending order of ranks of sending processes.

MPI7Win2°. A sequence of R real numbers is given in each slave process, where R is the process rank (1, 2, …). Create an access window of the appropriate size in the master process. Using the MPI_Put function call in the slave processes, send all the given real numbers to the master process and output received numbers in the ascending order of ranks of sending processes.

MPI7Win3°. An array A of K integers is given in the master process, where K is the number of slave processes. Create an access window containing the array A in the master process. Using the MPI_Get function call in the slave processes, receive and output one element of the array A in each slave process. Elements of the array A should be received in the slave processes in descending order of their indices (that is, the element A₀ should be received in the last process, the element A₁ should be received in the last but one process, and so on).

MPI7Win4°. An array A of K + 4 real numbers is given in the master process, where K is the number of slave processes. Create an access window containing the array A in the master process. Using the MPI_Get function call in the slave processes, receive and output five elements of the array A in each slave process starting with the element of index R − 1, where R is the slave process rank (R = 1, 2, …, K − 1).

MPI7Win5°. An array A of K integers is given in the master process, where K is the number of slave processes. In addition, an index N (an integer in the range 0 to K − 1) and an integer B are given in each slave process. Create an access window containing the array A in the master process. Using the MPI_Accumulate function call in the slave processes, multiply the element A_N by the number B and then output the modified array A in the master process.

Note. Some slave processes can contain the same value of N; in this case the element A_N will be multiplied several times. This circumstance does not require additional synchronization due to the features of the MPI_Accumulate function implementation.

MPI7Win6°. An array A of 2K − 1 real numbers is given in the master process (K is the number of slave processes), and array B of R real numbers is given in each slave process (R is the process rank, R = 1, 2, …, K − 1). Create an access window containing the array A in the master process. Using the MPI_Accumulate function call in the slave processes, add the values of all the elements of array B from the process of rank R to the elements of array A starting with the index R − 1 (the single element B₀ from the process 1 should be added to the element A₀, the elements B₀ and B₁ from the process 2 should be added to the elements A₁ and A₂ respectively, the elements B₀, B₁, and B₂ from the process 3 should be added to elements A₂, A₃, and A₄ respectively, and so on). Output the modified array A in the master process.

Note. Elements of array A, starting from the index 2, will be modified several times by adding values from the different slave processes. This circumstance does not require additional synchronization due to the features of the MPI_Accumulate function implementation.

MPI7Win7°. An array A of 2K integers is given in the master process, where K is the number of slave processes. Create an access window containing two integers in each slave process. Using the MPI_Put function call in the master process, send and output two elements of the array A in each slave process. Elements of the array A should be sent to slave processes in ascending order of their indices (that is, the elements A₀ and A₁ should be sent to the process 1, the elements A₂ and A₃ should be sent to the process 2, and so on).

MPI7Win8°. An integer R and a real number B are given in each process. All the integers R are different and are in the range from 0 to K − 1, where K is the number of processes. Create an access window containing one real number in each process. Using the MPI_Put function call in each process, send the number B from this process to the process R and output received numbers in all processes.

MPI7Win9°. An array A of K integers is given in each process, where K is the number of processes. Create an access window containing the array A in each process. Using several calls of the MPI_Get function in each process R (R = 0, …, K − 1), receive and output elements of all arrays A with the index R. Received elements should be output in descending order of ranks of sending processes (that is, the element received from the process K − 1 should be output first, the element received from the process K − 2 should be output second, and so on).

Note. The function MPI_Get, as well as other one-way communication functions, can also be used to access the window created in the calling process.

MPI7Win10°. An array A of 3 real numbers and integers N₁ and N₂ are given in each process. Each of the numbers N₁ and N₂ is in the range 0 to 2. Create an access window containing the array A in each process. Using two calls of the MPI_Get function in each process, receive and output the element of index N₁ from the array A of the previous process and then receive and output the element of index N₂ from the array A of the next process (the numbers N₁ and N₂ are taken from the calling process, processes are taken in a cyclic order).

MPI7Win11°. The number of processes K is an even number. An array A of K/2 integers is given in each process. Create an access window containing the array A in all the odd-rank processes (1, 3, …, K − 1). Using the required number of calls of the MPI_Accumulate function in each even-rank process, add the element A[I] of the process 2J to the element A[J] of the process 2I + 1 and output the changed arrays A in all the odd-rank processes.

Note. The required changing of the given arrays can be described in the another way: if B denotes a matrix of order K/2 whose rows coincide with the arrays A given in the even-rank processes and C denotes a matrix of the same order whose rows coincide with the arrays A given in the odd-rank processes, then the matrix C should be transformed as follows: elements of the row I of the matrix B should be added to the corresponding elements of the column I of the matrix C.

MPI7Win12°. Solve the MPI7Win11 task by creating access windows in even-rank processes and using the MPI_Get function calls instead of the MPI_Accumulate function calls in odd-rank processes.

Note. Since the numbers received from the even-rank processes must be added to the elements of array A after the second MPI_Win_fence synchronization function call, it is convenient to use an auxiliary array to store the received numbers.

MPI7Win13°. Three integers N₁, N₂, N₃ are given in each process; each given integer is in the range 0 to K − 1, where K is the number of processes (the values of some of these integers in each process may coincide). In addition, an array A of R + 1 real numbers is given in each process, where R is the process rank (0, …, K − 1). Create an access window containing the array A in all the processes. Using three calls of the MPI_Accumulate function in each process, add the integer R + 1 to all elements of the arrays A given in the processes N₁, N₂, N₃, where R is the rank of the process that calls the MPI_Accumulate function (for instance, if the number N₁ in the process 3 is equal to 2, then a real number 4.0 should be added to all the elements of array A in the process 2). If some of the integers N₁, N₂, N₃ coincide in the process R, then the number R + 1 should be added to the elements of the corresponding arrays several times. Output the changed arrays A in each process.

MPI7Win14°. An array of K real numbers is given in each process, where K is the number of processes. The given array contains a row of an upper triangular matrix A, including its zero-valued part (the process of rank R contains the Rth row of the matrix, the rows are numbered from 0). Create an access window containing the given array in all the processes. Using the required number of calls of the MPI_Get function in each process, write the rows of the matrix transposed to the given matrix A (including its zero-valued part) in the given arrays. Then output the changed arrays in each process. Do not use auxiliary arrays.

Notes. (1) The rows of the transposed matrix coincide with the columns of the original matrix, so the resulting matrix will be the lower triangular one. (2) You should write zero values to the required array elements only after the second call of the MPI_Win_fence function. (3) You do not need to create an access window for the last process.

MPI7Win15°. Solve the MPI7Win14 task by using the MPI_Put function calls instead of the MPI_Get function calls.

Note. In this case, you do not need to create an access window for the master process.

MPI7Win16°. One row of the square real-valued matrix A of order K is given in each process, where K is the number of processes (the process of rank R contains the Rth row of the matrix, the rows are numbered from 0). In addition, a real number B is given in each process. Create an access window containing the given row of the matrix A in all the processes. Using the required number of calls of the MPI_Accumulate function in each process R (R = 0, …, K − 1), change the matrix row given in the next process as follows: all row elements that are less than the number B from the process R should be replaced by this number B (processes are taken in a cyclic order). Then, using K calls of the MPI_Get function in each process, receive and output the Rth column of the transformed matrix A in the process R (R = 0, …, K − 1, the columns are numbered from 0).

Note. You should call the MPI_Win_fence synchronization function three times in each process.

MPI7Win17°. One row of the square real-valued matrix A of order K is given in each process, where K is the number of processes (the process of rank R contains the Rth row of the matrix, the rows are numbered from 0). In addition, a real number B is given in each process. Create an access window containing the given row of the matrix A in all the processes. Using the required number of calls of the MPI_Accumulate function in each process R (R = 0, …, K − 1), change the matrix row given in the previous process as follows: all row elements that are greater than the number B from the process R should be replaced by this number B (processes are taken in a cyclic order). Then, using K calls of the MPI_Accumulate function in each slave process, add the first element of the row from each slave process R (1, …, K − 1) to all the elements of the Rth column of the transformed matrix A (the columns are numbered from 0). Output the new contents of the given row of the matrix A in each process after all transformations.

Note. You should call the MPI_Win_fence synchronization function three times in each process.

Additional types of synchronization

MPI7Win18°. The number of processes K is an even number. An integer A is given in each even-rank process (0, 2, …, K − 2). Create an access window containing one integer in all the odd-rank processes (1, 3, …, K − 1). Using the MPI_Put function call in each even-rank process 2N, send the integer A to the process 2N + 1 and output the received integers. Use the MPI_Win_start and MPI_Win_complete synchronization functions in the even-rank processes and the MPI_Win_post and MPI_Win_wait synchronization functions in the odd-rank processes. Use the MPI_Group_incl function to create a group of processes specified as the first parameter of the MPI_Win_start and MPI_Win_post functions. The MPI_Group_incl function should be applied to the group of the MPI_COMM_WORLD communicator (use the MPI_Comm_group function to obtain the group of the MPI_COMM_WORLD communicator).

Note. Unlike the MPI_Win_fence collective synchronization function, used in previous tasks, the synchronization functions used in this and the subsequent tasks are local ones and, in addition, allow to specify the groups of origin and target processes for one-way communications.

MPI7Win19°. An array A of K real numbers is given in the master process, where K is the number of slave processes. Create an access window containing the array A in the master process. Using the MPI_Get function call in each slave process, receive and output one of elements of the array A. The elements should be received in descending order of their indices (that is, the element with the index K − 1 should be received in the process 1, the element with the index K − 2 should be received in the process 2, and so on). Use the MPI_Win_start and MPI_Win_complete synchronization functions in the slave processes and the MPI_Win_post and MPI_Win_wait synchronization functions in the master process. Use the MPI_Group_incl function to create a group of processes specified as the first parameter of the MPI_Win_start function, use the MPI_Group_excl function to create a group of processes specified as the first parameter of the MPI_Win_post function. The MPI_Group_incl and MPI_Group_excl functions should be applied to the group of the MPI_COMM_WORLD communicator.

MPI7Win20°. The number of processes K is a multiple of 3. An array A of 3 real numbers is given in the processes of rank 3N (N = 0, …, K/3 − 1). Create an access window containing the array A in all processes in which this array is given. Using one call of the MPI_Get function in the processes of rank 3N + 1 and 3N + 2 (N = 0, …, K/3 − 1), receive and output one element A₀ and two elements A₁, A₂ respectively from the process 3N (namely, the process 1 should output the element A₀ received from the process 0, the process 2 should output the elements A₁ and A₂ received from the process 0, the process 4 should output the element A₀ received from the process 3, and so on). Use the MPI_Win_post and MPI_Win_wait synchronization functions in the processes of rank 3N and the MPI_Win_start and MPI_Win_complete synchronization functions in the other processes.

MPI7Win21°. The number of processes K is an even number. An array A of K/2 real numbers and an array N of K/2 integers are given in the master process. All the elements of the array N are distinct and are in the range 1 to K − 1. Create an access window containing one real number in each slave process. Using the required number of calls of the MPI_Put function in the master process, send the real number A_I to the slave process of rank N_I (I = 0, …, K/2 − 1). Output the received number (оr 0.0 if the process did not receive data) in each slave process. Use the MPI_Win_post and MPI_Win_wait synchronization functions in the slave processes and the MPI_Win_start and MPI_Win_complete synchronization functions in the master process.

MPI7Win22°. An array A of K real numbers (where K is the number of slave processes) and an array N of 8 integers are given in the master process. All the elements of the array N are in the range 1 to K; some elements of this array may have the same value. In addition, an array B of R real numbers is given in the slave process of rank R (R = 1, …, K). Create an access window containing the array B in each slave process. Using the required number of calls of the MPI_Accumulate function in the master process, add all the elements of the array A to the corresponding elements of the array B from the process of rank N_I, I = 0, …, 7 (that is, the element A₀ should be added to the element B₀, the element A₁ should be added to the element B₁, and so on). Elements of the array A can be added several times to some arrays B. Output the array B (which may be changed or not) in each slave process. Use the MPI_Win_post and MPI_Win_wait synchronization functions in the slave processes and the MPI_Win_start and MPI_Win_complete synchronization functions in the master process.

MPI7Win23°. An array A of 5 real numbers is given in each process. In addition, two arrays N and M of 5 integers are given in the master process. All the elements of the array N are in the range 1 to K, where K is the number of slave processes, all the elements of the array M are in the range 0 to 4. Some elements of both the array N and the array M may have the same value. Create an access window containing the array A in each slave process. Using the required number of calls of the MPI_Get function in the master process, receive the element of A with the index M_I from the process N_I (I = 0, …, 4) and add the received element to the element A_I in the master process. After changing the array A in the master process, change all the arrays A in the slave processes as follows: if some element of the array A from the slave process is greater than the element, with the same index, of the array A from the master process, then replace this element in the slave process by the corresponding element from the master process (to do this, use the required number of calls of the MPI_Accumulate function in the master process). Output the changed arrays A in each process. Use two calls of the MPI_Win_post and MPI_Win_wait synchronization functions in the slave processes and two calls of the MPI_Win_start and MPI_Win_complete synchronization functions in the master process.

MPI7Win24°. An integer N is given in each slave process, all the integers N are distinct and are in the range 0 to K − 1, where K is the number of slave processes. Create an access window containing an array A of K integers in each slave process. Without performing any synchronization function calls in the master process (except calling the MPI_Barrier function) and using a sequence of calls of the MPI_Win_lock, MPI_Win_unlock, MPI_Barrier, MPI_Win_lock, MPI_Win_unlock synchronization functions in the slave processes, change element of the array A with index N by assigning the rank of the slave process, which contains the integer N, to this element (to do this, use the MPI_Put function) and then receive and output all the elements of the changed array A in each slave process (to do this, use the MPI_Get function). Use the MPI_LOCK_SHARED constant as the first parameter of the MPI_Win_lock function.

Note. The MPI_Win_lock and MPI_Win_unlock synchronization functions are used mainly for one-way communications with passive targets. In such a kind of one-way communications, the target process does not process the data transferred to it but acts as their storage, which is accessible to other processes.

MPI7Win25°. The number of processes K is a multiple of 3. An array A of 5 real numbers is given in the processes of rank 3N (N = 0, …, K/3 − 1), an integer M and a real number B are given in the processes of rank 3N + 1. The given integers M are in the range 0 to 4. Create an access window containing the array A in all processes in which this array is given. Using the MPI_Accumulate function call in the processes of rank 3N + 1 (N = 0, …, K/3 − 1), change the array A from the process 3N as follows: if the array element with the index M is greater than the number B, then this element should be replaced by the number B (the numbers M and B are taken from the process 3N + 1). Then send the changed array A from the process 3N to the process 3N + 2 and output the received array in the process 3N + 2; to do this, use the MPI_Get function call in the process of rank 3N + 2. Use the MPI_Win_lock, MPI_Win_unlock, MPI_Barrier synchronization functions in the processes of rank 3N + 1, the MPI_Barrier, MPI_Win_lock, MPI_Win_unlock synchronization functions in the processes of rank 3N + 2, and the MPI_Barrier function in the processes of rank 3N. Use the MPI_LOCK_EXCLUSIVE constant as the first parameter of the MPI_Win_lock function.

MPI7Win26°. An array A of 5 positive real numbers is given in each slave process. Create an access window containing an array B of 5 zero-valued real numbers in the master process. Without performing any synchronization function calls in the master process (except calling the MPI_Barrier function) and using a sequence of calls of the MPI_Win_lock, MPI_Win_unlock, MPI_Barrier, MPI_Win_lock, MPI_Win_unlock synchronization functions in the slave processes, change elements of the array B by assigning the maximal value of the array A elements with the index I (I = 0, …, 4) to the array B element with the same index (to do this, use the MPI_Accumulate function) and then receive and output all the elements of the changed array B in each slave process (to do this, use the MPI_Get function). Use the MPI_LOCK_SHARED constant as the first parameter of the MPI_Win_lock function.

MPI7Win27°. Two real numbers X, Y (the coordinates of a some point on a plane) are given in each slave process. Using the MPI_Get function in the master process, receive real numbers X₀, Y₀ in this process that are equal to the coordinates of the point that is the most remote from the origin among all the points given in the slave processes. Then send the numbers X₀, Y₀ from the master process to all the slave processes and output these numbers in the slave processes; to do this, use the MPI_Get function call in the slave processes. Use the MPI_Win_lock, MPI_Win_unlock, MPI_Barrier synchronization functions in the master process and the MPI_Barrier, MPI_Win_lock, MPI_Win_unlock synchronization functions in the slave processes.

Note. This task cannot be solved by using one-way communications only on the side of the slave processes by means of the lock/unlock synchronizations.

MPI7Win28°. Solve the MPI7Win27 task using the single access window containing the numbers X₀, Y₀ in the master process. Use the MPI_Get and MPI_Put functions in the slave processes to find the numbers X₀, Y₀ (for some processes, the MPI_Put function is not required), use the MPI_Get function to send the numbers X₀, Y₀ to all the slave processes (as in the MPI7Win27 task). To synchronize exchanges when find the numbers X₀, Y₀, use two calls of each of the MPI_Win_start and MPI_Win_complete functions in the slave processes and calls of the MPI_Win_post and MPI_Win_wait functions in a loop in the master process (it is necessary to define a new group of processes at each iteration of the loop; this group should be used in the MPI_Win_post function call). To synchronize sending numbers X₀, Y₀ to slave processes, use the MPI_Barrier function in the master process and the MPI_Barrier, MPI_Win_Lock, MPI_Win_unlock functions in the slave processes (as in the MPI7Win27 task).

Note. The solution method described in this task allows one-way communications to be used only on the side of the slave processes (in contrast to the method described in the MPI7Win27 task) but it requires to apply a synchronizations that different from the lock/unlock ones.

MPI7Win29°. One row of the square integer-valued matrix of order K is given in each process, where K is the number of processes (the process of rank R contains the Rth row of the matrix, the rows are numbered from 0). Using the MPI_Get function calls in the master process, receive a matrix row with the minimal sum S of elements in this process and also find the number N of matrix rows with this minimal sum (if N > 1, then the last of such rows, that is, the row with the maximal ordinal number, should be saved in the master process). Then send this matrix row, the sum S, and the number N to each slave process using the MPI_Get function in these processes. Output all received data in each process. To do this, create an access window containing K + 2 integers in each process; the first K elements of the window should contain the elements of the matrix row, the next element should contain the sum S of its elements, and the last element should contain the number N. Use the MPI_Win_lock, MPI_Win_unlock, MPI_Barrier synchronization functions in the master process and the MPI_Barrier, MPI_Win_lock, MPI_Win_unlock synchronization functions in the slave processes.

Note. This task cannot be solved by using one-way communications only on the side of the slave processes by means of the lock/unlock synchronizations.

MPI7Win30°. Solve the MPI7Win29 task using the single access window containing the matrix row and the numbers S and N in the master process. Use the MPI_Get and MPI_Put functions in the slave processes to find the matrix row with the minimal sum and the related numbers S and N (for some processes, the MPI_Put function is not required), use the MPI_Get function to send the row with the minimal sum and the numbers S and N to all the slave processes (as in the MPI7Win29 task). To synchronize exchanges when find the matrix row, use two calls of each of the MPI_Win_start and MPI_Win_complete functions in the slave processes and calls of the MPI_Win_post and MPI_Win_wait functions in a loop in the master process (it is necessary to define a new group of processes at each iteration of the loop; this group should be used in the MPI_Win_post function call). To synchronize sending the row with the minimal sum and the numbers S and N to slave processes, use the MPI_Barrier function in the master process and the MPI_Barrier, MPI_Win_Lock, MPI_Win_unlock functions in the slave processes (as in the MPI7Win29 task).

Note. The solution method described in this task allows one-way communications to be used only on the side of the slave processes (in contrast to the method described in the MPI7Win29 task) but it requires to apply a synchronizations that different from the lock/unlock ones.

Last revised:
01.01.2025