Write a MPI-Parallel code on 4 process to compute the sums of all rows of a 10,000 x 10,000 matrix. Let the process with rank 0 generate the matrix, and send corresponding rows of the matrix to processes rank 1, rank 2, and rank 3.
I am a professor of parallel programming and I have worked with C and MPI several times before.
I could do this project without any problems at all.
You say that you need CUDA skills in the project guidelines, but CUDA is for parallel programming in GPU, and I think that you have to do all the calculations in MPI.
I have a prior knowledge of MPI programming and I am an expert in C. I work in HPC domain and freshly looking for parallel programming jobs as a freelancer.
Relevant Skills and Experience
Prior knowledge of MPI using openMPI and MPICH.
Expert in C programming. Would rate 8/10.