HPC - Matrix-matrix addition

Moreno Marzolla

Last updated: 2022-11-23

The program cuda-matsum.cu computes the sum of two square matrices of size \(N \times N\) using the CPU. Modify the program to use the GPU; you must modify the function matsum() in such a way that the new version is transparent to the caller, i.e., the caller is not aware whether the computation happens on the CPU or the GPU. To this aim, function matsum() should:

The program must work with any value of the matrix size \(N\), even if it nos an integer multiple of the CUDA block size. Note that there is no need to use shared memory: why?

To compile:

    nvcc cuda-matsum.cu -o cuda-matsum -lm

To execute:

    ./cuda-matsum [N]


    ./cuda-matsum 1024