高性能计算代写 |澳洲代写


COSC3500/7502 Assignment: Parallel programming techniques

咨询 Alpha 小助手,获取更多课业帮助

1. Summary

The goal is to implement three different matrix multiply functions for three different hardware configurations (CPU - AVX/openMP , GPU - CUDA, and a cluster of two nodes - MPI). The matrices are real-only square matrices. Performance will be benchmarked relative to a reference implementation (Intel MKL for CPU and MPI, CUBLAS for GPU) for a 2048×2048 matrix, and marks will be assigned based on speed.


3. Hardware

Final performance will be assessed on the vgpu10 - 0 and vgpu10 - 1 nodes of the rangpur.compute.eait.uq.edu.au cluster. For development, jobs can be submitted to getafix.smp.uq.edu.au. All nodes on rangpur have similar performance for CPU and GPU jobs and most MPI implementations. Only highly optimized MPI matrix multiply will show communication overhead.


4. The benchmarks

Random unitary square real - only matrices are created and multiplied. The result is checked for correctness and speed relative to reference

implementations. For CPU and GPU, matrix multiplication is on the same machine. For MPI, each node has its own copy of matrices from the start, and nodes need to maintain a copy of the current matrix product answer.


5. Turning on/off CPU, GPU, and/or MPI code

Comment out the relevant #define lines in Assignment1_Gradebot.cpp and remove relevant lines in the MakeFile if the corresponding hardware is not available.

6. The only code files you can modify for your final submission

Only 3 files can be changed: matrixMultiply.cpp , matrixMultiplyGPU.cu , and matrixMultiplyMPI.cpp. Functions must not use outside libraries (except provided headers) and must not write to stdout or file in the final submission.


10. Final Submission

Submission must include matrixMultiply.cpp , matrixMultiplyGPU.cu , matrixMultiplyMPI.cpp and a zip file slurm.zip (containing slurm job

output files) all zipped together in a file named {your 8 digit student number}.zip . If a required file is not implemented, submit the original blank file.