Led by: Murali Emani, Denis Boyda
We introduce two profiling tools for understanding the communication in distributed deep learning.
-
MPI flat profiling using mpitrace To turn on the profiling, one has to set the following environment variable
LD_PRELOAD.export LD_PRELOAD=/lus/theta-fs0/software/datascience/thetagpu/hpctw/lib/libmpitrace.soThen run the application as usual. MPI profiling results will be generated after the run finishes
mpi_profile.XXXX.[rank_id]. -
Horovod timeline To perform Horovod timeline analysis, one has to set the environment variable
HOROVOD_TIMELINEwhich specifies the file for the output.export HOROVOD_TIMELINE=timeline.jsonThis file is only recorded on rank 0, but it contains information about activity of all workers. You can then open the timeline file using the
chrome://tracingfacility of the Chrome browser.
More details: https://horovod.readthedocs.io/en/stable/timeline_include.html
We introduce you to profiling using TensorFlow. Text Here
This is for profiling on Intel architecture. IntelProfiler/
This is for profiling on Nvidia architecture. NvidaProfiler/