Custom PT training script spawns multiple compile jobs#

Observed Error#

Custom PyTorch training/evaluation script spawns multiple compile jobs (or custom PyTorch script recursively executing itself in infinite loop).

Explanation#

The main reason why this happens is that the Python script is not guarded with an if __name__ == “__main__” section. In various places during execution, subprocesses are spun off (e.g., weight transfer, creating surrogate jobs, etc.) which could lead to the whole module being executed.

Work around#

Add an if __name__ == “__main__” to your Python script.