When using paddlepaddle to train the model, the model training is over, but the GPU memory is still occupied, which will affect the next training problem. In order to automatically release GPU memory after the end of model training, referring to Tensorflow’s method of releasing memory by multi process, paddlepaddle’s model training can be carried out in multi processes, so that GPU resources can be released automatically after the training process is finished.
But sometimes, when you use multiprocessing to train paddlepaddle models, you sometimes encounter them.
CUDA error(3), initialization error.
Error prompt for.
Referring to paddlepaddle’s issue discussion on GitHub, it is found that all modules related to paddle are placed in multiprocessing import and do not have import outside of many processes. These modules can run normally, so that the corresponding resources will be released automatically after the process is completed.
Tensorflow function is used to complete the problem of releasing video memory – Zhihu
Single GPU multi process error reporting · issue #2241 · paddlepaddle/paddedetection · GitHub
Multiprocessing — process based parallelism — Python 3.7.12 document
- Appium – multiprocessing.pool.MaybeEncodingError-[“Can’t pickle local object ‘PoolManager.__init…
- [Solved] TensorFlow Error: InternalError: Failed copying input tensor
- [How to Solve Pytorch Error] EOFError: Ran out of input
- [How to Solve] invalid argument: Nan in summary histogram for: image_pooling/BatchNorm/moving_variance_1
- Preservation and recovery of TF. Train. Saver () model of tensorflow
- What are hyperparameters in machine learning?
- Message: failed to decode response from marionette
- [Solved] Git Error: “Another git process seems to be running in this repository…”
- Pycharm Error: ImportError: No module named model_selection
- Error in calling GPU by keras or tensorflow: blas GEMM launch failed