Tag Archives: PaddlePaddle

Paddlepaddle uses multi process error reporting “(External) CUDA error (3), initialization error.” solution.

When using paddlepaddle to train the model, the model training is over, but the GPU memory is still occupied, which will affect the next training problem. In order to automatically release GPU memory after the end of model training, referring to Tensorflow’s method of releasing memory by multi process, paddlepaddle’s model training can be carried out in multi processes, so that GPU resources can be released automatically after the training process is finished.

But sometimes, when you use multiprocessing to train paddlepaddle models, you sometimes encounter them.

CUDA error(3), initialization error.

Error prompt for.

Referring to paddlepaddle’s issue discussion on GitHub, it is found that all modules related to paddle are placed in multiprocessing import and do not have import outside of many processes. These modules can run normally, so that the corresponding resources will be released automatically after the process is completed.

Reference:

Tensorflow function is used to complete the problem of releasing video memory – Zhihu

Single GPU multi process error reporting · issue #2241 · paddlepaddle/paddedetection · GitHub

Multiprocessing — process based parallelism — Python 3.7.12 document

[Solved] An error occurred when paddlepaddle iterated data: typeerror: ‘function’ object is not iterative

Problem Description: when using the reader to read the training data, there is an error, and the error prompt is typeerror: ‘function’ object is not iterative

error message:

TypeError                                 Traceback (most recent call last)
<ipython-input-12-0b74c209241b> in <module>
      2 for pass_id in range(1):
----> 3     for batch_id, data in enumerate(train_reader):
      4         train_cost, train_acc = exe.run(program=fluid.default_main_program(),
      5                                         feed=feeder.feed(data),

TypeError: 'function' object is not iterable

Problem recurrence: when reading data in the loop, the reader defined by padding. Batch() iterates over the data, and enumerate() uses the defined variables. When the function is called, an error will be reported. The error code is as follows:

for batch_id, data in enumerate(train_reader):
    train_cost, train_acc = exe.run(program=fluid.default_main_program(),
                                    feed=feeder.feed(data),
                                    fetch_list=[avg_cost, acc])

Problem-solving: the same as a data reading function obtained by pad DLE. Batch() , the return value is a reader, and the reason for the above error is that it directly trains_ Reader variable, which refers to a function, so you need to add a bracket to get the return value reader of the function

for batch_id, data in enumerate(train_reader()):
    train_cost, train_acc = exe.run(program=fluid.default_main_program(),
                                    feed=feeder.feed(data),
                                    fetch_list=[avg_cost, acc])

In Python variables, when there are no parentheses, the function itself is called. It is a function object, and there is no need to wait for the function to be executed. With brackets, the result of the function is called, and the result of the function execution must be completed

Paddlepaddle: fatal Python error: PyThreadState appears in the code running Boston house price forecast_ Get: no current thread

Problem description.
After successfully installing PaddlePaddle, running the code for Boston House Price Prediction reports the error Fatal Python error: PyThreadState_Get: no current</code

Error output.

Fatal Python error: PyThreadState_Get: no current thread

recurrence mode:
there are two versions of Python 2.7 and python 2.7 installed by brew in MAC. Install paddlepaddle in python of anaconda, and the installation is successful. Use the successfully installed paddlepaddle to execute the room prediction model report fatal Python error: PyThreadState_ Get: no current thread

solution:
this problem is caused by the conflict between brew’s Python and Anaconda’s python. The solution is as follows:

Execute otool - L/anaconda2/lib/python2.7/site-packages/py_ paddle/_ swig_ paddle.so

The output is as follows:

/anaconda2/lib/python2.7/site-packages/py_paddle/_swig_paddle.so 
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1445.12.0) 
/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 58286.20.16) 
/usr/local/opt/python/Frameworks/Python.framework/Versions/2.7/Python (compatibility version 2.7.0, current version 2.7.0) 
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.0) 
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.0.0)

It can be found that there is no/usr/local/opt/Python/frameworks/python. Framework/versions/2.7/python path

Execute install_ name_ tool -change /usr/local/opt/python/Frameworks/Python.framework/Versions/2.7/Python /anaconda/lib/libpython2.7.dylib /anaconda/lib/python2.7/site-packages/py_ paddle/_ swig_ paddle.so

At this point, we can run the Boston house price forecast code through paddlepaddle, and the above problem will not appear again

problem analysis:
PyThreadState_ Get () method is a method in the python kernel. This method is mainly used for the operation of Python threads. Threads actually involve the call of system resources. When there are many different Python in the system and there is no environment isolation, the problem of Python version conflict may occur. The manifestation of the conflict problem may be fatal Python error: PyThreadState_ Get: no current thread , because it’s kernel level code, we usually don’t need to modify, and it’s difficult to modify, and the cost is too high, so the more recommended method is to modify the environment in the system, such as the method used in the solution, and modify the python development environment through the corresponding configuration, so as to avoid the occurrence of Python version conflict

problem development:
generally speaking, problems at the kernel level are serious, so problems at this level can be quickly fixed. If problems at this level occur in the stable version of Python you are using, they are usually environmental problems, such as version conflicts or system resource restrictions, To solve this problem, the best way is to control the version of Python. Usually, tools such as pyenv and virtualenv can be used. Pyenv only supports Linux and MAC. These tools can be used to create independent virtual development environments for different versions of Python. These development environments will not affect the local environment and are well isolated. Of course, for specific problems, For example, fatal Python error: PyThreadState_ Get: no current thread can also use specific solutions

problem research:
PyThreadState_ Get is a method in the python kernel. Some of its kernel codes are as follows:

void PyErr_Restore(PyObject *type, PyObject *value, PyObject *traceback)
{
     
     
     
    PyThreadState *tstate = PyThreadState_GET();
    PyObject *oldtype, *oldvalue, *oldtraceback;

    if (traceback != NULL && !PyTraceBack_Check(traceback)) {
     
     
     
        /* XXX Should never happen -- fatal error instead?*/
        /* Well, it could be None. */
        Py_DECREF(traceback);
        traceback = NULL;
    }

    // Save previous exception messages
    oldtype = tstate->curexc_type;
    oldvalue = tstate->curexc_value;
    oldtraceback = tstate->curexc_traceback;
    // Set the current exception message
    tstate->curexc_type = type;
    tstate->curexc_value = value;
    tstate->curexc_traceback = traceback;
    // Discard previous exception messages
    Py_XDECREF(oldtype);
    Py_XDECREF(oldvalue);
    Py_XDECREF(oldtraceback);
}

Python through PyThreadState_ Get () can get the current thread and store the exception information in the thread state object

The Python kernel level code usually does not report any errors, but if it encounters errors at this level, the first thing to consider is still the development environment_ Get: no current thread , it usually appears in the MAC system. The common reason is that there are multiple Python environments in the Mac. An elegant way is to use pyenv on the Mac, so that you can isolate the original code of the system through pyenv. The python installed by brew is isolated from other Python installed later

PaddlePaddle Error: ‘map’ object is not subscriptable

Problem Description: I wrote the machine translation model according to the official document of paddlepaddle, and this error occurred. I compared the code in the document, and there was no error

error message:

Original sentence:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-e241afef7936> in <module>()
     20 
     21     print("Original sentence:")
---> 22     print(" ".join([src_dict[w] for w in feed_data[0][0][1:-1]]))
     23 
     24     print("Translated score and sentence:")

TypeError: 'map' object is not subscriptable

Problem recurrence:

exe = Executor(place)
exe.run(framework.default_startup_program())

for data in test_data():
    feed_data = map(lambda x: [x[0]], data)
    feed_dict = feeder.feed(feed_data)
    feed_dict['init_ids'] = init_ids
    feed_dict['init_scores'] = init_scores

    results = exe.run(
        framework.default_main_program(),
        feed=feed_dict,
        fetch_list=[translation_ids, translation_scores],
        return_numpy=False)

problem analysis:
in Python 3, map will return an iterative object of map type, which is different from the object directly obtained by subscript. In Python 2, there is no problem. In case of this problem, you only need to modify the code to a python 3 compatible mode

problem solving:

If you want to get the map object by subscript, you can first turn the map object into a list object, so you can get it directly by subscript

exe = Executor(place)
exe.run(framework.default_startup_program())

for data in test_data():
    feed_data = list(map(lambda x: [x[0]], data))
    feed_dict = feeder.feed(feed_data)
    feed_dict['init_ids'] = init_ids
    feed_dict['init_scores'] = init_scores

    results = exe.run(
        framework.default_main_program(),
        feed=feed_dict,
        fetch_list=[translation_ids, translation_scores],
        return_numpy=False)

Problem development:
the map() method is a built-in method in Python. The map() method in python2 is different from that in python3. In view of the necessity of everything, it will consume memory to return all the data, so it will be modified to the form of generated object, that is, it will be obtained when it is retrieved, and it will only take effect once