The internal representation of a string in Python is Unicode encoding. Therefore, during encoding conversion, it is usually necessary to use Unicode as the intermediate encoding, that is, first decode the other encoded string into Unicode, and then encode it into another encoding
the function of decode is to convert other encoded strings into Unicode encoding, such as str1.decode (‘gb2312 ‘), which means to convert the GB2312 encoded string STR1 into Unicode encoding
the function of encode is to convert the Unicode encoding to other encoded strings, such as str2.encode (‘gb2312 ‘), which means to convert the Unicode encoded string STR2 to GB2312 encoding<
therefore, when transcoding, we must first understand what encoding string STR is, then decode it into Unicode, and then encode it into other encoding codes
the default encoding of the string in the code is consistent with the encoding of the code file itself
when installing python, the default encoding is ASCII. When there is a non ASCII encoding in the program, python processing will often report such an error: Unicode decodeerror: ‘ASCII’ codec can’t decode byte 0x??In position 1: ordinal not in range (128). Python can’t handle non ASCII coding. In this case, you need to set the default encoding of Python to utf8
1. Modify on the command line, only this session is valid:
1) through & gt>> Sys. Getdefaultencoding() to view the current encoding (if an error is reported, execute & gt>> import sys >>> reload(sys));
2) through & gt>> Sys.setdefaultencoding (‘utf8 ‘) is the most effective and tedious way to set the encoding
2. It is more complicated
1) in the following three sentences in the program file
Import sys
reload (sys)
sys.setdefaultencoding (‘utf8’)
3. Modify the python environment (recommended)
create a new sitecustomize.py file in the Lib/site packages folder of Python, The contents are as follows:
?Coding = utf8
Import sys
reload (sys)
sys.setdefault encoding (‘utf8 ‘)
Restart Python interpreter, and find that the coding has been set to utf8, which is the same as scheme 2; This is because when the system starts up in Python, it calls the file itself and sets the default code of the system, instead of adding the solution code manually every time. This is a once and for all solution
It is worth noting that for the conversion between STR and Unicode:
str—> Unicode: str.decode (encoding format)
unicode–> STR: Unicode. Encode (encoding format)
Similar Posts:
- Error reporting and resolution of Python 3 using binascii method
- Python solves the problem of NameError: name ‘reload’ is not defined
- TypeError: the JSON object must be str, not ‘bytes’
- Python3 AttributeError: module ‘sys’ has no attribute ‘setdefaultencoding’
- [Solved] Typeerror: incorrect padding occurred in python3 Base64 decoding
- attributeerror: ‘str’ object has no attribute ‘decode’
- [Solved] Python pip install Error: SyntaxError: (unicode error) ‘utf-8’ codec can’t decode byte 0xc6
- [Solved] Python Numpy Data load error: Unicode error: unpicking a python object failed: Unicode decodeerror
- [Solved] Python Error: UnicodeDecodeError: ‘gb2312’ codec can’t decode byte 0xa4 in position… : illegal multibyte sequence
- SyntaxError: Non-ASCII character ‘\xe2‘ in file