Today, when solving the crawler’s analysis of encryption parameters, we need to use Base64 decoding. However, there is a type error: incorrect padding error in the process. Here is the solution for easy reference
In fact, the normal use of Base64 is not a problem, such as the following code
1 #!usr/bin/env python
2 # coding:utf-8
3
4 import base64
5
6 a = b'hello'
7 b = base64.b64encode(a)
8 # Base64 encoding of a
9 print(b) # b'aGVsbG8='
10 # base64 decoding of b, that is, the decoding of a encoded content
11 c = base64.b64decode(b)
12 print(c) # b'hello'
The coding result of the above code is complete, so there is no problem in decoding it directly. If the encoding result is incomplete, for example, if the value of the given bytes object in the above code is b’agvsbg8 ‘, an exception prompt of typeerror: incorrect padding will appear. For example, the following code is wrong
#!usr/bin/env python
# coding:utf-8
import base64
a = b'aGVsbG8'
b = base64.b64decode(a)
print(b) #binascii.Error: Incorrect padding
The solution is as follows:
1 #!usr/bin/env python
2 # coding:utf-8
3
4 import base64
5
6 a = b'aGVsbG8'
7 missing_padding = 4 - len(a) % 4
8 if missing_padding:
9 a += b'=' * missing_padding
10 b = base64.b64decode(a)
11 print(b) # b'hello'
In this way, the problem is solved. In fact, the equal sign is added after it. And missing_ Padding calculates the number of equal signs. If you figure out the number of = signs, it’s OK to add them directly. For example, the following code:
1 #!usr/bin/env python
2 # coding:utf-8
3
4 import base64
5 import chardet
6
7 a = b'aGVsbG8'
8 c = base64.b64decode(a + b'=')
9 print(c) # b'hello'
As for how to calculate, we need to understand the principle of Base64. In an equation, 3×8 = 4×6, that is, those who used to be able to store 3 bytes can now store 4 bytes, but the original bit is divided, and each byte is represented by 6 bits. Because each byte after segmentation has only 6 bits, the less than two bits are filled with 0. Moreover, these four bytes can be regarded as a whole. After Base64 decoding, the length of bytes is at least 4 and a multiple of 4, and the insufficient parts are filled with ‘=’
Confused?In fact, my expression is not good, and I don’t want to draw. Or look at the code:
1 #!usr/bin/env python
2 # coding:utf-8
3
4 import base64
5
6 # Original 1x8 = 8 bits
7 a = b'h'
8 # after base64 encoding 8/6 = 1 remainder 2, so at least 2 byte bits are needed, in order to meet the divisible by 4, you need to add two = signs
9 b = base64.b64encode(a)
10 print(b) # b'aA=='
11
12 # Process the encoded result by removing the '=' sign
13 c = b.decode('utf-8').rstrip('=')
14 # decode the result, the previous has calculated the need for 2 = sign, directly add it is good
15 d = base64.b64decode(c.encode('utf-8') + b'=' * 2)
16 # Restore the result
17 print(d) # b'h'
This is clear at a glance, and that’s all for the summary of Base64 exceptions