background
Recently, the project requires SSH to execute Python code remotely on the centos7 container. The python version is 3.6. When opening a file with a Chinese name, there will be no problem executing Open ('wolf.TXT')
locally on the centos7 container. However, after connecting through SSH, it is found that the open() method reports the following error:
UnicodeEncodeError :'ascii' codec can't encode characters in position 0-2
The problem occurs when opening.TXT (Chinese file name) is used as a parameter to encode according to the ASCII encoding method. We know that the ASCII encoding contains only 128 bits, including numbers, uppercase and lowercase letters and some special symbols, and does not contain Chinese characters. [this article will not introduce why encode is used]
Therefore, the code can run normally by making the following changes:
open('Wolf_is_coming.txt'.encode('utf8'))
Because here we specify the encoding format of UTF-8
(of course, it needs to be supported by the running environment).
Find out the essence of the problem, and then we can look at the SSH problem.
Language coding loop mirror
Python runtime container environment
In the python runtime environment container, use the locale
and locale - a
commands to view and discover the language coding environment, as shown below:
[root@pyrun-test-69d4d45d79-mqg6n /]# locale -a
C
en_US.utf8
POSIX
[root@pyrun-test-69d4d45d79-mqg6n /]# locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=
Language coding environment of SSH client
The language coding environment of SSH client is as follows:
root@coding-editor-test-8c6cdfdd8-9tpcl:/app# locale -a
C
C.UTF-8
POSIX
root@coding-editor-test-8c6cdfdd8-9tpcl:/app# locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=
Configure language encoding environment
Through comparison, it is found that the coding environment of SSH client is c.utf-8
, while the python running environment is en_US.utf8
(PS: C represents ASCII code, and en_us.utf8 and zh_cn.utf8 contain Chinese characters)
The first thought is that the SSH client may affect the default encode mode of Python runtime, so we set the SSH client environment to the same “here the client is Ubuntu” as the python runtime environment:
#!/bin/sh
# 1.set the language as en_US_UTF-8
echo -e 'LANG="en_US_UTF-8"\nLANGUAGE="en_US:en"' >> /etc/default/locale
# Effective Configuration
source /etc/default/locale
# 2.If the en_US_UTF-8 language pack is missing install the locales tool and set en_US_UTF-8
apt-get install --no-install-recommends -y locales
locale-gen en_US.UTF-8
localedef -v -c -i en_US -f UTF-8 en_US.UTF-8
After setting, try SSH to the python running environment container again. It is found that Open ('wolf.TXT')
runs without any error.
So the question is, why does SSH run in the client’s coding environment
Research on SSH env mechanism
cat /etc/ssh/ssh_config
The following contents are found in the SSH configuration file:
The above configuration will send the local locale sendenv to the python runtime environment, so we don’t need to ensure that the locale of the client and the python runtime environment are consistent.
You only need to modify the configuration to:
SendEnv LANG en_US.utf8
Mina SSH settings
Since the project is the code that Java calls the python running environment terminal through Mina SSH to execute, rather than SSH directly, The modification of /etc/SSH/SSH_config
does not solve the problem.
Through the observation of Mina, it is found that parameters can be passed when creating a channelshell. The source code interface is as follows:
/**
* Create a channel to start a shell using specific PTY settings and/or environment.
*
* @param ptyConfig The PTY configuration to use - if {@code null} then internal defaults are used
* @param env Extra environment configuration to be transmitted to the server - ignored if
* {@code null}/empty.
* @return The created {@link ChannelShell}
* @throws IOException If failed to create the requested channel
*/
ChannelShell createShellChannel(
PtyChannelConfigurationHolder ptyConfig, Map<String, ?> env)
throws IOException;
When creating a channelshell, you can write this to configure the language coding:
shellChannel = session.createShellChannel(new PtyChannelConfiguration(), Map.of("LANG","en_US.utf8"));