[Solved] SSH Remote execute Python 3 error: Unicode encodeerror: ‘ASCII’ codec

background

Recently, the project requires SSH to execute Python code remotely on the centos7 container. The python version is 3.6. When opening a file with a Chinese name, there will be no problem executing Open ('wolf.TXT') locally on the centos7 container. However, after connecting through SSH, it is found that the open() method reports the following error:

UnicodeEncodeError :'ascii' codec can't encode characters in position 0-2

The problem occurs when opening.TXT (Chinese file name) is used as a parameter to encode according to the ASCII encoding method. We know that the ASCII encoding contains only 128 bits, including numbers, uppercase and lowercase letters and some special symbols, and does not contain Chinese characters. [this article will not introduce why encode is used]

Therefore, the code can run normally by making the following changes:

open('Wolf_is_coming.txt'.encode('utf8'))

Because here we specify the encoding format of UTF-8 (of course, it needs to be supported by the running environment).

Find out the essence of the problem, and then we can look at the SSH problem.

Language coding loop mirror

Python runtime container environment

In the python runtime environment container, use the locale and locale - a commands to view and discover the language coding environment, as shown below:

[[email protected] /]# locale -a
C
en_US.utf8
POSIX
[[email protected] /]# locale 
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=

Language coding environment of SSH client

The language coding environment of SSH client is as follows:

[email protected]:/app# locale -a
C
C.UTF-8
POSIX
[email protected]:/app# locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=

Configure language encoding environment

Through comparison, it is found that the coding environment of SSH client is c.utf-8, while the python running environment is en_US.utf8

(PS: C represents ASCII code, and en_us.utf8 and zh_cn.utf8 contain Chinese characters)

The first thought is that the SSH client may affect the default encode mode of Python runtime, so we set the SSH client environment to the same “here the client is Ubuntu” as the python runtime environment:

#!/bin/sh

# 1.set the language as en_US_UTF-8
echo -e 'LANG="en_US_UTF-8"\nLANGUAGE="en_US:en"' >> /etc/default/locale
# Effective Configuration
source /etc/default/locale

# 2.If the en_US_UTF-8 language pack is missing install the locales tool and set en_US_UTF-8
apt-get install --no-install-recommends -y locales
locale-gen en_US.UTF-8
localedef -v -c -i en_US -f UTF-8 en_US.UTF-8

After setting, try SSH to the python running environment container again. It is found that Open ('wolf.TXT') runs without any error.

So the question is, why does SSH run in the client’s coding environment

Research on SSH env mechanism

cat /etc/ssh/ssh_config 

The following contents are found in the SSH configuration file:

The above configuration will send the local locale sendenv to the python runtime environment, so we don’t need to ensure that the locale of the client and the python runtime environment are consistent.

You only need to modify the configuration to:

SendEnv LANG en_US.utf8

Mina SSH settings

Since the project is the code that Java calls the python running environment terminal through Mina SSH to execute, rather than SSH directly, The modification of /etc/SSH/SSH_config does not solve the problem.

Through the observation of Mina, it is found that parameters can be passed when creating a channelshell. The source code interface is as follows:

/**
 * Create a channel to start a shell using specific PTY settings and/or environment.
 *
 * @param  ptyConfig   The PTY configuration to use - if {@code null} then internal defaults are used
 * @param  env         Extra environment configuration to be transmitted to the server - ignored if
 *                     {@code null}/empty.
 * @return             The created {@link ChannelShell}
 * @throws IOException If failed to create the requested channel
 */
ChannelShell createShellChannel(
  PtyChannelConfigurationHolder ptyConfig, Map<String, ?> env)
  throws IOException;

When creating a channelshell, you can write this to configure the language coding:

shellChannel = session.createShellChannel(new PtyChannelConfiguration(), Map.of("LANG","en_US.utf8"));

Similar Posts: