Category Archives: Linux

[Solved] SYSTEM.credentials.domains.root.ASM.Self.XXXX.root not found

SYSTEM.credentials.domains.root.ASM.Self.XXXX.root not found

1. Phenomenon

Oracle 12.2.0.1.0 cluster cannot be started after shutdown.

CRS status check is stuck in storage starting, and other resources cannot be started

[root@db2 ~]# /oracle/product/12.2.0.1/grid/bin/crsctl status res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.crf
      1        ONLINE  OFFLINE                               STABLE
ora.crsd
      1        ONLINE  OFFLINE                               STABLE
ora.cssd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.cssdmonitor
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.ctssd
      1        ONLINE  ONLINE       xd1archdb2               OBSERVER,STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.driver.afd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.drivers.acfs
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE xd1archdb2               STABLE
ora.gipcd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.gpnpd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.mdnsd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.storage
      1        ONLINE  OFFLINE      xd1archdb2               STARTING      <===================
--------------------------------------------------------------------------------

The last point in the alert log of crs[ORAROOTAGENT(199820)]CRS-5019: All OCR locations are on ASM disk groups [OCR_VOTE], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/oracle/gridbase/diag/crs/xd1archdb2/crs/trace/ohasd_orarootagent_root.trc".
Checking this log, you can see


2021-11-24 11:15:34.101 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS

2021-11-24 11:15:34.136 : CLSCRED:1556182784: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found
2021-11-24 11:15:34.136 : USRTHRD:1556182784: {0:5:3} 7872 Error 4 opening dom root in 0x7fc828518480

2021-11-24 11:15:35.190 : default:1556182784: clsCredDomClose: Credctx deleted 0x7fc828228ed0
2021-11-24 11:15:36.207 :   CLSNS:1556182784: clsns_SetTraceLevel:trace level set to 1.
2021-11-24 11:15:36.210 : default:1556182784: Inited LSF context: 0x7fc828321c50 
2021-11-24 11:15:36.214 : CLSCRED:1556182784: clsCredCommonInit: Inited singleton credctx.
2021-11-24 11:15:36.214 : CLSCRED:1556182784: (:CLSCRED0101:)clsCredDomInitRootDom: Using user given storage context for repository access.
2021-11-24 11:15:36.237 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS

2021-11-24 11:15:36.241 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS

2021-11-24 11:15:36.276 : CLSCRED:1556182784: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found
2021-11-24 11:15:36.276 : USRTHRD:1556182784: {0:5:3} 7872 Error 4 opening dom root in 0x7fc82834a180

2021-11-24 11:15:37.361 : default:1556182784: clsCredDomClose: Credctx deleted 0x7fc828228ed0
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} -- trace dump on error exit --

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} Error [kgfoAl06] in [kgfokge] at kgfo.c:3115

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} ORA-12547: TNS:lost contact
ORA-12547: TNS:lost contact
ORA-15077: could not locate ASM instance serving a required diskgroup


2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} Category: 7

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} DepInfo: 12547

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} -- trace dump end --

SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found,ORA-15077: could not locate ASM instance serving a required diskgroup,Unable to connect to asm instance, although crsctl checks that asm is started. But actually asm is not started

[root@db2 trace]# ps -ef | grep asm
root     206099 200334  0 11:17 pts/2    00:00:00 grep --color=auto asm

2. Cause analysis

ASM did not actually start successfully and could not locate the disk group.

3. Solution

Start the ASM instance manually. run as asm owner

sqlplus/as sysasm

startup

4. Root cause

In this version, ASM has the features of flex ASM. In this environment, the pattern used is flex.

[grid@db1 ~]$  asmcmd
ASMCMD> showclustermode
ASM cluster : Flex mode enabled
ASMCMD> exit
[grid@db1 ~]$ srvctl config asm
ASM home: <CRS home>
Password file: +OCR_VOTE/orapwASM
Backup of Password file: 
ASM listener: LISTENER
ASM instance count: 3
Cluster ASM listener: ASMNET1LSNR_ASM

When the ASM server in flex ASM starts, all ASM networks should be connected.

Check ASM listening

[grid@db1 ~]$ lsnrctl status ASMNET1LSNR_ASM

LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 24-NOV-2021 19:00:17

Copyright (c) 1991, 2016, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))
STATUS of the LISTENER
------------------------
Alias                     ASMNET1LSNR_ASM
Version                   TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date                24-NOV-2021 11:10:01
Uptime                    0 days 7 hr. 50 min. 15 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /oracle/product/12.2.0.1/grid/network/admin/listener.ora
Listener Log File         /oracle/gridbase/diag/tnslsnr/xd1archdb1/asmnet1lsnr_asm/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
The listener supports no services
The command completed successfully

You can see that the service is not registered in ASM listening.

Check whether the listening log is visible

Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
24-NOV-2021 11:10:03 * (ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM)) * service_register * LsnrAgt * 0
2021-11-24T11:10:06.129013+08:00
24-NOV-2021 11:10:06 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=db1)(USER=grid))(COMMAND=status)(ARGUMENTS=64)(SERVICE=ASMNET1LSNR_ASM)(VERSION=203424000)) * status * 0
2021-11-24T11:10:08.041279+08:00
Incoming connection from 10.*.*.111 rejected 
24-NOV-2021 11:10:08 * 12546
TNS-12546: TNS:permission denied
 TNS-12560: TNS:protocol adapter error
  TNS-00516: Permission denied

Access to another node was also denied

2021-11-24T11:13:50.958237+08:00
Incoming connection from 10.*.*.112 rejected 
24-NOV-2021 11:13:50 * 12546
TNS-12546: TNS:permission denied
 TNS-12560: TNS:protocol adapter error
  TNS-00516: Permission denied

111 and 112 access are denied. Combined with the CRS log on node 112, there is an ora-12547 error. Check the white list of sqlnet.ora and find that its address does not exist in the ACL.

You need to modify sqlnet.ora, add the corresponding address, and restart ASM listening to take effect.

[Solved] Docker error: Error starting daemon: Error initializing network controller: list bridge addresses failed: no available network

Errors:
1.# systemctl restart docker

Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.

2.# systemctl status docker.service

● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Thu 2021-11-25 18:12:33 CST; 1min 3s ago
Docs: https://docs.docker.com
Main PID: 35104 (code=exited, status=1/FAILURE)
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: Failed to start Docker Application Container Engine.
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: Unit docker.service entered failed state.
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: docker.service failed.
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: docker.service holdoff time over, scheduling restart.
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: Stopped Docker Application Container Engine.
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: start request repeated too quickly for docker.service
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: Failed to start Docker Application Container Engine.
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: Unit docker.service entered failed state.
Nov 25 18:12:33 xx:xx:xx:xx systemd[1]: docker.service failed.

3.# journalctl -xe

Nov 25 18:58:43 xx:xx:xx:xx dockerd[30363]: time="2021-11-25T18:58:43.204191101+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" modu
Nov 25 18:58:43 xx:xx:xx:xx dockerd[30363]: time="2021-11-25T18:58:43.204305943+08:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/ru
Nov 25 18:58:43 xx:xx:xx:xx dockerd[30363]: time="2021-11-25T18:58:43.204335414+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Nov 25 18:58:43 xx:xx:xx:xx dockerd[30363]: time="2021-11-25T18:58:43.204428835+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42053b820, CONNE
Nov 25 18:58:43 xx:xx:xx:xx dockerd[30363]: time="2021-11-25T18:58:43.204743716+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42053b820, READY
Nov 25 18:58:43 xx:xx:xx:xx dockerd[30363]: time="2021-11-25T18:58:43.204781727+08:00" level=info msg="Loading containers: start."
Nov 25 18:58:43 xx:xx:xx:xx dockerd[30363]: Error starting daemon: Error initializing network controller: list bridge addresses failed: no available network
Nov 25 18:58:43 xx:xx:xx:xx systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Nov 25 18:58:43 xx:xx:xx:xx systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
…………

Solution:
Inout in Teminal:
ip link add name docker0 type bridge
ip addr add dev docker0 172.17.0.1/16
then re-excetue
1)systemctl start docker
2)docker info | grep 'Server Version'

3)systemctl enable docker
Done!

[Solved] JVM Error: Failed to write core dump. Core dumps have been disabled.(jar was Forced to Exit)

JVM reports an error: failed to write core dump. Core dumps have been disabled

In the high concurrency big data scenario, the Linux server reports an error fork: retry: the resource is temporarily unavailable
the JVM will generate an HS_err_Pid74299.log log log files like this

By default, the core file size of the Linux service is set to 0. This parameter needs to be adjusted, but this parameter does not solve the problem;

The root cause of the problem is that the maximum number of open files and the maximum number of processes of the running application of the server are relatively small, and the default is 4096

The following configuration needs to be modified:

vi /etc/security/limits.conf

* soft nofile 327680
* hard nofile 327680
hdfs soft nproc 131072
hdfs hard nproc 131072
mapred soft nproc 131072
mapred hard nproc 131072
hbase soft nproc 131072
hbase hard nproc 131072
zookeeper soft nproc 131072
zookeeper hard nproc 131072
hive soft nproc 131072
hive hard nproc 131072
root soft nproc 131072
root hard nproc 131072

[Solved] Undefined reference_G++ _GLIBCXX_USE_CXX11_ ABI compilation error: STD:: Basic_String and STD::__cxx11::basic_String cannot match

1. Reason:

GCC 5 will compile STD:: string as per C + + 11 STD::__cxx11::basic_string< char> At this time, if the library you call does not enable the C + + 11 feature during compilation, the STD:: string in it is actually STD:: basic_string< char>, If the string under c + + 11 is passed as a parameter into a library other than C + + 11, error: cannot convert ‘const STD::__cxx11::basic_string< char>’ To ‘const char *’, or undefined reference

Enter GCC installation directory and enter  include/c++/5.4.0  Directory, and then view  x86_64 unknown Linux GNU/bits/C + + config.H, or key macro definitions in (/usr/include/x86_64 Linux GNU/C + +/5/bits/C + + config.H):

#if _GLIBCXX_USE_CXX11_ABI
namespace std
{
  inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { }
}
namespace __gnu_cxx
{
  inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { }
}
# define _GLIBCXX_NAMESPACE_CXX11 __cxx11::
# define _GLIBCXX_BEGIN_NAMESPACE_CXX11 namespace __cxx11 {
# define _GLIBCXX_END_NAMESPACE_CXX11 }
# define _GLIBCXX_DEFAULT_ABI_TAG _GLIBCXX_ABI_TAG_CXX11
#else
# define _GLIBCXX_NAMESPACE_CXX11
# define _GLIBCXX_BEGIN_NAMESPACE_CXX11
# define _GLIBCXX_END_NAMESPACE_CXX11
# define _GLIBCXX_DEFAULT_ABI_TAG

Check bits/basic_string.h

#if _GLIBCXX_BEGIN_NAMESPACE_CXX11
// line 52~2441
_GLIBCXX_END_NAMESPACE_CXX11
#else  // !_GLIBCXX_USE_CXX11_ABI
  // Reference-counted COW string implentation
  // ...
#endif

That is, use the old version of GCC or – D_GLIBCXX_USE_CXX11_ABI = 0, STD:: string will use the old version of STD:: basic_string. However, when compiling with the new version of GCC, STD:: string will use STR::__cxx11::basic_string. Therefore, if the linked library uses different versions of GCC or the compilation options are different during compilation, the following errors will occur:

libboost_regex.so.1.72.0: undefined reference to `std::__cxx11::messages<char> const& std::use_facet<std::__cxx11::messages<char> >(std::locale const&)@GLIBCXX_3.4.21'

2. If string is not involved, you can simply use the library that implements code compilation by yourself. You can add cxx11 scope during compilation in the following ways:

#  define DUAL_ABI cxx11 __attribute__((abi_tag("cxx11")))

namespace ClassA {
  inline namespace DUAL_ABI {
    // library goes here
  }
}

This can solve problems like undefined reference to ` STD::_Cxx11:: funa() ` problem.

3. Otherwise, you can only use the corresponding compilation option or GCC version

4. View compile symbol commands

Strings view symbols and C + + filt explain symbols

strings libfuna.so | grep init
#_ZN12TensorRT_SDK4initESsi 
c++filt _ZN12TensorRT_SDK4initESsi
#TensorRT_SDK::init(std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)

[Solved] Deploying Tomcat connection in Linux Navicat reports error 2059 – authentication plugin ‘caching_sha2_Password ‘cannot be loaded: + Xi input 9 g

This problem gives me a headache because navicat15.0.26 does not support the newly added encryption method of mysql8.0, caching_sha2_Password, so we need to modify the user’s encryption method. First log in to MySQL,

mysql -uroot -p

This is my configuration information

Mysql 8.0.27
Navicat 15.0.26
tomcat 9.0.54
jdk 1.8.0.65

View encryption method:

show variables like 'default_authentication_plugin';

Log in to the database using the command-line client tool provided with MySQL database, and enter use mysql

Query the original authentication plug-in of MySQL user from the database user table

select user,host,plugin,authentication_string from user;

Replace the authentication plug-in used by the user with MySQL used in the previous version_ native_ password

ALTER USER 'root'@'localhost'IDENTIFIED WITH mysql_native_password BY'root';

Just use Navicat to connect again.

Windows: How to Solve Git Pull Error

Error content:

Unable to negotiate with x.x.x.x port 22: no matching host key type found. Their offer: ssh-rsa
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

The public key of Windows computer has been added to the server. At first, it can be used normally, but suddenly, the code cannot be updated on windows.

Solution:

C:\users\administrator\.SSH (this is the path on my computer. Your computer path may be different. Find it according to the specific situation)

Add a new file config in this directory. The file content is:

Host x.x.x.x
    HostkeyAlgorithms +ssh-rsa
    PubkeyAcceptedAlgorithms +ssh-rsa

Note that x.x.x.x here represents the IP address of your git server.

Pull the code again and restore it.

[Solved] /bin/bash: /us/rbin/jdk1.8.0/bin/java: No such file or directory

Run the map reduce program and report the following error:

Container exited with a non-zero exit code 127. Error file: prelaunch.err
/bin/bash: /us/rbin/jdk1.8.0/bin/java: No such file or directory

reason:

Yan’s Java path is not configured or the wrong path is configured.

Solution:

vim hadoop/etc/hadoop/yarn-env.sh

Change Java_home to the correct path, for example:

export JAVA_HOME=/usr/bin/jdk1.8.0

The yarn-env.sh of each machine should be modified. You can also use the SCP command to synchronize the yarn-env.sh to other machines.

Suppose there are four machines C1, C2, C3 and C4, which are currently on machine C1:

cd hadoop/etc/hadoop
scp yarn-env.sh c2:/hadoop/etc/hadoop
scp yarn-env.sh c3:/hadoop/etc/hadoop
scp yarn-env.sh c4:/hadoop/etc/hadoop

*After modification, Hadoop will not take effect until it is restarted

fatal error: bits/c++config.h: No such file or directory /fatal error: bits/libc-header-start.h: No such file or directory

fatal error: bits/libc-header-start.h: No such file or directory
sudo apt-get install gcc-multilibfatal error: bits/c++config.h: No such file or directory
sudo apt-get install g++-multilib