Category Archives: Error

error: cannot lock ref ‘refs/remotes/origin/xxx’: is at `someHash` but expected `someOtherHash`

Refer to the pull code error: cannot lock ref ‘refs/remotes/origin/xxx’: ref XXX is at OOo expected XXX

Method: execute git remote prune origin to refresh the difference between remote and local

The author said that it needs to be in the root directory. If it is not in the root directory, it is speculated that there will be a pit of local refresh

fatal: unable to access ‘https://github.com/XXXX.git/’: OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443

$ git push guaiguai master
fatal: unable to access 'https://github.com/torchstar/20211202.git/': OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443

Command configuration

git config --global --unset http.proxy

git config --global --unset https.proxy

  

How to Solve Web APP Tag Error in Web.xml

Web app error

The content of element type “web-app” must match……

Cause of problem

Not according to web app_2_3. Write web app tags according to DTD specification

Solution:

1. Write according to the specification, that is, write web app tags according to the order of tags in the error message. (recommended)

2. Delete the specification directly.

Paddlepaddle uses multi process error reporting “(External) CUDA error (3), initialization error.” solution.

When using paddlepaddle to train the model, the model training is over, but the GPU memory is still occupied, which will affect the next training problem. In order to automatically release GPU memory after the end of model training, referring to Tensorflow’s method of releasing memory by multi process, paddlepaddle’s model training can be carried out in multi processes, so that GPU resources can be released automatically after the training process is finished.

But sometimes, when you use multiprocessing to train paddlepaddle models, you sometimes encounter them.

CUDA error(3), initialization error.

Error prompt for.

Referring to paddlepaddle’s issue discussion on GitHub, it is found that all modules related to paddle are placed in multiprocessing import and do not have import outside of many processes. These modules can run normally, so that the corresponding resources will be released automatically after the process is completed.

Reference:

Tensorflow function is used to complete the problem of releasing video memory – Zhihu

Single GPU multi process error reporting · issue #2241 · paddlepaddle/paddedetection · GitHub

Multiprocessing — process based parallelism — Python 3.7.12 document

How to Solve Arthas startup Error

Solution to error reporting after Arthas startup


1. Permission problem. If the java process in the system is an ordinary user, the monitoring needs to start the process with an ordinary user

2. Prompt connection error after startup:

Missing installation package:

#Telnet server Yum install telnet

#Telnet client Yum install telnet server

#Xinetd daemon Yum install xinetd

If the error is still reported:

yum -y install tftp-server yum -y install xinetd

$/ bin/systemctl enable xinetd. Service// start the service $/ bin/systemctl start xinetd. Service// start the service

Start command:

java -jar arthas-boot.jar –target-ip 0.0.0.0

[Solved] k8s Deploy postgresql Error: initdb: error: directory “/var/lib/postgresql/data” exists but is not empty

When gitlab was recently deployed on k8s, the underlying storage used CEPH. As a result, PostgreSQL encountered the following error:

$ kubectl -n gitlab logs postgresql-76d969dd56-m79wm
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

initdb: error: directory "/var/lib/postgresql/data" exists but is not empty
It contains a lost+found directory, perhaps due to it being a mount point.
Using a mount point directly as the data directory is not recommended.
Create a subdirectory under the mount point.

PostgreSQL yaml file:

# pv
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgresql-ceph-pvc
  namespace: gitlab
spec:
  storageClassName: ceph-leffss
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
 
---
apiVersion: v1
kind: Service
metadata:
  name: postgresql
  namespace: gitlab
  labels:
    app: gitlab
    tier: postgreSQL
spec:
  ports:
    - port: 5432
  selector:
    app: gitlab
    tier: postgreSQL

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgresql
  namespace: gitlab
  labels:
    app: gitlab
    tier: postgreSQL
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gitlab
      tier: postgreSQL
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: gitlab
        tier: postgreSQL
    spec:
      #nodeSelector:
      #  gee/disk: "500g"
      containers:
        - image: hub.leffss.com/library/postgres:12.6-alpine
        #- image: harbor.boge.com/library/postgres:12.6-alpine
          name: postgresql
          env:
            - name: POSTGRES_USER
              value: gitlab
            - name: POSTGRES_DB
              value: gitlabhq_production
            - name: POSTGRES_PASSWORD
              value: bogeusepg
            - name: TZ
              value: Asia/Shanghai
          ports:
            - containerPort: 5432
              name: postgresql
          livenessProbe:
            exec:
              command:
              - sh
              - -c
              - exec pg_isready -U gitlab -h 127.0.0.1 -p 5432 -d gitlabhq_production
            initialDelaySeconds: 110
            timeoutSeconds: 5
            failureThreshold: 6
          readinessProbe:
            exec:
              command:
              - sh
              - -c
              - exec pg_isready -U gitlab -h 127.0.0.1 -p 5432 -d gitlabhq_production
            initialDelaySeconds: 20
            timeoutSeconds: 3
            periodSeconds: 5
#          resources:
#            requests:
#              cpu: 100m
#              memory: 512Mi
#            limits:
#              cpu: "1"
#              memory: 1Gi
          volumeMounts:
            - name: postgresql
              mountPath: /var/lib/postgresql/data
      volumes:
        - name: postgresql
          persistentVolumeClaim:
            claimName: postgresql-ceph-pvc

Solution: add a subpath parameter in volumemounts

  volumeMounts:
    - name: postgresql
      mountPath: /var/lib/postgresql/data
      subPath: data

[Solved] gdb peda Install Plug-in Error: Error in sourced command file: Undefined command: “from”.

cause

Install the PEDA plug-in of GDB through the official website https://github.com/longld/peda After executing the GDB command, the plug-in is not executed

After troubleshooting, it is found that an error is reported after entering GBD:

Error in sourced command file: Undefined command: “from”.

processing method

Just reinstall GDB

sudo apt install gdb

[Solved] Hadoop Error: The directory item limit is exceeded: limit=1048576 items=1048576

Problem Description:

The scheduling system failed to execute hive task, and failed to execute all the time. The error reports are as follows:

java.io.ioexception: java.net.connectexception: call from #hostname/#ip to #hostname: 10020 failed on connection exception: java.net.connectexception: connection rejected; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

caused by: java.net.connectexception: call from #hostname/#ip to #hostname: 10020 failed on connection exception: java.net.connectexception: connection rejected; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

CONSOLE# Ended Job = job_ 1638255473937_ 0568 with exception 'java.io.ioexception (Java. Net. Connectexception: call from #hostname/#ip to #hostname: 10020 failed on connection exception: Java. Net. Connectexception: connection denied; for more details see: http://wiki.apache.org/hadoop/ConnectionRefused )

console # failed: execution error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.mapredtask. Java.net.connectexception: call from #hostname/#ip to #hostname: 10020 failed on connection exception: java.net.connectexception: connection rejected; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

You can’t see the specific problem from this information. You can’t see the problem by looking at all the logs on the server. Finally, you can see the problem by looking at yarn’s logs

According to the dispatching system, obtain applicationid: application_1638255473937_0568, and then view the corresponding log information from HDFS.

View yarn log information:

[ hdfs@centos hadoop27]$ yarn logs -applicationId application_1638255473937_0568

Key error reporting information:

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /tmp/hadoop-yarn/staging/history/done_intermediate/hdfs is exceeded: limit=1048576 items=1048576

Error reporting reason:

There are more than 1048576 files in a single Hadoop directory. The default limit is 1048576, so you should increase the limit.

Solution 1:

Add the configuration parameter dfs.namenode.fs-limits.max-directory-items to the hdfs-site.xml configuration file, and increase the parameter value.

Push the configuration file to all nodes of the Hadoop cluster and restart the Hadoop service.

Solution 2:

If it is inconvenient to modify the configuration, restart the Hadoop cluster service. You can delete this directory first:/TMP/Hadoop yarn/staging/history/done_intermediate/hdfs

Then rebuild the directory.

hadoop fs -rm -r /tmp/hadoop-yarn/staging/history/done_intermediate/hdfs
hadoop fs -mkdir /tmp/hadoop-yarn/staging/history/done_intermediate/hdfs

The reason why the number of files in this directory exceeds the upper limit is that the Hadoop cluster did not start the jobhistory server before and did not clear the historical job log information

Extended information:

1: How to view the yarn log storage directory and log details

1: View through the history server UI interface. I’m here http://IP:8801/jobhistory )

2: View through the yarn command (the user should be consistent with the user submitting the task)

2.1: yarn application – List – appstates all

2.2: yarn logs -applicationId application_1638255473937_0568

3: Directly view the log of the HDFS path (stored in the HDFS directory, not in the user-defined log directory of CentOS system)

3.1: check the yarn-site.xml file and confirm the log configuration directory.

    <property>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/data1/hadoop27/logs</value>
    </property>

3.2: view log file information

[hdfs@centos hadoop]$ hdfs dfs -ls /data1/hadoop27/logs/hdfs/logs/application_1638255473937_0568
Found 1 items
-rw-r-----   2 hdfs hdfs      66188 2021-11-30 20:24 /data1/hadoop27/logs/hdfs/logs/application_1638255473937_0568/centos.pp1.db_46654

3.3: view log details

3.3.1: yarn logs -applicationId application_1638255473937_0568 (same as 2)

3.3.2: HDFS DFS – Cat/data1/Hadoop 27/logs/HDFS/logs/application_1638255473937_0568/centos.pp1.db_46654 ## view through – Cat

3.3.3: HDFS DFS – Cat/data1/Hadoop 27/logs/HDFS/logs/application_1638255473937_0568/centos.pp1.db_46654 > tmp.log ## save the contents to tmp.log of the current directory through – cat.

3.3.4: HDFS DFS – get/data1/hadoop27/logs/HDFS/logs/application_1638255473937_0568/centos.pp1.db_46654 ## download the HDFS file to the current directory through get, and then view it.

2: HDFS operation command:

1.1: check the number of folders and files in the specified directory of HDFS.

[hdfs@centos hadoop]$ hadoop fs -count /tmp/hadoop-yarn/staging/history/done_intermediate/hdfs
           1            1048576             3253261451467 /tmp/hadoop-yarn/staging/history/done_intermediate/hdfs

The first value of 1 indicates that there is 1 folder under this directory

The second value 1048576 indicates that there is a file in this directory

The third value 3253261451467 represents the total size of all files in the directory.