Linux Server: How to Check ECC error check

Ipmitool tools are required

[root@Resource ~]# yum install ipmitool

First, check whether there are ECC errors

As shown below:

[root@Resource ~]# ipmitool sel list
   1 | 11/26/2016 | 05:21:07 | Event Logging Disabled #0x72 | Log area reset/cleared | Asserted
   2 | 11/26/2016 | 05:30:54 | OS Boot | C: boot completed | Asserted
   3 | 11/26/2016 | 05:30:54 | OEM record dc | 000137 | 00001e395800
   4 | 02/14/2017 | 16:58:06 | Physical Security #0x73 | General Chassis intrusion () | Asserted
   5 | 02/14/2017 | 16:58:11 | Physical Security #0x73 | General Chassis intrusion () | Deasserted
   6 | 02/14/2017 | 16:58:15 | Power Supply #0x74 | Redundancy Lost | Asserted
   7 | 02/14/2017 | 17:24:43 | Physical Security #0x73 | General Chassis intrusion () | Asserted
   8 | 02/14/2017 | 17:29:56 | Physical Security #0x73 | General Chassis intrusion () | Asserted
   9 | 02/14/2017 | 17:40:14 | Physical Security #0x73 | General Chassis intrusion () | Asserted
   a | 02/14/2017 | 17:40:40 | Unknown #0x2e |  | Asserted
   b | 02/14/2017 | 17:40:40 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC |  DIMMB3) | Asserted
   c | 02/14/2017 | 17:40:40 | Unknown #0x2e |  | Asserted
   d | 02/14/2017 | 17:40:40 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC |  DIMMB3) | Asserted
   e | 02/14/2017 | 17:42:26 | Physical Security #0x73 | General Chassis intrusion () | Asserted
   f | 02/14/2017 | 17:42:56 | Unknown #0x2e |  | Asserted
  10 | 02/14/2017 | 17:42:56 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC |  DIMMB3) | Asserted
  11 | 02/14/2017 | 17:42:56 | Unknown #0x2e |  | Asserted
  12 | 02/14/2017 | 17:42:56 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC |  DIMMB3) | Asserted
  13 | 02/14/2017 | 17:44:24 | Physical Security #0x73 | General Chassis intrusion () | Asserted
  14 | 02/14/2017 | 17:44:49 | Unknown #0x2e |  | Asserted
  15 | 02/14/2017 | 17:44:49 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC |  DIMMB3) | Asserted
  16 | 02/14/2017 | 17:44:49 | Unknown #0x2e |  | Asserted
  17 | 02/14/2017 | 17:44:49 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC |  DIMMB3) | Asserted
  18 | 02/14/2017 | 17:48:39 | Physical Security #0x73 | General Chassis intrusion () | Asserted
  19 | 02/15/2017 | 11:37:24 | Physical Security #0x73 | General Chassis intrusion () | Asserted
  1a | 02/15/2017 | 11:37:29 | Physical Security #0x73 | General Chassis intrusion () | Deasserted
  1b | 02/16/2017 | 01:41:13 | Drive Slot #0xa1 | Drive Present () | Deasserted
  1c | 02/16/2017 | 01:41:14 | Drive Slot #0xa3 | Drive Present () | Deasserted
  1d | 02/16/2017 | 01:41:15 | Drive Slot #0xa2 | Drive Present () | Deasserted
  1e | 02/16/2017 | 04:23:43 | Drive Slot #0xa1 | Drive Present () | Asserted
  1f | 02/16/2017 | 04:23:43 | Drive Slot #0xa3 | Drive Present () | Asserted
  20 | 02/16/2017 | 04:23:45 | Drive Slot #0xa0 | Drive Present () | Deasserted
  21 | 02/16/2017 | 04:23:45 | Drive Slot #0xa2 | Drive Present () | Asserted
  22 | 02/16/2017 | 04:25:49 | Drive Slot #0xa0 | Drive Present () | Asserted
  23 | 07/10/2017 | 07:27:14 | Temperature #0x04 | Upper Non-critical going high | Asserted
  24 | 07/10/2017 | 10:00:12 | Temperature #0x04 | Upper Non-critical going high | Deasserted
  25 | 07/10/2017 | 10:01:37 | Temperature #0x04 | Upper Non-critical going high | Asserted
  26 | 07/10/2017 | 10:26:07 | Temperature #0x04 | Upper Non-critical going high | Deasserted
  27 | 11/09/2017 | 06:09:42 | Physical Security #0x73 | General Chassis intrusion () | Asserted
  28 | 11/09/2017 | 06:12:32 | Physical Security #0x73 | General Chassis intrusion () | Deasserted

 

The error message is a TXT file, as shown below

[root@Resource ~]# ipmitool sel save SN12345.txt
1 | 11/26/2016 | 05:21:07 | Event Logging Disabled #0x72 | Log area reset/cleared | Asserted
2 | 11/26/2016 | 05:30:54 | OS Boot | C: boot completed | Asserted
3 | 11/26/2016 | 05:30:54 | OEM record dc | 000137 | 00001e395800
4 | 02/14/2017 | 16:58:06 | Physical Security #0x73 | General Chassis intrusion () | Asserted
5 | 02/14/2017 | 16:58:11 | Physical Security #0x73 | General Chassis intrusion () | Deasserted
6 | 02/14/2017 | 16:58:15 | Power Supply #0x74 | Redundancy Lost | Asserted
7 | 02/14/2017 | 17:24:43 | Physical Security #0x73 | General Chassis intrusion () | Asserted
8 | 02/14/2017 | 17:29:56 | Physical Security #0x73 | General Chassis intrusion () | Asserted
9 | 02/14/2017 | 17:40:14 | Physical Security #0x73 | General Chassis intrusion () | Asserted
a | 02/14/2017 | 17:40:40 | Unknown #0x2e | | Asserted
b | 02/14/2017 | 17:40:40 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC | DIMMB3) | Asserted
c | 02/14/2017 | 17:40:40 | Unknown #0x2e | | Asserted
d | 02/14/2017 | 17:40:40 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC | DIMMB3) | Asserted
e | 02/14/2017 | 17:42:26 | Physical Security #0x73 | General Chassis intrusion () | Asserted
f | 02/14/2017 | 17:42:56 | Unknown #0x2e | | Asserted
10 | 02/14/2017 | 17:42:56 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC | DIMMB3) | Asserted
11 | 02/14/2017 | 17:42:56 | Unknown #0x2e | | Asserted
12 | 02/14/2017 | 17:42:56 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC | DIMMB3) | Asserted
13 | 02/14/2017 | 17:44:24 | Physical Security #0x73 | General Chassis intrusion () | Asserted
14 | 02/14/2017 | 17:44:49 | Unknown #0x2e | | Asserted
15 | 02/14/2017 | 17:44:49 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC | DIMMB3) | Asserted
16 | 02/14/2017 | 17:44:49 | Unknown #0x2e | | Asserted
17 | 02/14/2017 | 17:44:49 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC | DIMMB3) | Asserted
18 | 02/14/2017 | 17:48:39 | Physical Security #0x73 | General Chassis intrusion () | Asserted
19 | 02/15/2017 | 11:37:24 | Physical Security #0x73 | General Chassis intrusion () | Asserted
1a | 02/15/2017 | 11:37:29 | Physical Security #0x73 | General Chassis intrusion () | Deasserted
1b | 02/16/2017 | 01:41:13 | Drive Slot #0xa1 | Drive Present () | Deasserted
1c | 02/16/2017 | 01:41:14 | Drive Slot #0xa3 | Drive Present () | Deasserted
1d | 02/16/2017 | 01:41:15 | Drive Slot #0xa2 | Drive Present () | Deasserted
1e | 02/16/2017 | 04:23:43 | Drive Slot #0xa1 | Drive Present () | Asserted
1f | 02/16/2017 | 04:23:43 | Drive Slot #0xa3 | Drive Present () | Asserted
20 | 02/16/2017 | 04:23:45 | Drive Slot #0xa0 | Drive Present () | Deasserted
21 | 02/16/2017 | 04:23:45 | Drive Slot #0xa2 | Drive Present () | Asserted
22 | 02/16/2017 | 04:25:49 | Drive Slot #0xa0 | Drive Present () | Asserted
23 | 07/10/2017 | 07:27:14 | Temperature #0x04 | Upper Non-critical going high | Asserted
24 | 07/10/2017 | 10:00:12 | Temperature #0x04 | Upper Non-critical going high | Deasserted
25 | 07/10/2017 | 10:01:37 | Temperature #0x04 | Upper Non-critical going high | Asserted
26 | 07/10/2017 | 10:26:07 | Temperature #0x04 | Upper Non-critical going high | Deasserted
27 | 11/09/2017 | 06:09:42 | Physical Security #0x73 | General Chassis intrusion () | Asserted
28 | 11/09/2017 | 06:12:32 | Physical Security #0x73 | General Chassis intrusion () | Deasserted
[root@Resource ~]# ll
total 64-rw-r--r--. 1 root root 3220 Nov 17 14:32 SN12345.txt

View saved file information

 

[root@Resource ~]# cat SN12345.txt 
0x04 0x10 0x72 0x6f 0x02 0xff 0xff # Event Logging Disabled #0x72 Log area reset/cleared
0x04 0x1f 0x00 0x6f 0x01 0xff 0xff # OS Boot #0x00 C: boot completed
0x37 0x00 0x00 0x1e 0x39 0x58 0x00 # Reserved #0x00 Unknown
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x05 0x73 0xef 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x08 0x74 0x0b 0x01 0xff 0xff # Power Supply #0x74 Redundancy Lost
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0xc1 0x2e 0x72 0xa2 0x04 0x00 # Unknown #0x2e Unknown
0x04 0x0c 0x02 0x6f 0xa1 0xc1 0x40 # Memory #0x02 Uncorrectable ECC (UnCorrectable ECC | DIMMB3)
0x04 0xc1 0x2e 0x72 0xa2 0x04 0x00 # Unknown #0x2e Unknown
0x04 0x0c 0x02 0x6f 0xa1 0xc1 0x40 # Memory #0x02 Uncorrectable ECC (UnCorrectable ECC | DIMMB3)
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0xc1 0x2e 0x72 0xa2 0x04 0x00 # Unknown #0x2e Unknown
0x04 0x0c 0x02 0x6f 0xa1 0xc1 0x40 # Memory #0x02 Uncorrectable ECC (UnCorrectable ECC | DIMMB3)
0x04 0xc1 0x2e 0x72 0xa2 0x04 0x00 # Unknown #0x2e Unknown
0x04 0x0c 0x02 0x6f 0xa1 0xc1 0x40 # Memory #0x02 Uncorrectable ECC (UnCorrectable ECC | DIMMB3)
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0xc1 0x2e 0x72 0xa2 0x04 0x00 # Unknown #0x2e Unknown
0x04 0x0c 0x02 0x6f 0xa1 0xc1 0x40 # Memory #0x02 Uncorrectable ECC (UnCorrectable ECC | DIMMB3)
0x04 0xc1 0x2e 0x72 0xa2 0x04 0x00 # Unknown #0x2e Unknown
0x04 0x0c 0x02 0x6f 0xa1 0xc1 0x40 # Memory #0x02 Uncorrectable ECC (UnCorrectable ECC | DIMMB3)
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x05 0x73 0xef 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x0d 0xa1 0xef 0xa0 0x01 0x01 # Drive Slot #0xa1 Drive Present ()
0x04 0x0d 0xa3 0xef 0xa0 0x01 0x03 # Drive Slot #0xa3 Drive Present ()
0x04 0x0d 0xa2 0xef 0xa0 0x01 0x02 # Drive Slot #0xa2 Drive Present ()
0x04 0x0d 0xa1 0x6f 0xa0 0x01 0x01 # Drive Slot #0xa1 Drive Present ()
0x04 0x0d 0xa3 0x6f 0xa0 0x01 0x03 # Drive Slot #0xa3 Drive Present ()
0x04 0x0d 0xa0 0xef 0xa0 0x01 0x00 # Drive Slot #0xa0 Drive Present ()
0x04 0x0d 0xa2 0x6f 0xa0 0x01 0x02 # Drive Slot #0xa2 Drive Present ()
0x04 0x0d 0xa0 0x6f 0xa0 0x01 0x00 # Drive Slot #0xa0 Drive Present ()
0x04 0x01 0x04 0x01 0x57 0xaa 0xaa # Temperature #0x04 Upper Non-critical going high
0x04 0x01 0x04 0x81 0x57 0xa7 0xaa # Temperature #0x04 Upper Non-critical going high
0x04 0x01 0x04 0x01 0x57 0xaa 0xaa # Temperature #0x04 Upper Non-critical going high
0x04 0x01 0x04 0x81 0x57 0xa7 0xaa # Temperature #0x04 Upper Non-critical going high
0x04 0x05 0x73 0x6f 0x80 0x02 0xff # Physical Security #0x73 General Chassis intrusion ()
0x04 0x05 0x73 0xef 0x80 0x01 0xff # Physical Security #0x73 General Chassis intrusion ()

According to the error message with ECC in red, the previous code

0x04 0x0c 0x02 0x6f 0xa1 0xc1 0x40 # Memory #0x02 Uncorrectable ECC (UnCorrectable ECC | DIMMB3)

 

Determine the location of the memory according to these hexadecimal codes and replace the memory

This location is subject to the requirements of the technical documents given by the server manufacturer, because the location of each model is different~

Similar Posts: