background
Some multipath disks cannot be aggregated multipath - V2
there are the following prompts
reject: 363c9d56100e06a30eed2649d00000075 undef HUAWEI ,XSG1
size=200G features='0' hwhandler='0' wp=undef
`-+- policy='service-time 0' prio=-1 status=undef
|- 7:0:4:8 sdaq 66:160 undef ready running
`- 7:0:3:8 sdl 8:176 undef ready running
Aug 25 02:24:55 | sdar: alua not supported
Aug 25 02:24:55 | sdm: alua not supported
reject: 363c9d56100e06a30eed2655600000076 undef HUAWEI ,XSG1
size=200G features='0' hwhandler='0' wp=undef
`-+- policy='service-time 0' prio=-1 status=undef
|- 7:0:4:9 sdar 66:176 undef ready running
`- 7:0:3:9 sdm 8:192 undef ready running
The thing is, a set of SAN storage used by Oracle RAC database is ready to go offline, and it is planned to replace it with new Huawei storage. Unplug the optical fiber of one port of the host HbA card from the old storage optical cross, plug it into the optical cross where the new storage is located, mark the Lun on the storage, scan the disk on the host, identify the disk normally, and be aggregated by multi-path with the default rules. The problem is that when I add alias rules for these LUNs in the configuration file /etc/multipath. Conf
, refreshing or restarting multipath cannot aggregate the new disks, even if the configuration file is restored, and the old storage LUNs can aggregate normally multipath -v2
you can see the information of reject......
. Query the RedHat database and find that the size attribute of the new Lun is inconsistent with the original Lun. Restart the host and the problem is solved. The following is the official analysis.
Multipath device is rejected and subsequently not created; size discard and domap failed to add paths
environment
Red Hat Enterprise Linux 5
Red Hat Enterprise Linux 6
Red Hat Enterprise Linux 7
Device-mapper-multipath
problem
Multipath
does not create a multipath
device when the native mpio
and SAN
configuration is such that a device should be created.
resolution
Rescan scsi
devices, or delete and rescan scsi
devices, or a reboot of the system should resolve this problem:
# echo 1 > /sys/block/sdX/device/rescan
or
# echo 1 > /sys/block/sdX/device/delete
# rescan-scsi-bus.sh -a <<<-------requires sg3_utils to be installed
Rescan for multipath
devices:
# multipath v2
# multipath -ll
Verify the previously rejected multipath
device has now been created.
The sd
device(s) in sysfs
contains stale size attributes, resulting in attributes that do not match the other sd
devices associated with the LUN
. When multipath
checks the sd
devices associated with a SAN
LUN
, and it detects size inconsistencies, it will discard the sd
devices associated with the LUN
, reject the LUN
, and generate domap
failed logs for the underlying sd
devices.
From the code:
for (i = k + 1; i < VECTOR_SIZE(pathvec); i++) {
pp2 = VECTOR_SLOT(pathvec, i);
if (strcmp(pp1->wwid, pp2->wwid))
continue;
if (!pp2->size)
continue;
if (pp2->size != mpp->size) {
/*
* ouch, avoid feeding that to the DM
*/
condlog(0, "%s: size %llu, expected %llu. "
"Discard", pp2->dev_t, pp2->size,
mpp->size);
mpp->action = ACT_REJECT;
}
if (pp2->priority == PRIO_UNDEF)
mpp->action = ACT_REJECT;
}
Check for reject and discards in the multipath
output:
# multipath -v2
Mar 23 11:56:17 | 8:16: size 16780800, expected 134219520. Discard
Mar 23 11:56:17 | 8:26: size 16780800, expected 134219520. Discard
Mar 23 11:56:17 | 8:114: size 16780800, expected 134219520. Discard
Mar 23 11:56:17 | DM message failed [queue_if_no_path]
reject: mpathab (360000000000012345678901234567890) undef EMC ,SYMMETRIX
If you look at the -v4
output, or check the size attributes in sysfs
, at least one of the devices reports a size that does not match the others:
# multipath -v4
[...]
Mar 23 15:51:20 | sdab: size = 16780800
Mar 23 15:51:20 | sdbd: size = 134219520 <<<---
Mar 23 15:51:20 | sdca: size = 16780800
Mar 23 15:51:20 | sddi: size = 16780800
and
# for i in ab bd ca di; do cat /sys/block/sd$i/size; done
16780800
134219520 <<<---
16780800
16780800