enable_group Segmentation fault #105

linzhanglong · 2024-11-24T11:08:59Z

Hello. In the do_check_path function, the following code:

if (pp->mpp->synced_count == 0) {  
    do_sync_mpp(vecs, pp->mpp);  
    /* if update_multipath_strings orphaned the path, quit early */  
    if (!pp->mpp)  
        return CHECK_PATH_SKIPPED;  
}

should be changed to ?

if (pp->mpp->synced_count == 0) {  
    do_sync_mpp(vecs, pp->mpp);  
    /* if update_multipath_strings orphaned the path, quit early */  
    if (!pp->mpp || pp->mpp->need_reload)  <--------------------------------modify  
        return CHECK_PATH_SKIPPED;  
}

Otherwise, the subsequent code in the enable_group function may access the pgindex array out of bounds:

static void
enable_group(struct path * pp)
{
    struct pathgroup * pgp;

    if (!pp->mpp->pg || !pp->pgindex)
        return;

    pgp = VECTOR_SLOT(pp->mpp->pg, pp->pgindex - 1); <-------------------------------------- here

    if (pgp->status == PGSTATE_DISABLED) {
        condlog(2, "%s: enable group #%i", pp->mpp->alias, pp->pgindex);
        dm_enablegroup(pp->mpp->alias, pp->pgindex);
    }
}

The text was updated successfully, but these errors were encountered:

mwilck · 2024-11-25T11:42:02Z

Thanks for the report.

Which multipath-tools version are you using?
can you share the multipathd logs preceding the crash with us, please?
Are you able to reproduce the crash (for testing fixes)?

I've reviewed the code, and while it's true that update_pathvec_from_dm() may modify pp->mpp, I am not sure if your fix is correct. The problem is our lax handling of pp->pgindex in general. I'm going to send a patch to dm-devel.

linzhanglong · 2024-11-25T12:34:10Z

Hello, I'm not at the office right now, I will provide the information later. This issue occurs with very low probability in a Active-Standby mode when you modify the LUN ID and then change it back. When multipathd detects a change in pp wwid, update_pathvec_from_dm will remove the path and also remove the empty pg, which can lead to this issue.

mwilck · 2024-11-25T14:17:30Z

No problem, take your time.

mwilck · 2024-11-25T14:35:32Z

I've sent a patch to [email protected]. Subject is "libmultipath: fix handling of pp->pgindex". You can inspect it here, too.

linzhanglong · 2024-11-27T10:07:39Z

Hello. Could the following code in the update_pathvec_from_dm function also lead to remove paths and empty path groups? It could cause the same issue ?

/* If this fails, the device is not in sysfs */
pp->udev = get_udev_device(pp->dev_t, DEV_DEVT);

if (!pp->udev) {
    condlog(2, "%s: discarding non-existing path %s",
        mpp->alias, pp->dev_t);
    vector_del_slot(pgp->paths, j--);
    free_path(pp);
    must_reload = true;
    continue;
}

the logs related to my previous question:
2024-11-11 17:49:31.653829 err [multipathd:] 36b46e0810052656500146efa00000005: path 66:1312 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-11-11 17:49:31.653846 notice [multipathd:] 36b46e0810052656500146efa00000005: removing empty pathgroup 5
2024-11-11 17:49:31.653853 warning [multipathd:] 36b46e0810052656500146efa00000005: sdbbk - rdac checker reports path is up
2024-11-11 17:49:31.653860 notice [multipathd:] 128:1376: reinstated
2024-11-11 17:49:31.653867 notice [multipathd:] 36b46e0810052656500146efa00000005: remaining active paths: 6
2024-11-11 17:49:47.588389 notice [multipathd:] --------start up--------

pp->pgindex is set in disassemble_map() when a map is parsed. There are various possiblities for this index to become invalid. pp->pgindex is only used in enable_group() and followover_should_fallback(), and both callers take no action if it is 0, which is the right thing to do if we don't know the path's pathgroup. Make sure pp->pgindex is reset to 0 in various places: - when it's orphaned, - before (re)grouping paths, - when we detect a bad mpp assignment in update_pathvec_from_dm(). - when a pathgroup is deleted in update_pathvec_from_dm(). In this case, pgindex needs to be invalidated for all paths in all pathgroups after the one that was deleted. The hunk in group_paths is mostly redundant with the hunk in free_pgvec(), but because we're looping over pg->paths in the former and over pg->pgp in the latter, I think it's better too play safe. Fixes: 99db1bd ("[multipathd] re-enable disabled PG when at least one path is up") Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

mwilck · 2024-11-27T23:10:19Z

Yes. @bmarzins pointed out the same thing on dm-devel.

I have posted another patch series to the dm-devel mailing list ("[PATCH v2 0/8] multipath-tools fixes") with an updated fix that should cover your case.

The set is also on my tip branch.

mwilck · 2024-11-27T23:14:24Z

2024-11-11 17:49:31.653829 err [multipathd:] 36b46e0810052656500146efa00000005: path 66:1312 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map

You (or we) should investigate how it comes to pass that this path with a wrong WWID is in this map. The log history should provide some clue about it.

multipathd tries to work around this, but it represents some rather evil problem that has probably external reasons (a path has changed its WWID without being deleted and re-added).

linzhanglong · 2024-11-28T02:01:44Z

You (or we) should investigate how it comes to pass that this path with a wrong WWID is in this map. The log history should provide some clue about it.

multipathd tries to work around this, but it represents some rather evil problem that has probably external reasons (a path has changed its WWID without being deleted and re-added).

Okay.
When modify the LUN ID or unmap this LUN on the storage side , the WWID of this LUN will change when viewed from the host side.

mwilck · 2024-11-28T09:23:10Z

What storage type is it? You can't just have unmapped the LUN, you must have unmapped and re-mapped it, otherwise it wouldn't show up in the host any more, or am I missing something?

Anyway, testing the current patch set is more important now.

mwilck · 2024-11-29T11:34:29Z

When modify the LUN ID or unmap this LUN on the storage side , the WWID of this LUN will change when viewed from the host side.

And there are no uevents on the host when this happens?

linzhanglong · 2024-11-29T11:44:41Z

When modify the LUN ID or unmap this LUN on the storage side , the WWID of this LUN will change when viewed from the host side.

And there are no uevents on the host when this happens?

Yes, the storage LUN is not mounted, but it is mapped to the host. I modified the LUN ID on the storage side, and then executed the command: udevadm monitor --kernel --property, but there were no uevent events. The storage is HUAWEI XSG1.

mwilck · 2024-11-29T11:48:24Z

Are there any kernel messages about changed LUNs or the like?

It is highly dangerous to swap a SCSI device in this way while a host is accessing it. It it isn't mounted, there's no immediate threat of data corruption, but still, I would strongly discourage doing it.

linzhanglong · 2024-11-29T11:54:11Z

Yes, this issue occurred when I modified the LUN ID or umap LUN and rollback at that time.
These change LUN ID/Unmap LUN tests were conducted while I was analyzing that issue.

If the kernel issues the scsi command(tur) to the storage while the LUN ID is being modified, the kernel can receive the return code from the storage and detect that the LUN data has changed. In this scenario, a uevent will be triggered, but this depends on probability. I tested many times, and I only occasionally received the uevent event; in most cases, I did not receive it.

static void scsi_report_sense(struct scsi_device *sdev,
			      struct scsi_sense_hdr *sshdr)
{
	enum scsi_device_event evt_type = SDEV_EVT_MAXBITS;	/* i.e. none */

	if (sshdr->sense_key == UNIT_ATTENTION) {
		if (sshdr->asc == 0x3f && sshdr->ascq == 0x03) {
			evt_type = SDEV_EVT_INQUIRY_CHANGE_REPORTED;
			sdev_printk(KERN_WARNING, sdev,
				    "Inquiry data has changed");
		} else if (sshdr->asc == 0x3f && sshdr->ascq == 0x0e) {
			evt_type = SDEV_EVT_LUN_CHANGE_REPORTED;
			scsi_report_lun_change(sdev);  
			sdev_printk(KERN_WARNING, sdev,
				    "Warning! Received an indication that the "
				    "LUN assignments on this target have "
				    "changed. The Linux SCSI layer does not "
				    "automatically remap LUN assignments.\n");
		} else if (sshdr->asc == 0x3f)
			sdev_printk(KERN_WARNING, sdev,
				    "Warning! Received an indication that the "
				    "operating parameters on this target have "
				    "changed. The Linux SCSI layer does not "
				    "automatically adjust these parameters.\n");

		if (sshdr->asc == 0x38 && sshdr->ascq == 0x07) {
			evt_type = SDEV_EVT_SOFT_THRESHOLD_REACHED_REPORTED;
			sdev_printk(KERN_WARNING, sdev,
				    "Warning! Received an indication that the "
				    "LUN reached a thin provisioning soft "
				    "threshold.\n");
		}

		if (sshdr->asc == 0x29) {
			scsi_disk_reset_handler(sdev);
			evt_type = SDEV_EVT_POWER_ON_RESET_OCCURRED;
			sdev_printk(KERN_WARNING, sdev,
				    "Power-on or device reset occurred\n");
		}

		if (sshdr->asc == 0x2a && sshdr->ascq == 0x01) {
			evt_type = SDEV_EVT_MODE_PARAMETER_CHANGE_REPORTED;
			sdev_printk(KERN_WARNING, sdev,
				    "Mode parameters changed");
		} else if (sshdr->asc == 0x2a && sshdr->ascq == 0x06) {
			evt_type = SDEV_EVT_ALUA_STATE_CHANGE_REPORTED;
			sdev_printk(KERN_WARNING, sdev,
				    "Asymmetric access state changed");
		} else if (sshdr->asc == 0x2a && sshdr->ascq == 0x09) {
			evt_type = SDEV_EVT_CAPACITY_CHANGE_REPORTED;
			sdev_printk(KERN_WARNING, sdev,
				    "Capacity data has changed");
		} else if (sshdr->asc == 0x2a)
			sdev_printk(KERN_WARNING, sdev,
				    "Parameters changed");
	}

	if (evt_type != SDEV_EVT_MAXBITS) {
		set_bit(evt_type, sdev->pending_events);
		schedule_work(&sdev->event_work);
	}
}

mwilck · 2024-11-29T15:13:35Z

Ok, I understand.

Do you still observe the crash with the current patch set?

linzhanglong · 2024-11-30T02:13:18Z

Ok, I understand.

Do you still observe the crash with the current patch set?

Ok. I am testing. I have a small question: if this issue occurs and the pgindex is set to invalid, where can we ensure that the reload map will be triggered?

update_pathvec_from_dm() may set mpp->need_reload if it finds inconsistent settings. In this case, the map should be reloaded, but so far we don't do this reliably. Add a call to reload_and_sync_map() to do_sync_mpp() to clear this kind of inconsistency. In order to avoid endless reload loops, limit the number of retries to 1. Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

mwilck · 2024-12-02T10:04:25Z

I have a small question: if this issue occurs and the pgindex is set to invalid, where can we ensure that the reload map will be triggered?

That question isn't small. We currently don't. It's a subtle matter because we must avoid spurious map reloads, and endless reload loops. The difficult part is where we can safely reload the map. I've double-checked, and I think that do_sync_mpp() is the correct place to attempt a reload like this.

I have just pushed another commit (ab60145) to my "tip" branch. I'm curious to see if it works for your case (i.e. causes a map reload for the broken map).

@bmarzins, your opinion about that commit would also be highly appreciated, as you've made lots of changes around the checkerloop recently.

mwilck · 2024-12-02T11:41:27Z

Do you observe kernel messages like this?

LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments.

Also, could you run udevadm monitor -k -p -s scsi when the LUN assignment changes, and see if the kernel sends any notification about the changed LUN assignments to user space?

mwilck · 2024-12-02T11:56:48Z

Note that the fact that multipathd receives no notifications about SCSI Unit Attention (UA)events is a long-standing problem. We have missing links un multiple levels here.
We might get a UA, but not necessarily on the path that had changed, it can be some other path device belonging to the same SCSI target. Even if the UA is received, it doesn't trigger a target rescan by the kernel, and even if the rescan is done, it doesn't trigger a block-level uevent, even if the device ID changes.

mwilck · 2024-12-02T11:59:06Z

@linzhanglong, can you describe your test procedure in detail? You change a LUN assignment on the storage side, and then what do you do on the host side?

linzhanglong · 2024-12-02T14:59:36Z

Do you observe kernel messages like this?
LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments.
Also, could you run udevadm monitor -k -p -s scsi when the LUN assignment changes, and see if the kernel sends any notification about the changed LUN assignments to user space?

Hello, I just tested a single LUN, and the reason I previously mentioned not receiving the uevent event was because I was filtering with sdx. In fact, the uevent events were received, and they are as follows:

KERNEL[22778.730765] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-13/target15:0:7/15:0:7:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-13/target15:0:7/15:0:7:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=18718
SUBSYSTEM=scsi
UDEV_LOG=6

multipathd.log, 36b46e08100526565029487c700000108 is the WWID that the LUN changes to after I unmap it.

2024-12-02 22:39:47.929573 notice [multipathd:] sync_map_state: failing sdet state 2 dmstate 2
2024-12-02 22:39:47.929576 notice [multipathd:] sync_map_state: failing sdeg state 2 dmstate 2
2024-12-02 22:39:47.929580 notice [multipathd:] sync_map_state: failing sdeh state 2 dmstate 2
2024-12-02 22:39:47.929590 notice [multipathd:] sync_map_state: failing sdem state 2 dmstate 2
2024-12-02 22:39:47.929593 notice [multipathd:] sync_map_state: failing sdel state 2 dmstate 2
2024-12-02 22:39:47.929595 notice [multipathd:] sync_map_state: failing sden state 2 dmstate 2
2024-12-02 22:39:47.929606 notice [multipathd:] sync_map_state: failing sdei state 2 dmstate 2
2024-12-02 22:39:47.929610 notice [multipathd:] sync_map_state: failing sdej state 2 dmstate 2
2024-12-02 22:39:47.929613 notice [multipathd:] sync_map_state: failing sdek state 2 dmstate 2
2024-12-02 22:39:47.929623 notice [multipathd:] 36b46e08100526565029487c700000108: devmap dm-1 registered
2024-12-02 22:39:48.068179 err [multipathd:] 36b46e08100526565029487c700000108: path 8:16 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-12-02 22:39:48.068211 notice [multipathd:] 36b46e08100526565029487c700000108: removing empty pathgroup 0
2024-12-02 22:39:48.068215 err [multipathd:] 36b46e08100526565029487c700000108: path 69:192 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-12-02 22:39:48.068232 notice [multipathd:] 36b46e08100526565029487c700000108: removing empty pathgroup 0
2024-12-02 22:39:48.068235 err [multipathd:] 36b46e08100526565029487c700000108: path 70:32 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-12-02 22:39:48.068250 notice [multipathd:] 36b46e08100526565029487c700000108: removing empty pathgroup 0
2024-12-02 22:39:48.068255 err [multipathd:] 36b46e08100526565029487c700000108: path 67:224 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-12-02 22:39:48.068266 notice [multipathd:] 36b46e08100526565029487c700000108: removing empty pathgroup 0
2024-12-02 22:39:48.068270 err [multipathd:] 36b46e08100526565029487c700000108: path 68:64 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-12-02 22:39:48.068284 notice [multipathd:] 36b46e08100526565029487c700000108: removing empty pathgroup 0
2024-12-02 22:39:48.707487 err [multipathd:] 36b46e08100526565029487c700000108: path 8:16 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-12-02 22:39:48.707493 notice [multipathd:] 36b46e08100526565029487c700000108: removing empty pathgroup 0
2024-12-02 22:39:48.707496 err [multipathd:] 36b46e08100526565029487c700000108: path 69:192 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
...
024-12-02 22:39:50.068846 err [multipathd:] 36b46e08100526565029487c700000108: path 68:64 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map
2024-12-02 22:39:50.068850 notice [multipathd:] 36b46e08100526565029487c700000108: removing empty pathgroup 0
2024-12-02 22:39:50.068852 notice [multipathd:] checker failed path 129:0 in map 36b46e08100526565029487c700000108
2024-12-02 22:39:50.068894 notice [multipathd:] 36b46e08100526565029487c700000108: sdeo - rdac checker reports path is down: lun not connected
2024-12-02 22:39:50.068904 notice [multipathd:] 36b46e08100526565029487c700000108: switch to path group #1
2024-12-02 22:39:50.069057 err [multipathd:] 36b46e08100526565029487c700000108: path 8:16 WWID 36b46e0810052656512345678000103e8 doesn't match, removing from map

 # multipathd  show topo
36b46e08100526565029487c700000108 dm-1 HUAWEI,XSG1
size=20G features='0' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 0:0:3:1    sdf  8:80   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:5:1    sdj  8:144  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:1:1    sdbk 67:224 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:5:1   sdas 66:192 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:2:1   sdaa 65:160 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:3:1   sdag 66:0   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:2:1    sdbq 68:64  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:0:1   sdh  8:112  active ready running
`-+- policy='round-robin 0' prio=50 status=enabled
  `- 15:0:6:1   sday 67:32  active ready running

linzhanglong · 2024-12-02T15:11:27Z

I retested by unmapping a LUN, and this is the uevent event for the unmapping. My previous environment had 256 LUNs.

 # udevadm  monitor -k -p -s scsi
custom logging function 0x1fe6010 registered
selinux=0
runtime dir '/run/udev'
calling: monitor
monitor will print the received events for:
KERNEL - the kernel uevent

KERNEL[23808.067273] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-3/target15:0:2/15:0:2:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-3/target15:0:2/15:0:2:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19279
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.083982] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-4/target0:0:3/0:0:3:162 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-4/target0:0:3/0:0:3:162
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19281
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.098779] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-4/target15:0:3/15:0:3:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-4/target15:0:3/15:0:3:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19283
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.116540] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-9/target0:0:5/0:0:5:162 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-9/target0:0:5/0:0:5:162
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19285
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.131014] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-0/target0:0:0/0:0:0:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-0/target0:0:0/0:0:0:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19287
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.134275] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-9/target15:0:4/15:0:4:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-9/target15:0:4/15:0:4:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19289
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.149047] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-10/target0:0:4/0:0:4:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-10/target0:0:4/0:0:4:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19291
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.156474] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-12/target15:0:5/15:0:5:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-12/target15:0:5/15:0:5:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19293
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.160778] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-12/target0:0:6/0:0:6:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-12/target0:0:6/0:0:6:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19295
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.171297] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-13/target0:0:7/0:0:7:1 (scsi)
ACTION=change
DEVPATH=/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/host0/rport-0:0-13/target0:0:7/0:0:7:1
DEVTYPE=scsi_device
DRIVER=sd
MODALIAS=scsi:t-0x00
SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED
SEQNUM=19297
SUBSYSTEM=scsi
UDEV_LOG=6

KERNEL[23808.180670] change   /devices/pci0000:3a/0000:3a:00.0/0000:3b:00.1/host15/rport-15:0-10/target15:0:6/15:0:6:1 (scsi)
ACTION=change
DEVPATH=/devices

Test the unmap operation on a LUN.

1.1 before unmap LUN:

 # multipathd  show topo
36b46e08100526565029487c700000108 dm-1 HUAWEI,XSG1
size=20G features='0' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 0:0:5:1  sdg 8:96  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:6:1  sdh 8:112 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:7:1  sdi 8:128 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:0:1  sdb 8:16  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:4:1  sdf 8:80  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:1:1  sdc 8:32  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:2:1  sdd 8:48  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:3:1  sde 8:64  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:0:1 sdj 8:144 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:6:1 sdw 65:96 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:1:1 sdk 8:160 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:5:1 sdv 65:80 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:3:1 sdm 8:192 active ready running
`-+- policy='round-robin 0' prio=50 status=enabled
  `- 15:0:2:1 sdl 8:176 active ready running

1.2 after unmap LUN:

36b46e08100526565029487c700000108 dm-1 HUAWEI,XSG1
size=20G features='0' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 15:0:3:1 sdm 8:192  failed faulty running <------------------------------

36b46e0810052656512345678000103e8 dm-2 HUAWEI,XSG1
size=20G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 15:0:1:1 sdk 8:160  failed faulty running
  |- 0:0:4:1  sdf 8:80   failed faulty running
  |- 15:0:2:1 sdl 8:176  failed faulty running
  |- 0:0:3:1  sde 8:64   failed faulty running
  |- 15:0:0:1 sdj 8:144  failed faulty running
  |- 0:0:0:1  sdb 8:16   failed faulty running
  |- 15:0:6:1 sdw 65:96  failed faulty running
  |- 0:0:6:1  sdh 8:112  failed faulty running
  |- 15:0:4:1 sdu 65:64  failed faulty running
  |- 0:0:1:1  sdc 8:32   failed faulty running
  |- 15:0:5:1 sdv 65:80  failed faulty running
  |- 0:0:5:1  sdg 8:96   failed faulty running
  |- 15:0:7:1 sdx 65:112 failed faulty running
  |- 0:0:2:1  sdd 8:48   failed faulty running
  `- 0:0:7:1  sdi 8:128  failed faulty running
 
----- After some time:
36b46e0810052656512345678000103e8 dm-2 HUAWEI,XSG1
size=20G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 15:0:1:1 sdk 8:160  failed faulty running
  |- 0:0:4:1  sdf 8:80   failed faulty running
  |- 15:0:2:1 sdl 8:176  failed faulty running
  |- 0:0:3:1  sde 8:64   failed faulty running
  |- 15:0:0:1 sdj 8:144  failed faulty running
  |- 0:0:0:1  sdb 8:16   failed faulty running
  |- 15:0:6:1 sdw 65:96  failed faulty running
  |- 0:0:6:1  sdh 8:112  failed faulty running
  |- 15:0:4:1 sdu 65:64  failed faulty running
  |- 0:0:1:1  sdc 8:32   failed faulty running
  |- 15:0:5:1 sdv 65:80  failed faulty running
  |- 0:0:5:1  sdg 8:96   failed faulty running
  |- 15:0:7:1 sdx 65:112 failed faulty running
  |- 0:0:2:1  sdd 8:48   failed faulty running
  |- 15:0:3:1 sdm 8:192  failed faulty running
  `- 0:0:7:1  sdi 8:128  failed faulty running

1.3 remap LUN to host

36b46e08100526565029487c700000108 dm-1 HUAWEI,XSG1
size=20G features='0' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 15:0:7:1 sdx 65:112 active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:5:1 sdv 65:80  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:7:1  sdi 8:128  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:4:1 sdu 65:64  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:1:1 sdk 8:160  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:4:1  sdf 8:80   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:2:1 sdl 8:176  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:3:1  sde 8:64   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:0:1 sdj 8:144  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:0:1  sdb 8:16   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:6:1 sdw 65:96  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:6:1  sdh 8:112  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:1:1  sdc 8:32   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:5:1  sdg 8:96   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:2:1  sdd 8:48   active ready running
`-+- policy='round-robin 0' prio=50 status=enabled
  `- 15:0:3:1 sdm 8:192  active ready running

Test the change LUN ID operation on a LUN.
2.1 before change

36b46e08100526565029487c700000108 dm-1 HUAWEI,XSG1
size=20G features='0' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 15:0:7:1 sdx 65:112 failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 15:0:5:1 sdv 65:80  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 0:0:7:1  sdi 8:128  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 15:0:4:1 sdu 65:64  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 15:0:1:1 sdk 8:160  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 0:0:4:1  sdf 8:80   failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 15:0:2:1 sdl 8:176  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 0:0:3:1  sde 8:64   failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 15:0:0:1 sdj 8:144  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 15:0:6:1 sdw 65:96  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 0:0:6:1  sdh 8:112  failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 0:0:1:1  sdc 8:32   failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 0:0:5:1  sdg 8:96   failed faulty running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 0:0:2:1  sdd 8:48   failed faulty running
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 15:0:3:1 sdm 8:192  failed faulty running

----- After some time:
36b46e0810052656512345678000103e8 dm-2 HUAWEI,XSG1
size=20G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 0:0:0:1  sdb 8:16   failed faulty running
  |- 15:0:0:1 sdj 8:144  failed faulty running
  |- 0:0:1:1  sdc 8:32   failed faulty running
  |- 15:0:1:1 sdk 8:160  failed faulty running
  |- 0:0:2:1  sdd 8:48   failed faulty running
  |- 15:0:2:1 sdl 8:176  failed faulty running
  |- 0:0:3:1  sde 8:64   failed faulty running
  |- 15:0:3:1 sdm 8:192  failed faulty running
  |- 0:0:4:1  sdf 8:80   failed faulty running
  |- 15:0:4:1 sdu 65:64  failed faulty running
  |- 0:0:5:1  sdg 8:96   failed faulty running
  |- 15:0:5:1 sdv 65:80  failed faulty running
  |- 0:0:6:1  sdh 8:112  failed faulty running
  |- 15:0:6:1 sdw 65:96  failed faulty running
  |- 0:0:7:1  sdi 8:128  failed faulty running
  `- 15:0:7:1 sdx 65:112 failed faulty running

2.1 change back

36b46e0810052656512345678000103e8 dm-2 HUAWEI,XSG1
size=20G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 15:0:0:1 sdj 8:144  failed faulty running
  |- 0:0:1:1  sdc 8:32   failed faulty running
  |- 15:0:1:1 sdk 8:160  failed faulty running
  |- 0:0:2:1  sdd 8:48   failed faulty running
  |- 15:0:2:1 sdl 8:176  failed faulty running
  |- 0:0:3:1  sde 8:64   failed faulty running
  |- 15:0:3:1 sdm 8:192  failed faulty running
  |- 0:0:4:1  sdf 8:80   failed faulty running
  |- 15:0:4:1 sdu 65:64  failed faulty running
  |- 0:0:5:1  sdg 8:96   failed faulty running
  |- 15:0:5:1 sdv 65:80  failed faulty running
  |- 0:0:6:1  sdh 8:112  failed faulty running
  |- 15:0:6:1 sdw 65:96  failed faulty running
  |- 0:0:7:1  sdi 8:128  failed faulty running
  `- 15:0:7:1 sdx 65:112 failed faulty running
create: 36b46e08100526565029487c700000108 dm-1 HUAWEI,XSG1
size=20G features='0' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 0:0:0:1  sdb 8:16   active ready  running

----- After some time:
36b46e08100526565029487c700000108 dm-1 HUAWEI,XSG1
size=20G features='0' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 0:0:0:1  sdb 8:16   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:5:1  sdg 8:96   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:4:1  sdf 8:80   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:3:1  sde 8:64   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:2:1  sdd 8:48   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:1:1  sdc 8:32   active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:6:1  sdh 8:112  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 0:0:7:1  sdi 8:128  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:0:1 sdj 8:144  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:1:1 sdk 8:160  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:2:1 sdl 8:176  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:3:1 sdm 8:192  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:4:1 sdu 65:64  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:5:1 sdv 65:80  active ready running
|-+- policy='round-robin 0' prio=50 status=enabled
| `- 15:0:6:1 sdw 65:96  active ready running
`-+- policy='round-robin 0' prio=50 status=enab

linzhanglong · 2024-12-03T01:53:38Z

I will merge the patch later today and test it in the environment with 256 LUNs. Looking at the code can solve the problem

I noticed that there are related operations for reload_and_sync_map at the end of the do_check_path function. Would it be more appropriate to reload the map there?

pp->pgindex is set in disassemble_map() when a map is parsed. There are various possiblities for this index to become invalid. pp->pgindex is only used in enable_group() and followover_should_fallback(), and both callers take no action if it is 0, which is the right thing to do if we don't know the path's pathgroup. Make sure pp->pgindex is reset to 0 in various places: - when it's orphaned, - before (re)grouping paths, - when we detect a bad mpp assignment in update_pathvec_from_dm(). - when a pathgroup is deleted in update_pathvec_from_dm(). In this case, pgindex needs to be invalidated for all paths in all pathgroups after the one that was deleted. The hunk in group_paths is mostly redundant with the hunk in free_pgvec(), but because we're looping over pg->paths in the former and over pg->pgp in the latter, I think it's better too play safe. Fixes: 99db1bd ("[multipathd] re-enable disabled PG when at least one path is up") Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

update_pathvec_from_dm() may set mpp->need_reload if it finds inconsistent settings. In this case, the map should be reloaded, but so far we don't do this reliably. Add a call to reload_and_sync_map() to do_sync_mpp() to clear this kind of inconsistency. In order to avoid endless reload loops, limit the number of retries to 1. Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

pp->pgindex is set in disassemble_map() when a map is parsed. There are various possiblities for this index to become invalid. pp->pgindex is only used in enable_group() and followover_should_fallback(), and both callers take no action if it is 0, which is the right thing to do if we don't know the path's pathgroup. Make sure pp->pgindex is reset to 0 in various places: - when it's orphaned, - before (re)grouping paths, - when we detect a bad mpp assignment in update_pathvec_from_dm(). - when a pathgroup is deleted in update_pathvec_from_dm(). In this case, pgindex needs to be invalidated for all paths in all pathgroups after the one that was deleted. The hunk in group_paths is mostly redundant with the hunk in free_pgvec(), but because we're looping over pg->paths in the former and over pg->pgp in the latter, I think it's better too play safe. Fixes: 99db1bd ("[multipathd] re-enable disabled PG when at least one path is up") Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>

update_pathvec_from_dm() may set mpp->need_reload if it finds inconsistent settings. In this case, the map should be reloaded, but so far we don't do this reliably. Add a call to reload_and_sync_map() to do_sync_mpp() to clear this kind of inconsistency. In order to avoid endless reload loops, limit the number of retries to 1. Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

mwilck · 2024-12-04T11:03:54Z

I noticed that there are related operations for reload_and_sync_map at the end of the do_check_path function. Would it be more appropriate to reload the map there?

The need_reload flag is set in update_pathvec_from_dm(), which is called in the update_multipath_strings() code path. It makes sense to do the reload closely after that. do_sync_mpp() is where this check logically belongs ¹. The calls late in do_check_path() are meant to fix priorities.

However, I think now that calling reload_and_sync_map() in do_sync_mpp(), like in my path, is not ideal. We'll be pointlessly repeating ioctls. It's fine to test this I think I'll post an update to the patch.

We have lots of similar code paths for refreshing some properties of the maps either in multipathd or in the kernel, and that this is quite confusing even for people who've been working with this code base for years. Some day we'll need to clean this up. ↩

update_pathvec_from_dm() may set mpp->need_reload if it finds inconsistent settings. In this case, the map should be reloaded, but so far we don't do this reliably. Add a call to reload_map() to do_sync_mpp() to clear this kind of inconsistency. In order to avoid endless reload loops, limit the number of retries to 1. Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

mwilck · 2024-12-04T11:54:05Z

However, I think now that calling reload_and_sync_map() in do_sync_mpp(), like in my path, is not ideal.

Indeed, this was wrong. reload_and_sync_map() may actually end up removing the map we're just working on.

I've pushed a new commit, 01ec4fa ~~15747c2~~, that replaces the call to reload_and_sync_map() with one to just reload_map(). This is sufficient because we'll call update_multipath_strings() again in do_sync_mpp().

@bmarzins, your feedback would be appreciated.

update_pathvec_from_dm() may set mpp->need_reload if it finds inconsistent settings. In this case, the map should be reloaded, but so far we don't do this reliably. Add a call to reload_map() to do_sync_mpp() to clear this kind of inconsistency. In order to avoid endless reload loops, limit the number of retries to 1. Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

mwilck · 2024-12-04T11:58:26Z

Sorry for posting the wrong commit. 01ec4fa is correct.

linzhanglong · 2024-12-04T14:22:57Z

Sorry for posting the wrong commit. 01ec4fa is correct.

Okay, I will test the new patch.

bmarzins · 2024-12-05T03:58:06Z

@mwilck, Is there a big benefit to retrying immediately that I'm overlooking? Is this to deal with the case where reload_map() fails, or are you worried about successfully reloading the map, but having update_multipath_strings() still flag it as needing to be reloaded again? I seems to me that instead of immediately retrying, we could just wait till the next path check to try again.

Also, I think we do want to call reload_and_sync_map(). Right now, every call to domap() in multipathd will call setup_multipath() and sync_map_state() afterwards. I think we want to keep that for all cases. I posted a patchset that contains an alternate version of this patch.

linzhanglong · 2024-12-05T04:12:57Z

Is there a big benefit to retrying immediately that I'm overlooking? Is this to deal with the case where reload_map() fails, or are you worried about successfully reloading the map, but having update_multipath_strings() still flag it as needing to be reloaded again? I seems to me that instead of immediately retrying, we could just wait till the next path check to try again.

Also, I think we do want to call reload_and_sync_map(). Right now, every call to domap() in multipathd will call setup_multipath() and sync_map_state() afterwards. I think we want to keep that for all cases. I posted a patchset that contains an alternate version of this patch.
Hello，regarding the patch set, where I can find it?

mwilck · 2024-12-05T16:24:25Z

Is there a big benefit to retrying immediately that I'm overlooking?

In my mind, need_reload indicates an inconsistent state in the kernel. While we do our best to make sure that the kernel won't actually use this path for I/O, I think that we should attempt to fix this situation rather sooner than later.

bmarzins · 2024-12-05T16:29:04Z

@linzhanglong, the patches are available here:
https://lore.kernel.org/dm-devel/[email protected]/

bmarzins · 2024-12-05T16:50:07Z

Is there a big benefit to retrying immediately that I'm overlooking?

In my mind, need_reload indicates an inconsistent state in the kernel. While we do our best to make sure that the kernel won't actually use this path for I/O, I think that we should attempt to fix this situation rather sooner than later.

Oops. I misread your code. I though that you did one retry reload_map() immediately, buy you just loop to call update_multipath_strings() again. I still think that reload_and_sync_map() makes more sense, but you can ignore my retrys question.

My code should do the reload_and_sync_map() as often as yours. I just moved it after the prio refresh so that if we're going to do a reload because of that anyways, we won't reload the device twice (unless the device gets set to need_reload when we are syncing after the prio changed reload).

In my version of the patch, perhaps we could store a flag when need_reload is still set after the new call to reload_and_sync_map() in checkerloop, and cleared whenever the map is reloaded. Then we could check that flag and only require the mpp->synced_count > 0 check if the last reload of the map was solely because of need_reload, and it didn't help.

That would make multipathd respond to a new need_reload within a checker tick. If the reload didn't fix the problem, we would wait till the next time a path in the device is checked before trying again, which is the same speed as yours.

mwilck · 2024-12-05T18:02:44Z

I'd also prefer to do this kind of reload no more than once per tick.
But in your version of the patch, checkerloop() could release the lock before actually reloading the map after update_multipath_strings() detects an inconsistency. That sort of contradicts my "rather sooner than later" idea. I agree that possibly having to reload the map twice during a single tick is not beatiful, but given that such inconsistencies are very rare, IMO it shouldn't hurt much.

The idea of the retry was to check if update_pathvec_from_dm() still reports an inconsistency (which includes the case in which the reload failed, because in that case we'd have failed to fix the situation). I thought one immediate retry was warranted, given that this is a rare but serious error condition. But I've no idea about the likelihood that such an immediate retry would succeed.

I'll respond in the dm-devel thread.

mwilck · 2024-12-05T18:12:13Z

Another thought: we could attempt a single reload in do_sync_mpp() without an immediate retry (leaving need_reload set), and use your patch on top. This way if the reload in do_sync_mpp() failed, we'd retry at the end of the tick, with all path properties adjusted, which makes probably more sense than retrying immediately.

bmarzins · 2024-12-05T18:45:19Z

I thought one immediate retry was warranted, given that this is a rare but serious error condition. But I've no idea about the likelihood that such an immediate retry would succeed.

But your code checks retry++ < MAX_RETRIES so that chunk will only run once and won't ever call reload_map() after jumping to try_again (retry will be 1 and MAX_RETRIES is 1). Even if mpp->need_reload gets set when it tries again, it will skip the code to reload the map. Right? That's what I didn't notice originally.

mwilck · 2024-12-05T20:16:27Z

Right, now I misinterpreted my own code :-)

Indeed the idea was just to retry reading the kernel parameters. I didn't intend to reload multiple times.

pp->pgindex is set in disassemble_map() when a map is parsed. There are various possiblities for this index to become invalid. pp->pgindex is only used in enable_group() and followover_should_fallback(), and both callers take no action if it is 0, which is the right thing to do if we don't know the path's pathgroup. Make sure pp->pgindex is reset to 0 in various places: - when it's orphaned, - before (re)grouping paths, - when we detect a bad mpp assignment in update_pathvec_from_dm(). - when a pathgroup is deleted in update_pathvec_from_dm(). In this case, pgindex needs to be invalidated for all paths in all pathgroups after the one that was deleted. The hunk in group_paths is mostly redundant with the hunk in free_pgvec(), but because we're looping over pg->paths in the former and over pg->pgp in the latter, I think it's better too play safe. Fixes: 99db1bd ("[multipathd] re-enable disabled PG when at least one path is up") Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>

update_pathvec_from_dm() may set mpp->need_reload if it finds inconsistent settings. In this case, the map should be reloaded, but so far we don't do this reliably. A previous patch added a call to reload_and_sync_map() in the CHECKER_FINISHED state, but in the mean time the checker may have waited for checker threads to finish, and may have dropped and re-acquired the vecs lock. As mpp->need_reload is a serious but rare condition, also try to fix it early in the checker loop. Because of the previous patch, we can call reload_and_sync_map() here. Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

pp->pgindex is set in disassemble_map() when a map is parsed. There are various possiblities for this index to become invalid. pp->pgindex is only used in enable_group() and followover_should_fallback(), and both callers take no action if it is 0, which is the right thing to do if we don't know the path's pathgroup. Make sure pp->pgindex is reset to 0 in various places: - when it's orphaned, - before (re)grouping paths, - when we detect a bad mpp assignment in update_pathvec_from_dm(). - when a pathgroup is deleted in update_pathvec_from_dm(). In this case, pgindex needs to be invalidated for all paths in all pathgroups after the one that was deleted. The hunk in group_paths is mostly redundant with the hunk in free_pgvec(), but because we're looping over pg->paths in the former and over pg->pgp in the latter, I think it's better too play safe. Fixes: 99db1bd ("[multipathd] re-enable disabled PG when at least one path is up") Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>

update_pathvec_from_dm() may set mpp->need_reload if it finds inconsistent settings. In this case, the map should be reloaded, but so far we don't do this reliably. A previous patch added a call to reload_and_sync_map() in the CHECKER_FINISHED state, but in the mean time the checker may have waited for checker threads to finish, and may have dropped and re-acquired the vecs lock. As mpp->need_reload is a serious but rare condition, also try to fix it early in the checker loop. Because of the previous patch, we can call reload_and_sync_map() here. Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]>

mwilck · 2024-12-09T17:26:35Z

My tip branch now contains Ben's fixes plus an improved version of mine above, plus some cleanup.

@linzhanglong, Please provide feedback.

linzhanglong · 2024-12-10T03:14:23Z

Sorry for posting the wrong commit. 01ec4fa is correct.

Okay, I will test the new patch.

I have merged the changes from this patch into the test environment without any issues. If there are any new changes, I will merge and test them today.

linzhanglong · 2024-12-14T10:43:14Z

Test Okay

linzhanglong · 2024-12-16T14:24:02Z

Hello, when will the new version be released? Is the branch at https://github.com/openSUSE/multipath-tools/tree/tip the next version?

enable_group Segmentation fault #105

enable_group Segmentation fault #105

Comments

linzhanglong commented Nov 24, 2024 • edited Loading

mwilck commented Nov 25, 2024 • edited Loading

linzhanglong commented Nov 25, 2024

mwilck commented Nov 25, 2024

mwilck commented Nov 25, 2024

linzhanglong commented Nov 27, 2024

mwilck commented Nov 27, 2024

mwilck commented Nov 27, 2024

linzhanglong commented Nov 28, 2024

mwilck commented Nov 28, 2024

mwilck commented Nov 29, 2024

linzhanglong commented Nov 29, 2024 • edited Loading

mwilck commented Nov 29, 2024

linzhanglong commented Nov 29, 2024 • edited Loading

mwilck commented Nov 29, 2024

linzhanglong commented Nov 30, 2024 • edited Loading

mwilck commented Dec 2, 2024

mwilck commented Dec 2, 2024

mwilck commented Dec 2, 2024

mwilck commented Dec 2, 2024

linzhanglong commented Dec 2, 2024 • edited Loading

linzhanglong commented Dec 2, 2024 • edited Loading

linzhanglong commented Dec 3, 2024 • edited Loading

mwilck commented Dec 4, 2024 • edited Loading

Footnotes

mwilck commented Dec 4, 2024 • edited Loading

mwilck commented Dec 4, 2024

linzhanglong commented Dec 4, 2024 • edited Loading

bmarzins commented Dec 5, 2024 • edited Loading

linzhanglong commented Dec 5, 2024

mwilck commented Dec 5, 2024

bmarzins commented Dec 5, 2024

bmarzins commented Dec 5, 2024 • edited Loading

mwilck commented Dec 5, 2024

mwilck commented Dec 5, 2024

bmarzins commented Dec 5, 2024

mwilck commented Dec 5, 2024

mwilck commented Dec 9, 2024

linzhanglong commented Dec 10, 2024

linzhanglong commented Dec 14, 2024

linzhanglong commented Dec 16, 2024

linzhanglong commented Nov 24, 2024 •

edited

Loading

mwilck commented Nov 25, 2024 •

edited

Loading

linzhanglong commented Nov 29, 2024 •

edited

Loading

linzhanglong commented Nov 29, 2024 •

edited

Loading

linzhanglong commented Nov 30, 2024 •

edited

Loading

linzhanglong commented Dec 2, 2024 •

edited

Loading

linzhanglong commented Dec 2, 2024 •

edited

Loading

linzhanglong commented Dec 3, 2024 •

edited

Loading

mwilck commented Dec 4, 2024 •

edited

Loading

mwilck commented Dec 4, 2024 •

edited

Loading

linzhanglong commented Dec 4, 2024 •

edited

Loading

bmarzins commented Dec 5, 2024 •

edited

Loading

bmarzins commented Dec 5, 2024 •

edited

Loading