Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4427695pxf; Tue, 23 Mar 2021 10:18:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy/05dj9UlAdRXye5bKca2p/0O8UQQKKm23cbbljYWxoyrzZrIiwwmvIi/wJ7TJdR8fMocL X-Received: by 2002:a17:906:13ce:: with SMTP id g14mr4035007ejc.378.1616519886606; Tue, 23 Mar 2021 10:18:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616519886; cv=none; d=google.com; s=arc-20160816; b=QNl5FIOUqBAr88lQlJwZPVF9+eyDzdZivOAOLFvjT9XX16fJvK/0erZlzY3WFev0lo FLb7S3ew1acLaIPzKGj/ixK3kLvkn5hHI8ne59VhZJCqLhr4a3d9rN27poTpTdjeV1Wr 8b71Qqvz5O/oxnnRL2Zjv4TdIypS02rgf+GM2jNRCrbuvTzTMEJq7LtwK6EWrrBPf1m9 Tze6kppkXLAJqdwfdLB/ghYGIFbRw1eUXAsozgyXxpKrVY4A7UR2X7zSgHbtv+UPz1kk zOAt8swqD8m9NpyviNtEIbXOvkDXXdT8NFf3Wev0pGIqILkSFaeap/+1H9DkxvC9jlRA sSCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature; bh=CKHlmcAcFrOM8c13z+gPq/cfF8DQ2S4kR6iZLDzuW8w=; b=r9owvY50BgGxDmKLI04AY23mZuvlP0slVzhXpp3SW0RbGlE0by7/u5B0Dj5urMAdsR VLTUJxl0ZRQ7lx037RmVamO6sjOFHDU8SkyehNhKeUVpW65IxbxPOYVFkK3FWe8RWSdF ELgtdtXxVhAnsgxkVBhlGkyQyNX/ToS1t0h2F7xh3JzWN7sH3qCuIOaJzzuUfEvJQ5yc yXBOmQLxgvXXzx7HTDjJgn8HwB5I75OU6baSOSFOSGEPdL8y5tbd5FaUxnnnI++hMCR/ 5oOZ8l1mc+wYNklwBU+r4+YTiM6FC9JWb/PpW/b47s1l/vQPBXjs2xat2q6aooT21YHp cNwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BzlWdVnZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k4si13210789edq.442.2021.03.23.10.17.42; Tue, 23 Mar 2021 10:18:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BzlWdVnZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230035AbhCWRME (ORCPT + 99 others); Tue, 23 Mar 2021 13:12:04 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27506 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229890AbhCWRLv (ORCPT ); Tue, 23 Mar 2021 13:11:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1616519511; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CKHlmcAcFrOM8c13z+gPq/cfF8DQ2S4kR6iZLDzuW8w=; b=BzlWdVnZsaDD2BtCJ2ruOfVTGl9PVH0R9gqK5ASQgc/1pRGQ/5rKddAObq0zndwTTd/cSH dJuP02Hr5bANdD0s1dPQU7yJ3nRRuUcFZ4mZ1rnWW/iqjQL5RoqEWU8FnM+CxI1AsDJbvD cg45Ezxa3oiTcNsbIzCJyyPa6wrwuUw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-304-sR3Ur4hPO2qR45qt9qq_hw-1; Tue, 23 Mar 2021 13:11:47 -0400 X-MC-Unique: sR3Ur4hPO2qR45qt9qq_hw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A67B05F9E1; Tue, 23 Mar 2021 17:11:45 +0000 (UTC) Received: from ovpn-112-207.phx2.redhat.com (ovpn-112-207.phx2.redhat.com [10.3.112.207]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0EB045C232; Tue, 23 Mar 2021 17:11:30 +0000 (UTC) Message-ID: <5612cd9e298488ea9277f8d99ce0fd35240bdc53.camel@redhat.com> Subject: Re: md/dm-mpath: check whether all pgpaths have same uuid in multipath_ctr() From: "Ewan D. Milne" To: lixiaokeng , Mike Snitzer , Christoph Hellwig , Zhiqiang Liu Cc: agk@redhat.com, dm-devel@redhat.com, "jejb@linux.ibm.com" , "martin.petersen@oracle.com" , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, linfeilong , "wubo (T)" Date: Tue, 23 Mar 2021 13:11:30 -0400 In-Reply-To: References: <20210322081155.GE1946905@infradead.org> <20210322142207.GB30698@redhat.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2021-03-23 at 15:47 +0800, lixiaokeng wrote: > > On 2021/3/22 22:22, Mike Snitzer wrote: > > On Mon, Mar 22 2021 at 4:11am -0400, > > Christoph Hellwig wrote: > > > > > On Sat, Mar 20, 2021 at 03:19:23PM +0800, Zhiqiang Liu wrote: > > > > From: Zhiqiang Liu > > > > > > > > When we make IO stress test on multipath device, there will > > > > be a metadata err because of wrong path. In the test, we > > > > concurrent execute 'iscsi device login|logout' and > > > > 'multipath -r' command with IO stress on multipath device. > > > > In some case, systemd-udevd may have not time to process > > > > uevents of iscsi device logout|login, and then 'multipath -r' > > > > command triggers multipathd daemon calls ioctl to load table > > > > with incorrect old device info from systemd-udevd. > > > > Then, one iscsi path may be incorrectly attached to another > > > > multipath which has different uuid. Finally, the metadata err > > > > occurs when umounting filesystem to down write metadata on > > > > the iscsi device which is actually not owned by the multipath > > > > device. > > > > > > > > So we need to check whether all pgpaths of one multipath have > > > > the same uuid, if not, we should throw a error. > > > > > > > > Signed-off-by: Zhiqiang Liu > > > > Signed-off-by: lixiaokeng > > > > Signed-off-by: linfeilong > > > > Signed-off-by: Wubo > > > > --- > > > > drivers/md/dm-mpath.c | 52 > > > > +++++++++++++++++++++++++++++++++++++++++ > > > > drivers/scsi/scsi_lib.c | 1 + > > > > 2 files changed, 53 insertions(+) > > > > > > > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > > > > index bced42f082b0..f0b995784b53 100644 > > > > --- a/drivers/md/dm-mpath.c > > > > +++ b/drivers/md/dm-mpath.c > > > > @@ -24,6 +24,7 @@ > > > > #include > > > > #include > > > > #include > > > > +#include > > > > #include > > > > #include > > > > > > > > @@ -1169,6 +1170,45 @@ static int parse_features(struct > > > > dm_arg_set *as, struct multipath *m) > > > > return r; > > > > } > > > > > > > > +#define SCSI_VPD_LUN_ID_PREFIX_LEN 4 > > > > +#define MPATH_UUID_PREFIX_LEN 7 > > > > +static int check_pg_uuid(struct priority_group *pg, char > > > > *md_uuid) > > > > +{ > > > > + char pgpath_uuid[DM_UUID_LEN] = {0}; > > > > + struct request_queue *q; > > > > + struct pgpath *pgpath; > > > > + struct scsi_device *sdev; > > > > + ssize_t count; > > > > + int r = 0; > > > > + > > > > + list_for_each_entry(pgpath, &pg->pgpaths, list) { > > > > + q = bdev_get_queue(pgpath->path.dev->bdev); > > > > + sdev = scsi_device_from_queue(q); > > > > > > Common dm-multipath code should never poke into scsi > > > internals. This > > > is something for the device handler to check. It probably also > > > won't > > > work for all older devices. > > > > Definitely. > > > > But that aside, userspace (multipathd) _should_ be able to do extra > > validation, _before_ pushing down a new table to the kernel, rather > > than > > forcing the kernel to do it. > > > > Martin (committer of multipath-tools) said that: > "Don't get me wrong, I don't argue against tough testing. But we > should > be aware that there are always time intervals during which > multipathd's > picture of the present devices is different from what the kernel > sees." > > It is difficult to solve this in multipathd. > > Regards, > Lixiaokeng > I think the patch is no good. There are plenty of devices that don't support VPD page 83h: int scsi_vpd_lun_id(struct scsi_device *sdev, char *id, size_t id_len) { u8 cur_id_type = 0xff; u8 cur_id_size = 0; unsigned char *d, *cur_id_str; unsigned char __rcu *vpd_pg83; int id_size = -EINVAL; rcu_read_lock(); vpd_pg83 = rcu_dereference(sdev->vpd_pg83); if (!vpd_pg83) { rcu_read_unlock(); return -ENXIO; } and the DM layer should not be looking at the properties of the underlying devices in this way anyway. It should be pushed down to the table. -Ewan