Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5627077imu; Wed, 30 Jan 2019 00:27:19 -0800 (PST) X-Google-Smtp-Source: ALg8bN5e85BvqpcUDBmgWCb+fXDBySMFPnkimRP37h74HbamAxZYPQKKEOGVwpmpGdweE1MNxsj6 X-Received: by 2002:a17:902:ac1:: with SMTP id 59mr29051718plp.36.1548836839591; Wed, 30 Jan 2019 00:27:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548836839; cv=none; d=google.com; s=arc-20160816; b=o2/NR0h+T79GowtZx3839cH1ZMsMbHpf1TXP945Dn2RMGk7EPVQfhbmlYRG6tIpqo0 q0clMmu6WfZV6B8UOVIORxMO5Y5EKuudVK6BWwNvdR3VEe5VkCQNGZ+7uwQWeidO8Z2T bY0OG93L+dMN9sS6DxibY2IpkgW8+j8treetNlyxUk/h7gPQ/nFqiEicW2YCZufPsdj3 QoEEvG2vm4QoQ9czda0bbwTKh6nHNSbCJEBs9cGLw2/LFqyWB9CCNzv++HoS7HnyRjyd B4FiI4wP1ecgq6+1Bx3wcpmaBu3kxAVs4+WzJ0Xtwyof4Pgoptcrnz1sN7MRupkKaufG G6Jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=ViZPhtZk3cwuW8nXPH7QuBofHYXB5HflgleKOamAnWU=; b=DOY43RYdlPQvCIgLP7XeMDWPSUiZ/aIEZTdESjTKD5qeWP4v7jeoLZbA7z+qZqdPrW /GjlXIonNpk80aFG7E47EPNqgitLKmtqVkwRnUzpc+iINcmtYVp6++RXCfvPAsMbatXN fZ4vDetlu6yP0EUptRfJBIAfnCCK32m9PEiqsAcQ/HaL/PU/9fQ0MU/0hFPt3pBPqwn3 meH3RCr2dNXRjzJKZ8NBFqHlw34ZLCyL5Fb9hIRTem1wgVKi5bDJfg7Z3ak9L6yKWqpy hK+vlLvds25Y2Fw8SnWmkomzpikh12WkI93xLlAVUD4XDLf1n7vvSyljdzb2j9K9oRBy vxRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m75si879758pga.432.2019.01.30.00.27.04; Wed, 30 Jan 2019 00:27:19 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730279AbfA3IZ7 (ORCPT + 99 others); Wed, 30 Jan 2019 03:25:59 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:2704 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730251AbfA3IZ7 (ORCPT ); Wed, 30 Jan 2019 03:25:59 -0500 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id E8684CA78E3B31BB3F4E; Wed, 30 Jan 2019 16:25:56 +0800 (CST) Received: from huawei.com (10.175.124.28) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.408.0; Wed, 30 Jan 2019 16:25:51 +0800 From: Jason Yan To: , CC: , , , , , , , , , , , , , Jason Yan , Xiaofei Tan , Ewan Milne , Tomas Henzl Subject: [PATCH v2 7/7] scsi: libsas: fix issue of swapping two sas disks Date: Wed, 30 Jan 2019 16:24:12 +0800 Message-ID: <20190130082412.9357-8-yanaijie@huawei.com> X-Mailer: git-send-email 2.14.4 In-Reply-To: <20190130082412.9357-1-yanaijie@huawei.com> References: <20190130082412.9357-1-yanaijie@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.124.28] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The work flow of revalidation now is scanning expander phy by the sequence of the phy and check if the phy have changed. This will leads to an issue of swapping two sas disks on one expander. Assume we have two sas disks, connected with expander phy10 and phy11: phy10: 5000cca04eb1001d port-0:0:10 phy11: 5000cca04eb043ad port-0:0:11 Swap these two disks, and imaging the following scenario: revalidation 1: -->phy10: 0 --> delete phy10 domain device -->phy11: 5000cca04eb043ad (no change) revalidation done revalidation 2: -->step 1, check phy10: -->phy10: 5000cca04eb043ad --> add to wide port(port-0:0:11) (phy11 address is still 5000cca04eb043ad now) -->step 2, check phy11: -->phy11: 0 --> phy11 address is 0 now, but it's part of wide port(port-0:0:11), the domain device will not be deleted. revalidation done revalidation 3: -->phy10, 5000cca04eb043ad (no change) -->phy11: 5000cca04eb1001d --> try to add port-0:0:11 but failed, port-0:0:11 already exist, trigger a warning as follows revalidation done [14790.189699] sysfs: cannot create duplicate filename '/devices/pci0000:74/0000:74:02.0/host0/port-0:0/expander-0:0/port-0:0:11' [14790.201081] CPU: 25 PID: 5031 Comm: kworker/u192:3 Not tainted 4.16.0-rc1-191134-g138f084-dirty #228 [14790.210199] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 EC UEFI Nemo 2.0 RC0 - B303 05/16/2018 [14790.219323] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [14790.225404] Call trace: [14790.227842] dump_backtrace+0x0/0x18c [14790.231492] show_stack+0x14/0x1c [14790.234798] dump_stack+0x88/0xac [14790.238101] sysfs_warn_dup+0x64/0x7c [14790.241751] sysfs_create_dir_ns+0x90/0xa0 [14790.245835] kobject_add_internal+0xa0/0x284 [14790.250092] kobject_add+0xb8/0x11c [14790.253570] device_add+0xe8/0x598 [14790.256960] sas_port_add+0x24/0x50 [14790.260436] sas_ex_discover_devices+0xb10/0xc30 [14790.265040] sas_ex_revalidate_domain+0x1d8/0x518 [14790.269731] sas_revalidate_domain+0x12c/0x154 [14790.274163] process_one_work+0x128/0x2b0 [14790.278160] worker_thread+0x14c/0x408 [14790.281897] kthread+0xfc/0x128 [14790.285026] ret_from_fork+0x10/0x18 [14790.288598] ------------[ cut here ]------------ At last, the disk 5000cca04eb1001d is lost. The basic idea of fix this issue is to let the revalidation first scan all phys, and then unregisterring devices. Only when no devices need to be unregisterred, go to the next step to discover new devices. If there are devices need unregister, unregister those devices and tell the revalidation event processor to retry again. The next revalidation will process the discovering of the new devices. Tested-by: Chen Liangfei Signed-off-by: Jason Yan CC: Xiaofei Tan CC: chenxiang CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams CC: Hannes Reinecke --- drivers/scsi/libsas/sas_expander.c | 146 +++++++++++++++++++++++++------------ 1 file changed, 100 insertions(+), 46 deletions(-) diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c index e781941a7088..073bb5c6e353 100644 --- a/drivers/scsi/libsas/sas_expander.c +++ b/drivers/scsi/libsas/sas_expander.c @@ -2056,7 +2056,7 @@ static bool sas_process_flutter(struct domain_device *dev, struct ex_phy *phy, return true; } -static int sas_unregister(struct domain_device *dev, int phy_id, bool last, +static int sas_ex_unregister(struct domain_device *dev, int phy_id, bool last, bool *retry) { struct expander_device *ex = &dev->ex_dev; @@ -2108,21 +2108,8 @@ static int sas_unregister(struct domain_device *dev, int phy_id, bool last, return 0; } -/** - * sas_rediscover - revalidate the domain. - * @dev:domain device to be detect. - * @phy_id: the phy id will be detected. - * - * NOTE: this process _must_ quit (return) as soon as any connection - * errors are encountered. Connection recovery is done elsewhere. - * Discover process only interrogates devices in order to discover the - * domain.For plugging out, we un-register the device only when it is - * the last phy in the port, for other phys in this port, we just delete it - * from the port.For inserting, we do discovery when it is the - * first phy,for other phys in this port, we add it to the port to - * forming the wide-port. - */ -static void sas_rediscover(struct domain_device *dev, const int phy_id, + +static void sas_ex_unregister_device(struct domain_device *dev, const int phy_id, bool *retry) { struct expander_device *ex = &dev->ex_dev; @@ -2131,31 +2118,70 @@ static void sas_rediscover(struct domain_device *dev, const int phy_id, int i; bool last = true; /* is this the last phy of the port */ - pr_debug("ex %016llx phy%d originated BROADCAST(CHANGE)\n", - SAS_ADDR(dev->sas_addr), phy_id); - - if (SAS_ADDR(changed_phy->attached_sas_addr) != 0) { - for (i = 0; i < ex->num_phys; i++) { - struct ex_phy *phy = &ex->ex_phy[i]; + for (i = 0; i < ex->num_phys; i++) { + struct ex_phy *phy = &ex->ex_phy[i]; - if (i == phy_id) - continue; - if (SAS_ADDR(phy->attached_sas_addr) == - SAS_ADDR(changed_phy->attached_sas_addr)) { - pr_debug("phy%d part of wide port with phy%d\n", - phy_id, i); - last = false; - break; - } + if (i == phy_id) + continue; + if (SAS_ADDR(phy->attached_sas_addr) == + SAS_ADDR(changed_phy->attached_sas_addr)) { + pr_debug("phy%d part of wide port with phy%d\n", + phy_id, i); + last = false; + break; } - res = sas_unregister(dev, phy_id, last, retry); - } else - res = sas_discover_new(dev, phy_id); + } + res = sas_ex_unregister(dev, phy_id, last, retry); pr_debug("ex %016llx phy%d discover returned 0x%x\n", SAS_ADDR(dev->sas_addr), phy_id, res); } +static int sas_ex_try_unregister(struct domain_device *dev, u8 *changed_phy, + int nr, bool *retry) +{ + struct expander_device *ex = &dev->ex_dev; + int unregistered = 0; + struct ex_phy *phy; + int i; + + for (i = 0; i < nr; i++) { + pr_debug("ex %016llx phy%d originated BROADCAST(CHANGE)\n", + SAS_ADDR(dev->sas_addr), changed_phy[i]); + + phy = &ex->ex_phy[changed_phy[i]]; + + if (SAS_ADDR(phy->attached_sas_addr) == 0) + continue; + + sas_ex_unregister_device(dev, changed_phy[i], retry); + changed_phy[i] = 0xff; + unregistered++; + } + return unregistered; +} + +static void sas_ex_register(struct domain_device *dev, u8 *changed_phy, + int nr) +{ + struct expander_device *ex = &dev->ex_dev; + struct ex_phy *phy; + int res = 0; + int i; + + for (i = 0; i < nr; i++) { + if (changed_phy[i] == 0xff) + continue; + + phy = &ex->ex_phy[changed_phy[i]]; + + res = sas_discover_new(dev, changed_phy[i]); + + pr_debug("ex %016llx phy%d register returned 0x%x\n", + SAS_ADDR(dev->sas_addr), changed_phy[i], res); + } +} + /** * sas_ex_revalidate_domain - revalidate the domain * @port_dev: port domain device. @@ -2170,21 +2196,49 @@ void sas_ex_revalidate_domain(struct domain_device *port_dev, bool *retry) { int res; struct domain_device *dev = NULL; + u8 changed_phy[MAX_EXPANDER_PHYS]; + struct expander_device *ex; + int unregistered = 0; + int phy_id; + int nr = 0; + int i = 0; res = sas_find_bcast_dev(port_dev, &dev); - if (res == 0 && dev) { - struct expander_device *ex = &dev->ex_dev; - int i = 0, phy_id; - - do { - phy_id = -1; - res = sas_find_bcast_phy(dev, &phy_id, i, true); - if (phy_id == -1) - break; - sas_rediscover(dev, phy_id, retry); - i = phy_id + 1; - } while (i < ex->num_phys); + if (res || !dev) + return; + + memset(changed_phy, 0xff, MAX_EXPANDER_PHYS); + ex = &dev->ex_dev; + + do { + phy_id = -1; + res = sas_find_bcast_phy(dev, &phy_id, i, true); + if (phy_id == -1) + break; + changed_phy[nr++] = phy_id; + i = phy_id + 1; + } while (i < dev->ex_dev.num_phys); + + if (nr == 0) + return; + + unregistered = sas_ex_try_unregister(dev, changed_phy, nr, retry); + + if (unregistered > 0) { + struct ex_phy *phy; + + for (i = 0; i < nr; i++) { + if (changed_phy[i] == 0xff) + continue; + phy = &ex->ex_phy[changed_phy[i]]; + phy->phy_change_count = -1; + } + ex->ex_change_count = -1; + *retry = true; + return; } + + sas_ex_register(dev, changed_phy, nr); } void sas_smp_handler(struct bsg_job *job, struct Scsi_Host *shost, -- 2.14.4