Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp268287ybi; Wed, 29 May 2019 21:02:23 -0700 (PDT) X-Google-Smtp-Source: APXvYqzPmvUIWH27y487O3g1BYiI2fpEaH1nXWPC9THzlPNaUikcBESqlAaU63bbQRsU7C0wFLw7 X-Received: by 2002:a62:f247:: with SMTP id y7mr1678021pfl.18.1559188943627; Wed, 29 May 2019 21:02:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559188943; cv=none; d=google.com; s=arc-20160816; b=RA+o/3ICuHvVvV+WnSWQj3ftsfOfhr40hNWM5lQ0xo/6xc5wVHBj6VWHGbXrZMOQh5 +vV4vg+JWGfjLo+33X2mawDLwlVBoUc0EHJpQGMdnlkPA01xUCScjEolHdCxwW3z3+1c XXlyUIZmK3IGnZe4tWsvj/O3LKEZ6Xfeq1lndKDVoFMsyFLQu4wLUyxe6LT4+eAdivIa m0BpzkIcErTg9zzVEXB9t36Wcw6rV0HejclabTWfyY+SweUYHq1XKCEOkW7amCcMNYdI uOkqBwzsgBdLPBjz4zmm7CUL8YQb95ZBt1SzzrOd21x0uFNfPQk2lHnStaKENN8xYzL7 MuLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=n7Yjgb2oTSWorsLoR21X+uBJ104F9AfXR/+bIi1HOzk=; b=k0CV70i9G///wowcinoeFNZKX2CmWjwS+h3hi4l1Y/yaqUqA/QqQjuxlW+0T55kVMM YLbCcGvYs0NQbbZ/eaIC5MnVuuvsb67Rak8YpaGCpfydG7OvW96U1gVRyCUvKprkclkC JmtysEq8rHG4WjKIreRqm83R0CKPjZWxkIevnxb41rHDiF1g5B1EQyn3OJKcL0CZsGF5 e5ZEUshK6Rfm6HhHDwdabKvNrQ/wNV23KBe4GIFF5CO8+waNqVNLE7KzA9FIiyZB8SxO a4R7pBzZlgzDOqtkDExmOATGSpyduDAAc4UiIAyuIdA4WFEtYgG9DHzveTWJVBp/3U/y cJgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=werwtqaK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c1si1908681pgm.548.2019.05.29.21.02.06; Wed, 29 May 2019 21:02:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=werwtqaK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732252AbfE3D7q (ORCPT + 99 others); Wed, 29 May 2019 23:59:46 -0400 Received: from mail.kernel.org ([198.145.29.99]:51694 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731581AbfE3DSc (ORCPT ); Wed, 29 May 2019 23:18:32 -0400 Received: from localhost (ip67-88-213-2.z213-88-67.customer.algx.net [67.88.213.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DEDB9247D8; Thu, 30 May 2019 03:18:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1559186311; bh=m7mleQE6RR3bBkxcDg1SSBGWVQ95hMKFP/PrqYHWZSs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=werwtqaKHZRd0s/ZzhfvRgbJEhO0jSBgP7I2UzFzxPwGssUux4mQw2nfzRI9+zkwL ZjLC8zaAGUitvYn2llXbJe+SH+B++P8WXUV3LGumukfdaFQexXEuoiHm/EEO4S/OhD hRU2mMm2za0NqSCkCHbSO7Xcg+CPFNcOc4NWFhU0= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jane Chu , Jeff Moyer , Erwin Tsaur , Johannes Thumshirn , Dan Williams Subject: [PATCH 4.14 013/193] libnvdimm/namespace: Fix label tracking error Date: Wed, 29 May 2019 20:04:27 -0700 Message-Id: <20190530030449.694671104@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190530030446.953835040@linuxfoundation.org> References: <20190530030446.953835040@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dan Williams commit c4703ce11c23423d4b46e3d59aef7979814fd608 upstream. Users have reported intermittent occurrences of DIMM initialization failures due to duplicate allocations of address capacity detected in the labels, or errors of the form below, both have the same root cause. nd namespace1.4: failed to track label: 0 WARNING: CPU: 17 PID: 1381 at drivers/nvdimm/label.c:863 RIP: 0010:__pmem_label_update+0x56c/0x590 [libnvdimm] Call Trace: ? nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm] nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm] uuid_store+0x17e/0x190 [libnvdimm] kernfs_fop_write+0xf0/0x1a0 vfs_write+0xb7/0x1b0 ksys_write+0x57/0xd0 do_syscall_64+0x60/0x210 Unfortunately those reports were typically with a busy parallel namespace creation / destruction loop making it difficult to see the components of the bug. However, Jane provided a simple reproducer using the work-in-progress sub-section implementation. When ndctl is reconfiguring a namespace it may take an existing defunct / disabled namespace and reconfigure it with a new uuid and other parameters. Critically namespace_update_uuid() takes existing address resources and renames them for the new namespace to use / reconfigure as it sees fit. The bug is that this rename only happens in the resource tracking tree. Existing labels with the old uuid are not reaped leading to a scenario where multiple active labels reference the same span of address range. Teach namespace_update_uuid() to flag any references to the old uuid for reaping at the next label update attempt. Cc: Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation") Link: https://github.com/pmem/ndctl/issues/91 Reported-by: Jane Chu Reported-by: Jeff Moyer Reported-by: Erwin Tsaur Cc: Johannes Thumshirn Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman --- drivers/nvdimm/label.c | 29 ++++++++++++++++------------- drivers/nvdimm/namespace_devs.c | 15 +++++++++++++++ drivers/nvdimm/nd.h | 4 ++++ 3 files changed, 35 insertions(+), 13 deletions(-) --- a/drivers/nvdimm/label.c +++ b/drivers/nvdimm/label.c @@ -614,6 +614,17 @@ static const guid_t *to_abstraction_guid return &guid_null; } +static void reap_victim(struct nd_mapping *nd_mapping, + struct nd_label_ent *victim) +{ + struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); + u32 slot = to_slot(ndd, victim->label); + + dev_dbg(ndd->dev, "free: %d\n", slot); + nd_label_free_slot(ndd, slot); + victim->label = NULL; +} + static int __pmem_label_update(struct nd_region *nd_region, struct nd_mapping *nd_mapping, struct nd_namespace_pmem *nspm, int pos, unsigned long flags) @@ -621,9 +632,9 @@ static int __pmem_label_update(struct nd struct nd_namespace_common *ndns = &nspm->nsio.common; struct nd_interleave_set *nd_set = nd_region->nd_set; struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); - struct nd_label_ent *label_ent, *victim = NULL; struct nd_namespace_label *nd_label; struct nd_namespace_index *nsindex; + struct nd_label_ent *label_ent; struct nd_label_id label_id; struct resource *res; unsigned long *free; @@ -692,18 +703,10 @@ static int __pmem_label_update(struct nd list_for_each_entry(label_ent, &nd_mapping->labels, list) { if (!label_ent->label) continue; - if (memcmp(nspm->uuid, label_ent->label->uuid, - NSLABEL_UUID_LEN) != 0) - continue; - victim = label_ent; - list_move_tail(&victim->list, &nd_mapping->labels); - break; - } - if (victim) { - dev_dbg(ndd->dev, "%s: free: %d\n", __func__, slot); - slot = to_slot(ndd, victim->label); - nd_label_free_slot(ndd, slot); - victim->label = NULL; + if (test_and_clear_bit(ND_LABEL_REAP, &label_ent->flags) + || memcmp(nspm->uuid, label_ent->label->uuid, + NSLABEL_UUID_LEN) == 0) + reap_victim(nd_mapping, label_ent); } /* update index */ --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -1229,12 +1229,27 @@ static int namespace_update_uuid(struct for (i = 0; i < nd_region->ndr_mappings; i++) { struct nd_mapping *nd_mapping = &nd_region->mapping[i]; struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); + struct nd_label_ent *label_ent; struct resource *res; for_each_dpa_resource(ndd, res) if (strcmp(res->name, old_label_id.id) == 0) sprintf((void *) res->name, "%s", new_label_id.id); + + mutex_lock(&nd_mapping->lock); + list_for_each_entry(label_ent, &nd_mapping->labels, list) { + struct nd_namespace_label *nd_label = label_ent->label; + struct nd_label_id label_id; + + if (!nd_label) + continue; + nd_label_gen_id(&label_id, nd_label->uuid, + __le32_to_cpu(nd_label->flags)); + if (strcmp(old_label_id.id, label_id.id) == 0) + set_bit(ND_LABEL_REAP, &label_ent->flags); + } + mutex_unlock(&nd_mapping->lock); } kfree(*old_uuid); out: --- a/drivers/nvdimm/nd.h +++ b/drivers/nvdimm/nd.h @@ -120,8 +120,12 @@ struct nd_percpu_lane { spinlock_t lock; }; +enum nd_label_flags { + ND_LABEL_REAP, +}; struct nd_label_ent { struct list_head list; + unsigned long flags; struct nd_namespace_label *label; };