Received: by 2002:a05:7412:8d06:b0:f9:332d:97f1 with SMTP id bj6csp19522rdb; Mon, 18 Dec 2023 07:52:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IF8rai1lFHl5DsieSZnnd/GLu6JbxqjkLcCFh4OTUdxd/js05BtwmihiITrqPbcd4r4HaIy X-Received: by 2002:a50:9fc9:0:b0:553:70f0:f429 with SMTP id c67-20020a509fc9000000b0055370f0f429mr267530edf.63.1702914754149; Mon, 18 Dec 2023 07:52:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702914754; cv=none; d=google.com; s=arc-20160816; b=kZ2hE7UjtW1FLv/AVbr4bua5UQslu0jCNsHX4cBX7PM6LK/Cka3r0ZYTJSzYGj59D6 QN+2LRxQSoJJMJLiw6JUikBypElt5nHQwiK5PmY1CZ6Bci4y8FIrXgTuKrTPqZnHECJW 8pfWZB2TAEEyLzqb6PVYAB7MKga9M3FkwX9lcno1GAhIy4WxQrdnRMTmqCDBqWRBF8QH jt2Uo7/WM3z3Maei0zrqEdKK4XQq+Bw1nKspIqwk02ubxcv65lcku9fFlKue5bXqvuRB U8Djctdjfcf5MY7MytZL+XG2hm6P8JLaA3k4m2IxJkMByzeqVxRlffXtG+hiTOfsmyzk S14g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature:dkim-signature :dkim-signature:dkim-signature; bh=YqGuPFrtbrN2FWZZjxVD7mfXJAhGmfEeq9rtj6/wYaA=; fh=3QIVKsEmsld8rvcEkMSr/XDw53o2lrGJeYcvS7FhzFk=; b=L/DpCEmlivgFbOZbZ2XzQkWpls+hSfPreDxNFG26CjiROgIeC6CeDNJKuYZG35/OD0 75A8GibyFbat/jSRpbwMaWi8gbSGKRsW7JYYFRrvs98PUY+4rgaTTpL7CeyMKdVqWVIi uvVN6ZFsYQW2/zHb51DqBRkLcwCmH7bFWI/VZV/ek10FPgHQzkRqqmceNiYdfjI/BvZj muqgigeBtMvNuB26RQBsYQpsewk8qNk+fV5BgxqTpaWK11xU4bs0EyJn8mBPwEAwQF6K yjeZhugX1gjnm/gyjpJoOgy3urOpfx6lScDaxRJ8oSHg3qeTF6+rgNXdCNC5a/aQBG56 1AKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=Yyw+GZ3f; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=Yyw+GZ3f; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-kernel+bounces-3987-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-3987-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id e7-20020a056402190700b0054f64f49accsi9271423edz.305.2023.12.18.07.52.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Dec 2023 07:52:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-3987-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=Yyw+GZ3f; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=Yyw+GZ3f; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-kernel+bounces-3987-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-3987-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 6C09E1F272BF for ; Mon, 18 Dec 2023 15:52:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 98C9D3D54C; Mon, 18 Dec 2023 15:51:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="Yyw+GZ3f"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="q2zNUTsA"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="Yyw+GZ3f"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="q2zNUTsA" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 333F315485 for ; Mon, 18 Dec 2023 15:51:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.dmz-prg2.suse.org (imap2.dmz-prg2.suse.org [10.150.64.98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 5E4FB1F445; Mon, 18 Dec 2023 15:51:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702914696; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YqGuPFrtbrN2FWZZjxVD7mfXJAhGmfEeq9rtj6/wYaA=; b=Yyw+GZ3fwz+aWhW3vC8lhHRmquNamtckqmYvfeAOZ7crpeXoXUrQH6Pc9a3mxatbHEFS4F UFSmOHzfOPyaopfNaPEUkvHLoFmjaeNMZOVZepDFbO4gW9nQX1vZ4kygVhgnXCgXlB6DKS gCCdfuqu0XmXczhEuXMrspIiDdiEujI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702914696; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YqGuPFrtbrN2FWZZjxVD7mfXJAhGmfEeq9rtj6/wYaA=; b=q2zNUTsA7Mv4kL3rzEaASJ72ZwNG3Ex04giS+QEKaPRqDvWu5JXsrSDFWs8dijWY6I1G4t zlW2VGzLkQrmFbBA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702914696; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YqGuPFrtbrN2FWZZjxVD7mfXJAhGmfEeq9rtj6/wYaA=; b=Yyw+GZ3fwz+aWhW3vC8lhHRmquNamtckqmYvfeAOZ7crpeXoXUrQH6Pc9a3mxatbHEFS4F UFSmOHzfOPyaopfNaPEUkvHLoFmjaeNMZOVZepDFbO4gW9nQX1vZ4kygVhgnXCgXlB6DKS gCCdfuqu0XmXczhEuXMrspIiDdiEujI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702914696; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YqGuPFrtbrN2FWZZjxVD7mfXJAhGmfEeq9rtj6/wYaA=; b=q2zNUTsA7Mv4kL3rzEaASJ72ZwNG3Ex04giS+QEKaPRqDvWu5JXsrSDFWs8dijWY6I1G4t zlW2VGzLkQrmFbBA== Received: from imap2.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap2.dmz-prg2.suse.org (Postfix) with ESMTPS id 4F15913BC8; Mon, 18 Dec 2023 15:51:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap2.dmz-prg2.suse.org with ESMTPSA id LPZdEohqgGWLAQAAn2gu4w (envelope-from ); Mon, 18 Dec 2023 15:51:36 +0000 From: Daniel Wagner To: linux-nvme@lists.infradead.org Cc: linux-kernel@vger.kernel.org, Christoph Hellwig , Sagi Grimberg , Keith Busch , James Smart , Hannes Reinecke , Daniel Wagner Subject: [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects Date: Mon, 18 Dec 2023 16:30:56 +0100 Message-ID: <20231218153105.12717-9-dwagner@suse.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231218153105.12717-1-dwagner@suse.de> References: <20231218153105.12717-1-dwagner@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Level: X-Spam-Level: X-Spamd-Result: default: False [-2.10 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; REPLY(-4.00)[]; BROKEN_CONTENT_TYPE(1.50)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_SEVEN(0.00)[8]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] Authentication-Results: smtp-out2.suse.de; none X-Spam-Score: -2.10 X-Spam-Flag: NO Associations take a refcount on queues, queues take a refcount on associations. The existing code lead to the situation that the target executes a disconnect and the host triggers a reconnect immediately. The reconnect command still finds an existing association and uses this. Though the reconnect crashes later on because nvmet_fc_delete_target_assoc() blindly goes ahead and removes resources while the reconnect code wants to use it. The problem is that nvmet_fc_find_target_assoc() is able to lookup an association which is being removed. So the first thing to address nvmet_fc_find_target_queue() is to remove the association out of the list and wait a RCU cycle and free resources in the free function callback of the kref_put(). The live time of the queues are strictly bound to the lifetime of an association. Thus we don't need to take reverse refcounts (queue -> association). Furthermore, streamline the cleanup code by using the workqueue for delete the association in nvmet_fc_ls_disconnect. This ensures, that we run through the same shutdown path in all non error cases. Reproducer: nvme/003 Signed-off-by: Daniel Wagner --- drivers/nvme/target/fc.c | 83 +++++++++++++++++++--------------------- 1 file changed, 40 insertions(+), 43 deletions(-) diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c index 28e432e62361..db992df13c73 100644 --- a/drivers/nvme/target/fc.c +++ b/drivers/nvme/target/fc.c @@ -166,6 +166,7 @@ struct nvmet_fc_tgt_assoc { struct nvmet_fc_hostport *hostport; struct nvmet_fc_ls_iod *rcv_disconn; struct list_head a_list; + struct nvmet_fc_tgt_queue *_queues[NVMET_NR_QUEUES + 1]; struct nvmet_fc_tgt_queue __rcu *queues[NVMET_NR_QUEUES + 1]; struct kref ref; struct work_struct del_work; @@ -803,14 +804,11 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc, if (!queue) return NULL; - if (!nvmet_fc_tgt_a_get(assoc)) - goto out_free_queue; - queue->work_q = alloc_workqueue("ntfc%d.%d.%d", 0, 0, assoc->tgtport->fc_target_port.port_num, assoc->a_id, qid); if (!queue->work_q) - goto out_a_put; + goto out_free_queue; queue->qid = qid; queue->sqsize = sqsize; @@ -831,7 +829,8 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc, if (ret) goto out_fail_iodlist; - WARN_ON(assoc->queues[qid]); + WARN_ON(assoc->_queues[qid]); + assoc->_queues[qid] = queue; rcu_assign_pointer(assoc->queues[qid], queue); return queue; @@ -839,8 +838,6 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc, out_fail_iodlist: nvmet_fc_destroy_fcp_iodlist(assoc->tgtport, queue); destroy_workqueue(queue->work_q); -out_a_put: - nvmet_fc_tgt_a_put(assoc); out_free_queue: kfree(queue); return NULL; @@ -853,12 +850,8 @@ nvmet_fc_tgt_queue_free(struct kref *ref) struct nvmet_fc_tgt_queue *queue = container_of(ref, struct nvmet_fc_tgt_queue, ref); - rcu_assign_pointer(queue->assoc->queues[queue->qid], NULL); - nvmet_fc_destroy_fcp_iodlist(queue->assoc->tgtport, queue); - nvmet_fc_tgt_a_put(queue->assoc); - destroy_workqueue(queue->work_q); kfree_rcu(queue, rcu); @@ -1173,13 +1166,18 @@ nvmet_fc_target_assoc_free(struct kref *ref) struct nvmet_fc_tgtport *tgtport = assoc->tgtport; struct nvmet_fc_ls_iod *oldls; unsigned long flags; + int i; + + for (i = NVMET_NR_QUEUES; i >= 0; i--) { + if (assoc->_queues[i]) + nvmet_fc_delete_target_queue(assoc->_queues[i]); + } /* Send Disconnect now that all i/o has completed */ nvmet_fc_xmt_disconnect_assoc(assoc); nvmet_fc_free_hostport(assoc->hostport); spin_lock_irqsave(&tgtport->lock, flags); - list_del_rcu(&assoc->a_list); oldls = assoc->rcv_disconn; spin_unlock_irqrestore(&tgtport->lock, flags); /* if pending Rcv Disconnect Association LS, send rsp now */ @@ -1209,7 +1207,7 @@ static void nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc) { struct nvmet_fc_tgtport *tgtport = assoc->tgtport; - struct nvmet_fc_tgt_queue *queue; + unsigned long flags; int i, terminating; terminating = atomic_xchg(&assoc->terminating, 1); @@ -1218,29 +1216,25 @@ nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc) if (terminating) return; + /* prevent new I/Os entering the queues */ + for (i = NVMET_NR_QUEUES; i >= 0; i--) + rcu_assign_pointer(assoc->queues[i], NULL); - for (i = NVMET_NR_QUEUES; i >= 0; i--) { - rcu_read_lock(); - queue = rcu_dereference(assoc->queues[i]); - if (!queue) { - rcu_read_unlock(); - continue; - } + spin_lock_irqsave(&tgtport->lock, flags); + list_del_rcu(&assoc->a_list); + spin_unlock_irqrestore(&tgtport->lock, flags); - if (!nvmet_fc_tgt_q_get(queue)) { - rcu_read_unlock(); - continue; - } - rcu_read_unlock(); - nvmet_fc_delete_target_queue(queue); - nvmet_fc_tgt_q_put(queue); + synchronize_rcu(); + + /* ensure all in-flight I/Os have been processed */ + for (i = NVMET_NR_QUEUES; i >= 0; i--) { + if (assoc->_queues[i]) + flush_workqueue(assoc->_queues[i]->work_q); } dev_info(tgtport->dev, "{%d:%d} Association deleted\n", tgtport->fc_target_port.port_num, assoc->a_id); - - nvmet_fc_tgt_a_put(assoc); } static struct nvmet_fc_tgt_assoc * @@ -1493,9 +1487,8 @@ __nvmet_fc_free_assocs(struct nvmet_fc_tgtport *tgtport) list_for_each_entry_rcu(assoc, &tgtport->assoc_list, a_list) { if (!nvmet_fc_tgt_a_get(assoc)) continue; - if (!queue_work(nvmet_wq, &assoc->del_work)) - /* already deleting - release local reference */ - nvmet_fc_tgt_a_put(assoc); + queue_work(nvmet_wq, &assoc->del_work); + nvmet_fc_tgt_a_put(assoc); } rcu_read_unlock(); } @@ -1548,9 +1541,8 @@ nvmet_fc_invalidate_host(struct nvmet_fc_target_port *target_port, continue; assoc->hostport->invalid = 1; noassoc = false; - if (!queue_work(nvmet_wq, &assoc->del_work)) - /* already deleting - release local reference */ - nvmet_fc_tgt_a_put(assoc); + queue_work(nvmet_wq, &assoc->del_work); + nvmet_fc_tgt_a_put(assoc); } spin_unlock_irqrestore(&tgtport->lock, flags); @@ -1594,9 +1586,8 @@ nvmet_fc_delete_ctrl(struct nvmet_ctrl *ctrl) nvmet_fc_tgtport_put(tgtport); if (found_ctrl) { - if (!queue_work(nvmet_wq, &assoc->del_work)) - /* already deleting - release local reference */ - nvmet_fc_tgt_a_put(assoc); + queue_work(nvmet_wq, &assoc->del_work); + nvmet_fc_tgt_a_put(assoc); return; } @@ -1626,6 +1617,8 @@ nvmet_fc_unregister_targetport(struct nvmet_fc_target_port *target_port) /* terminate any outstanding associations */ __nvmet_fc_free_assocs(tgtport); + flush_workqueue(nvmet_wq); + /* * should terminate LS's as well. However, LS's will be generated * at the tail end of association termination, so they likely don't @@ -1871,9 +1864,6 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtport, sizeof(struct fcnvme_ls_disconnect_assoc_acc)), FCNVME_LS_DISCONNECT_ASSOC); - /* release get taken in nvmet_fc_find_target_assoc */ - nvmet_fc_tgt_a_put(assoc); - /* * The rules for LS response says the response cannot * go back until ABTS's have been sent for all outstanding @@ -1888,8 +1878,6 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtport, assoc->rcv_disconn = iod; spin_unlock_irqrestore(&tgtport->lock, flags); - nvmet_fc_delete_target_assoc(assoc); - if (oldls) { dev_info(tgtport->dev, "{%d:%d} Multiple Disconnect Association LS's " @@ -1905,6 +1893,9 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtport, nvmet_fc_xmt_ls_rsp(tgtport, oldls); } + queue_work(nvmet_wq, &assoc->del_work); + nvmet_fc_tgt_a_put(assoc); + return false; } @@ -2903,6 +2894,9 @@ nvmet_fc_remove_port(struct nvmet_port *port) nvmet_fc_portentry_unbind(pe); + /* terminate any outstanding associations */ + __nvmet_fc_free_assocs(pe->tgtport); + kfree(pe); } @@ -2934,6 +2928,9 @@ static int __init nvmet_fc_init_module(void) static void __exit nvmet_fc_exit_module(void) { + /* ensure any shutdown operation, e.g. delete ctrls have finished */ + flush_workqueue(nvmet_wq); + /* sanity check - all lports should be removed */ if (!list_empty(&nvmet_fc_target_list)) pr_warn("%s: targetport list not empty\n", __func__); -- 2.43.0