Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp426020imm; Tue, 9 Oct 2018 21:20:55 -0700 (PDT) X-Google-Smtp-Source: ACcGV61Kb2AGANnMBIIbcACoGcaVSHZEU5kkWZWiI86jCSdCWCSYWZC4zxwYgBBkrzke4mbi7vXs X-Received: by 2002:a62:3995:: with SMTP id u21-v6mr33310986pfj.116.1539145255776; Tue, 09 Oct 2018 21:20:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539145255; cv=none; d=google.com; s=arc-20160816; b=BXbOKln/zbtNizO1LdnrtZMwRxIxzAZxOfZ4VY3n+Uowojx6W6qrpWjbDv+o0gWGL5 GmvEStUJOzRrfhJnTHa6JN0qsDhv5PSvXvDAFQMQojs7A8Lppj3+pQM83giNsH0Orn6o A1HNtNKf4qbK5owCFqujm5Epk0LherOjmg+zkjq63jxEbZai62nb+crcmxOqlm1HZmdI eoD4FMr0mTYd08VZtxjRoNBAvc56fjCT8JJfnxwsUMSZsK7kNCBZwES+TFLYz5HT/02m lKV86cOIsWiJEexBnB/wA3MuwGtxXcm/KeUeM2yMW/anTnM19fYOwzASuW4kIY/TsKuu oB3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to:date :cc:to:from:subject:message-id:dkim-signature; bh=nLdMVRhDHR3vffjqob3i0AunHP0rvMh/Uyt/JSlC4x8=; b=MDOEFKr66aDO7Z4yVfZ5Km/Qiwpg43cvvjy/HZ+MnYOpU1EetFTu7reeH66xTIjBhp 0TDXMUI0kIeM4AI59JY4acOmlkL42AISE/TGxTDsgptu607gpaYdNgsyEkiBkYlTkZkP y6ZkBUNIY8DFdGsu7Yx1zKHnk+7pfC6MEi5ioesQOWsDud5Z0HzukkX3pFkRMDszFs6Y 5pALyffxuf9AHO9goi7N1s/t8Em9pcZNuq3xcFFUwIm5n7ZSwQv+J+Vl+IzynNbe4JI0 hLwguFnVZCg0hn5YH01foSwwNaI1sSaMJ9C5MXaIoIy0mO3jbt0p8Pw6VJhtmKeND7P6 hojQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linux-iscsi.org header.s=default.private header.b=t4a5lkCj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c5-v6si21186607pll.414.2018.10.09.21.20.40; Tue, 09 Oct 2018 21:20:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@linux-iscsi.org header.s=default.private header.b=t4a5lkCj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726942AbeJJLkb (ORCPT + 99 others); Wed, 10 Oct 2018 07:40:31 -0400 Received: from mail.linux-iscsi.org ([67.23.28.174]:39385 "EHLO linux-iscsi.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725860AbeJJLkb (ORCPT ); Wed, 10 Oct 2018 07:40:31 -0400 Received: from [192.168.1.66] (75-37-194-224.lightspeed.lsatca.sbcglobal.net [75.37.194.224]) (using SSLv3 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: nab) by linux-iscsi.org (Postfix) with ESMTPSA id A707140ABD; Wed, 10 Oct 2018 04:20:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=linux-iscsi.org; s=default.private; t=1539145224; bh=1vgxvY9oIQzM/0hTwdQFtbG9tPaBy3O k0NgDosKoQCU=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To: References:Content-Type:Mime-Version; b=t4a5lkCjillgN0n75z1qEYrtg0 U1I07+3PX99Gzd6gGzvjhg8Ts3UfE60aamS+ruWL7e7p1olbILVW3YNAI2Opv5VhEZ3 78a7nHdWj+Z1EU+Dy64qIc/N9MIg9+DzOiWoRXsW3JAi0Y+nILUggTBLIZ5SsA3GrVa 7MVABnwn7vE= Message-ID: <1539145206.6124.15.camel@haakon3.daterainc.com> Subject: Re: [PATCH 0/2] target: Fix v4.19-rc active I/O shutdown deadlock From: "Nicholas A. Bellinger" To: target-devel Cc: linux-scsi , lkml , "Martin K. Petersen" , Mike Christie , Hannes Reinecke , Christoph Hellwig , Sagi Grimberg , "Bryant G. Ly" , "Peter Zijlstra (Intel)" Date: Tue, 09 Oct 2018 21:20:06 -0700 In-Reply-To: <1539141790-13557-1-git-send-email-nab@linux-iscsi.org> References: <1539141790-13557-1-git-send-email-nab@linux-iscsi.org> Content-Type: multipart/mixed; boundary="=-mAJjcxyyMXpcSErrgYWS" X-Mailer: Evolution 3.4.4-1 Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-mAJjcxyyMXpcSErrgYWS Content-Type: text/plain; charset="ISO-8859-15" Content-Transfer-Encoding: 7bit On Wed, 2018-10-10 at 03:23 +0000, Nicholas A. Bellinger wrote: > From: Nicholas Bellinger > > Hi MNC, MKP & Co, > > While testing v4.19-rc recently with simple backend I/O error injection > (via delayed BIO completion), I was able to trigger an end-less loop > deadlock with recent changes in commit 00d909a107: > > Author: Bart Van Assche > Date: Fri Jun 22 14:52:53 2018 -0700 > > scsi: target: Make the session shutdown code also wait for commands that are being aborted > > It comes down to an incorrect assumption wrt signals during session > shutdown plus active I/O quiesce, which triggers an endless loop > immediately during session shutdown as se_session->sess_list_wq > waits for outstanding backend I/O to complete. > > The easiest reproduction is with iser-target or simulation with plain > old iscsi-target/TCP ports. For reference, attached are two debug patches and instructions to trigger the end-less loop deadlock regression on v4.19-rc. 1) Simulate iscsi-target via iscsit_transport->iscsi_wait_conn() This makes iscsi-target/TCP follow isert_wait_conn() code, and uses iscsit_transport->iscsi_wait_conn() during active I/O shutdown to invoke target_wait_for_sess_cmds() with signals pending per existing iser-target session shutdown logic. Useful to trigger in a VM, without a RDMA capable NIC. 2) Simulate IBLOCK WRITE delayed completion by 60 seconds MNC likes to use scsi_debug for this, but I use BRD to add an arbitrary completion delay. ----------------------------------------------------------------------- So once an /sys/kernel/config/target/core/$IBLOCK_HBA/$IBLOCK_DEV/ has been created + exported via /sys/kernel/config/target/iscsi/$IQN/$TPGT/, issue a single block WRITE. Once WRITE completion is delayed by IBLOCK, go ahead and send a 'kill -SIGINT $PID' to iscsi_trx kthread to trigger usual iscsi/iser session shutdown + reconnect for the connection with the outstanding delayed I/O. Once target_wait_for_sess_cmds() is called with signals pending, it will immediately kill the machine. --=-mAJjcxyyMXpcSErrgYWS Content-Disposition: attachment; filename*0=0001-iscsi-target-Add-iscsit_wait_conn-simulation-for-tes.pat; filename*1=ch Content-Type: text/x-patch; name="0001-iscsi-target-Add-iscsit_wait_conn-simulation-for-tes.patch"; charset="ISO-8859-15" Content-Transfer-Encoding: 7bit From 9b1d23148994edb0969a5efd3a131122ad25e39d Mon Sep 17 00:00:00 2001 From: Nicholas Bellinger Date: Tue, 9 Oct 2018 01:57:53 -0700 Subject: [PATCH 1/2] iscsi-target: Add iscsit_wait_conn() simulation for testing Simulate sess_tearing_down put in iscsit_queue_rsp following how iser-target isert_put_cmd() works. Signed-off-by: Nicholas Bellinger --- drivers/target/iscsi/iscsi_target.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index cc756a1..b05d8af 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -485,6 +485,22 @@ int iscsit_del_np(struct iscsi_np *np) int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd) { + if (conn && conn->sess && conn->sess->se_sess) { + struct se_session *se_sess = conn->sess->se_sess; + + if (se_sess->sess_tearing_down) { + printk("Got iscsit_queue_rsp sess_tearing_down !!!!!!\n"); + + spin_lock_bh(&conn->cmd_lock); + if (!list_empty(&cmd->i_conn_node)) + list_del_init(&cmd->i_conn_node); + spin_unlock_bh(&conn->cmd_lock); + + transport_generic_free_cmd(&cmd->se_cmd, 0); + return 0; + } + } + return iscsit_add_cmd_to_response_queue(cmd, cmd->conn, cmd->i_state); } EXPORT_SYMBOL(iscsit_queue_rsp); @@ -667,6 +683,16 @@ static enum target_prot_op iscsit_get_sup_prot_ops(struct iscsi_conn *conn) return TARGET_PROT_NORMAL; } +static void iscsit_wait_conn(struct iscsi_conn *conn) +{ + if (conn->sess) { + target_sess_cmd_list_set_waiting(conn->sess->se_sess); + printk("se_sess: %p before target_wait_for_sess_cmds\n", conn->sess->se_sess); + target_wait_for_sess_cmds(conn->sess->se_sess); + printk("se_sess: %p after target_wait_for_sess_cmds\n", conn->sess->se_sess); + } +} + static struct iscsit_transport iscsi_target_transport = { .name = "iSCSI/TCP", .transport_type = ISCSI_TCP, @@ -686,6 +712,7 @@ static enum target_prot_op iscsit_get_sup_prot_ops(struct iscsi_conn *conn) .iscsit_xmit_pdu = iscsit_xmit_pdu, .iscsit_get_rx_pdu = iscsit_get_rx_pdu, .iscsit_get_sup_prot_ops = iscsit_get_sup_prot_ops, + .iscsit_wait_conn = iscsit_wait_conn, }; static int __init iscsi_target_init_module(void) -- 1.9.1 --=-mAJjcxyyMXpcSErrgYWS Content-Disposition: attachment; filename*0=0002-target-iblock-Delayed-bios-for-active-I-O-shutdown-t.pat; filename*1=ch Content-Type: text/x-patch; name="0002-target-iblock-Delayed-bios-for-active-I-O-shutdown-t.patch"; charset="ISO-8859-15" Content-Transfer-Encoding: 7bit From dc01432e1d561a4d20b20fe868fc8df22fcc4601 Mon Sep 17 00:00:00 2001 From: Nicholas Bellinger Date: Thu, 9 Feb 2017 20:56:01 -0800 Subject: [PATCH 2/2] target/iblock: Delayed bios for active I/O shutdown testing Signed-off-by: Nicholas Bellinger --- drivers/target/target_core_iblock.c | 21 +++++++++++++++++++++ drivers/target/target_core_iblock.h | 2 ++ 2 files changed, 23 insertions(+) diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c index ce1321a..deef231 100644 --- a/drivers/target/target_core_iblock.c +++ b/drivers/target/target_core_iblock.c @@ -297,6 +297,19 @@ static void iblock_complete_cmd(struct se_cmd *cmd) kfree(ibr); } +static void iblock_delayed_work(struct work_struct *work) +{ + struct iblock_req *ibr = container_of(work, + struct iblock_req, delayed_work.work); + struct bio *bio = ibr->bio; + struct se_cmd *cmd = bio->bi_private; + + printk("iblock_delayed_work: cmd: %p\n", cmd); + + bio_put(bio); + iblock_complete_cmd(cmd); +} + static void iblock_bio_done(struct bio *bio) { struct se_cmd *cmd = bio->bi_private; @@ -311,6 +324,14 @@ static void iblock_bio_done(struct bio *bio) smp_mb__after_atomic(); } + if (cmd->data_direction == DMA_TO_DEVICE) { + ibr->bio = bio; + INIT_DELAYED_WORK(&ibr->delayed_work, iblock_delayed_work); + schedule_delayed_work(&ibr->delayed_work, 60 * HZ); + printk("Queued ibr->delayed_work ! :-)\n"); + return; + } + bio_put(bio); iblock_complete_cmd(cmd); diff --git a/drivers/target/target_core_iblock.h b/drivers/target/target_core_iblock.h index 9cc3843..122b415 100644 --- a/drivers/target/target_core_iblock.h +++ b/drivers/target/target_core_iblock.h @@ -14,6 +14,8 @@ struct iblock_req { refcount_t pending; atomic_t ib_bio_err_cnt; + struct delayed_work delayed_work; + struct bio *bio; } ____cacheline_aligned; #define IBDF_HAS_UDEV_PATH 0x01 -- 1.9.1 --=-mAJjcxyyMXpcSErrgYWS--