Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp2019238ybc; Wed, 20 Nov 2019 07:42:32 -0800 (PST) X-Google-Smtp-Source: APXvYqzHzZOdVL2Xu4vpCF99AVnj27rye1POlDkBPPn9wZDgK9lPDxKuC+FN89ebuqqekcIhNoha X-Received: by 2002:a1c:22c6:: with SMTP id i189mr4176378wmi.51.1574264552454; Wed, 20 Nov 2019 07:42:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574264552; cv=none; d=google.com; s=arc-20160816; b=k3QVrVfnwb62EFLC40vhD9jgOpi5zcygYpXZFdU21ZJyaqAj4DJGGXQ2R73OkHdLK7 Dgj85dE37NWieV5rYOUi2xT82nkVHcTcd4q96a69LMZi2yDXgEla54KH1PX7Im/K5c33 DQVMolYiUcRFml+UcDPfhW9aDfK7UsYqVMJaNZecBwJs5ldz5CC1+Ejjw/ac3Hw/flpM ifLSZdGb6F45kKd74/Egn4Vs7GE2g48LNfyH4jUGJMf0eweLViNd8EzC14i0pIA4MgRj Lvl+tyjb7JrtMSHeLVhA6Mx/gWwSTHc1q7mKyVd135Qhf5CpfBuZZ4UyMnUKDoNJxzYw VI1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:message-id:date:thread-index :thread-topic:subject:cc:to:from; bh=VFTq7HOBOmvvFlnJT7ztPEDL/6rApg8oyRS1FbN/CDM=; b=jcpnrce/332RmWhPUq39ftDVIZZ5ZzEbkCDPi8uO3CuTkfzh+jv+iz2GdOyHLKv89Z k3qmb2tLojzGgL3J5DQFJEvyfRiyeLJ8VY15ioFKmSKllqo4wK0QNkXTIGyGeDnbkEsJ hRcebBONNSbqhfplQ+UhOBgOiKLzjiA9nPCK0Uoj+7t4k9iX7/j/1x8DZ3YDCsim0e1t oRrMEKxGk1uYZ/8Nluf2TAbI2tHLwGav0ORGjfFKu7PIw62SGVrU9IVFIsL4i6LR5K8B 2EJaFRuj8oYhaoc2JdhWWNDB2NZm3VBTEQcX6FSuFDcMgX9ofhqn2bL3CWRJ+ZGwBWdc 8tTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h6si18313246eda.258.2019.11.20.07.42.07; Wed, 20 Nov 2019 07:42:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730128AbfKTN0k convert rfc822-to-8bit (ORCPT + 99 others); Wed, 20 Nov 2019 08:26:40 -0500 Received: from szxga03-in.huawei.com ([45.249.212.189]:2090 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727798AbfKTN0j (ORCPT ); Wed, 20 Nov 2019 08:26:39 -0500 Received: from DGGEML404-HUB.china.huawei.com (unknown [172.30.72.53]) by Forcepoint Email with ESMTP id 4738F8727DE762286877; Wed, 20 Nov 2019 21:26:28 +0800 (CST) Received: from DGGEML505-MBX.china.huawei.com ([169.254.12.31]) by DGGEML404-HUB.china.huawei.com ([fe80::b177:a243:7a69:5ab8%31]) with mapi id 14.03.0439.000; Wed, 20 Nov 2019 21:26:18 +0800 From: "wubo (T)" To: Lee Duncan , "cleech@redhat.com" , "jejb@linux.ibm.com" , "martin.petersen@oracle.com" , "open-iscsi@googlegroups.com" , "linux-scsi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Ulrich Windl CC: Mingfangsen , "liuzhiqiang (I)" Subject: [PATCH V4] scsi: avoid potential deadlock in iscsi_if_rx func Thread-Topic: [PATCH V4] scsi: avoid potential deadlock in iscsi_if_rx func Thread-Index: AdWfpPEqf8P620ByTC6DVQFqYQ7mDQ== Date: Wed, 20 Nov 2019 13:26:17 +0000 Message-ID: Accept-Language: en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.173.221.252] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In iscsi_if_rx func, after receiving one request through iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to reply the request in do-loop. If the return of iscsi_if_send_reply func return -EAGAIN all the time, one deadlock will occur. For example, a client only send msg without calling recvmsg func, then it will result in the watchdog soft lockup. The details are given as follows, Details of the special case which can cause deadlock: sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI); retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr); while (1) { state_msg = sendmsg(sock_fd, &msg, 0); //Note: recvmsg(sock_fd, &msg, 0) is not processed here. } close(sock_fd); watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat: curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0 deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq: TIMER: 992 SCHED: 8 Sample irqstat: irq 2: delta 1003, curr: 3103802, arch_timer CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G OE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO) pc : __alloc_skb+0x104/0x1b0 lr : __alloc_skb+0x9c/0x1b0 sp : ffff000033603a30 x29: ffff000033603a30 x28: 00000000000002dd x27: ffff800b34ced810 x26: ffff800ba7569f00 x25: 00000000ffffffff x24: 0000000000000000 x23: ffff800f7c43f600 x22: 0000000000480020 x21: ffff0000091d9000 x20: ffff800b34eff200 x19: ffff800ba7569f00 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0001000101000100 x13: 0000000101010000 x12: 0101000001010100 x11: 0001010101010001 x10: 00000000000002dd x9 : ffff000033603d58 x8 : ffff800b34eff400 x7 : ffff800ba7569200 x6 : ffff800b34eff400 x5 : 0000000000000000 x4 : 00000000ffffffff x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace: __alloc_skb+0x104/0x1b0 iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi] netlink_unicast+0x1e0/0x258 netlink_sendmsg+0x310/0x378 sock_sendmsg+0x4c/0x70 sock_write_iter+0x90/0xf0 __vfs_write+0x11c/0x190 vfs_write+0xac/0x1c0 ksys_write+0x6c/0xd8 __arm64_sys_write+0x24/0x30 el0_svc_common+0x78/0x130 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Here, we add one limit of retry times in do-loop to avoid the deadlock. V4: - modify the patch subject, no code change. V3: - replace the error with warning as suggested by Ulrich V2: - add some debug kernel message as suggested by Lee Duncan Signed-off-by: Bo Wu Reviewed-by: Zhiqiang Liu Reviewed-by: Lee Duncan --- drivers/scsi/scsi_transport_iscsi.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 417b868d8735..ed8d9709b9b9 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -24,6 +24,8 @@ #define ISCSI_TRANSPORT_VERSION "2.0-870" +#define ISCSI_SEND_MAX_ALLOWED 10 + #define CREATE_TRACE_POINTS #include @@ -3682,6 +3684,7 @@ iscsi_if_rx(struct sk_buff *skb) struct nlmsghdr *nlh; struct iscsi_uevent *ev; uint32_t group; + int retries = ISCSI_SEND_MAX_ALLOWED; nlh = nlmsg_hdr(skb); if (nlh->nlmsg_len < sizeof(*nlh) + sizeof(*ev) || @@ -3712,6 +3715,10 @@ iscsi_if_rx(struct sk_buff *skb) break; err = iscsi_if_send_reply(portid, nlh->nlmsg_type, ev, sizeof(*ev)); + if (err == -EAGAIN && --retries < 0) { + printk(KERN_WARNING "Send reply failed, error %d\n", err); + break; + } } while (err < 0 && err != -ECONNREFUSED && err != -ESRCH); skb_pull(skb, rlen); } -- 1.8.3.1