Received: by 2002:ac0:950e:0:0:0:0:0 with SMTP id f14csp131712imc; Fri, 15 Mar 2019 19:14:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqzPDHbIQtWkUdXi+nHYXwkT9ND5j1xcUW9HRBDSWr6p7NTKkk5KBvCla19uIe09+rUQLxfy X-Received: by 2002:a17:902:8c84:: with SMTP id t4mr7501852plo.298.1552702490369; Fri, 15 Mar 2019 19:14:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552702490; cv=none; d=google.com; s=arc-20160816; b=DgakkZRb0ymFIsEQcNphrxFEd5rmVCs6xlVzBBuB04k0l+wsYa6RGUauby0UbK8p9S tJx80W8NmYRU2poCgrnjmun+tZ6eA5b/PytqEl59uRg3LMgW0l/wCiO+z1FvmBvIXSOT XzUwnoHSrnyDIw9F/cfi0pNEfF4cQD1+/zZ6P4rrS65a71umAeAPAfGGFXAAJqNMazcW Zh4xYPiI2jn6YYG9yzod96irFLtYPLJ08rWli9kDSAp2ht5X7WsH/8pkf4Ba6/pDcP0A Mtienw9Dl93vsOlTopoHOHxBPqTTsejtRZvL4iq+zQIlP3IHOgiBggJVgsNXkdp+xb5W peFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:subject:cc :to:from; bh=nkXiuXc/eU4sQ4rZ6SXmCbV2URQrIfUNrn8nLo7C2dk=; b=ogLGSZpZ2tJru2XxBYRAAwOzPcH3sZTUxHkAYnmW7sKPuuYD6qc2LEFeosaQYFyakh 1B3Y67z9ZJpbWZ3FufJH8BiDxif4GAKSj36Scr3j2ToIp6sDHwARWZFj4T4TyKYmBU8b FeeRfDdCuFxXfxODFUQys7aO3WU/vNuDR1bxZvlsXRtF0ENlgEmWST0b2hH119kqnU+x L+b2Sfr/Wh5E21cpRlWcjsG4IxXQuWJL0zh2vmt22xdMlcZdOe1ugM9bmfNXBOHJ7frq 94/IomyO4emGV5mdDE3CRa9mDCjrkoXF/YQQkTr3LZitaCpsza/W407JgEWSmqC6oIPu AgNg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 68si3189930pga.497.2019.03.15.19.14.35; Fri, 15 Mar 2019 19:14:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726998AbfCPCLK (ORCPT + 99 others); Fri, 15 Mar 2019 22:11:10 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:51910 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726218AbfCPCLJ (ORCPT ); Fri, 15 Mar 2019 22:11:09 -0400 Received: from DGGEMS403-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 87BA2242CEC94DA9710F; Sat, 16 Mar 2019 10:11:05 +0800 (CST) Received: from huawei.com (10.175.124.28) by DGGEMS403-HUB.china.huawei.com (10.3.19.203) with Microsoft SMTP Server id 14.3.408.0; Sat, 16 Mar 2019 10:10:59 +0800 From: Jason Yan To: , CC: , , , , , , Jason Yan Subject: [RFC PATCH v2] scsi: fix oops in scsi_uninit_cmd() Date: Sat, 16 Mar 2019 10:09:05 +0800 Message-ID: <20190316020905.14962-1-yanaijie@huawei.com> X-Mailer: git-send-email 2.14.5 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.124.28] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If we remove the scsi disk when running io with fio, oops occured with the following condition. [scsi_eh_0] [fio] scsi_end_request ->blk_update_request ->end_bio(io returned to userspace) close ->sd_release ->scsi_disk_put ->scsi_disk_release ->disk->private_data = NULL; ->scsi_mq_uninit_cmd ->scsi_uninit_cmd ->scsi_cmd_to_driver ->drv is NULL, Oops There is a small window between blk_update_request() and scsi_mq_uninit_cmd() that scsi disk may have been released. This will cause a oops like below: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 s/sync.c:67, func=xfer, error=In[11347.116050] Mem abort info: put/output error [11347.121598] ESR = 0x96000006 [11347.126200] Exception class = DABT (current EL), IL = 32 bits [11347.132117] SET = 0, FnV = 0 [11347.135170] EA = 0, S1PTW = 0 [11347.138308] Data abort info: [11347.141186] ISV = 0, ISS = 0x00000006 [11347.145019] CM = 0, WnR = 0 [11347.147977] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000a67aece2 [11347.154591] [0000000000000000] pgd=0000002f90774003, pud=0000002fab098003, pmd=0000000000000000 [11347.163304] Internal error: Oops: 96000006 [#1] PREEMPT SMP [11347.168870] Modules linked in: hisi_sas_v3_hw hisi_sas_main libsas [11347.175044] CPU: 56 PID: 4294 Comm: scsi_eh_2 Not tainted 4.19.0-g8052059-dirty #2 [11347.182600] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - B601 (V6.01) 11/08/2018 [11347.191370] pstate: a0c00009 (NzCv daif +PAN +UAO) [11347.196155] pc : scsi_uninit_cmd+0x24/0x3c [11347.200240] lr : scsi_mq_uninit_cmd+0x1c/0x30 [11347.204583] sp : ffff000024dabb60 [11347.207884] x29: ffff000024dabb60 x28: ffff000024dabd38 [11347.213184] x27: ffff000000f5b3a8 x26: ffff7df3b0181600 [11347.218484] x25: 0000000000000000 x24: ffff803bc5d36778 [11347.223783] x23: 000000000000000a x22: 0000000000000000 [11347.229082] x21: ffff803bc7397000 x20: ffff802f9148e530 [11347.234381] x19: ffff802f9148e530 x18: ffff7e0000000000 [11347.239679] x17: 0000000000000000 x16: 0000002f9e37d000 [11347.244979] x15: ffff7e0000000000 x14: 3863206336203839 [11347.250278] x13: 2036302030302038 x12: a46fac3d0d363d00 [11347.255578] x11: ffffffffffffffff x10: a46fac3d0d363d00 [11347.260877] x9 : 0000000040040000 x8 : 000000000000eb4b [11347.266177] x7 : ffff000009771000 x6 : 0000000000210d00 [11347.271476] x5 : ffff803bc9f50000 x4 : 0000000000000000 [11347.276775] x3 : ffff802fb02b4380 x2 : ffff802f9148e400 [11347.282075] x1 : 0000000000000000 x0 : ffff802f9148e530 [11347.287375] Process scsi_eh_2 (pid: 4294, stack limit = 0x000000007d2257f8) [11347.294323] Call trace: Jobs: 6 (f=6): [R[RRR1XXX1XRR3] 47.296758] scsi_uninit_cmd+0x24/0x3c [22.7% done] [1516MB/0KB/0KB /s] [754/0/0 iops] [eta 08m:39s] [11347.308390] scsi_mq_uninit_cmd+0x1c/0x30 [11347.312387] scsi_end_request+0x7c/0x1b8 [11347.316297] scsi_io_completion+0x464/0x668 [11347.320467] scsi_finish_command+0xbc/0x160 [11347.324636] scsi_eh_flush_done_q+0x10c/0x170 [11347.328990] sas_scsi_recover_host+0x84c/0xa98 [libsas] [11347.334202] scsi_error_handler+0x140/0x5b0 [11347.338374] kthread+0x100/0x12c [11347.341590] ret_from_fork+0x10/0x18 [11347.345153] Code: 71000c3f 540000e9 f9404c41 f941f421 (f9400021) [11347.351234] ---[ end trace f496aacdaa1dcc51 ]--- To fix this, move the bio_endio() action from blk_update_request() to __blk_mq_end_request(). Signed-off-by: Jason Yan --- block/blk-core.c | 6 ++++-- block/blk-mq.c | 7 +++++++ include/linux/blkdev.h | 1 + 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 4673ebe42255..f39ea78c0535 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -192,8 +192,10 @@ static void req_bio_endio(struct request *rq, struct bio *bio, bio_advance(bio, nbytes); /* don't actually finish bio if it's part of flush sequence */ - if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ)) - bio_endio(bio); + if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ)) { + bio->bi_next = rq->bio_to_release; + rq->bio_to_release = bio; + } } void blk_dump_rq_flags(struct request *rq, char *msg) diff --git a/block/blk-mq.c b/block/blk-mq.c index a9c181603cbd..5ad595ebc198 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -529,8 +529,15 @@ EXPORT_SYMBOL_GPL(blk_mq_free_request); inline void __blk_mq_end_request(struct request *rq, blk_status_t error) { + struct bio *bio; u64 now = 0; + while (rq->bio_to_release) { + bio = rq->bio_to_release->bi_next; + bio_endio(rq->bio_to_release); + rq->bio_to_release = bio; + } + if (blk_mq_need_time_stamp(rq)) now = ktime_get_ns(); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 0de92b29f589..74fe561d5a49 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -143,6 +143,7 @@ struct request { struct bio *bio; struct bio *biotail; + struct bio *bio_to_release; struct list_head queuelist; -- 2.14.5