Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp310007ybi; Wed, 29 May 2019 22:00:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqzkbAGyGjDq4gP/0z1WgN73vVOKkGXB6x5g1KHTVTLOhFm4iXTw19zSUBu01yKtLwzf9wVH X-Received: by 2002:a17:902:14e:: with SMTP id 72mr2068635plb.36.1559192438921; Wed, 29 May 2019 22:00:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559192438; cv=none; d=google.com; s=arc-20160816; b=r4Mmfx/eAbPWATJkQUQxeDeF2SS7dlZj2ZUINmhBkhp6K3f9URg6ov2CeeqN6a2kfq MOs6k1wRA+OysC8860bC+WX2ay+NbGI4zLtF1kMs8OHLJJq055JcUNF3AEk1F37jlZJ/ ILrpSP494ZXZlv1yOY4VrfFdv/92UX6bI3h+e8n3sn2CTLDluRX+UNxvF8Vy5t5F2NzF unwvOY6whWQXC1jvBtAsaUjetfimgLK+0ZHX65YKhqrXICOSVJXNjb9VfxE2n8ZuZqie ErKJ45Zg03TEKz9x9AS+5EIQvzq0XPFU1h2lAs0WtZzqMURm2N/6zHFZJmVfKLJu9k+z iTUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=MmSmZD03XEYteZUbjY2QU0L6aO/pDsXU8vq4skb4BV4=; b=vc1gMNUU/n3a2WWzxAdPp1rKIxwmG1SR4rdpLqou8uTpqabk3rVS+3lkPr4u3aViwb 3n5tiX/3hzEh/9uKXsu8patnEFhT2HthCb7I8pgeFtM4Kocunc3XyYuS73Mj95s5aenh mfkyjZlW3bu7QeJ1j6VbuiPwSTlr5x8r5/oY62Fe63JcWhqQaJIv4+3ew94yd27qyiZ7 ZOeKOytfX8ry5XLAr82ZvMt0C6jP+/kfWuU+Oo1/6Xeo7R7cY1bI9wnjv7oBkPV4WjCY yuKy5qFfF/tIjxPTeWTZbhaw8A+nS2VVS2RmkmztoNFtqFpZfAzn9JXOB1MhHppC8y9p lzaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="GrT8gZY/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h15si2183011pfn.96.2019.05.29.22.00.22; Wed, 29 May 2019 22:00:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="GrT8gZY/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728385AbfE3E6K (ORCPT + 99 others); Thu, 30 May 2019 00:58:10 -0400 Received: from mail.kernel.org ([198.145.29.99]:45518 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727817AbfE3DJv (ORCPT ); Wed, 29 May 2019 23:09:51 -0400 Received: from localhost (ip67-88-213-2.z213-88-67.customer.algx.net [67.88.213.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A80D424490; Thu, 30 May 2019 03:09:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1559185790; bh=WfpdtFgC/0wlYZVLJzAH8OAhUWZ7d9TQnak4/lQ/9aM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GrT8gZY/1crP7Q5F+7/EcXHBiUhmw52xNsON1kYUD5mSS6ONLQ4CA7SxQEdXK8nTI 1TC/TOOK51NjKJBNOYp4DBq+PDEk3XGVcdMY2h5POnpMeqnh/fNLquGKVqXfUwiuqO IIxK7GdhfvZ0mdYpnu7CmKUMLhg2j6pNmqFUJQEA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dongli Zhang , James Smart , Bart Van Assche , Ming Lei , Jens Axboe , Sasha Levin , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" Subject: [PATCH 5.1 062/405] blk-mq: grab .q_usage_counter when queuing request from plug code path Date: Wed, 29 May 2019 20:01:00 -0700 Message-Id: <20190530030544.078681607@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190530030540.291644921@linuxfoundation.org> References: <20190530030540.291644921@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [ Upstream commit e87eb301bee183d82bb3d04bd71b6660889a2588 ] Just like aio/io_uring, we need to grab 2 refcount for queuing one request, one is for submission, another is for completion. If the request isn't queued from plug code path, the refcount grabbed in generic_make_request() serves for submission. In theroy, this refcount should have been released after the sumission(async run queue) is done. blk_freeze_queue() works with blk_sync_queue() together for avoiding race between cleanup queue and IO submission, given async run queue activities are canceled because hctx->run_work is scheduled with the refcount held, so it is fine to not hold the refcount when running the run queue work function for dispatch IO. However, if request is staggered into plug list, and finally queued from plug code path, the refcount in submission side is actually missed. And we may start to run queue after queue is removed because the queue's kobject refcount isn't guaranteed to be grabbed in flushing plug list context, then kernel oops is triggered, see the following race: blk_mq_flush_plug_list(): blk_mq_sched_insert_requests() insert requests to sw queue or scheduler queue blk_mq_run_hw_queue Because of concurrent run queue, all requests inserted above may be completed before calling the above blk_mq_run_hw_queue. Then queue can be freed during the above blk_mq_run_hw_queue(). Fixes the issue by grab .q_usage_counter before calling blk_mq_sched_insert_requests() in blk_mq_flush_plug_list(). This way is safe because the queue is absolutely alive before inserting request. Cc: Dongli Zhang Cc: James Smart Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Reviewed-by: Bart Van Assche Tested-by: James Smart Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- block/blk-mq-sched.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index aa6bc5c026438..c59babca6857a 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -413,6 +413,14 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, struct list_head *list, bool run_queue_async) { struct elevator_queue *e; + struct request_queue *q = hctx->queue; + + /* + * blk_mq_sched_insert_requests() is called from flush plug + * context only, and hold one usage counter to prevent queue + * from being released. + */ + percpu_ref_get(&q->q_usage_counter); e = hctx->queue->elevator; if (e && e->type->ops.insert_requests) @@ -426,12 +434,14 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, if (!hctx->dispatch_busy && !e && !run_queue_async) { blk_mq_try_issue_list_directly(hctx, list); if (list_empty(list)) - return; + goto out; } blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); + out: + percpu_ref_put(&q->q_usage_counter); } static void blk_mq_sched_free_tags(struct blk_mq_tag_set *set, -- 2.20.1