Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp3309926pxu; Sun, 29 Nov 2020 23:09:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJwQ0s2nduu3Qe68+VI6VehRo7lvDqbZGOCRnzYQ89GaN+U86syLNEj057Mb1aPPhE9fUDaY X-Received: by 2002:a50:e791:: with SMTP id b17mr6790658edn.388.1606720179300; Sun, 29 Nov 2020 23:09:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606720179; cv=none; d=google.com; s=arc-20160816; b=R2Wo5fsskrQ6rDq1dg9bgtD7S4E7otesTrjFBytI3XsmiVtPeNQft2Rcy9mkGQb5Kf Z5w7aJeauzdOAMk79FNVJCfSEhld/iek2fXJFXAEDLYQe4WOcPP8UgeeWx6TGIeCMNhI UdBJXO1A+zEdQ1xXS3GLzSDnH5pl8riXzIpT2iBMrZzyKyLsvF2T/ukPxj2Us9ffwvta pRmIfw1ToEQMSGpEJNutNR5GG7SyuXQt5FRu2uc3hFi0mbURXMcgvIuuoNesoUWFN4Le C8j83xHLPpyBtC3JtCrt2qdHGQq0Ll/ubjYqY0ECbRS8bJ6AZHqSpJeUrjf1i1gWkE79 l/8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=uo8xeVQ6WVBdmH+72Zx1EXIjieGBrOm+QpVHYFeqhMA=; b=it56wddjN82/M0b8qHcSGbC3TGHPaSWjaVUYihx38EJ5BAlRJVUfaMop9f6DcR155i 7Dduys4+UNRAVtjdXLZCkonjXFvTUs2W07TCvW3xBq8rmgYlbVMmiJBI+R+g/xm4VinQ lDC5dSinL9KuByP61Q3BooMes/8eDTB40F5AsXRoZUaMY/cignZc+vqXZ9jM0R+2662h EDbBE6wyfdGzmWcoW0z02XJLYkhSsgA4hKgRPzvCuWTBU6AFqWFr10krTkXckPqYEXws gVphIzTJHitWTJvbBBMggHDElIdyAj5sqeAoJAPCjrXDsdpO3rYQd9M7mjegxXMOde0R 6njA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mj7si1431404ejb.575.2020.11.29.23.08.59; Sun, 29 Nov 2020 23:09:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726518AbgK3HFP (ORCPT + 99 others); Mon, 30 Nov 2020 02:05:15 -0500 Received: from mx2.suse.de ([195.135.220.15]:36092 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726364AbgK3HFO (ORCPT ); Mon, 30 Nov 2020 02:05:14 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 145BAAC8F; Mon, 30 Nov 2020 07:04:33 +0000 (UTC) Subject: Re: [PATCH V1] block: Fix use-after-free while iterating over requests To: John Garry , Bart Van Assche , Pradeep P V K , axboe@kernel.dk, linux-block@vger.kernel.org Cc: stummala@codeaurora.org, linux-kernel@vger.kernel.org, Ming Lei References: <1606402925-24420-1-git-send-email-ppvk@codeaurora.org> <693ea723-aa9e-1166-8a19-a7787f724969@huawei.com> From: Hannes Reinecke Message-ID: <0c925db8-e481-5f21-b0fe-f691142b0437@suse.de> Date: Mon, 30 Nov 2020 08:04:33 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <693ea723-aa9e-1166-8a19-a7787f724969@huawei.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/26/20 5:49 PM, John Garry wrote: > On 26/11/2020 16:27, Bart Van Assche wrote: >> On 11/26/20 7:02 AM, Pradeep P V K wrote: >>> Observes below crash while accessing (use-after-free) request queue >>> member of struct request. >>> >>> 191.784789:   <2> Unable to handle kernel paging request at virtual >>> address ffffff81429a4440 >>> ... >>> 191.786174:   <2> CPU: 3 PID: 213 Comm: kworker/3:1H Tainted: G S >>> O      5.4.61-qgki-debug-ge45de39 #1 >>> ... >>> 191.786226:   <2> Workqueue: kblockd blk_mq_timeout_work >>> 191.786242:   <2> pstate: 20c00005 (nzCv daif +PAN +UAO) >>> 191.786261:   <2> pc : bt_for_each+0x114/0x1a4 >>> 191.786274:   <2> lr : bt_for_each+0xe0/0x1a4 >>> ... >>> 191.786494:   <2> Call trace: >>> 191.786507:   <2>  bt_for_each+0x114/0x1a4 >>> 191.786519:   <2>  blk_mq_queue_tag_busy_iter+0x60/0xd4 >>> 191.786532:   <2>  blk_mq_timeout_work+0x54/0xe8 >>> 191.786549:   <2>  process_one_work+0x2cc/0x568 >>> 191.786562:   <2>  worker_thread+0x28c/0x518 >>> 191.786577:   <2>  kthread+0x160/0x170 >>> 191.786594:   <2>  ret_from_fork+0x10/0x18 >>> 191.786615:   <2> Code: 0b080148 f9404929 f8685921 b4fffe01 (f9400028) >>> 191.786630:   <2> ---[ end trace 0f1f51d79ab3f955 ]--- >>> 191.786643:   <2> Kernel panic - not syncing: Fatal exception >>> >>> Fix this by updating the freed request with NULL. >>> This could avoid accessing the already free request from other >>> contexts while iterating over the requests. >>> >>> Signed-off-by: Pradeep P V K >>> --- >>>   block/blk-mq.c | 1 + >>>   block/blk-mq.h | 1 + >>>   2 files changed, 2 insertions(+) >>> >>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>> index 55bcee5..9996cb1 100644 >>> --- a/block/blk-mq.c >>> +++ b/block/blk-mq.c >>> @@ -492,6 +492,7 @@ static void __blk_mq_free_request(struct request >>> *rq) >>>       blk_crypto_free_request(rq); >>>       blk_pm_mark_last_busy(rq); >>> +    hctx->tags->rqs[rq->tag] = NULL; >>>       rq->mq_hctx = NULL; >>>       if (rq->tag != BLK_MQ_NO_TAG) >>>           blk_mq_put_tag(hctx->tags, ctx, rq->tag); >>> diff --git a/block/blk-mq.h b/block/blk-mq.h >>> index a52703c..8747bf1 100644 >>> --- a/block/blk-mq.h >>> +++ b/block/blk-mq.h >>> @@ -224,6 +224,7 @@ static inline int __blk_mq_active_requests(struct >>> blk_mq_hw_ctx *hctx) >>>   static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, >>>                          struct request *rq) >>>   { >>> +    hctx->tags->rqs[rq->tag] = NULL; >>>       blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag); >>>       rq->tag = BLK_MQ_NO_TAG; >> >> Is this perhaps a block driver bug instead of a block layer core bug? If >> this would be a block layer core bug, it would have been reported before. > > Isn't this the same issue which as been reported many times: > > https://lore.kernel.org/linux-block/20200820180335.3109216-1-ming.lei@redhat.com/ > > > https://lore.kernel.org/linux-block/8376443a-ec1b-0cef-8244-ed584b96fa96@huawei.com/ > > > But I never saw a crash, just kasan report. > And if that above were a concern, I would have thought one would need to use a WRITE_ONCE() here; otherwise we might have a race condition where other CPUs still see the old value, no? Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer