Received: by 2002:ab2:69cc:0:b0:1fd:c486:4f03 with SMTP id n12csp87007lqp; Mon, 10 Jun 2024 19:48:39 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV6u99ElmqY3c4xvMxENCvRRuJoP8ck3DLxgyXY+sL9u2dWMYQn2kRUTjl+2soUu5BKx/dFsk2zWr5k8Tvq/mTWML+DIG8yE0IQtrK2Dg== X-Google-Smtp-Source: AGHT+IHXuBL0MWNl9dFhvus/DZYYfNzZBiob6TOBjzkgmLxDYyhj8LXOjkfumY7ssl9yHe5kTuL/ X-Received: by 2002:a17:90a:e997:b0:2c2:db07:f814 with SMTP id 98e67ed59e1d1-2c2db07f877mr7007092a91.26.1718074119239; Mon, 10 Jun 2024 19:48:39 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718074119; cv=pass; d=google.com; s=arc-20160816; b=Br6vVygfjnweHu4BKA78uj0bZlPJUEErV13KNyxwA84MD43Wc7LxyzxkrVNlHoY2rk sxHJ0n+fbl++NKzkEHYlVYmUsTpzcepeTM1qSVHsUATz/6QSDoysTfQYX1hQ5MOUY1lD KPBCbo3fd44O8YyA8jRL+1NjENSJe3WsjUDlo4yY126wVE4N8zAxZi07eoBMCGSNzDeu N9qr2GrGc1r8ZlQ1PqJiWYAOXHJTdcCdMqr2NfXdERomUSG1F+280U+5Xo9wTzRtvia1 rh+GWIp5HtZJ5fQn37m41ZyaRAI0GT2XgN0g5ghsmknYKrhrmcb2uKI+qF7480wNO6Fe 8GDw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:user-agent:date:message-id:from :cc:references:to:subject; bh=RX6sLVyc5VQV8U083rV3t+cFHMYRSO5smoqRT//GvYs=; fh=U0QWOs6/wt44NCvmDM4VDiOsLq80sjUs92fit7ZOxHQ=; b=c/OU7mLyQ1g9xBzUZsE0ZTNvkYamT3ePRMCDo5Ktrq4tkuKEbPKeQGrQaZca999uzE pjrhw4GUni6jaxcnJTYwckjbI7GqLvzoIusm336zVMr7yyfmqR1kvyonqXCJ/gb7cThp 22sGmmxklrox0Iha1yKsT6AVp3l0+32xmPalkj7nyBhB3EKe/+hW4dSU4gOjlfiEzStU nRRZIuuCum1ZLKBzw6jDJZPZef/c0WQocBePN7ySHm3Y7OvN9JYve9AUCryQxcoaCRdh 02HmUj4DNLg1ORQe04u28+IauUxdQUHNvV39iywfwWauxOgvxq69qFl7lSSlE5gWDbwN 9FQg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-209197-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209197-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id 98e67ed59e1d1-2c30e7ddf05si2548477a91.66.2024.06.10.19.48.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Jun 2024 19:48:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-209197-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-209197-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209197-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id E5C9CB20FF5 for ; Tue, 11 Jun 2024 02:48:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id AEBB216F8E9; Tue, 11 Jun 2024 02:48:28 +0000 (UTC) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59192A2A; Tue, 11 Jun 2024 02:48:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718074108; cv=none; b=IbjNsxSwgvq03mXTJlB9pd844G/9gF7vTyBanEpN6sSqVD+O1LyOMhhfPdeEIGK0XkyhxevEaB0Pc7Z+nBhAp1odKyOh7jNwEJ4Aj+cpWhgbt09stwqgaqkTfvN1fAHRAEUSO6j8zVgPAOGuO2pWc3xK/fkQa+N9U7LgQ1HSIZY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718074108; c=relaxed/simple; bh=egZA0SSS8MIXil0fslYG1Ppb7q3Z+0S9QWa4E4FnHag=; h=Subject:To:References:CC:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=TvVU4XNCKfUQTnSZTiLH8h8fafjGOFw/9Ayk0usXEGnhH1kFY3el9eO4JiP7K4H7l+umNboaILokglx2hF/ChieWT7NJiz2eJXNuFSVIc15EjF2U3qbY94CxTkcLSVIuakilf+XiqRAr9n5trt79GRzchnNRF4h9OeAgYDGvNn0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.163]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4VytLW3MzFz1yt6x; Tue, 11 Jun 2024 10:44:51 +0800 (CST) Received: from canpemm500010.china.huawei.com (unknown [7.192.105.118]) by mail.maildlp.com (Postfix) with ESMTPS id A3159180047; Tue, 11 Jun 2024 10:48:14 +0800 (CST) Received: from [10.174.178.185] (10.174.178.185) by canpemm500010.china.huawei.com (7.192.105.118) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Tue, 11 Jun 2024 10:48:14 +0800 Subject: Re: [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free To: Ming Lei , yebin References: <20240606062655.2185006-1-yebin@huaweicloud.com> <6662632D.7020000@huaweicloud.com> CC: , , From: "yebin (H)" Message-ID: <6667BAED.7060809@huawei.com> Date: Tue, 11 Jun 2024 10:48:13 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To canpemm500010.china.huawei.com (7.192.105.118) On 2024/6/7 9:35, Ming Lei wrote: > On Fri, Jun 07, 2024 at 09:32:29AM +0800, yebin wrote: >> >> On 2024/6/7 8:13, Ming Lei wrote: >>> On Thu, Jun 06, 2024 at 02:26:55PM +0800, Ye Bin wrote: >>>> From: Ye Bin >>>> >>>> There's a issue as follows when do format NVME with IO: >>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 >>>> PGD 101727f067 P4D 1011fae067 PUD fbed78067 PMD 0 >>>> Oops: 0000 [#1] SMP NOPTI >>>> RIP: 0010:kfree+0x4f/0x160 >>>> RSP: 0018:ff705a800912b910 EFLAGS: 00010247 >>>> RAX: 0000000000000000 RBX: 0d06d30000000000 RCX: ff4fb320260ad990 >>>> RDX: ff4fb30ee7acba40 RSI: 0000000000000000 RDI: 00b04cff80000000 >>>> RBP: ff4fb30ee7acba40 R08: 0000000000000200 R09: ff705a800912bb60 >>>> R10: 0000000000000000 R11: ff4fb3103b67c750 R12: ffffffff9a62d566 >>>> R13: ff4fb30aa0530000 R14: 0000000000000000 R15: 000000000000000a >>>> FS: 00007f4399b6b700(0000) GS:ff4fb31040140000(0000) knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> CR2: 0000000000000008 CR3: 0000001014cd4002 CR4: 0000000000761ee0 >>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>> DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 >>>> PKRU: 55555554 >>>> Call Trace: >>>> bio_integrity_free+0xa6/0xb0 >>>> __bio_integrity_endio+0x8c/0xa0 >>>> bio_endio+0x2b/0x130 >>>> blk_update_request+0x78/0x2b0 >>>> blk_mq_end_request+0x1a/0x140 >>>> blk_mq_try_issue_directly+0x5d/0xc0 >>>> blk_mq_make_request+0x46b/0x540 >>>> generic_make_request+0x121/0x300 >>>> submit_bio+0x6c/0x140 >>>> __blkdev_direct_IO_simple+0x1ca/0x3a0 >>>> blkdev_direct_IO+0x3d9/0x460 >>>> generic_file_read_iter+0xb4/0xc60 >>>> new_sync_read+0x121/0x170 >>>> vfs_read+0x89/0x130 >>>> ksys_read+0x52/0xc0 >>>> do_syscall_64+0x5d/0x1d0 >>>> entry_SYSCALL_64_after_hwframe+0x65/0xca >>>> >>>> Assuming a 512 byte directIO is issued, the initial logical block size of >>>> the state block device is 512 bytes, and then modified to 4096 bytes. >>>> Above issue may happen as follows: >>>> Direct read format NVME >>>> __blkdev_direct_IO_simple(iocb, iter, nr_pages); >>>> if ((pos | iov_iter_alignment(iter)) & (bdev_logical_block_size(bdev) - 1)) >>>> -->The logical block size is 512, and the IO issued is 512 bytes, >>>> which can be checked >>>> return -EINVAL; >>>> submit_bio(&bio); >>>> nvme_dev_ioctl >>>> case NVME_IOCTL_RESCAN: >>>> nvme_queue_scan(ctrl); >>>> ... >>>> nvme_update_disk_info(disk, ns, id); >>>> blk_queue_logical_block_size(disk->queue, bs); >>>> --> 512->4096 >>>> blk_queue_enter(q, flags) >>>> blk_mq_make_request(q, bio) >>>> bio_integrity_prep(bio) >>>> len = bio_integrity_bytes(bi, bio_sectors(bio)); >>>> -->At this point, because the logical block size has increased to >>>> 4096 bytes, the calculated 'len' here is 0 >>>> buf = kmalloc(len, GFP_NOIO | q->bounce_gfp); >>>> -->Passed in len=0 and returned buf=16 >>>> end = (((unsigned long) buf) + len + PAGE_SIZE - 1) >> PAGE_SHIFT; >>>> start = ((unsigned long) buf) >> PAGE_SHIFT; >>>> nr_pages = end - start; -->nr_pages == 1 >>>> bip->bip_flags |= BIP_BLOCK_INTEGRITY; >>>> for (i = 0 ; i < nr_pages ; i++) { >>>> if (len <= 0) >>>> -->Not initializing the bip_vec of bio_integrity, will result >>>> in null pointer access during subsequent releases. Even if >>>> initialized, it will still cause subsequent releases access >>>> null pointer because the buffer address is incorrect. >>>> break; >>>> >>>> Firstly, it is unreasonable to format NVME in the presence of IO. It is also >>>> possible to see IO smaller than the logical block size in the block layer for >>>> this type of concurrency. It is expected that this type of IO device will >>>> return an error, so exception handling should also be done for this type of >>>> IO to prevent null pointer access from causing system crashes. >>> Actually unaligned IO handling is one mess for nvme hardware. Yes, IO may fail, >>> but it is observed that meta buffer is overwrite by DMA in read IO. >>> >>> Ye and Yi, can you test the following patch in your 'nvme format' & IO workload? >>> >>> >>> diff --git a/block/blk-core.c b/block/blk-core.c >>> index 82c3ae22d76d..a41ab4a3a398 100644 >>> --- a/block/blk-core.c >>> +++ b/block/blk-core.c >>> @@ -336,6 +336,19 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) >>> return 0; >>> } >>> +static bool bio_unaligned(struct bio *bio) >>> +{ >>> + unsigned int bs = bdev_logical_block_size(bio->bi_bdev); >>> + >>> + if (bio->bi_iter.bi_size & (bs - 1)) >>> + return true; >>> + >>> + if ((bio->bi_iter.bi_sector << SECTOR_SHIFT) & (bs - 1)) >>> + return true; >>> + >>> + return false; >>> +} >> I think this judgment is a bit incorrect. It should not be sufficient to >> only determine whether >> the length and starting sector are logically block aligned. > Can you explain why the two are not enough? Other limits should be handled > by bio split. If logical block size is 512 bytes, BIO has 4 segments, each segment length is 512 bytes, bio->bi_iter.bi_sector == 0. If logical block size change to 4096 bytes, bio_unaligned() will return false. I'm not sure if the example I gave is appropriate? > > Thanks, > Ming > > . >