Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3081707ybt; Mon, 29 Jun 2020 14:52:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzIaxEwr/1V3G5b7vrkoCPycKH+a89iYkWy9fPy+6fNTerdwalZQ+oCux8ugepKkd4VBSfD X-Received: by 2002:a17:906:fa9a:: with SMTP id lt26mr6445561ejb.502.1593467529786; Mon, 29 Jun 2020 14:52:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593467529; cv=none; d=google.com; s=arc-20160816; b=aEwEOoCeUMYEkPT/18G0toOJWUpKtGYSb0iAXZ+7o894SdSLsEvFS8JDO8+mmUQSwD jt92D87bQX5GZYNkEMM28MTkZixl3/xXNlnkR6P2cCErYCWqC7VkLKsktJMiwHOZ0JCM NEDHyPwL0f17qmzXBhtePhb9EXL4dKkjHOhVOg8Cx7ItTFgDQ5zEzKn2RI2rfr2Ey/hQ MLniXFBLH3ZeRBt0zghNdr64J7z018T0TUugsPD8p5S8gMK7vbRskXucRx6mu9ZmGPu9 ssQ35Yqzboz4H91UJc58X7aWuOkAgrjWpo9CnEoFZeqDBzdo9bUjPIBtwc2UJZxAac1Z JIeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=AkhVLV3W2O8cKGcZ8939cMkoDCctzebgSL74dFJpFEc=; b=jbULADbcWW5CnAEzS1zFsope7WoTty410oRQtQlunE03aJ50Rlo3njtWMKWOaNw/S1 1FDzp4dKSbsDtbRJ2Tppuann6ZWVktjjRf9rHWbpquarArSYeobhn4jeccme6jtq5qgt 9WzGI5+59T/Ez3MwVE2RcKT6ioc5clrMmbjccpaW1/85UltLYXfi2iaj/X+c35JsHTBR R/YT+qR/Z3DOVHQDAkfj7nO+BnIoz3m+6c501Jd5oBn8IBps2ukPXj/hudwTNuHq1VpJ tHzIhuW4zlfLXwmvybGB3E5QZvFBQ01vuunXm/kukvHQiWMOCc6rCsloWmpUJTZKekaw e7Kg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wBU89bRv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y11si481738edp.319.2020.06.29.14.51.47; Mon, 29 Jun 2020 14:52:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wBU89bRv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404304AbgF2VtF (ORCPT + 99 others); Mon, 29 Jun 2020 17:49:05 -0400 Received: from mail.kernel.org ([198.145.29.99]:56910 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726697AbgF2Sfl (ORCPT ); Mon, 29 Jun 2020 14:35:41 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 885952475E; Mon, 29 Jun 2020 15:21:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593444077; bh=S4OBt6uJBfnbnKcJpzwKgOczPbY5w4kwCO+lHH4TmuU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=wBU89bRvtrD0hN08gCF4gMyvZge8vyaAND1NfjjuYgyrd9TiTRRelHfiC+jN08tBC zC/Xfj7V3M8CPMAK6weRmTsqGsYapy1P88HXfASH2+QUVLMJHTfT8um283LYIo+/9V a7KZKZXmvcoSfqqCAkG/FRlCDWzcAdfGuD7+khJY= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Mauricio Faria de Oliveira , Ryan Finnie , Sebastian Marsching , Coly Li , Jens Axboe , Sasha Levin Subject: [PATCH 5.7 186/265] bcache: check and adjust logical block size for backing devices Date: Mon, 29 Jun 2020 11:16:59 -0400 Message-Id: <20200629151818.2493727-187-sashal@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200629151818.2493727-1-sashal@kernel.org> References: <20200629151818.2493727-1-sashal@kernel.org> MIME-Version: 1.0 X-KernelTest-Patch: http://kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.7-rc1.gz X-KernelTest-Tree: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git X-KernelTest-Branch: linux-5.7.y X-KernelTest-Patches: git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git X-KernelTest-Version: 5.7.7-rc1 X-KernelTest-Deadline: 2020-07-01T15:14+00:00 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mauricio Faria de Oliveira [ Upstream commit dcacbc1242c71e18fa9d2eadc5647e115c9c627d ] It's possible for a block driver to set logical block size to a value greater than page size incorrectly; e.g. bcache takes the value from the superblock, set by the user w/ make-bcache. This causes a BUG/NULL pointer dereference in the path: __blkdev_get() -> set_init_blocksize() // set i_blkbits based on ... -> bdev_logical_block_size() -> queue_logical_block_size() // ... this value -> bdev_disk_changed() ... -> blkdev_readpage() -> block_read_full_page() -> create_page_buffers() // size = 1 << i_blkbits -> create_empty_buffers() // give size/take pointer -> alloc_page_buffers() // return NULL .. BUG! Because alloc_page_buffers() is called with size > PAGE_SIZE, thus it initializes head = NULL, skips the loop, return head; then create_empty_buffers() gets (and uses) the NULL pointer. This has been around longer than commit ad6bf88a6c19 ("block: fix an integer overflow in logical block size"); however, it increased the range of values that can trigger the issue. Previously only 8k/16k/32k (on x86/4k page size) would do it, as greater values overflow unsigned short to zero, and queue_ logical_block_size() would then use the default of 512. Now the range with unsigned int is much larger, and users w/ the 512k value, which happened to be zero'ed previously and work fine, started to hit this issue -- as the zero is gone, and queue_logical_block_size() does return 512k (>PAGE_SIZE.) Fix this by checking the bcache device's logical block size, and if it's greater than page size, fallback to the backing/ cached device's logical page size. This doesn't affect cache devices as those are still checked for block/page size in read_super(); only the backing/cached devices are not. Apparently it's a regression from commit 2903381fce71 ("bcache: Take data offset from the bdev superblock."), moving the check into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks of backing devices out there with this larger value, we cannot refuse to load them (i.e., have a similar check in _BDEV.) Ideally perhaps bcache should use all values from the backing device (physical/logical/io_min block size)? But for now just fix the problematic case. Test-case: # IMG=/root/disk.img # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G # DEV=$(losetup --find --show $IMG) # make-bcache --bdev $DEV --block 8k < see dmesg > Before: # uname -r 5.7.0-rc7 [ 55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000 ... [ 55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4 ... [ 55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100 ... [ 55.966434] Call Trace: [ 55.967021] create_page_buffers+0x48/0x50 [ 55.967834] block_read_full_page+0x49/0x380 [ 55.972181] do_read_cache_page+0x494/0x610 [ 55.974780] read_part_sector+0x2d/0xaa [ 55.975558] read_lba+0x10e/0x1e0 [ 55.977904] efi_partition+0x120/0x5a6 [ 55.980227] blk_add_partitions+0x161/0x390 [ 55.982177] bdev_disk_changed+0x61/0xd0 [ 55.982961] __blkdev_get+0x350/0x490 [ 55.983715] __device_add_disk+0x318/0x480 [ 55.984539] bch_cached_dev_run+0xc5/0x270 [ 55.986010] register_bcache.cold+0x122/0x179 [ 55.987628] kernfs_fop_write+0xbc/0x1a0 [ 55.988416] vfs_write+0xb1/0x1a0 [ 55.989134] ksys_write+0x5a/0xd0 [ 55.989825] do_syscall_64+0x43/0x140 [ 55.990563] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 55.991519] RIP: 0033:0x7f7d60ba3154 ... After: # uname -r 5.7.0.bcachelbspgsz [ 31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512) [ 31.675133] bcache: register_bdev() registered backing device loop0 # grep ^ /sys/block/bcache0/queue/*_block_size /sys/block/bcache0/queue/logical_block_size:512 /sys/block/bcache0/queue/physical_block_size:8192 Reported-by: Ryan Finnie Reported-by: Sebastian Marsching Signed-off-by: Mauricio Faria de Oliveira Signed-off-by: Coly Li Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- drivers/md/bcache/super.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 4d8bf731b118c..a2e5a0fcd7d5c 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -819,7 +819,8 @@ static void bcache_device_free(struct bcache_device *d) } static int bcache_device_init(struct bcache_device *d, unsigned int block_size, - sector_t sectors, make_request_fn make_request_fn) + sector_t sectors, make_request_fn make_request_fn, + struct block_device *cached_bdev) { struct request_queue *q; const size_t max_stripes = min_t(size_t, INT_MAX, @@ -885,6 +886,21 @@ static int bcache_device_init(struct bcache_device *d, unsigned int block_size, q->limits.io_min = block_size; q->limits.logical_block_size = block_size; q->limits.physical_block_size = block_size; + + if (q->limits.logical_block_size > PAGE_SIZE && cached_bdev) { + /* + * This should only happen with BCACHE_SB_VERSION_BDEV. + * Block/page size is checked for BCACHE_SB_VERSION_CDEV. + */ + pr_info("%s: sb/logical block size (%u) greater than page size " + "(%lu) falling back to device logical block size (%u)", + d->disk->disk_name, q->limits.logical_block_size, + PAGE_SIZE, bdev_logical_block_size(cached_bdev)); + + /* This also adjusts physical block size/min io size if needed */ + blk_queue_logical_block_size(q, bdev_logical_block_size(cached_bdev)); + } + blk_queue_flag_set(QUEUE_FLAG_NONROT, d->disk->queue); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, d->disk->queue); blk_queue_flag_set(QUEUE_FLAG_DISCARD, d->disk->queue); @@ -1342,7 +1358,7 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size) ret = bcache_device_init(&dc->disk, block_size, dc->bdev->bd_part->nr_sects - dc->sb.data_offset, - cached_dev_make_request); + cached_dev_make_request, dc->bdev); if (ret) return ret; @@ -1455,7 +1471,7 @@ static int flash_dev_run(struct cache_set *c, struct uuid_entry *u) kobject_init(&d->kobj, &bch_flash_dev_ktype); if (bcache_device_init(d, block_bytes(c), u->sectors, - flash_dev_make_request)) + flash_dev_make_request, NULL)) goto err; bcache_device_attach(d, c, u - c->uuids); -- 2.25.1