2022-04-10 02:16:46

by Christoph Hellwig

[permalink] [raw]
Subject: use block_device based APIs in block layer consumers v2

Hi Jens,

this series cleanups up the block layer API so that APIs consumed
by file systems are (almost) only struct block_devic based, so that
file systems don't have to poke into block layer internals like the
request_queue.

I also found a bunch of existing bugs related to partition offsets
and discard so these are fixed while going along.


Changes since v1:
- fix a bisection hazard
- minor spelling fixes
- reorder hunks between two patches to make the changes more obvious
- reorder a patch to be earlier in the series to ease backporting


Diffstat:
arch/um/drivers/ubd_kern.c | 2
block/blk-core.c | 4 -
block/blk-lib.c | 124 ++++++++++++++++++++---------------
block/blk-mq-debugfs.c | 2
block/blk-settings.c | 74 ++++++++++++++++++++
block/blk.h | 14 ---
block/fops.c | 2
block/genhd.c | 4 -
block/ioctl.c | 48 ++++++++++---
block/partitions/core.c | 12 ---
drivers/block/drbd/drbd_main.c | 51 ++++++--------
drivers/block/drbd/drbd_nl.c | 94 +++++++++++---------------
drivers/block/drbd/drbd_receiver.c | 13 +--
drivers/block/loop.c | 15 +---
drivers/block/nbd.c | 5 -
drivers/block/null_blk/main.c | 1
drivers/block/rbd.c | 1
drivers/block/rnbd/rnbd-clt.c | 6 -
drivers/block/rnbd/rnbd-srv-dev.h | 8 --
drivers/block/rnbd/rnbd-srv.c | 5 -
drivers/block/virtio_blk.c | 2
drivers/block/xen-blkback/blkback.c | 15 ++--
drivers/block/xen-blkback/xenbus.c | 9 --
drivers/block/xen-blkfront.c | 7 -
drivers/block/zram/zram_drv.c | 1
drivers/md/bcache/alloc.c | 2
drivers/md/bcache/request.c | 4 -
drivers/md/bcache/super.c | 3
drivers/md/bcache/sysfs.c | 2
drivers/md/dm-cache-target.c | 9 --
drivers/md/dm-clone-target.c | 9 --
drivers/md/dm-io.c | 2
drivers/md/dm-log-writes.c | 3
drivers/md/dm-raid.c | 9 --
drivers/md/dm-table.c | 25 +------
drivers/md/dm-thin.c | 15 ----
drivers/md/dm.c | 3
drivers/md/md-linear.c | 11 ---
drivers/md/md.c | 5 -
drivers/md/raid0.c | 7 -
drivers/md/raid1.c | 18 -----
drivers/md/raid10.c | 20 -----
drivers/md/raid5-cache.c | 8 +-
drivers/md/raid5.c | 14 +--
drivers/mmc/core/queue.c | 3
drivers/mtd/mtd_blkdevs.c | 1
drivers/nvme/host/core.c | 6 -
drivers/nvme/target/io-cmd-bdev.c | 2
drivers/nvme/target/zns.c | 3
drivers/s390/block/dasd_fba.c | 1
drivers/scsi/sd.c | 2
drivers/target/target_core_device.c | 20 ++---
drivers/target/target_core_file.c | 10 +-
drivers/target/target_core_iblock.c | 17 +---
fs/btrfs/disk-io.c | 3
fs/btrfs/extent-tree.c | 8 +-
fs/btrfs/ioctl.c | 12 +--
fs/btrfs/volumes.c | 4 -
fs/btrfs/zoned.c | 3
fs/direct-io.c | 32 +--------
fs/exfat/file.c | 5 -
fs/exfat/super.c | 10 --
fs/ext4/ioctl.c | 10 --
fs/ext4/mballoc.c | 10 +-
fs/ext4/super.c | 10 --
fs/f2fs/f2fs.h | 3
fs/f2fs/file.c | 19 ++---
fs/f2fs/segment.c | 8 --
fs/fat/file.c | 5 -
fs/fat/inode.c | 10 --
fs/gfs2/rgrp.c | 7 -
fs/iomap/direct-io.c | 3
fs/jbd2/journal.c | 9 --
fs/jfs/ioctl.c | 5 -
fs/jfs/super.c | 8 --
fs/nilfs2/ioctl.c | 6 -
fs/nilfs2/sufile.c | 4 -
fs/nilfs2/the_nilfs.c | 4 -
fs/ntfs3/file.c | 6 -
fs/ntfs3/super.c | 10 +-
fs/ocfs2/ioctl.c | 5 -
fs/super.c | 2
fs/xfs/xfs_discard.c | 8 +-
fs/xfs/xfs_log_cil.c | 2
fs/xfs/xfs_super.c | 12 +--
fs/zonefs/super.c | 3
include/linux/blkdev.h | 112 +++++++++++--------------------
include/target/target_core_backend.h | 4 -
mm/swapfile.c | 31 ++------
89 files changed, 493 insertions(+), 653 deletions(-)


2022-04-10 04:54:46

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 04/27] drbd: remove assign_p_sizes_qlim

Fold each branch into its only caller.

Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/block/drbd/drbd_main.c | 47 +++++++++++++++-------------------
1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 9676a1d214bc5..1262fe1c33618 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -903,31 +903,6 @@ void drbd_gen_and_send_sync_uuid(struct drbd_peer_device *peer_device)
}
}

-/* communicated if (agreed_features & DRBD_FF_WSAME) */
-static void
-assign_p_sizes_qlim(struct drbd_device *device, struct p_sizes *p,
- struct request_queue *q)
-{
- if (q) {
- p->qlim->physical_block_size = cpu_to_be32(queue_physical_block_size(q));
- p->qlim->logical_block_size = cpu_to_be32(queue_logical_block_size(q));
- p->qlim->alignment_offset = cpu_to_be32(queue_alignment_offset(q));
- p->qlim->io_min = cpu_to_be32(queue_io_min(q));
- p->qlim->io_opt = cpu_to_be32(queue_io_opt(q));
- p->qlim->discard_enabled = blk_queue_discard(q);
- p->qlim->write_same_capable = 0;
- } else {
- q = device->rq_queue;
- p->qlim->physical_block_size = cpu_to_be32(queue_physical_block_size(q));
- p->qlim->logical_block_size = cpu_to_be32(queue_logical_block_size(q));
- p->qlim->alignment_offset = 0;
- p->qlim->io_min = cpu_to_be32(queue_io_min(q));
- p->qlim->io_opt = cpu_to_be32(queue_io_opt(q));
- p->qlim->discard_enabled = 0;
- p->qlim->write_same_capable = 0;
- }
-}
-
int drbd_send_sizes(struct drbd_peer_device *peer_device, int trigger_reply, enum dds_flags flags)
{
struct drbd_device *device = peer_device->device;
@@ -957,14 +932,32 @@ int drbd_send_sizes(struct drbd_peer_device *peer_device, int trigger_reply, enu
q_order_type = drbd_queue_order_type(device);
max_bio_size = queue_max_hw_sectors(q) << 9;
max_bio_size = min(max_bio_size, DRBD_MAX_BIO_SIZE);
- assign_p_sizes_qlim(device, p, q);
+ p->qlim->physical_block_size =
+ cpu_to_be32(queue_physical_block_size(q));
+ p->qlim->logical_block_size =
+ cpu_to_be32(queue_logical_block_size(q));
+ p->qlim->alignment_offset =
+ cpu_to_be32(queue_alignment_offset(q));
+ p->qlim->io_min = cpu_to_be32(queue_io_min(q));
+ p->qlim->io_opt = cpu_to_be32(queue_io_opt(q));
+ p->qlim->discard_enabled = blk_queue_discard(q);
put_ldev(device);
} else {
+ struct request_queue *q = device->rq_queue;
+
+ p->qlim->physical_block_size =
+ cpu_to_be32(queue_physical_block_size(q));
+ p->qlim->logical_block_size =
+ cpu_to_be32(queue_logical_block_size(q));
+ p->qlim->alignment_offset = 0;
+ p->qlim->io_min = cpu_to_be32(queue_io_min(q));
+ p->qlim->io_opt = cpu_to_be32(queue_io_opt(q));
+ p->qlim->discard_enabled = 0;
+
d_size = 0;
u_size = 0;
q_order_type = QUEUE_ORDERED_NONE;
max_bio_size = DRBD_MAX_BIO_SIZE; /* ... multiple BIOs per peer_request */
- assign_p_sizes_qlim(device, p, NULL);
}

if (peer_device->connection->agreed_pro_version <= 94)
--
2.30.2

2022-04-10 11:05:56

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 14/27] block: add a bdev_stable_writes helper

Add a helper to check the stable writes flag based on the block_device
instead of having to poke into the block layer internal request_queue.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
drivers/md/dm-table.c | 4 +---
fs/super.c | 2 +-
include/linux/blkdev.h | 6 ++++++
mm/swapfile.c | 2 +-
4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5e38d0dd009d5..d46839faa0ca5 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1950,9 +1950,7 @@ static int device_requires_stable_pages(struct dm_target *ti,
struct dm_dev *dev, sector_t start,
sector_t len, void *data)
{
- struct request_queue *q = bdev_get_queue(dev->bdev);
-
- return blk_queue_stable_writes(q);
+ return bdev_stable_writes(dev->bdev);
}

int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
diff --git a/fs/super.c b/fs/super.c
index f1d4a193602d6..60f57c7bc0a69 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1204,7 +1204,7 @@ static int set_bdev_super(struct super_block *s, void *data)
s->s_dev = s->s_bdev->bd_dev;
s->s_bdi = bdi_get(s->s_bdev->bd_disk->bdi);

- if (blk_queue_stable_writes(s->s_bdev->bd_disk->queue))
+ if (bdev_stable_writes(s->s_bdev))
s->s_iflags |= SB_I_STABLE_WRITES;
return 0;
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 075b16d4560e7..a433798c3343e 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1330,6 +1330,12 @@ static inline bool bdev_nonrot(struct block_device *bdev)
return blk_queue_nonrot(bdev_get_queue(bdev));
}

+static inline bool bdev_stable_writes(struct block_device *bdev)
+{
+ return test_bit(QUEUE_FLAG_STABLE_WRITES,
+ &bdev_get_queue(bdev)->queue_flags);
+}
+
static inline bool bdev_write_cache(struct block_device *bdev)
{
return test_bit(QUEUE_FLAG_WC, &bdev_get_queue(bdev)->queue_flags);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index d5ab7ec4d92ca..4069f17a82c8e 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3065,7 +3065,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
goto bad_swap_unlock_inode;
}

- if (p->bdev && blk_queue_stable_writes(p->bdev->bd_disk->queue))
+ if (p->bdev && bdev_stable_writes(p->bdev))
p->flags |= SWP_STABLE_WRITES;

if (p->bdev && p->bdev->bd_disk->fops->rw_page)
--
2.30.2

2022-04-10 11:07:37

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 07/27] drbd: cleanup decide_on_discard_support

Sanitize the calling conventions and use a goto label to cleanup the
code flow.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Christoph Böhmwalder <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 68 +++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 33 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 02030c9c4d3b1..40bb0b356a6d6 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1204,38 +1204,42 @@ static unsigned int drbd_max_discard_sectors(struct drbd_connection *connection)
}

static void decide_on_discard_support(struct drbd_device *device,
- struct request_queue *q,
- struct request_queue *b,
- bool discard_zeroes_if_aligned)
+ struct drbd_backing_dev *bdev)
{
- /* q = drbd device queue (device->rq_queue)
- * b = backing device queue (device->ldev->backing_bdev->bd_disk->queue),
- * or NULL if diskless
- */
- struct drbd_connection *connection = first_peer_device(device)->connection;
- bool can_do = b ? blk_queue_discard(b) : true;
-
- if (can_do && connection->cstate >= C_CONNECTED && !(connection->agreed_features & DRBD_FF_TRIM)) {
- can_do = false;
- drbd_info(connection, "peer DRBD too old, does not support TRIM: disabling discards\n");
- }
- if (can_do) {
- /* We don't care for the granularity, really.
- * Stacking limits below should fix it for the local
- * device. Whether or not it is a suitable granularity
- * on the remote device is not our problem, really. If
- * you care, you need to use devices with similar
- * topology on all peers. */
- blk_queue_discard_granularity(q, 512);
- q->limits.max_discard_sectors = drbd_max_discard_sectors(connection);
- blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
- q->limits.max_write_zeroes_sectors = drbd_max_discard_sectors(connection);
- } else {
- blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
- blk_queue_discard_granularity(q, 0);
- q->limits.max_discard_sectors = 0;
- q->limits.max_write_zeroes_sectors = 0;
+ struct drbd_connection *connection =
+ first_peer_device(device)->connection;
+ struct request_queue *q = device->rq_queue;
+
+ if (bdev && !blk_queue_discard(bdev->backing_bdev->bd_disk->queue))
+ goto not_supported;
+
+ if (connection->cstate >= C_CONNECTED &&
+ !(connection->agreed_features & DRBD_FF_TRIM)) {
+ drbd_info(connection,
+ "peer DRBD too old, does not support TRIM: disabling discards\n");
+ goto not_supported;
}
+
+ /*
+ * We don't care for the granularity, really.
+ *
+ * Stacking limits below should fix it for the local device. Whether or
+ * not it is a suitable granularity on the remote device is not our
+ * problem, really. If you care, you need to use devices with similar
+ * topology on all peers.
+ */
+ blk_queue_discard_granularity(q, 512);
+ q->limits.max_discard_sectors = drbd_max_discard_sectors(connection);
+ blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
+ q->limits.max_write_zeroes_sectors =
+ drbd_max_discard_sectors(connection);
+ return;
+
+not_supported:
+ blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
+ blk_queue_discard_granularity(q, 0);
+ q->limits.max_discard_sectors = 0;
+ q->limits.max_write_zeroes_sectors = 0;
}

static void fixup_discard_if_not_supported(struct request_queue *q)
@@ -1273,7 +1277,6 @@ static void drbd_setup_queue_param(struct drbd_device *device, struct drbd_backi
unsigned int max_segments = 0;
struct request_queue *b = NULL;
struct disk_conf *dc;
- bool discard_zeroes_if_aligned = true;

if (bdev) {
b = bdev->backing_bdev->bd_disk->queue;
@@ -1282,7 +1285,6 @@ static void drbd_setup_queue_param(struct drbd_device *device, struct drbd_backi
rcu_read_lock();
dc = rcu_dereference(device->ldev->disk_conf);
max_segments = dc->max_bio_bvecs;
- discard_zeroes_if_aligned = dc->discard_zeroes_if_aligned;
rcu_read_unlock();

blk_set_stacking_limits(&q->limits);
@@ -1292,7 +1294,7 @@ static void drbd_setup_queue_param(struct drbd_device *device, struct drbd_backi
/* This is the workaround for "bio would need to, but cannot, be split" */
blk_queue_max_segments(q, max_segments ? max_segments : BLK_MAX_SEGMENTS);
blk_queue_segment_boundary(q, PAGE_SIZE-1);
- decide_on_discard_support(device, q, b, discard_zeroes_if_aligned);
+ decide_on_discard_support(device, bdev);

if (b) {
blk_stack_limits(&q->limits, &b->limits, 0);
--
2.30.2

2022-04-10 18:24:08

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 25/27] block: add a bdev_discard_granularity helper

Abstract away implementation details from file systems by providing a
block_device based helper to retrieve the discard granularity.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Acked-by: Christoph Böhmwalder <[email protected]> [btrfs]
Acked-by: Ryusuke Konishi <[email protected]>
Acked-by: David Sterba <[email protected]> [btrfs]
---
block/blk-lib.c | 5 ++---
drivers/block/drbd/drbd_nl.c | 9 +++++----
drivers/block/drbd/drbd_receiver.c | 3 +--
drivers/block/loop.c | 2 +-
drivers/target/target_core_device.c | 3 +--
fs/btrfs/ioctl.c | 12 ++++--------
fs/exfat/file.c | 3 +--
fs/ext4/mballoc.c | 6 +++---
fs/f2fs/file.c | 3 +--
fs/fat/file.c | 3 +--
fs/gfs2/rgrp.c | 7 +++----
fs/jfs/ioctl.c | 3 +--
fs/nilfs2/ioctl.c | 4 ++--
fs/ntfs3/file.c | 4 ++--
fs/ntfs3/super.c | 6 ++----
fs/ocfs2/ioctl.c | 3 +--
fs/xfs/xfs_discard.c | 4 ++--
include/linux/blkdev.h | 5 +++++
18 files changed, 38 insertions(+), 47 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 8b4b66d3a9bfc..43aa4d7fe859f 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -12,8 +12,7 @@

static sector_t bio_discard_limit(struct block_device *bdev, sector_t sector)
{
- unsigned int discard_granularity =
- bdev_get_queue(bdev)->limits.discard_granularity;
+ unsigned int discard_granularity = bdev_discard_granularity(bdev);
sector_t granularity_aligned_sector;

if (bdev_is_partition(bdev))
@@ -59,7 +58,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
}

/* In case the discard granularity isn't set by buggy device driver */
- if (WARN_ON_ONCE(!q->limits.discard_granularity)) {
+ if (WARN_ON_ONCE(!bdev_discard_granularity(bdev))) {
char dev_name[BDEVNAME_SIZE];

bdevname(bdev, dev_name);
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index b55e5fcc21e1f..0b3e43be6414d 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1425,7 +1425,6 @@ static void sanitize_disk_conf(struct drbd_device *device, struct disk_conf *dis
struct drbd_backing_dev *nbc)
{
struct block_device *bdev = nbc->backing_bdev;
- struct request_queue *q = bdev->bd_disk->queue;

if (disk_conf->al_extents < DRBD_AL_EXTENTS_MIN)
disk_conf->al_extents = DRBD_AL_EXTENTS_MIN;
@@ -1442,12 +1441,14 @@ static void sanitize_disk_conf(struct drbd_device *device, struct disk_conf *dis
if (disk_conf->rs_discard_granularity) {
int orig_value = disk_conf->rs_discard_granularity;
sector_t discard_size = bdev_max_discard_sectors(bdev) << 9;
+ unsigned int discard_granularity = bdev_discard_granularity(bdev);
int remainder;

- if (q->limits.discard_granularity > disk_conf->rs_discard_granularity)
- disk_conf->rs_discard_granularity = q->limits.discard_granularity;
+ if (discard_granularity > disk_conf->rs_discard_granularity)
+ disk_conf->rs_discard_granularity = discard_granularity;

- remainder = disk_conf->rs_discard_granularity % q->limits.discard_granularity;
+ remainder = disk_conf->rs_discard_granularity %
+ discard_granularity;
disk_conf->rs_discard_granularity += remainder;

if (disk_conf->rs_discard_granularity > discard_size)
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 8a4a47da56fe9..275c53c7b629e 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1511,7 +1511,6 @@ void drbd_bump_write_ordering(struct drbd_resource *resource, struct drbd_backin
int drbd_issue_discard_or_zero_out(struct drbd_device *device, sector_t start, unsigned int nr_sectors, int flags)
{
struct block_device *bdev = device->ldev->backing_bdev;
- struct request_queue *q = bdev_get_queue(bdev);
sector_t tmp, nr;
unsigned int max_discard_sectors, granularity;
int alignment;
@@ -1521,7 +1520,7 @@ int drbd_issue_discard_or_zero_out(struct drbd_device *device, sector_t start, u
goto zero_out;

/* Zero-sector (unknown) and one-sector granularities are the same. */
- granularity = max(q->limits.discard_granularity >> 9, 1U);
+ granularity = max(bdev_discard_granularity(bdev) >> 9, 1U);
alignment = (bdev_discard_alignment(bdev) >> 9) % granularity;

max_discard_sectors = min(bdev_max_discard_sectors(bdev), (1U << 22));
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 0e061c9896eff..976cf987b3920 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -759,7 +759,7 @@ static void loop_config_discard(struct loop_device *lo)
struct request_queue *backingq = bdev_get_queue(I_BDEV(inode));

max_discard_sectors = backingq->limits.max_write_zeroes_sectors;
- granularity = backingq->limits.discard_granularity ?:
+ granularity = bdev_discard_granularity(I_BDEV(inode)) ?:
queue_physical_block_size(backingq);

/*
diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
index 6cb9f87843278..25f33eb25337c 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -835,7 +835,6 @@ struct se_device *target_alloc_device(struct se_hba *hba, const char *name)
bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib,
struct block_device *bdev)
{
- struct request_queue *q = bdev_get_queue(bdev);
int block_size = bdev_logical_block_size(bdev);

if (!bdev_max_discard_sectors(bdev))
@@ -847,7 +846,7 @@ bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib,
* Currently hardcoded to 1 in Linux/SCSI code..
*/
attrib->max_unmap_block_desc_count = 1;
- attrib->unmap_granularity = q->limits.discard_granularity / block_size;
+ attrib->unmap_granularity = bdev_discard_granularity(bdev) / block_size;
attrib->unmap_granularity_alignment =
bdev_discard_alignment(bdev) / block_size;
return true;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 096bb0da03f1c..70765d59616a5 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -468,7 +468,6 @@ static noinline int btrfs_ioctl_fitrim(struct btrfs_fs_info *fs_info,
void __user *arg)
{
struct btrfs_device *device;
- struct request_queue *q;
struct fstrim_range range;
u64 minlen = ULLONG_MAX;
u64 num_devices = 0;
@@ -498,14 +497,11 @@ static noinline int btrfs_ioctl_fitrim(struct btrfs_fs_info *fs_info,
rcu_read_lock();
list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
dev_list) {
- if (!device->bdev)
+ if (!device->bdev || !bdev_max_discard_sectors(device->bdev))
continue;
- q = bdev_get_queue(device->bdev);
- if (bdev_max_discard_sectors(device->bdev)) {
- num_devices++;
- minlen = min_t(u64, q->limits.discard_granularity,
- minlen);
- }
+ num_devices++;
+ minlen = min_t(u64, bdev_discard_granularity(device->bdev),
+ minlen);
}
rcu_read_unlock();

diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index 765e4f63dd18d..20d4e47f57ab2 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -351,7 +351,6 @@ int exfat_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,

static int exfat_ioctl_fitrim(struct inode *inode, unsigned long arg)
{
- struct request_queue *q = bdev_get_queue(inode->i_sb->s_bdev);
struct fstrim_range range;
int ret = 0;

@@ -365,7 +364,7 @@ static int exfat_ioctl_fitrim(struct inode *inode, unsigned long arg)
return -EFAULT;

range.minlen = max_t(unsigned int, range.minlen,
- q->limits.discard_granularity);
+ bdev_discard_granularity(inode->i_sb->s_bdev));

ret = exfat_trim_fs(inode, &range);
if (ret < 0)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c3668c977cd99..6d1820536d88d 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6455,7 +6455,7 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
*/
int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
{
- struct request_queue *q = bdev_get_queue(sb->s_bdev);
+ unsigned int discard_granularity = bdev_discard_granularity(sb->s_bdev);
struct ext4_group_info *grp;
ext4_group_t group, first_group, last_group;
ext4_grpblk_t cnt = 0, first_cluster, last_cluster;
@@ -6475,9 +6475,9 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
range->len < sb->s_blocksize)
return -EINVAL;
/* No point to try to trim less than discard granularity */
- if (range->minlen < q->limits.discard_granularity) {
+ if (range->minlen < discard_granularity) {
minlen = EXT4_NUM_B2C(EXT4_SB(sb),
- q->limits.discard_granularity >> sb->s_blocksize_bits);
+ discard_granularity >> sb->s_blocksize_bits);
if (minlen > EXT4_CLUSTERS_PER_GROUP(sb))
goto out;
}
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 5b89af0f27f05..8053d99f3920b 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2285,7 +2285,6 @@ static int f2fs_ioc_fitrim(struct file *filp, unsigned long arg)
{
struct inode *inode = file_inode(filp);
struct super_block *sb = inode->i_sb;
- struct request_queue *q = bdev_get_queue(sb->s_bdev);
struct fstrim_range range;
int ret;

@@ -2304,7 +2303,7 @@ static int f2fs_ioc_fitrim(struct file *filp, unsigned long arg)
return ret;

range.minlen = max((unsigned int)range.minlen,
- q->limits.discard_granularity);
+ bdev_discard_granularity(sb->s_bdev));
ret = f2fs_trim_fs(F2FS_SB(sb), &range);
mnt_drop_write_file(filp);
if (ret < 0)
diff --git a/fs/fat/file.c b/fs/fat/file.c
index e4c7d10e80129..bf91f977debea 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -127,7 +127,6 @@ static int fat_ioctl_fitrim(struct inode *inode, unsigned long arg)
struct super_block *sb = inode->i_sb;
struct fstrim_range __user *user_range;
struct fstrim_range range;
- struct request_queue *q = bdev_get_queue(sb->s_bdev);
int err;

if (!capable(CAP_SYS_ADMIN))
@@ -141,7 +140,7 @@ static int fat_ioctl_fitrim(struct inode *inode, unsigned long arg)
return -EFAULT;

range.minlen = max_t(unsigned int, range.minlen,
- q->limits.discard_granularity);
+ bdev_discard_granularity(sb->s_bdev));

err = fat_trim_fs(inode, &range);
if (err < 0)
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 7f20ac9133bc6..6d26bb5254844 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1386,7 +1386,7 @@ int gfs2_fitrim(struct file *filp, void __user *argp)
{
struct inode *inode = file_inode(filp);
struct gfs2_sbd *sdp = GFS2_SB(inode);
- struct request_queue *q = bdev_get_queue(sdp->sd_vfs->s_bdev);
+ struct block_device *bdev = sdp->sd_vfs->s_bdev;
struct buffer_head *bh;
struct gfs2_rgrpd *rgd;
struct gfs2_rgrpd *rgd_end;
@@ -1405,7 +1405,7 @@ int gfs2_fitrim(struct file *filp, void __user *argp)
if (!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
return -EROFS;

- if (!bdev_max_discard_sectors(sdp->sd_vfs->s_bdev))
+ if (!bdev_max_discard_sectors(bdev))
return -EOPNOTSUPP;

if (copy_from_user(&r, argp, sizeof(r)))
@@ -1418,8 +1418,7 @@ int gfs2_fitrim(struct file *filp, void __user *argp)
start = r.start >> bs_shift;
end = start + (r.len >> bs_shift);
minlen = max_t(u64, r.minlen, sdp->sd_sb.sb_bsize);
- minlen = max_t(u64, minlen,
- q->limits.discard_granularity) >> bs_shift;
+ minlen = max_t(u64, minlen, bdev_discard_granularity(bdev)) >> bs_shift;

if (end <= start || minlen > sdp->sd_max_rg_data)
return -EINVAL;
diff --git a/fs/jfs/ioctl.c b/fs/jfs/ioctl.c
index 357ae6e5c36ec..1e7b177ece605 100644
--- a/fs/jfs/ioctl.c
+++ b/fs/jfs/ioctl.c
@@ -110,7 +110,6 @@ long jfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
case FITRIM:
{
struct super_block *sb = inode->i_sb;
- struct request_queue *q = bdev_get_queue(sb->s_bdev);
struct fstrim_range range;
s64 ret = 0;

@@ -127,7 +126,7 @@ long jfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
return -EFAULT;

range.minlen = max_t(unsigned int, range.minlen,
- q->limits.discard_granularity);
+ bdev_discard_granularity(sb->s_bdev));

ret = jfs_ioc_trim(inode, &range);
if (ret < 0)
diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
index 52b73f558fcb1..87e1004b606d2 100644
--- a/fs/nilfs2/ioctl.c
+++ b/fs/nilfs2/ioctl.c
@@ -1052,7 +1052,6 @@ static int nilfs_ioctl_resize(struct inode *inode, struct file *filp,
static int nilfs_ioctl_trim_fs(struct inode *inode, void __user *argp)
{
struct the_nilfs *nilfs = inode->i_sb->s_fs_info;
- struct request_queue *q = bdev_get_queue(nilfs->ns_bdev);
struct fstrim_range range;
int ret;

@@ -1065,7 +1064,8 @@ static int nilfs_ioctl_trim_fs(struct inode *inode, void __user *argp)
if (copy_from_user(&range, argp, sizeof(range)))
return -EFAULT;

- range.minlen = max_t(u64, range.minlen, q->limits.discard_granularity);
+ range.minlen = max_t(u64, range.minlen,
+ bdev_discard_granularity(nilfs->ns_bdev));

down_read(&nilfs->ns_segctor_sem);
ret = nilfs_sufile_trim_fs(nilfs->ns_sufile, &range);
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index e763236169331..15806eeae217a 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -22,7 +22,6 @@ static int ntfs_ioctl_fitrim(struct ntfs_sb_info *sbi, unsigned long arg)
{
struct fstrim_range __user *user_range;
struct fstrim_range range;
- struct request_queue *q = bdev_get_queue(sbi->sb->s_bdev);
int err;

if (!capable(CAP_SYS_ADMIN))
@@ -35,7 +34,8 @@ static int ntfs_ioctl_fitrim(struct ntfs_sb_info *sbi, unsigned long arg)
if (copy_from_user(&range, user_range, sizeof(range)))
return -EFAULT;

- range.minlen = max_t(u32, range.minlen, q->limits.discard_granularity);
+ range.minlen = max_t(u32, range.minlen,
+ bdev_discard_granularity(sbi->sb->s_bdev));

err = ntfs_trim_fs(sbi, &range);
if (err < 0)
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
index c734085bcce4a..5f2e414cfa79b 100644
--- a/fs/ntfs3/super.c
+++ b/fs/ntfs3/super.c
@@ -882,7 +882,6 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
int err;
struct ntfs_sb_info *sbi = sb->s_fs_info;
struct block_device *bdev = sb->s_bdev;
- struct request_queue *rq;
struct inode *inode;
struct ntfs_inode *ni;
size_t i, tt;
@@ -912,9 +911,8 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
goto out;
}

- rq = bdev_get_queue(bdev);
- if (bdev_max_discard_sectors(bdev) && rq->limits.discard_granularity) {
- sbi->discard_granularity = rq->limits.discard_granularity;
+ if (bdev_max_discard_sectors(bdev) && bdev_discard_granularity(bdev)) {
+ sbi->discard_granularity = bdev_discard_granularity(bdev);
sbi->discard_granularity_mask_inv =
~(u64)(sbi->discard_granularity - 1);
}
diff --git a/fs/ocfs2/ioctl.c b/fs/ocfs2/ioctl.c
index 9b78ef103ada6..afd54ec661030 100644
--- a/fs/ocfs2/ioctl.c
+++ b/fs/ocfs2/ioctl.c
@@ -903,7 +903,6 @@ long ocfs2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
case FITRIM:
{
struct super_block *sb = inode->i_sb;
- struct request_queue *q = bdev_get_queue(sb->s_bdev);
struct fstrim_range range;
int ret = 0;

@@ -916,7 +915,7 @@ long ocfs2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
if (copy_from_user(&range, argp, sizeof(range)))
return -EFAULT;

- range.minlen = max_t(u64, q->limits.discard_granularity,
+ range.minlen = max_t(u64, bdev_discard_granularity(sb->s_bdev),
range.minlen);
ret = ocfs2_trim_fs(sb, &range);
if (ret < 0)
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index a4e6609d616b7..e2ada115c23f9 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -152,8 +152,8 @@ xfs_ioc_trim(
struct xfs_mount *mp,
struct fstrim_range __user *urange)
{
- struct request_queue *q = bdev_get_queue(mp->m_ddev_targp->bt_bdev);
- unsigned int granularity = q->limits.discard_granularity;
+ unsigned int granularity =
+ bdev_discard_granularity(mp->m_ddev_targp->bt_bdev);
struct fstrim_range range;
xfs_daddr_t start, end, minlen;
xfs_agnumber_t start_agno, end_agno, agno;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 767ab22e1052a..f1cf557ea20ef 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1257,6 +1257,11 @@ static inline unsigned int bdev_max_discard_sectors(struct block_device *bdev)
return bdev_get_queue(bdev)->limits.max_discard_sectors;
}

+static inline unsigned int bdev_discard_granularity(struct block_device *bdev)
+{
+ return bdev_get_queue(bdev)->limits.discard_granularity;
+}
+
static inline unsigned int bdev_write_zeroes_sectors(struct block_device *bdev)
{
struct request_queue *q = bdev_get_queue(bdev);
--
2.30.2

2022-04-10 18:24:12

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 13/27] block: add a bdev_fua helper

Add a helper to check the FUA flag based on the block_device instead of
having to poke into the block layer internal request_queue.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
drivers/block/rnbd/rnbd-srv.c | 3 +--
drivers/target/target_core_iblock.c | 3 +--
fs/iomap/direct-io.c | 3 +--
include/linux/blkdev.h | 6 +++++-
4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index f8cc3c5fecb4b..beaef43a67b9d 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -533,7 +533,6 @@ static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
struct rnbd_srv_sess_dev *sess_dev)
{
struct rnbd_dev *rnbd_dev = sess_dev->rnbd_dev;
- struct request_queue *q = bdev_get_queue(rnbd_dev->bdev);

rsp->hdr.type = cpu_to_le16(RNBD_MSG_OPEN_RSP);
rsp->device_id =
@@ -560,7 +559,7 @@ static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
rsp->cache_policy = 0;
if (bdev_write_cache(rnbd_dev->bdev))
rsp->cache_policy |= RNBD_WRITEBACK;
- if (blk_queue_fua(q))
+ if (bdev_fua(rnbd_dev->bdev))
rsp->cache_policy |= RNBD_FUA;
}

diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 03013e85ffc03..c4a903b8a47fc 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -727,14 +727,13 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,

if (data_direction == DMA_TO_DEVICE) {
struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
- struct request_queue *q = bdev_get_queue(ib_dev->ibd_bd);
/*
* Force writethrough using REQ_FUA if a volatile write cache
* is not enabled, or if initiator set the Force Unit Access bit.
*/
opf = REQ_OP_WRITE;
miter_dir = SG_MITER_TO_SG;
- if (test_bit(QUEUE_FLAG_FUA, &q->queue_flags)) {
+ if (bdev_fua(ib_dev->ibd_bd)) {
if (cmd->se_cmd_flags & SCF_FUA)
opf |= REQ_FUA;
else if (!bdev_write_cache(ib_dev->ibd_bd))
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index b08f5dc31780d..62da020d02a11 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -265,8 +265,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
* cache flushes on IO completion.
*/
if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) &&
- (dio->flags & IOMAP_DIO_WRITE_FUA) &&
- blk_queue_fua(bdev_get_queue(iomap->bdev)))
+ (dio->flags & IOMAP_DIO_WRITE_FUA) && bdev_fua(iomap->bdev))
use_fua = true;
}

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 807a49aa5a27a..075b16d4560e7 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -602,7 +602,6 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
REQ_FAILFAST_DRIVER))
#define blk_queue_quiesced(q) test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags)
#define blk_queue_pm_only(q) atomic_read(&(q)->pm_only)
-#define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags)
#define blk_queue_registered(q) test_bit(QUEUE_FLAG_REGISTERED, &(q)->queue_flags)
#define blk_queue_nowait(q) test_bit(QUEUE_FLAG_NOWAIT, &(q)->queue_flags)

@@ -1336,6 +1335,11 @@ static inline bool bdev_write_cache(struct block_device *bdev)
return test_bit(QUEUE_FLAG_WC, &bdev_get_queue(bdev)->queue_flags);
}

+static inline bool bdev_fua(struct block_device *bdev)
+{
+ return test_bit(QUEUE_FLAG_FUA, &bdev_get_queue(bdev)->queue_flags);
+}
+
static inline enum blk_zoned_model bdev_zoned_model(struct block_device *bdev)
{
struct request_queue *q = bdev_get_queue(bdev);
--
2.30.2

2022-04-11 07:57:38

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 18/27] block: move bdev_alignment_offset and queue_limit_alignment_offset out of line

No need to inline these fairly larger helpers.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
block/blk-settings.c | 23 +++++++++++++++++++++++
include/linux/blkdev.h | 21 +--------------------
2 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index b83df3d2eebca..94410a13c0dee 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -468,6 +468,16 @@ void blk_queue_io_opt(struct request_queue *q, unsigned int opt)
}
EXPORT_SYMBOL(blk_queue_io_opt);

+static int queue_limit_alignment_offset(struct queue_limits *lim,
+ sector_t sector)
+{
+ unsigned int granularity = max(lim->physical_block_size, lim->io_min);
+ unsigned int alignment = sector_div(sector, granularity >> SECTOR_SHIFT)
+ << SECTOR_SHIFT;
+
+ return (granularity + lim->alignment_offset - alignment) % granularity;
+}
+
static unsigned int blk_round_down_sectors(unsigned int sectors, unsigned int lbs)
{
sectors = round_down(sectors, lbs >> SECTOR_SHIFT);
@@ -901,3 +911,16 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
}
}
EXPORT_SYMBOL_GPL(blk_queue_set_zoned);
+
+int bdev_alignment_offset(struct block_device *bdev)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+
+ if (q->limits.misaligned)
+ return -1;
+ if (bdev_is_partition(bdev))
+ return queue_limit_alignment_offset(&q->limits,
+ bdev->bd_start_sect);
+ return q->limits.alignment_offset;
+}
+EXPORT_SYMBOL_GPL(bdev_alignment_offset);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d5346e72e3645..0a1795ac26275 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1251,26 +1251,7 @@ bdev_zone_write_granularity(struct block_device *bdev)
return queue_zone_write_granularity(bdev_get_queue(bdev));
}

-static inline int queue_limit_alignment_offset(struct queue_limits *lim, sector_t sector)
-{
- unsigned int granularity = max(lim->physical_block_size, lim->io_min);
- unsigned int alignment = sector_div(sector, granularity >> SECTOR_SHIFT)
- << SECTOR_SHIFT;
-
- return (granularity + lim->alignment_offset - alignment) % granularity;
-}
-
-static inline int bdev_alignment_offset(struct block_device *bdev)
-{
- struct request_queue *q = bdev_get_queue(bdev);
-
- if (q->limits.misaligned)
- return -1;
- if (bdev_is_partition(bdev))
- return queue_limit_alignment_offset(&q->limits,
- bdev->bd_start_sect);
- return q->limits.alignment_offset;
-}
+int bdev_alignment_offset(struct block_device *bdev);

static inline int queue_discard_alignment(const struct request_queue *q)
{
--
2.30.2

2022-04-11 08:22:07

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 03/27] target: fix discard alignment on partitions

Use the proper bdev_discard_alignment helper that accounts for partition
offsets.

Fixes: c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
drivers/target/target_core_device.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
index 3a1ec705cd80b..16e775bcf4a7c 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -849,8 +849,8 @@ bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib,
*/
attrib->max_unmap_block_desc_count = 1;
attrib->unmap_granularity = q->limits.discard_granularity / block_size;
- attrib->unmap_granularity_alignment = q->limits.discard_alignment /
- block_size;
+ attrib->unmap_granularity_alignment =
+ bdev_discard_alignment(bdev) / block_size;
return true;
}
EXPORT_SYMBOL(target_configure_unmap_from_queue);
--
2.30.2

2022-04-11 08:51:32

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 21/27] block: move {bdev,queue_limit}_discard_alignment out of line

No need to inline these fairly larger helpers. Also fix the return value
to be unsigned, just like the field in struct queue_limits.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
block/blk-settings.c | 35 +++++++++++++++++++++++++++++++++++
include/linux/blkdev.h | 34 +---------------------------------
2 files changed, 36 insertions(+), 33 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 94410a13c0dee..fd83d674afd0a 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -478,6 +478,30 @@ static int queue_limit_alignment_offset(struct queue_limits *lim,
return (granularity + lim->alignment_offset - alignment) % granularity;
}

+static unsigned int queue_limit_discard_alignment(struct queue_limits *lim,
+ sector_t sector)
+{
+ unsigned int alignment, granularity, offset;
+
+ if (!lim->max_discard_sectors)
+ return 0;
+
+ /* Why are these in bytes, not sectors? */
+ alignment = lim->discard_alignment >> SECTOR_SHIFT;
+ granularity = lim->discard_granularity >> SECTOR_SHIFT;
+ if (!granularity)
+ return 0;
+
+ /* Offset of the partition start in 'granularity' sectors */
+ offset = sector_div(sector, granularity);
+
+ /* And why do we do this modulus *again* in blkdev_issue_discard()? */
+ offset = (granularity + alignment - offset) % granularity;
+
+ /* Turn it back into bytes, gaah */
+ return offset << SECTOR_SHIFT;
+}
+
static unsigned int blk_round_down_sectors(unsigned int sectors, unsigned int lbs)
{
sectors = round_down(sectors, lbs >> SECTOR_SHIFT);
@@ -924,3 +948,14 @@ int bdev_alignment_offset(struct block_device *bdev)
return q->limits.alignment_offset;
}
EXPORT_SYMBOL_GPL(bdev_alignment_offset);
+
+unsigned int bdev_discard_alignment(struct block_device *bdev)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+
+ if (bdev_is_partition(bdev))
+ return queue_limit_discard_alignment(&q->limits,
+ bdev->bd_start_sect);
+ return q->limits.discard_alignment;
+}
+EXPORT_SYMBOL_GPL(bdev_discard_alignment);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 5a9b7aeda010b..34b1cfd067421 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1252,39 +1252,7 @@ bdev_zone_write_granularity(struct block_device *bdev)
}

int bdev_alignment_offset(struct block_device *bdev);
-
-static inline int queue_limit_discard_alignment(struct queue_limits *lim, sector_t sector)
-{
- unsigned int alignment, granularity, offset;
-
- if (!lim->max_discard_sectors)
- return 0;
-
- /* Why are these in bytes, not sectors? */
- alignment = lim->discard_alignment >> SECTOR_SHIFT;
- granularity = lim->discard_granularity >> SECTOR_SHIFT;
- if (!granularity)
- return 0;
-
- /* Offset of the partition start in 'granularity' sectors */
- offset = sector_div(sector, granularity);
-
- /* And why do we do this modulus *again* in blkdev_issue_discard()? */
- offset = (granularity + alignment - offset) % granularity;
-
- /* Turn it back into bytes, gaah */
- return offset << SECTOR_SHIFT;
-}
-
-static inline int bdev_discard_alignment(struct block_device *bdev)
-{
- struct request_queue *q = bdev_get_queue(bdev);
-
- if (bdev_is_partition(bdev))
- return queue_limit_discard_alignment(&q->limits,
- bdev->bd_start_sect);
- return q->limits.discard_alignment;
-}
+unsigned int bdev_discard_alignment(struct block_device *bdev);

static inline unsigned int bdev_write_zeroes_sectors(struct block_device *bdev)
{
--
2.30.2

2022-04-11 10:19:36

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 20/27] block: use bdev_discard_alignment in part_discard_alignment_show

Use the bdev based alignment helper instead of open coding it.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
block/partitions/core.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/block/partitions/core.c b/block/partitions/core.c
index 240b3fff521e4..70dec1c78521d 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -206,11 +206,7 @@ static ssize_t part_alignment_offset_show(struct device *dev,
static ssize_t part_discard_alignment_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- struct block_device *bdev = dev_to_bdev(dev);
-
- return sprintf(buf, "%u\n",
- queue_limit_discard_alignment(&bdev_get_queue(bdev)->limits,
- bdev->bd_start_sect));
+ return sprintf(buf, "%u\n", bdev_discard_alignment(dev_to_bdev(dev)));
}

static DEVICE_ATTR(partition, 0444, part_partition_show, NULL);
--
2.30.2

2022-04-11 10:46:05

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 06/27] drbd: use bdev_alignment_offset instead of queue_alignment_offset

The bdev version does the right thing for partitions, so use that.

Fixes: 9104d31a759f ("drbd: introduce WRITE_SAME support")
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Christoph Böhmwalder <[email protected]>
---
drivers/block/drbd/drbd_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index aa2c21aeb747c..eae629c4f6eaf 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -939,7 +939,7 @@ int drbd_send_sizes(struct drbd_peer_device *peer_device, int trigger_reply, enu
p->qlim->logical_block_size =
cpu_to_be32(bdev_logical_block_size(bdev));
p->qlim->alignment_offset =
- cpu_to_be32(queue_alignment_offset(q));
+ cpu_to_be32(bdev_alignment_offset(bdev));
p->qlim->io_min = cpu_to_be32(bdev_io_min(bdev));
p->qlim->io_opt = cpu_to_be32(bdev_io_opt(bdev));
p->qlim->discard_enabled = blk_queue_discard(q);
--
2.30.2

2022-04-11 12:40:47

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 16/27] block: use bdev_alignment_offset in part_alignment_offset_show

Replace the open coded offset calculation with the proper helper.
This is an ABI change in that the -1 for a misaligned partition is
properly propagated, which can be considered a bug fix and matches
what is done on the whole device.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
block/partitions/core.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/block/partitions/core.c b/block/partitions/core.c
index 2ef8dfa1e5c85..240b3fff521e4 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -200,11 +200,7 @@ static ssize_t part_ro_show(struct device *dev,
static ssize_t part_alignment_offset_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- struct block_device *bdev = dev_to_bdev(dev);
-
- return sprintf(buf, "%u\n",
- queue_limit_alignment_offset(&bdev_get_queue(bdev)->limits,
- bdev->bd_start_sect));
+ return sprintf(buf, "%u\n", bdev_alignment_offset(dev_to_bdev(dev)));
}

static ssize_t part_discard_alignment_show(struct device *dev,
--
2.30.2

2022-04-11 17:03:07

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 15/27] block: add a bdev_max_zone_append_sectors helper

Add a helper to check the max supported sectors for zone append based on
the block_device instead of having to poke into the block layer internal
request_queue.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Damien Le Moal <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
---
drivers/nvme/target/zns.c | 3 +--
fs/zonefs/super.c | 3 +--
include/linux/blkdev.h | 6 ++++++
3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c
index e34718b095504..82b61acf7a72b 100644
--- a/drivers/nvme/target/zns.c
+++ b/drivers/nvme/target/zns.c
@@ -34,8 +34,7 @@ static int validate_conv_zones_cb(struct blk_zone *z,

bool nvmet_bdev_zns_enable(struct nvmet_ns *ns)
{
- struct request_queue *q = ns->bdev->bd_disk->queue;
- u8 zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
+ u8 zasl = nvmet_zasl(bdev_max_zone_append_sectors(ns->bdev));
struct gendisk *bd_disk = ns->bdev->bd_disk;
int ret;

diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index 3614c7834007d..7a63807b736c4 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -678,13 +678,12 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
struct inode *inode = file_inode(iocb->ki_filp);
struct zonefs_inode_info *zi = ZONEFS_I(inode);
struct block_device *bdev = inode->i_sb->s_bdev;
- unsigned int max;
+ unsigned int max = bdev_max_zone_append_sectors(bdev);
struct bio *bio;
ssize_t size;
int nr_pages;
ssize_t ret;

- max = queue_max_zone_append_sectors(bdev_get_queue(bdev));
max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize);
iov_iter_truncate(from, max);

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a433798c3343e..f8c50b77543eb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1188,6 +1188,12 @@ static inline unsigned int queue_max_zone_append_sectors(const struct request_qu
return min(l->max_zone_append_sectors, l->max_sectors);
}

+static inline unsigned int
+bdev_max_zone_append_sectors(struct block_device *bdev)
+{
+ return queue_max_zone_append_sectors(bdev_get_queue(bdev));
+}
+
static inline unsigned queue_logical_block_size(const struct request_queue *q)
{
int retval = 512;
--
2.30.2

2022-04-12 07:17:36

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH 25/27] block: add a bdev_discard_granularity helper

On Sat, Apr 09, 2022 at 06:50:41AM +0200, Christoph Hellwig wrote:
> Abstract away implementation details from file systems by providing a
> block_device based helper to retrieve the discard granularity.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Reviewed-by: Martin K. Petersen <[email protected]>
> Acked-by: Christoph B?hmwalder <[email protected]> [btrfs]

This ^^^^ is for drbd

> Acked-by: Ryusuke Konishi <[email protected]>
> Acked-by: David Sterba <[email protected]> [btrfs]

2022-04-12 07:59:07

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 27/27] direct-io: remove random prefetches

Randomly poking into block device internals for manual prefetches isn't
exactly a very maintainable thing to do. And none of the performance
criticil direct I/O implementations still use this library function
anyway, so just drop it.

Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/direct-io.c | 32 ++++----------------------------
1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index aef06e607b405..840752006f601 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -1115,11 +1115,10 @@ static inline int drop_refcount(struct dio *dio)
* individual fields and will generate much worse code. This is important
* for the whole file.
*/
-static inline ssize_t
-do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
- struct block_device *bdev, struct iov_iter *iter,
- get_block_t get_block, dio_iodone_t end_io,
- dio_submit_t submit_io, int flags)
+ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
+ struct block_device *bdev, struct iov_iter *iter,
+ get_block_t get_block, dio_iodone_t end_io,
+ dio_submit_t submit_io, int flags)
{
unsigned i_blkbits = READ_ONCE(inode->i_blkbits);
unsigned blkbits = i_blkbits;
@@ -1334,29 +1333,6 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
kmem_cache_free(dio_cache, dio);
return retval;
}
-
-ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
- struct block_device *bdev, struct iov_iter *iter,
- get_block_t get_block,
- dio_iodone_t end_io, dio_submit_t submit_io,
- int flags)
-{
- /*
- * The block device state is needed in the end to finally
- * submit everything. Since it's likely to be cache cold
- * prefetch it here as first thing to hide some of the
- * latency.
- *
- * Attempt to prefetch the pieces we likely need later.
- */
- prefetch(&bdev->bd_disk->part_tbl);
- prefetch(bdev->bd_disk->queue);
- prefetch((char *)bdev->bd_disk->queue + SMP_CACHE_BYTES);
-
- return do_blockdev_direct_IO(iocb, inode, bdev, iter, get_block,
- end_io, submit_io, flags);
-}
-
EXPORT_SYMBOL(__blockdev_direct_IO);

static __init int dio_init(void)
--
2.30.2

2022-04-12 10:47:06

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 17/27] block: use bdev_alignment_offset in disk_alignment_offset_show

This does the same as the open coded variant except for an extra branch,
and allows to remove queue_alignment_offset entirely.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
---
block/genhd.c | 2 +-
include/linux/blkdev.h | 8 --------
2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index b8b6759d670f0..712031ce19070 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1010,7 +1010,7 @@ static ssize_t disk_alignment_offset_show(struct device *dev,
{
struct gendisk *disk = dev_to_disk(dev);

- return sprintf(buf, "%d\n", queue_alignment_offset(disk->queue));
+ return sprintf(buf, "%d\n", bdev_alignment_offset(disk->part0));
}

static ssize_t disk_discard_alignment_show(struct device *dev,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f8c50b77543eb..d5346e72e3645 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1251,14 +1251,6 @@ bdev_zone_write_granularity(struct block_device *bdev)
return queue_zone_write_granularity(bdev_get_queue(bdev));
}

-static inline int queue_alignment_offset(const struct request_queue *q)
-{
- if (q->limits.misaligned)
- return -1;
-
- return q->limits.alignment_offset;
-}
-
static inline int queue_limit_alignment_offset(struct queue_limits *lim, sector_t sector)
{
unsigned int granularity = max(lim->physical_block_size, lim->io_min);
--
2.30.2

2022-04-15 06:16:23

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH 03/27] target: fix discard alignment on partitions

On 4/14/22 21:52, Christoph Hellwig wrote:
> Use the proper bdev_discard_alignment helper that accounts for partition
> offsets.
>
> Fixes: c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6")
> Signed-off-by: Christoph Hellwig <[email protected]>
> Reviewed-by: Martin K. Petersen <[email protected]>
> ---

Helper does handle the case for of partition.

Looks good.

Reviewed-by: Chaitanya Kulkarni <[email protected]>

-ck

2022-04-15 06:18:17

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH 15/27] block: add a bdev_max_zone_append_sectors helper

On 4/14/22 21:52, Christoph Hellwig wrote:
> Add a helper to check the max supported sectors for zone append based on
> the block_device instead of having to poke into the block layer internal
> request_queue.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Acked-by: Damien Le Moal <[email protected]>
> Reviewed-by: Martin K. Petersen <[email protected]>
> Reviewed-by: Johannes Thumshirn <[email protected]>
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni <[email protected]>

-ck


2022-04-16 01:15:06

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH 16/27] block: use bdev_alignment_offset in part_alignment_offset_show

On 4/14/22 21:52, Christoph Hellwig wrote:
> Replace the open coded offset calculation with the proper helper.
> This is an ABI change in that the -1 for a misaligned partition is
> properly propagated, which can be considered a bug fix and matches
> what is done on the whole device.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Reviewed-by: Martin K. Petersen <[email protected]>
> ---

Neat!

Looks good.

Reviewed-by: Chaitanya Kulkarni <[email protected]>

-ck