2021-07-07 05:30:50

by Satya Tangirala

[permalink] [raw]
Subject: [PATCH v4 0/9] ensure bios aren't split in middle of crypto data unit

When a bio has an encryption context, its size must be aligned to its
crypto data unit size. A bio must not be split in the middle of a data
unit. Currently, bios are split at logical block boundaries, but a crypto
data unit size might be larger than the logical block size - e.g. a machine
could be using fscrypt (which uses 4K crypto data units) with an eMMC block
device with inline encryption hardware that has a logical block size of 512
bytes.

Right now, the only user of blk-crypto is fscrypt (on ext4 and f2fs), which
(currently) only submits bios where the size of each segment is a multiple
of data_unit_size. That happens to avoid most of the cases where bios
could be split in the middle of a data unit. However, when support for
direct I/O on encrypted files is added, or when support for filesystem
metadata encryption is added, it will be possible for bios to have segment
lengths that are multiples of the logical block size, but not multiples of
the crypto data unit size. So the block layer needs to start handling this
case appropriately.

So we need to support cases where the data unit size is larger than
the logical block size.

Patch 1 introduces blk_ksm_is_empty() that checks whether a keyslot manager
advertises a non-zero number of crypto capabilities. This function helps
clean up code a little.

Patches 2 and 3 introduce blk_crypto_bio_sectors_alignment() and
bio_required_sector_alignment() respectively. The former returns the
required sector alignment due to any crypto requirements the bio has. The
latter returns the required sector alignment due to any reason. The number
of sectors in any bio (and in particular, the number of sectors passed to
bio_split) *must* be aligned to the value returned by the latter function
(which, of course, calls the former function to decide what to return).

Patch 4 introduces restrictions on the data unit sizes advertised by a
keyslot manager. These restrictions come about due to the request_queue's
queue_limits, and are required to ensure that blk_bio_segment_split() can
always split a bio so that it has a limited number of sectors and segments,
and that the number of sectors it has is non-zero and aligned to
bio_required_sector_alignment().

Patches 5, 6 and 7 handle the error code from blk_ksm_register() in all
callers. This return code was previously ignored by all callers because
the function could only fail if the request_queue had integrity support,
which the callers ensured would not be the case. But the patches in this
series add more cases where this function might fail, so it's better to
just handle the return code properly in all the callers.

Patch 8 updates get_max_io_size() and blk_bio_segment_split() to respect
bio_required_sector_alignment(). get_max_io_size() always returns a
value that is aligned to bio_required_sector_alignment(), and together
with Patch 4, this is enough to ensure that if the bio is split, it is
split at a crypto data unit size boundary.

Since all callers to bio_split() should have been updated by the previous
patches, Patch 9 adds a WARN_ON() to bio_split() when sectors isn't aligned
to bio_required_sector_alignment() (the one exception is bounce.c which is
legacy code and won't interact with inline encryption).

This patch series was tested by running android xfstests on the SDM630
chipset (which has eMMC inline encryption hardware with logical block size
512 bytes) with test_dummy_encryption with and without the 'inlinecrypt'
mount option.

Changes v3 => v4
- Addressed comments and incorporated fixes from Eric
- Patch 4 in v3 has been removed (Eric points out it isn't required
without some of the changes in the device mapper patchset at
https://lore.kernel.org/linux-block/[email protected]/
so I'll add this patch to that series instead.

Satya Tangirala (9):
block: introduce blk_ksm_is_empty()
block: blk-crypto: introduce blk_crypto_bio_sectors_alignment()
block: introduce bio_required_sector_alignment()
block: keyslot-manager: introduce
blk_ksm_restrict_dus_to_queue_limits()
ufshcd: handle error from blk_ksm_register()
mmc: handle error from blk_ksm_register()
dm: handle error from blk_ksm_register()
blk-merge: Ensure bios aren't split in middle of a crypto data unit
block: add WARN_ON_ONCE() to bio_split() for sector alignment

block/bio.c | 1 +
block/blk-crypto-internal.h | 17 +++++
block/blk-merge.c | 49 ++++++++-----
block/blk.h | 17 +++++
block/keyslot-manager.c | 120 +++++++++++++++++++++++++++++++
drivers/md/dm-table.c | 28 +++++---
drivers/mmc/core/crypto.c | 13 +++-
drivers/scsi/ufs/ufshcd-crypto.c | 13 +++-
include/linux/keyslot-manager.h | 2 +
9 files changed, 227 insertions(+), 33 deletions(-)

--
2.25.1


2021-07-07 05:30:50

by Satya Tangirala

[permalink] [raw]
Subject: [PATCH v4 4/9] block: keyslot-manager: introduce blk_ksm_restrict_dus_to_queue_limits()

From: Satya Tangirala <[email protected]>

Not all crypto data unit sizes might be supported by the block layer due to
certain queue limits. This new function checks the queue limits and
appropriately modifies the keyslot manager to reflect only the supported
crypto data unit sizes. blk_ksm_register() runs any given ksm through this
function before actually registering the ksm with a queue.

Signed-off-by: Satya Tangirala <[email protected]>
---
block/keyslot-manager.c | 89 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 89 insertions(+)

diff --git a/block/keyslot-manager.c b/block/keyslot-manager.c
index 4d0794506d43..01358df66e41 100644
--- a/block/keyslot-manager.c
+++ b/block/keyslot-manager.c
@@ -468,12 +468,101 @@ bool blk_ksm_is_empty(struct blk_keyslot_manager *ksm)
}
EXPORT_SYMBOL_GPL(blk_ksm_is_empty);

+/*
+ * Restrict the supported data unit sizes of the ksm based on the request queue
+ * limits
+ */
+static unsigned long
+blk_ksm_largest_dus_for_queue_limits(struct blk_keyslot_manager *ksm,
+ struct request_queue *q)
+{
+ /* The largest possible data unit size we support is PAGE_SIZE. */
+ unsigned long largest_dus = PAGE_SIZE;
+ struct queue_limits *limits = &q->limits;
+
+ /*
+ * If the queue doesn't support SG gaps, then a bio may have to be split
+ * between any two bio_vecs. Since the size of each bio_vec is only
+ * guaranteed to be a multiple of logical_block_size, logical_block_size
+ * is also the maximum crypto data unit size that can be supported in
+ * this case, as bios must not be split in the middle of a data unit.
+ */
+ if (limits->virt_boundary_mask)
+ largest_dus = queue_logical_block_size(q);
+
+ /*
+ * Similarly, if chunk_sectors is set and a bio is submitted that
+ * crosses a chunk boundary, then that bio may have to be split at a
+ * boundary that is only logical_block_size aligned. So that limits the
+ * crypto data unit size to logical_block_size as well.
+ */
+ if (limits->chunk_sectors)
+ largest_dus = queue_logical_block_size(q);
+
+ /*
+ * Each bio_vec can be as small as logical_block_size. Therefore the
+ * crypto data unit size can't be greater than 'max_segments *
+ * logical_block_size', as otherwise in the worst case there would be no
+ * way to process the first data unit without exceeding max_segments.
+ */
+ largest_dus = min(largest_dus,
+ rounddown_pow_of_two(limits->max_segments) *
+ queue_logical_block_size(q));
+
+ return largest_dus;
+}
+
+/**
+ * blk_ksm_register() - Sets the queue's keyslot manager to the provided ksm, if
+ * compatible
+ * @ksm: The ksm to register
+ * @q: The request_queue to register the ksm to
+ *
+ * Checks if the keyslot manager provided is compatible with the request queue
+ * (i.e. the queue shouldn't also support integrity). After that, the crypto
+ * capabilities of the given keyslot manager are restricted to what the queue
+ * can support based on it's limits. Note that if @ksm won't support any
+ * crypto capabilities if its capabilities are restricted, the queue's ksm is
+ * set to NULL, instead of being set to a pointer to an "empty" @ksm, and @ksm
+ * is *not* modified.
+ *
+ * Return: true if @q's ksm is set to the provided @ksm, false otherwise
+ * (in which case @ksm will not have been modified)
+ */
bool blk_ksm_register(struct blk_keyslot_manager *ksm, struct request_queue *q)
{
+ unsigned long largest_dus_allowed;
+ unsigned int dus_allowed_mask;
+ bool dus_was_restricted = false;
+ int i;
+
if (blk_integrity_queue_supports_integrity(q)) {
pr_warn("Integrity and hardware inline encryption are not supported together. Disabling hardware inline encryption.\n");
return false;
}
+
+ largest_dus_allowed = blk_ksm_largest_dus_for_queue_limits(ksm, q);
+ dus_allowed_mask = (largest_dus_allowed << 1) - 1;
+
+ /*
+ * Check if ksm will become empty if we clear disallowed data unit
+ * sizes (in which case, don't modify the ksm)
+ */
+ if (blk_ksm_is_empty_mask(ksm, dus_allowed_mask))
+ return false;
+
+ /* Clear all unsupported data unit sizes. */
+ for (i = 0; i < ARRAY_SIZE(ksm->crypto_modes_supported); i++) {
+ if (ksm->crypto_modes_supported[i] & (~dus_allowed_mask))
+ dus_was_restricted = true;
+ ksm->crypto_modes_supported[i] &= dus_allowed_mask;
+ }
+
+ if (dus_was_restricted) {
+ pr_warn("Device: %s - Disallowed use of encryption data unit sizes above %lu bytes with inline encryption hardware because of device request queue limits.\n",
+ q->backing_dev_info->dev_name, largest_dus_allowed);
+ }
+
q->ksm = ksm;
return true;
}
--
2.25.1

2021-07-07 05:30:54

by Satya Tangirala

[permalink] [raw]
Subject: [PATCH v4 6/9] mmc: handle error from blk_ksm_register()

From: Satya Tangirala <[email protected]>

Handle any error from blk_ksm_register() in the callers. Previously,
the callers ignored the return value because blk_ksm_register() wouldn't
fail as long as the request_queue didn't have integrity support too, but
as this is no longer the case, it's safer for the callers to just handle
the return value appropriately.

Signed-off-by: Satya Tangirala <[email protected]>
---
drivers/mmc/core/crypto.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/core/crypto.c b/drivers/mmc/core/crypto.c
index 419a368f8402..c87be58a5548 100644
--- a/drivers/mmc/core/crypto.c
+++ b/drivers/mmc/core/crypto.c
@@ -21,8 +21,17 @@ void mmc_crypto_set_initial_state(struct mmc_host *host)

void mmc_crypto_setup_queue(struct request_queue *q, struct mmc_host *host)
{
- if (host->caps2 & MMC_CAP2_CRYPTO)
- blk_ksm_register(&host->ksm, q);
+ if (host->caps2 & MMC_CAP2_CRYPTO) {
+ /*
+ * This WARN_ON should never trigger since &host->ksm won't be
+ * "empty" (i.e. will support at least 1 crypto capability), an
+ * MMC device's request queue doesn't support integrity, and all MMC
+ * devices that currently support inline encryption also satisfy all the
+ * block layer constraints (i.e. support SG gaps, don't have chunk
+ * sectors, have a sufficiently large supported max_segments per bio)
+ */
+ WARN_ON(!blk_ksm_register(&host->ksm, q));
+ }
}
EXPORT_SYMBOL_GPL(mmc_crypto_setup_queue);

--
2.25.1

2021-07-07 05:31:19

by Satya Tangirala

[permalink] [raw]
Subject: [PATCH v4 8/9] blk-merge: Ensure bios aren't split in middle of a crypto data unit

From: Satya Tangirala <[email protected]>

Update get_max_io_size() to return a value aligned to
bio_required_sector_alignment(). With this change, and the changes
to blk_ksm_register() that restrict the supported data unit sizes
based on the queue's limits, blk_bio_segment_split() won't split bios in
the middle of a data unit anymore

Signed-off-by: Satya Tangirala <[email protected]>
---
block/blk-merge.c | 49 ++++++++++++++++++++++++++++++-----------------
1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index a11b3b53717e..68c96dec5680 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -135,27 +135,39 @@ static struct bio *blk_bio_write_same_split(struct request_queue *q,

/*
* Return the maximum number of sectors from the start of a bio that may be
- * submitted as a single request to a block device. If enough sectors remain,
- * align the end to the physical block size. Otherwise align the end to the
- * logical block size. This approach minimizes the number of non-aligned
- * requests that are submitted to a block device if the start of a bio is not
- * aligned to a physical block boundary.
+ * submitted as a single request to a block device. Tries to align the end to
+ * the physical block size, while also aligning the returned number of sectors
+ * to bio_required_sector_alignment(). This approach minimizes the number of
+ * non-aligned requests that are submitted to a block device if the start of a
+ * bio is not aligned to a physical block boundary.
+ *
+ * More clearly, there are two conditions we're interested in satisfying.
+ *
+ * Condition 1) We absolutely must have @return divisible by the
+ * bio_required_sector_alignment(bio).
+ *
+ * Condition 2) *If possible*, while still satisfying Condition 1, we would like
+ * to have start_offset + @return divisible by physical block size in sectors
+ * (pbs).
*/
static inline unsigned get_max_io_size(struct request_queue *q,
struct bio *bio)
{
- unsigned sectors = blk_max_size_offset(q, bio->bi_iter.bi_sector, 0);
- unsigned max_sectors = sectors;
- unsigned pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
- unsigned lbs = queue_logical_block_size(q) >> SECTOR_SHIFT;
- unsigned start_offset = bio->bi_iter.bi_sector & (pbs - 1);
-
- max_sectors += start_offset;
- max_sectors &= ~(pbs - 1);
- if (max_sectors > start_offset)
- return max_sectors - start_offset;
-
- return sectors & ~(lbs - 1);
+ unsigned int start_offset = bio->bi_iter.bi_sector;
+ unsigned int sectors = blk_max_size_offset(q, start_offset, 0);
+ unsigned int pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
+ unsigned int req_sector_align = bio_required_sector_alignment(bio);
+ unsigned int pbs_aligned_sector = round_down(start_offset + sectors, pbs);
+
+ /*
+ * If we can return a pbs aligned endpoint while satisfying Condition 1,
+ * then do so.
+ */
+ if (pbs_aligned_sector > start_offset &&
+ IS_ALIGNED(pbs_aligned_sector - start_offset, req_sector_align))
+ return pbs_aligned_sector - start_offset;
+
+ return round_down(sectors, req_sector_align);
}

static inline unsigned get_max_segment_size(const struct request_queue *q,
@@ -235,6 +247,7 @@ static bool bvec_split_segs(const struct request_queue *q,
* following is guaranteed for the cloned bio:
* - That it has at most get_max_io_size(@q, @bio) sectors.
* - That it has at most queue_max_segments(@q) segments.
+ * - That the number of sectors is aligned to bio_required_sector_alignment(@bio).
*
* Except for discard requests the cloned bio will point at the bi_io_vec of
* the original bio. It is the responsibility of the caller to ensure that the
@@ -286,7 +299,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
* big IO can be trival, disable iopoll when split needed.
*/
bio->bi_opf &= ~REQ_HIPRI;
-
+ sectors = round_down(sectors, bio_required_sector_alignment(bio));
return bio_split(bio, sectors, GFP_NOIO, bs);
}

--
2.25.1

2021-07-07 05:32:19

by Satya Tangirala

[permalink] [raw]
Subject: [PATCH v4 9/9] block: add WARN_ON_ONCE() to bio_split() for sector alignment

From: Satya Tangirala <[email protected]>

The number of sectors passed to bio_split() must be aligned to
bio_required_sector_alignment(). All callers (other than bounce.c) have
been updated to ensure this, so add a WARN_ON_ONCE() if the number of
sectors is not aligned. (bounce.c was not updated since it's legacy code
- any device that enables bounce buffering won't declare inline
encryption support, so bounce.c will never see a bio with an encryption
context).

Signed-off-by: Satya Tangirala <[email protected]>
---
block/bio.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/block/bio.c b/block/bio.c
index 1fab762e079b..4c7bfdeefe76 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1435,6 +1435,7 @@ struct bio *bio_split(struct bio *bio, int sectors,

BUG_ON(sectors <= 0);
BUG_ON(sectors >= bio_sectors(bio));
+ WARN_ON_ONCE(!IS_ALIGNED(sectors, bio_required_sector_alignment(bio)));

/* Zone append commands cannot be split */
if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND))
--
2.25.1

2021-07-07 05:32:42

by Satya Tangirala

[permalink] [raw]
Subject: [PATCH v4 7/9] dm: handle error from blk_ksm_register()

From: Satya Tangirala <[email protected]>

Handle any error from blk_ksm_register() in the callers. Previously,
the callers ignored the return value because blk_ksm_register() wouldn't
fail as long as the request_queue didn't have integrity support too, but
as this is no longer the case, it's safer for the callers to just handle
the return value appropriately.

Signed-off-by: Satya Tangirala <[email protected]>
---
drivers/md/dm-table.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 29cbfc3e3c4b..c79c0fbe80dd 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1343,6 +1343,20 @@ static int dm_table_construct_keyslot_manager(struct dm_table *t)
*/
t->ksm = ksm;

+ /*
+ * At this point, t->ksm is either NULL or *not* empty (i.e. will support
+ * at least 1 crypto capability), the request queue doesn't support
+ * integrity, and it also satisfies all the block layer constraints
+ * "sufficiently" (as in the constraints of the DM device's request queue
+ * won't preclude any of the intersection of the supported capabilities
+ * of the underlying devices, since if some capability were precluded by
+ * the DM device's request queue's constraints, that capability would
+ * also have been precluded by one of the child device's request queues).
+ *
+ * Hence any future attempt to call blk_ksm_register() on t->ksm (if it's
+ * not NULL) will succeed.
+ */
+
return 0;
}

@@ -1354,7 +1368,8 @@ static void dm_update_keyslot_manager(struct request_queue *q,

/* Make the ksm less restrictive */
if (!q->ksm) {
- blk_ksm_register(t->ksm, q);
+ if (WARN_ON(!blk_ksm_register(t->ksm, q)))
+ dm_destroy_keyslot_manager(t->ksm);
} else {
blk_ksm_update_capabilities(q->ksm, t->ksm);
dm_destroy_keyslot_manager(t->ksm);
--
2.25.1

2021-07-23 16:53:08

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 0/9] ensure bios aren't split in middle of crypto data unit

On Tue, Jul 06, 2021 at 10:29:34PM -0700, Satya Tangirala wrote:
>
> Changes v3 => v4
> - Patch 4 in v3 has been removed (Eric points out it isn't required
> without some of the changes in the device mapper patchset at
> https://lore.kernel.org/linux-block/[email protected]/
> so I'll add this patch to that series instead.

Wouldn't it make more sense to have the blk-crypto-fallback change in this
series? My concern was just that it didn't make sense to have it split between
the two patch series -- it seemed like one logical change.

- Eric

2021-07-23 17:12:45

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 4/9] block: keyslot-manager: introduce blk_ksm_restrict_dus_to_queue_limits()

On Tue, Jul 06, 2021 at 10:29:38PM -0700, Satya Tangirala wrote:
> +/*
> + * Restrict the supported data unit sizes of the ksm based on the request queue
> + * limits
> + */
> +static unsigned long
> +blk_ksm_largest_dus_for_queue_limits(struct blk_keyslot_manager *ksm,
> + struct request_queue *q)
> +{

The ksm argument to this function isn't actually used.

Also the comment should be fixed to be something like "Return the largest data
unit size that is compatible with the given request queue.".

> +/**
> + * blk_ksm_register() - Sets the queue's keyslot manager to the provided ksm, if
> + * compatible
> + * @ksm: The ksm to register
> + * @q: The request_queue to register the ksm to
> + *
> + * Checks if the keyslot manager provided is compatible with the request queue
> + * (i.e. the queue shouldn't also support integrity). After that, the crypto
> + * capabilities of the given keyslot manager are restricted to what the queue
> + * can support based on it's limits. Note that if @ksm won't support any
> + * crypto capabilities if its capabilities are restricted, the queue's ksm is
> + * set to NULL, instead of being set to a pointer to an "empty" @ksm, and @ksm
> + * is *not* modified.
> + *
> + * Return: true if @q's ksm is set to the provided @ksm, false otherwise
> + * (in which case @ksm will not have been modified)
> + */

Can this comment be made more concise and less confusing? Something like:

Checks whether any of @ksm's crypto capabilities are compatible with the
request_queue, and if so, clears any incompatible capabilities from @ksm
and assigns @ksm to the request_queue.

Return: %true if @ksm was assigned to @q, or %false if it was not (due
to none of @ksm's crypto capabilities being compatible with @q)

> bool blk_ksm_register(struct blk_keyslot_manager *ksm, struct request_queue *q)
> {
> + unsigned long largest_dus_allowed;
> + unsigned int dus_allowed_mask;
> + bool dus_was_restricted = false;
> + int i;
> +
> if (blk_integrity_queue_supports_integrity(q)) {
> pr_warn("Integrity and hardware inline encryption are not supported together. Disabling hardware inline encryption.\n");
> return false;
> }
> +
> + largest_dus_allowed = blk_ksm_largest_dus_for_queue_limits(ksm, q);
> + dus_allowed_mask = (largest_dus_allowed << 1) - 1;
> +
> + /*
> + * Check if ksm will become empty if we clear disallowed data unit
> + * sizes (in which case, don't modify the ksm)
> + */
> + if (blk_ksm_is_empty_mask(ksm, dus_allowed_mask))
> + return false;
> +
> + /* Clear all unsupported data unit sizes. */
> + for (i = 0; i < ARRAY_SIZE(ksm->crypto_modes_supported); i++) {
> + if (ksm->crypto_modes_supported[i] & (~dus_allowed_mask))

There's no need for the parenthesis around ~dus_allowed_mask.

> + dus_was_restricted = true;
> + ksm->crypto_modes_supported[i] &= dus_allowed_mask;
> + }
> +
> + if (dus_was_restricted) {
> + pr_warn("Device: %s - Disallowed use of encryption data unit sizes above %lu bytes with inline encryption hardware because of device request queue limits.\n",
> + q->backing_dev_info->dev_name, largest_dus_allowed);
> + }

Is there a better way to get the queue/disk name? Also, device names normally
go at the very beginning of the messages, like "%s: <message>".

This message is also very long; something more concise would be good. Maybe:

"%s: only allowing crypto data unit sizes up to %lu bytes due to device limitations\n"

- Eric

2021-07-23 17:27:54

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 7/9] dm: handle error from blk_ksm_register()

On Tue, Jul 06, 2021 at 10:29:41PM -0700, Satya Tangirala wrote:
> From: Satya Tangirala <[email protected]>
>
> Handle any error from blk_ksm_register() in the callers. Previously,
> the callers ignored the return value because blk_ksm_register() wouldn't
> fail as long as the request_queue didn't have integrity support too, but
> as this is no longer the case, it's safer for the callers to just handle
> the return value appropriately.
>
> Signed-off-by: Satya Tangirala <[email protected]>
> ---
> drivers/md/dm-table.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 29cbfc3e3c4b..c79c0fbe80dd 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -1343,6 +1343,20 @@ static int dm_table_construct_keyslot_manager(struct dm_table *t)
> */
> t->ksm = ksm;
>
> + /*
> + * At this point, t->ksm is either NULL or *not* empty (i.e. will support
> + * at least 1 crypto capability), the request queue doesn't support
> + * integrity, and it also satisfies all the block layer constraints
> + * "sufficiently" (as in the constraints of the DM device's request queue
> + * won't preclude any of the intersection of the supported capabilities
> + * of the underlying devices, since if some capability were precluded by
> + * the DM device's request queue's constraints, that capability would
> + * also have been precluded by one of the child device's request queues).
> + *
> + * Hence any future attempt to call blk_ksm_register() on t->ksm (if it's
> + * not NULL) will succeed.
> + */
> +
> return 0;

I don't think this properly answers the question I had on the previous version
of this patch, which was not just how we know that blk_ksm_register() will
succeed later, but also how we know that the blk_ksm_is_superset() check done
above is valid when some of the crypto capabilities may be cleared by
blk_ksm_register() later. Is the assumption actually that in the device-mapper
case, blk_ksm_register() will never clear any crypto capabilities at all?
If so, can that be explained properly?

- Eric

2021-07-23 17:32:39

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 9/9] block: add WARN_ON_ONCE() to bio_split() for sector alignment

On Tue, Jul 06, 2021 at 10:29:43PM -0700, Satya Tangirala wrote:
> From: Satya Tangirala <[email protected]>
>
> The number of sectors passed to bio_split() must be aligned to
> bio_required_sector_alignment(). All callers (other than bounce.c) have
> been updated to ensure this, so add a WARN_ON_ONCE() if the number of
> sectors is not aligned. (bounce.c was not updated since it's legacy code
> - any device that enables bounce buffering won't declare inline
> encryption support, so bounce.c will never see a bio with an encryption
> context).

The last sentence could say "so bounce.c will never see a bio with
bio_required_sector_alignment() greater than the logical block size".

>
> Signed-off-by: Satya Tangirala <[email protected]>
> ---
> block/bio.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/block/bio.c b/block/bio.c
> index 1fab762e079b..4c7bfdeefe76 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1435,6 +1435,7 @@ struct bio *bio_split(struct bio *bio, int sectors,
>
> BUG_ON(sectors <= 0);
> BUG_ON(sectors >= bio_sectors(bio));
> + WARN_ON_ONCE(!IS_ALIGNED(sectors, bio_required_sector_alignment(bio)));
>
> /* Zone append commands cannot be split */
> if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND))

Reviewed-by: Eric Biggers <[email protected]>

- Eric

2021-07-23 17:56:16

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 0/9] ensure bios aren't split in middle of crypto data unit

On Fri, Jul 23, 2021 at 09:49:52AM -0700, Eric Biggers wrote:
> On Tue, Jul 06, 2021 at 10:29:34PM -0700, Satya Tangirala wrote:
> >
> > Changes v3 => v4
> > - Patch 4 in v3 has been removed (Eric points out it isn't required
> > without some of the changes in the device mapper patchset at
> > https://lore.kernel.org/linux-block/[email protected]/
> > so I'll add this patch to that series instead.
>
> Wouldn't it make more sense to have the blk-crypto-fallback change in this
> series? My concern was just that it didn't make sense to have it split between
> the two patch series -- it seemed like one logical change.

This series also doesn't actually remove the data_unit_size bvec alignment
requirement from block/blk-crypto.c. Isn't that the main goal here? So I
expected that it would be included.

Unless there are special considerations here, I'd expect that all the block
layer changes needed for the fscrypt direct I/O support would be in one patch
series that could go in together, and then the patch series with the direct I/O
support would be only filesystem layer changes.

- Eric

2021-07-23 18:13:42

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 8/9] blk-merge: Ensure bios aren't split in middle of a crypto data unit

On Tue, Jul 06, 2021 at 10:29:42PM -0700, Satya Tangirala wrote:
> From: Satya Tangirala <[email protected]>
>
> Update get_max_io_size() to return a value aligned to
> bio_required_sector_alignment(). With this change, and the changes
> to blk_ksm_register() that restrict the supported data unit sizes
> based on the queue's limits, blk_bio_segment_split() won't split bios in
> the middle of a data unit anymore
>
> Signed-off-by: Satya Tangirala <[email protected]>
> ---
> block/blk-merge.c | 49 ++++++++++++++++++++++++++++++-----------------
> 1 file changed, 31 insertions(+), 18 deletions(-)
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index a11b3b53717e..68c96dec5680 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -135,27 +135,39 @@ static struct bio *blk_bio_write_same_split(struct request_queue *q,
>
> /*
> * Return the maximum number of sectors from the start of a bio that may be
> - * submitted as a single request to a block device. If enough sectors remain,
> - * align the end to the physical block size. Otherwise align the end to the
> - * logical block size. This approach minimizes the number of non-aligned
> - * requests that are submitted to a block device if the start of a bio is not
> - * aligned to a physical block boundary.
> + * submitted as a single request to a block device. Tries to align the end to
> + * the physical block size, while also aligning the returned number of sectors
> + * to bio_required_sector_alignment(). This approach minimizes the number of
> + * non-aligned requests that are submitted to a block device if the start of a
> + * bio is not aligned to a physical block boundary.
> + *
> + * More clearly, there are two conditions we're interested in satisfying.
> + *
> + * Condition 1) We absolutely must have @return divisible by the
> + * bio_required_sector_alignment(bio).
> + *
> + * Condition 2) *If possible*, while still satisfying Condition 1, we would like
> + * to have start_offset + @return divisible by physical block size in sectors
> + * (pbs).
> */

This comment is basically saying the same thing in two different ways. Could we
just say it once clearly? Maybe:

/*
* Return the maximum number of sectors from the start of a bio that may be
* submitted as a single request to a block device. The returned value will be
* a multiple of bio_required_sector_alignment(). If possible, the returned
* value will also make the request end on a physical block boundary; this
* minimizes the number of non-aligned requests that are submitted to a block
* device when a bio doesn't start on a physical block boundary.
*/

> static inline unsigned get_max_io_size(struct request_queue *q,
> struct bio *bio)
> {
> - unsigned sectors = blk_max_size_offset(q, bio->bi_iter.bi_sector, 0);
> - unsigned max_sectors = sectors;
> - unsigned pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
> - unsigned lbs = queue_logical_block_size(q) >> SECTOR_SHIFT;
> - unsigned start_offset = bio->bi_iter.bi_sector & (pbs - 1);
> -
> - max_sectors += start_offset;
> - max_sectors &= ~(pbs - 1);
> - if (max_sectors > start_offset)
> - return max_sectors - start_offset;
> -
> - return sectors & ~(lbs - 1);
> + unsigned int start_offset = bio->bi_iter.bi_sector;
> + unsigned int sectors = blk_max_size_offset(q, start_offset, 0);
> + unsigned int pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
> + unsigned int req_sector_align = bio_required_sector_alignment(bio);
> + unsigned int pbs_aligned_sector = round_down(start_offset + sectors, pbs);
> +
> + /*
> + * If we can return a pbs aligned endpoint while satisfying Condition 1,
> + * then do so.
> + */
> + if (pbs_aligned_sector > start_offset &&
> + IS_ALIGNED(pbs_aligned_sector - start_offset, req_sector_align))
> + return pbs_aligned_sector - start_offset;
> +
> + return round_down(sectors, req_sector_align);
> }

'start_offset' and 'pbs_aligned_sector' need to be 'sector_t', not 'unsigned
int'. Also 'start_offset' probably should be called 'start_sector'. And
'req_sector_align' could be shortened to just 'align'.

- Eric

2021-07-24 07:41:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 0/9] ensure bios aren't split in middle of crypto data unit

On Tue, Jul 06, 2021 at 10:29:34PM -0700, Satya Tangirala wrote:
> When a bio has an encryption context, its size must be aligned to its
> crypto data unit size. A bio must not be split in the middle of a data
> unit. Currently, bios are split at logical block boundaries, but a crypto
> data unit size might be larger than the logical block size - e.g. a machine
> could be using fscrypt (which uses 4K crypto data units) with an eMMC block
> device with inline encryption hardware that has a logical block size of 512
> bytes.
>
> Right now, the only user of blk-crypto is fscrypt (on ext4 and f2fs), which
> (currently) only submits bios where the size of each segment is a multiple
> of data_unit_size. That happens to avoid most of the cases where bios
> could be split in the middle of a data unit. However, when support for
> direct I/O on encrypted files is added, or when support for filesystem
> metadata encryption is added, it will be possible for bios to have segment
> lengths that are multiples of the logical block size, but not multiples of
> the crypto data unit size. So the block layer needs to start handling this
> case appropriately.

I'm still not sold on this case yet. sector size aligned I/O is an
optional feature, and I don't think it is worth this overhead. And
while file systems metadata can be smaller than the file system block
size in a few cases (e.g. XFS log writes), that is usually an extra
performance optimization and can be trivially disabled in mkfs.