2016-05-25 06:12:28

by Baolin Wang

[permalink] [raw]
Subject: [RFC 0/3] Introduce the bulk mode method when sending request to crypto layer

This patchset will check if the cipher can support bulk mode, then dm-crypt
will handle different ways to send requests to crypto layer according to
cipher mode.

Looking forward to any comments and suggestions. Thanks.

Baolin Wang (3):
block: Introduce blk_bio_map_sg() to map one bio
crypto: Introduce CRYPTO_ALG_BULK flag
md: dm-crypt: Introduce the bulk mode method when sending request

block/blk-merge.c | 45 +++++++++++
drivers/md/dm-crypt.c | 188 ++++++++++++++++++++++++++++++++++++++++++---
include/crypto/skcipher.h | 7 ++
include/linux/blkdev.h | 3 +
include/linux/crypto.h | 6 ++
5 files changed, 237 insertions(+), 12 deletions(-)

--
1.7.9.5


2016-05-25 06:12:29

by Baolin Wang

[permalink] [raw]
Subject: [RFC 1/3] block: Introduce blk_bio_map_sg() to map one bio

In dm-crypt, it need to map one bio to scatterlist for improving the
hardware engine encryption efficiency. Thus this patch introduces the
blk_bio_map_sg() function to map one bio with scatterlists.

Signed-off-by: Baolin Wang <[email protected]>
---
block/blk-merge.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/blkdev.h | 3 +++
2 files changed, 48 insertions(+)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2613531..9b92af4 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -417,6 +417,51 @@ single_segment:
}

/*
+ * map a bio to scatterlist, return number of sg entries setup.
+ */
+int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+ struct scatterlist *sglist,
+ struct scatterlist **sg)
+{
+ struct bio_vec bvec, bvprv = { NULL };
+ struct bvec_iter iter;
+ int nsegs, cluster;
+
+ nsegs = 0;
+ cluster = blk_queue_cluster(q);
+
+ if (bio->bi_rw & REQ_DISCARD) {
+ /*
+ * This is a hack - drivers should be neither modifying the
+ * biovec, nor relying on bi_vcnt - but because of
+ * blk_add_request_payload(), a discard bio may or may not have
+ * a payload we need to set up here (thank you Christoph) and
+ * bi_vcnt is really the only way of telling if we need to.
+ */
+
+ if (bio->bi_vcnt)
+ goto single_segment;
+
+ return 0;
+ }
+
+ if (bio->bi_rw & REQ_WRITE_SAME) {
+single_segment:
+ *sg = sglist;
+ bvec = bio_iovec(bio);
+ sg_set_page(*sg, bvec.bv_page, bvec.bv_len, bvec.bv_offset);
+ return 1;
+ }
+
+ bio_for_each_segment(bvec, bio, iter)
+ __blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
+ &nsegs, &cluster);
+
+ return nsegs;
+}
+EXPORT_SYMBOL(blk_bio_map_sg);
+
+/*
* map a request to scatterlist, return number of sg entries setup. Caller
* must make sure sg can hold rq->nr_phys_segments entries
*/
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1fd8fdf..e5de4f8 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1013,6 +1013,9 @@ extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool fu
extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);

extern int blk_rq_map_sg(struct request_queue *, struct request *, struct scatterlist *);
+extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+ struct scatterlist *sglist,
+ struct scatterlist **sg);
extern void blk_dump_rq_flags(struct request *, char *);
extern long nr_blockdev_pages(void);

--
1.7.9.5


2016-05-25 06:13:03

by Baolin Wang

[permalink] [raw]
Subject: [RFC 2/3] crypto: Introduce CRYPTO_ALG_BULK flag

Now some cipher hardware engines prefer to handle bulk block rather than one
sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
the intermediate values (IV) by themselves in one bulk block. This means we
can increase the size of the request by merging request rather than always 512
bytes and thus increase the hardware engine processing speed.

So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
mode.

Signed-off-by: Baolin Wang <[email protected]>
---
include/crypto/skcipher.h | 7 +++++++
include/linux/crypto.h | 6 ++++++
2 files changed, 13 insertions(+)

diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 0f987f5..d89d29a 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -519,5 +519,12 @@ static inline void skcipher_request_set_crypt(
req->iv = iv;
}

+static inline unsigned int skcipher_is_bulk_mode(struct crypto_skcipher *sk_tfm)
+{
+ struct crypto_tfm *tfm = crypto_skcipher_tfm(sk_tfm);
+
+ return crypto_tfm_alg_bulk(tfm);
+}
+
#endif /* _CRYPTO_SKCIPHER_H */

diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 6e28c89..a315487 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -63,6 +63,7 @@
#define CRYPTO_ALG_DEAD 0x00000020
#define CRYPTO_ALG_DYING 0x00000040
#define CRYPTO_ALG_ASYNC 0x00000080
+#define CRYPTO_ALG_BULK 0x00000100

/*
* Set this bit if and only if the algorithm requires another algorithm of
@@ -623,6 +624,11 @@ static inline u32 crypto_tfm_alg_type(struct crypto_tfm *tfm)
return tfm->__crt_alg->cra_flags & CRYPTO_ALG_TYPE_MASK;
}

+static inline unsigned int crypto_tfm_alg_bulk(struct crypto_tfm *tfm)
+{
+ return tfm->__crt_alg->cra_flags & CRYPTO_ALG_BULK;
+}
+
static inline unsigned int crypto_tfm_alg_blocksize(struct crypto_tfm *tfm)
{
return tfm->__crt_alg->cra_blocksize;
--
1.7.9.5

2016-05-25 06:12:31

by Baolin Wang

[permalink] [raw]
Subject: [RFC 3/3] md: dm-crypt: Introduce the bulk mode method when sending request

In now dm-crypt code, it is ineffective to map one segment (always one
sector) of one bio with just only one scatterlist at one time for hardware
crypto engine. Especially for some encryption mode (like ecb or xts mode)
cooperating with the crypto engine, they just need one initial IV or null
IV instead of different IV for each sector. In this situation We can consider
to use multiple scatterlists to map the whole bio and send all scatterlists
of one bio to crypto engine to encrypt or decrypt, which can improve the
hardware engine's efficiency.

With this optimization, On my test setup (beaglebone black board) using 64KB
I/Os on an eMMC storage device I saw about 60% improvement in throughput for
encrypted writes, and about 100% improvement for encrypted reads. But this
is not fit for other modes which need different IV for each sector.

Signed-off-by: Baolin Wang <[email protected]>
---
drivers/md/dm-crypt.c | 188 +++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 176 insertions(+), 12 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 4f3cb35..1c86ea7 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -33,6 +33,7 @@
#include <linux/device-mapper.h>

#define DM_MSG_PREFIX "crypt"
+#define DM_MAX_SG_LIST 1024

/*
* context holding the current state of a multi-part conversion
@@ -46,6 +47,8 @@ struct convert_context {
sector_t cc_sector;
atomic_t cc_pending;
struct skcipher_request *req;
+ struct sg_table sgt_in;
+ struct sg_table sgt_out;
};

/*
@@ -803,6 +806,108 @@ static struct crypt_iv_operations crypt_iv_tcw_ops = {
.post = crypt_iv_tcw_post
};

+/*
+ * Check how many sg entry numbers are needed when map one bio
+ * with scatterlists in advance.
+ */
+static unsigned int crypt_sg_entry(struct bio *bio_t)
+{
+ struct request_queue *q = bdev_get_queue(bio_t->bi_bdev);
+ int cluster = blk_queue_cluster(q);
+ struct bio_vec bvec, bvprv = { NULL };
+ struct bvec_iter biter;
+ unsigned long nbytes = 0, sg_length = 0;
+ unsigned int sg_cnt = 0, first_bvec = 0;
+
+ if (bio_t->bi_rw & REQ_DISCARD) {
+ if (bio_t->bi_vcnt)
+ return 1;
+ return 0;
+ }
+
+ if (bio_t->bi_rw & REQ_WRITE_SAME)
+ return 1;
+
+ bio_for_each_segment(bvec, bio_t, biter) {
+ nbytes = bvec.bv_len;
+
+ if (!cluster) {
+ sg_cnt++;
+ continue;
+ }
+
+ if (!first_bvec) {
+ first_bvec = 1;
+ goto new_segment;
+ }
+
+ if (sg_length + nbytes > queue_max_segment_size(q))
+ goto new_segment;
+
+ if (!BIOVEC_PHYS_MERGEABLE(&bvprv, &bvec))
+ goto new_segment;
+
+ if (!BIOVEC_SEG_BOUNDARY(q, &bvprv, &bvec))
+ goto new_segment;
+
+ sg_length += nbytes;
+ continue;
+
+new_segment:
+ memcpy(&bvprv, &bvec, sizeof(struct bio_vec));
+ sg_length = nbytes;
+ sg_cnt++;
+ }
+
+ return sg_cnt;
+}
+
+static int crypt_convert_alloc_table(struct crypt_config *cc,
+ struct convert_context *ctx)
+{
+ struct bio *bio_in = ctx->bio_in;
+ struct bio *bio_out = ctx->bio_out;
+ unsigned int mode = skcipher_is_bulk_mode(any_tfm(cc));
+ unsigned int sg_in_max, sg_out_max;
+ int ret = 0;
+
+ if (!mode)
+ goto out2;
+
+ /*
+ * Need to calculate how many sg entry need to be used
+ * for this bio.
+ */
+ sg_in_max = crypt_sg_entry(bio_in) + 1;
+ if (sg_in_max > DM_MAX_SG_LIST || sg_in_max <= 2)
+ goto out2;
+
+ ret = sg_alloc_table(&ctx->sgt_in, sg_in_max, GFP_KERNEL);
+ if (ret)
+ goto out2;
+
+ if (bio_data_dir(bio_in) == READ)
+ goto out1;
+
+ sg_out_max = crypt_sg_entry(bio_out) + 1;
+ if (sg_out_max > DM_MAX_SG_LIST || sg_out_max <= 2)
+ goto out3;
+
+ ret = sg_alloc_table(&ctx->sgt_out, sg_out_max, GFP_KERNEL);
+ if (ret)
+ goto out3;
+
+ return 0;
+
+out3:
+ sg_free_table(&ctx->sgt_in);
+out2:
+ ctx->sgt_in.orig_nents = 0;
+out1:
+ ctx->sgt_out.orig_nents = 0;
+ return ret;
+}
+
static void crypt_convert_init(struct crypt_config *cc,
struct convert_context *ctx,
struct bio *bio_out, struct bio *bio_in,
@@ -843,7 +948,13 @@ static int crypt_convert_block(struct crypt_config *cc,
{
struct bio_vec bv_in = bio_iter_iovec(ctx->bio_in, ctx->iter_in);
struct bio_vec bv_out = bio_iter_iovec(ctx->bio_out, ctx->iter_out);
+ unsigned int mode = skcipher_is_bulk_mode(any_tfm(cc));
+ struct bio *bio_in = ctx->bio_in;
+ struct bio *bio_out = ctx->bio_out;
+ unsigned int total_bytes = bio_in->bi_iter.bi_size;
struct dm_crypt_request *dmreq;
+ struct scatterlist *sg_in;
+ struct scatterlist *sg_out;
u8 *iv;
int r;

@@ -852,16 +963,6 @@ static int crypt_convert_block(struct crypt_config *cc,

dmreq->iv_sector = ctx->cc_sector;
dmreq->ctx = ctx;
- sg_init_table(&dmreq->sg_in, 1);
- sg_set_page(&dmreq->sg_in, bv_in.bv_page, 1 << SECTOR_SHIFT,
- bv_in.bv_offset);
-
- sg_init_table(&dmreq->sg_out, 1);
- sg_set_page(&dmreq->sg_out, bv_out.bv_page, 1 << SECTOR_SHIFT,
- bv_out.bv_offset);
-
- bio_advance_iter(ctx->bio_in, &ctx->iter_in, 1 << SECTOR_SHIFT);
- bio_advance_iter(ctx->bio_out, &ctx->iter_out, 1 << SECTOR_SHIFT);

if (cc->iv_gen_ops) {
r = cc->iv_gen_ops->generator(cc, iv, dmreq);
@@ -869,8 +970,63 @@ static int crypt_convert_block(struct crypt_config *cc,
return r;
}

- skcipher_request_set_crypt(req, &dmreq->sg_in, &dmreq->sg_out,
- 1 << SECTOR_SHIFT, iv);
+ if (mode && ctx->sgt_in.orig_nents > 0) {
+ struct scatterlist *sg = NULL;
+ unsigned int total_sg_in, total_sg_out;
+
+ total_sg_in = blk_bio_map_sg(bdev_get_queue(bio_in->bi_bdev),
+ bio_in, ctx->sgt_in.sgl, &sg);
+ if ((total_sg_in <= 0) ||
+ (total_sg_in > ctx->sgt_in.orig_nents)) {
+ DMERR("%s in sg map error %d, sg table nents[%d]\n",
+ __func__, total_sg_in, ctx->sgt_in.orig_nents);
+ return -EINVAL;
+ }
+
+ if (sg)
+ sg_mark_end(sg);
+
+ ctx->iter_in.bi_size -= total_bytes;
+ sg_in = ctx->sgt_in.sgl;
+ sg_out = ctx->sgt_in.sgl;
+
+ if (bio_data_dir(bio_in) == READ)
+ goto set_crypt;
+
+ sg = NULL;
+ total_sg_out = blk_bio_map_sg(bdev_get_queue(bio_out->bi_bdev),
+ bio_out, ctx->sgt_out.sgl, &sg);
+ if ((total_sg_out <= 0) ||
+ (total_sg_out > ctx->sgt_out.orig_nents)) {
+ DMERR("%s out sg map error %d, sg table nents[%d]\n",
+ __func__, total_sg_out, ctx->sgt_out.orig_nents);
+ return -EINVAL;
+ }
+
+ if (sg)
+ sg_mark_end(sg);
+
+ ctx->iter_out.bi_size -= total_bytes;
+ sg_out = ctx->sgt_out.sgl;
+ } else {
+ sg_init_table(&dmreq->sg_in, 1);
+ sg_set_page(&dmreq->sg_in, bv_in.bv_page, 1 << SECTOR_SHIFT,
+ bv_in.bv_offset);
+
+ sg_init_table(&dmreq->sg_out, 1);
+ sg_set_page(&dmreq->sg_out, bv_out.bv_page, 1 << SECTOR_SHIFT,
+ bv_out.bv_offset);
+
+ bio_advance_iter(ctx->bio_in, &ctx->iter_in, 1 << SECTOR_SHIFT);
+ bio_advance_iter(ctx->bio_out, &ctx->iter_out, 1 << SECTOR_SHIFT);
+
+ sg_in = &dmreq->sg_in;
+ sg_out = &dmreq->sg_out;
+ total_bytes = 1 << SECTOR_SHIFT;
+ }
+
+set_crypt:
+ skcipher_request_set_crypt(req, sg_in, sg_out, total_bytes, iv);

if (bio_data_dir(ctx->bio_in) == WRITE)
r = crypto_skcipher_encrypt(req);
@@ -1081,6 +1237,8 @@ static void crypt_dec_pending(struct dm_crypt_io *io)
if (io->ctx.req)
crypt_free_req(cc, io->ctx.req, base_bio);

+ sg_free_table(&io->ctx.sgt_in);
+ sg_free_table(&io->ctx.sgt_out);
base_bio->bi_error = error;
bio_endio(base_bio);
}
@@ -1312,6 +1470,9 @@ static void kcryptd_crypt_write_convert(struct dm_crypt_io *io)
io->ctx.iter_out = clone->bi_iter;

sector += bio_sectors(clone);
+ r = crypt_convert_alloc_table(cc, &io->ctx);
+ if (r < 0)
+ io->error = -EIO;

crypt_inc_pending(io);
r = crypt_convert(cc, &io->ctx);
@@ -1343,6 +1504,9 @@ static void kcryptd_crypt_read_convert(struct dm_crypt_io *io)

crypt_convert_init(cc, &io->ctx, io->base_bio, io->base_bio,
io->sector);
+ r = crypt_convert_alloc_table(cc, &io->ctx);
+ if (r < 0)
+ io->error = -EIO;

r = crypt_convert(cc, &io->ctx);
if (r < 0)
--
1.7.9.5

2016-05-25 08:52:05

by Ming Lei

[permalink] [raw]
Subject: Re: [RFC 1/3] block: Introduce blk_bio_map_sg() to map one bio

On Wed, May 25, 2016 at 2:12 PM, Baolin Wang <[email protected]> wrote:
> In dm-crypt, it need to map one bio to scatterlist for improving the
> hardware engine encryption efficiency. Thus this patch introduces the
> blk_bio_map_sg() function to map one bio with scatterlists.
>
> Signed-off-by: Baolin Wang <[email protected]>
> ---
> block/blk-merge.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> include/linux/blkdev.h | 3 +++
> 2 files changed, 48 insertions(+)
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 2613531..9b92af4 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -417,6 +417,51 @@ single_segment:
> }
>
> /*
> + * map a bio to scatterlist, return number of sg entries setup.
> + */
> +int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
> + struct scatterlist *sglist,
> + struct scatterlist **sg)
> +{
> + struct bio_vec bvec, bvprv = { NULL };
> + struct bvec_iter iter;
> + int nsegs, cluster;
> +
> + nsegs = 0;
> + cluster = blk_queue_cluster(q);
> +
> + if (bio->bi_rw & REQ_DISCARD) {
> + /*
> + * This is a hack - drivers should be neither modifying the
> + * biovec, nor relying on bi_vcnt - but because of
> + * blk_add_request_payload(), a discard bio may or may not have
> + * a payload we need to set up here (thank you Christoph) and
> + * bi_vcnt is really the only way of telling if we need to.
> + */
> +
> + if (bio->bi_vcnt)
> + goto single_segment;
> +
> + return 0;
> + }
> +
> + if (bio->bi_rw & REQ_WRITE_SAME) {
> +single_segment:
> + *sg = sglist;
> + bvec = bio_iovec(bio);
> + sg_set_page(*sg, bvec.bv_page, bvec.bv_len, bvec.bv_offset);
> + return 1;
> + }
> +
> + bio_for_each_segment(bvec, bio, iter)
> + __blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
> + &nsegs, &cluster);
> +
> + return nsegs;
> +}
> +EXPORT_SYMBOL(blk_bio_map_sg);

You can use __blk_bios_map_sg() to implement blk_bio_map_sg(),
then code duplication may be avoided.

> +
> +/*
> * map a request to scatterlist, return number of sg entries setup. Caller
> * must make sure sg can hold rq->nr_phys_segments entries
> */
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 1fd8fdf..e5de4f8 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1013,6 +1013,9 @@ extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool fu
> extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
>
> extern int blk_rq_map_sg(struct request_queue *, struct request *, struct scatterlist *);
> +extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
> + struct scatterlist *sglist,
> + struct scatterlist **sg);
> extern void blk_dump_rq_flags(struct request *, char *);
> extern long nr_blockdev_pages(void);
>
> --
> 1.7.9.5
>

2016-05-25 09:02:53

by Baolin Wang

[permalink] [raw]
Subject: Re: [RFC 1/3] block: Introduce blk_bio_map_sg() to map one bio

On 25 May 2016 at 16:52, Ming Lei <[email protected]> wrote:
>> /*
>> + * map a bio to scatterlist, return number of sg entries setup.
>> + */
>> +int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
>> + struct scatterlist *sglist,
>> + struct scatterlist **sg)
>> +{
>> + struct bio_vec bvec, bvprv = { NULL };
>> + struct bvec_iter iter;
>> + int nsegs, cluster;
>> +
>> + nsegs = 0;
>> + cluster = blk_queue_cluster(q);
>> +
>> + if (bio->bi_rw & REQ_DISCARD) {
>> + /*
>> + * This is a hack - drivers should be neither modifying the
>> + * biovec, nor relying on bi_vcnt - but because of
>> + * blk_add_request_payload(), a discard bio may or may not have
>> + * a payload we need to set up here (thank you Christoph) and
>> + * bi_vcnt is really the only way of telling if we need to.
>> + */
>> +
>> + if (bio->bi_vcnt)
>> + goto single_segment;
>> +
>> + return 0;
>> + }
>> +
>> + if (bio->bi_rw & REQ_WRITE_SAME) {
>> +single_segment:
>> + *sg = sglist;
>> + bvec = bio_iovec(bio);
>> + sg_set_page(*sg, bvec.bv_page, bvec.bv_len, bvec.bv_offset);
>> + return 1;
>> + }
>> +
>> + bio_for_each_segment(bvec, bio, iter)
>> + __blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
>> + &nsegs, &cluster);
>> +
>> + return nsegs;
>> +}
>> +EXPORT_SYMBOL(blk_bio_map_sg);
>
> You can use __blk_bios_map_sg() to implement blk_bio_map_sg(),
> then code duplication may be avoided.

OK. I'll re-factor the code to map one bio.

>
>> +
>> +/*
>> * map a request to scatterlist, return number of sg entries setup. Caller
>> * must make sure sg can hold rq->nr_phys_segments entries
>> */
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 1fd8fdf..e5de4f8 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -1013,6 +1013,9 @@ extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool fu
>> extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
>>
>> extern int blk_rq_map_sg(struct request_queue *, struct request *, struct scatterlist *);
>> +extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
>> + struct scatterlist *sglist,
>> + struct scatterlist **sg);
>> extern void blk_dump_rq_flags(struct request *, char *);
>> extern long nr_blockdev_pages(void);
>>
>> --
>> 1.7.9.5
>>



--
Baolin.wang
Best Regards

2016-05-26 14:04:40

by Mike Snitzer

[permalink] [raw]
Subject: Re: [RFC 3/3] md: dm-crypt: Introduce the bulk mode method when sending request

Comments inlined.

In general the most concerning bit is the need for memory allocation in
the IO path (see comment/question below near call to sg_alloc_table).
In DM targets we make heavy use of .ctr preallocated memory and/or
per-bio-data to avoid memory allocations in the IO path.

On Wed, May 25 2016 at 2:12am -0400,
Baolin Wang <[email protected]> wrote:

> In now dm-crypt code, it is ineffective to map one segment (always one
> sector) of one bio with just only one scatterlist at one time for hardware
> crypto engine. Especially for some encryption mode (like ecb or xts mode)
> cooperating with the crypto engine, they just need one initial IV or null
> IV instead of different IV for each sector. In this situation We can consider
> to use multiple scatterlists to map the whole bio and send all scatterlists
> of one bio to crypto engine to encrypt or decrypt, which can improve the
> hardware engine's efficiency.
>
> With this optimization, On my test setup (beaglebone black board) using 64KB
> I/Os on an eMMC storage device I saw about 60% improvement in throughput for
> encrypted writes, and about 100% improvement for encrypted reads. But this
> is not fit for other modes which need different IV for each sector.
>
> Signed-off-by: Baolin Wang <[email protected]>
> ---
> drivers/md/dm-crypt.c | 188 +++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 176 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> index 4f3cb35..1c86ea7 100644
> --- a/drivers/md/dm-crypt.c
> +++ b/drivers/md/dm-crypt.c
> @@ -33,6 +33,7 @@
> #include <linux/device-mapper.h>
>
> #define DM_MSG_PREFIX "crypt"
> +#define DM_MAX_SG_LIST 1024
>
> /*
> * context holding the current state of a multi-part conversion
> @@ -46,6 +47,8 @@ struct convert_context {
> sector_t cc_sector;
> atomic_t cc_pending;
> struct skcipher_request *req;
> + struct sg_table sgt_in;
> + struct sg_table sgt_out;
> };
>
> /*
> @@ -803,6 +806,108 @@ static struct crypt_iv_operations crypt_iv_tcw_ops = {
> .post = crypt_iv_tcw_post
> };
>
> +/*
> + * Check how many sg entry numbers are needed when map one bio
> + * with scatterlists in advance.
> + */
> +static unsigned int crypt_sg_entry(struct bio *bio_t)
> +{
> + struct request_queue *q = bdev_get_queue(bio_t->bi_bdev);
> + int cluster = blk_queue_cluster(q);
> + struct bio_vec bvec, bvprv = { NULL };
> + struct bvec_iter biter;
> + unsigned long nbytes = 0, sg_length = 0;
> + unsigned int sg_cnt = 0, first_bvec = 0;
> +
> + if (bio_t->bi_rw & REQ_DISCARD) {
> + if (bio_t->bi_vcnt)
> + return 1;
> + return 0;
> + }
> +
> + if (bio_t->bi_rw & REQ_WRITE_SAME)
> + return 1;
> +
> + bio_for_each_segment(bvec, bio_t, biter) {
> + nbytes = bvec.bv_len;
> +
> + if (!cluster) {
> + sg_cnt++;
> + continue;
> + }
> +
> + if (!first_bvec) {
> + first_bvec = 1;
> + goto new_segment;
> + }
> +
> + if (sg_length + nbytes > queue_max_segment_size(q))
> + goto new_segment;
> +
> + if (!BIOVEC_PHYS_MERGEABLE(&bvprv, &bvec))
> + goto new_segment;
> +
> + if (!BIOVEC_SEG_BOUNDARY(q, &bvprv, &bvec))
> + goto new_segment;
> +
> + sg_length += nbytes;
> + continue;
> +
> +new_segment:
> + memcpy(&bvprv, &bvec, sizeof(struct bio_vec));
> + sg_length = nbytes;
> + sg_cnt++;
> + }
> +
> + return sg_cnt;
> +}
> +
> +static int crypt_convert_alloc_table(struct crypt_config *cc,
> + struct convert_context *ctx)
> +{
> + struct bio *bio_in = ctx->bio_in;
> + struct bio *bio_out = ctx->bio_out;
> + unsigned int mode = skcipher_is_bulk_mode(any_tfm(cc));

please use: bool bulk_mode = ...

> + unsigned int sg_in_max, sg_out_max;
> + int ret = 0;
> +
> + if (!mode)
> + goto out2;

Please use more descriptive label names than out[1-3]

> +
> + /*
> + * Need to calculate how many sg entry need to be used
> + * for this bio.
> + */
> + sg_in_max = crypt_sg_entry(bio_in) + 1;

The return from crypt_sg_entry() is pretty awkward, given you just go on
to add 1; as is the bounds checking.. the magic value of 2 needs to be
be made clearer.

> + if (sg_in_max > DM_MAX_SG_LIST || sg_in_max <= 2)
> + goto out2;
> +
> + ret = sg_alloc_table(&ctx->sgt_in, sg_in_max, GFP_KERNEL);

Is it safe to be using GFP_KERNEL here? AFAIK this is in the IO mapping
path and we try to avoid memory allocations at all costs -- due to the
risk of deadlock when issuing IO to stacked block devices (dm-crypt
could be part of a much more elaborate IO stack).

> + if (ret)
> + goto out2;
> +
> + if (bio_data_dir(bio_in) == READ)
> + goto out1;
> +
> + sg_out_max = crypt_sg_entry(bio_out) + 1;
> + if (sg_out_max > DM_MAX_SG_LIST || sg_out_max <= 2)
> + goto out3;
> +
> + ret = sg_alloc_table(&ctx->sgt_out, sg_out_max, GFP_KERNEL);
> + if (ret)
> + goto out3;
> +
> + return 0;
> +
> +out3:

out_free_table?

> + sg_free_table(&ctx->sgt_in);
> +out2:

out_skip_alloc?

> + ctx->sgt_in.orig_nents = 0;
> +out1:

out_skip_write?

> + ctx->sgt_out.orig_nents = 0;
> + return ret;
> +}
> +
> static void crypt_convert_init(struct crypt_config *cc,
> struct convert_context *ctx,
> struct bio *bio_out, struct bio *bio_in,
> @@ -843,7 +948,13 @@ static int crypt_convert_block(struct crypt_config *cc,
> {
> struct bio_vec bv_in = bio_iter_iovec(ctx->bio_in, ctx->iter_in);
> struct bio_vec bv_out = bio_iter_iovec(ctx->bio_out, ctx->iter_out);
> + unsigned int mode = skcipher_is_bulk_mode(any_tfm(cc));

again please use: bool bulk_mode = ...

> + struct bio *bio_in = ctx->bio_in;
> + struct bio *bio_out = ctx->bio_out;
> + unsigned int total_bytes = bio_in->bi_iter.bi_size;
> struct dm_crypt_request *dmreq;
> + struct scatterlist *sg_in;
> + struct scatterlist *sg_out;
> u8 *iv;
> int r;
>
> @@ -852,16 +963,6 @@ static int crypt_convert_block(struct crypt_config *cc,
>
> dmreq->iv_sector = ctx->cc_sector;
> dmreq->ctx = ctx;
> - sg_init_table(&dmreq->sg_in, 1);
> - sg_set_page(&dmreq->sg_in, bv_in.bv_page, 1 << SECTOR_SHIFT,
> - bv_in.bv_offset);
> -
> - sg_init_table(&dmreq->sg_out, 1);
> - sg_set_page(&dmreq->sg_out, bv_out.bv_page, 1 << SECTOR_SHIFT,
> - bv_out.bv_offset);
> -
> - bio_advance_iter(ctx->bio_in, &ctx->iter_in, 1 << SECTOR_SHIFT);
> - bio_advance_iter(ctx->bio_out, &ctx->iter_out, 1 << SECTOR_SHIFT);
>
> if (cc->iv_gen_ops) {
> r = cc->iv_gen_ops->generator(cc, iv, dmreq);
> @@ -869,8 +970,63 @@ static int crypt_convert_block(struct crypt_config *cc,
> return r;
> }
>
> - skcipher_request_set_crypt(req, &dmreq->sg_in, &dmreq->sg_out,
> - 1 << SECTOR_SHIFT, iv);
> + if (mode && ctx->sgt_in.orig_nents > 0) {
> + struct scatterlist *sg = NULL;
> + unsigned int total_sg_in, total_sg_out;
> +
> + total_sg_in = blk_bio_map_sg(bdev_get_queue(bio_in->bi_bdev),
> + bio_in, ctx->sgt_in.sgl, &sg);
> + if ((total_sg_in <= 0) ||
> + (total_sg_in > ctx->sgt_in.orig_nents)) {
> + DMERR("%s in sg map error %d, sg table nents[%d]\n",
> + __func__, total_sg_in, ctx->sgt_in.orig_nents);
> + return -EINVAL;
> + }
> +
> + if (sg)
> + sg_mark_end(sg);
> +
> + ctx->iter_in.bi_size -= total_bytes;
> + sg_in = ctx->sgt_in.sgl;
> + sg_out = ctx->sgt_in.sgl;
> +
> + if (bio_data_dir(bio_in) == READ)
> + goto set_crypt;
> +
> + sg = NULL;
> + total_sg_out = blk_bio_map_sg(bdev_get_queue(bio_out->bi_bdev),
> + bio_out, ctx->sgt_out.sgl, &sg);
> + if ((total_sg_out <= 0) ||
> + (total_sg_out > ctx->sgt_out.orig_nents)) {
> + DMERR("%s out sg map error %d, sg table nents[%d]\n",
> + __func__, total_sg_out, ctx->sgt_out.orig_nents);
> + return -EINVAL;
> + }
> +
> + if (sg)
> + sg_mark_end(sg);
> +
> + ctx->iter_out.bi_size -= total_bytes;
> + sg_out = ctx->sgt_out.sgl;
> + } else {
> + sg_init_table(&dmreq->sg_in, 1);
> + sg_set_page(&dmreq->sg_in, bv_in.bv_page, 1 << SECTOR_SHIFT,
> + bv_in.bv_offset);
> +
> + sg_init_table(&dmreq->sg_out, 1);
> + sg_set_page(&dmreq->sg_out, bv_out.bv_page, 1 << SECTOR_SHIFT,
> + bv_out.bv_offset);
> +
> + bio_advance_iter(ctx->bio_in, &ctx->iter_in, 1 << SECTOR_SHIFT);
> + bio_advance_iter(ctx->bio_out, &ctx->iter_out, 1 << SECTOR_SHIFT);
> +
> + sg_in = &dmreq->sg_in;
> + sg_out = &dmreq->sg_out;
> + total_bytes = 1 << SECTOR_SHIFT;
> + }
> +
> +set_crypt:
> + skcipher_request_set_crypt(req, sg_in, sg_out, total_bytes, iv);

Given how long this code has gotten I'd prefer to see this factored out
to a new setup method.

> if (bio_data_dir(ctx->bio_in) == WRITE)
> r = crypto_skcipher_encrypt(req);
> @@ -1081,6 +1237,8 @@ static void crypt_dec_pending(struct dm_crypt_io *io)
> if (io->ctx.req)
> crypt_free_req(cc, io->ctx.req, base_bio);
>
> + sg_free_table(&io->ctx.sgt_in);
> + sg_free_table(&io->ctx.sgt_out);
> base_bio->bi_error = error;
> bio_endio(base_bio);
> }
> @@ -1312,6 +1470,9 @@ static void kcryptd_crypt_write_convert(struct dm_crypt_io *io)
> io->ctx.iter_out = clone->bi_iter;
>
> sector += bio_sectors(clone);
> + r = crypt_convert_alloc_table(cc, &io->ctx);
> + if (r < 0)
> + io->error = -EIO;
>
> crypt_inc_pending(io);
> r = crypt_convert(cc, &io->ctx);
> @@ -1343,6 +1504,9 @@ static void kcryptd_crypt_read_convert(struct dm_crypt_io *io)
>
> crypt_convert_init(cc, &io->ctx, io->base_bio, io->base_bio,
> io->sector);
> + r = crypt_convert_alloc_table(cc, &io->ctx);
> + if (r < 0)
> + io->error = -EIO;
>
> r = crypt_convert(cc, &io->ctx);
> if (r < 0)
> --
> 1.7.9.5
>
> --
> dm-devel mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/dm-devel

2016-05-27 06:03:05

by Baolin Wang

[permalink] [raw]
Subject: Re: [RFC 3/3] md: dm-crypt: Introduce the bulk mode method when sending request

On 26 May 2016 at 22:04, Mike Snitzer <[email protected]> wrote:
> Comments inlined.
>
> In general the most concerning bit is the need for memory allocation in
> the IO path (see comment/question below near call to sg_alloc_table).
> In DM targets we make heavy use of .ctr preallocated memory and/or
> per-bio-data to avoid memory allocations in the IO path.

Make sense.

>
> On Wed, May 25 2016 at 2:12am -0400,
> Baolin Wang <[email protected]> wrote:
>
>> In now dm-crypt code, it is ineffective to map one segment (always one
>> sector) of one bio with just only one scatterlist at one time for hardware
>> crypto engine. Especially for some encryption mode (like ecb or xts mode)
>> cooperating with the crypto engine, they just need one initial IV or null
>> IV instead of different IV for each sector. In this situation We can consider
>> to use multiple scatterlists to map the whole bio and send all scatterlists
>> of one bio to crypto engine to encrypt or decrypt, which can improve the
>> hardware engine's efficiency.
>>
>> With this optimization, On my test setup (beaglebone black board) using 64KB
>> I/Os on an eMMC storage device I saw about 60% improvement in throughput for
>> encrypted writes, and about 100% improvement for encrypted reads. But this
>> is not fit for other modes which need different IV for each sector.
>>
>> Signed-off-by: Baolin Wang <[email protected]>
>> ---
>> drivers/md/dm-crypt.c | 188 +++++++++++++++++++++++++++++++++++++++++++++----
>> 1 file changed, 176 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>> index 4f3cb35..1c86ea7 100644
>> --- a/drivers/md/dm-crypt.c
>> +++ b/drivers/md/dm-crypt.c
>> @@ -33,6 +33,7 @@
>> #include <linux/device-mapper.h>
>>
>> #define DM_MSG_PREFIX "crypt"
>> +#define DM_MAX_SG_LIST 1024
>>
>> /*
>> * context holding the current state of a multi-part conversion
>> @@ -46,6 +47,8 @@ struct convert_context {
>> sector_t cc_sector;
>> atomic_t cc_pending;
>> struct skcipher_request *req;
>> + struct sg_table sgt_in;
>> + struct sg_table sgt_out;
>> };
>>
>> /*
>> @@ -803,6 +806,108 @@ static struct crypt_iv_operations crypt_iv_tcw_ops = {
>> .post = crypt_iv_tcw_post
>> };
>>
>> +/*
>> + * Check how many sg entry numbers are needed when map one bio
>> + * with scatterlists in advance.
>> + */
>> +static unsigned int crypt_sg_entry(struct bio *bio_t)
>> +{
>> + struct request_queue *q = bdev_get_queue(bio_t->bi_bdev);
>> + int cluster = blk_queue_cluster(q);
>> + struct bio_vec bvec, bvprv = { NULL };
>> + struct bvec_iter biter;
>> + unsigned long nbytes = 0, sg_length = 0;
>> + unsigned int sg_cnt = 0, first_bvec = 0;
>> +
>> + if (bio_t->bi_rw & REQ_DISCARD) {
>> + if (bio_t->bi_vcnt)
>> + return 1;
>> + return 0;
>> + }
>> +
>> + if (bio_t->bi_rw & REQ_WRITE_SAME)
>> + return 1;
>> +
>> + bio_for_each_segment(bvec, bio_t, biter) {
>> + nbytes = bvec.bv_len;
>> +
>> + if (!cluster) {
>> + sg_cnt++;
>> + continue;
>> + }
>> +
>> + if (!first_bvec) {
>> + first_bvec = 1;
>> + goto new_segment;
>> + }
>> +
>> + if (sg_length + nbytes > queue_max_segment_size(q))
>> + goto new_segment;
>> +
>> + if (!BIOVEC_PHYS_MERGEABLE(&bvprv, &bvec))
>> + goto new_segment;
>> +
>> + if (!BIOVEC_SEG_BOUNDARY(q, &bvprv, &bvec))
>> + goto new_segment;
>> +
>> + sg_length += nbytes;
>> + continue;
>> +
>> +new_segment:
>> + memcpy(&bvprv, &bvec, sizeof(struct bio_vec));
>> + sg_length = nbytes;
>> + sg_cnt++;
>> + }
>> +
>> + return sg_cnt;
>> +}
>> +
>> +static int crypt_convert_alloc_table(struct crypt_config *cc,
>> + struct convert_context *ctx)
>> +{
>> + struct bio *bio_in = ctx->bio_in;
>> + struct bio *bio_out = ctx->bio_out;
>> + unsigned int mode = skcipher_is_bulk_mode(any_tfm(cc));
>
> please use: bool bulk_mode = ...

OK.

>
>> + unsigned int sg_in_max, sg_out_max;
>> + int ret = 0;
>> +
>> + if (!mode)
>> + goto out2;
>
> Please use more descriptive label names than out[1-3]

OK.

>
>> +
>> + /*
>> + * Need to calculate how many sg entry need to be used
>> + * for this bio.
>> + */
>> + sg_in_max = crypt_sg_entry(bio_in) + 1;
>
> The return from crypt_sg_entry() is pretty awkward, given you just go on
> to add 1; as is the bounds checking.. the magic value of 2 needs to be
> be made clearer.

I'll remove the crypt_sg_entry() function.

>
>> + if (sg_in_max > DM_MAX_SG_LIST || sg_in_max <= 2)
>> + goto out2;
>> +
>> + ret = sg_alloc_table(&ctx->sgt_in, sg_in_max, GFP_KERNEL);
>
> Is it safe to be using GFP_KERNEL here? AFAIK this is in the IO mapping
> path and we try to avoid memory allocations at all costs -- due to the
> risk of deadlock when issuing IO to stacked block devices (dm-crypt
> could be part of a much more elaborate IO stack).

OK. I'll move the sg table allocation to be preallocated in the .ctr function.

>
>> + if (ret)
>> + goto out2;
>> +
>> + if (bio_data_dir(bio_in) == READ)
>> + goto out1;
>> +
>> + sg_out_max = crypt_sg_entry(bio_out) + 1;
>> + if (sg_out_max > DM_MAX_SG_LIST || sg_out_max <= 2)
>> + goto out3;
>> +
>> + ret = sg_alloc_table(&ctx->sgt_out, sg_out_max, GFP_KERNEL);
>> + if (ret)
>> + goto out3;
>> +
>> + return 0;
>> +
>> +out3:
>
> out_free_table?
>
>> + sg_free_table(&ctx->sgt_in);
>> +out2:
>
> out_skip_alloc?
>
>> + ctx->sgt_in.orig_nents = 0;
>> +out1:
>
> out_skip_write?
>
>> + ctx->sgt_out.orig_nents = 0;
>> + return ret;
>> +}
>> +
>> static void crypt_convert_init(struct crypt_config *cc,
>> struct convert_context *ctx,
>> struct bio *bio_out, struct bio *bio_in,
>> @@ -843,7 +948,13 @@ static int crypt_convert_block(struct crypt_config *cc,
>> {
>> struct bio_vec bv_in = bio_iter_iovec(ctx->bio_in, ctx->iter_in);
>> struct bio_vec bv_out = bio_iter_iovec(ctx->bio_out, ctx->iter_out);
>> + unsigned int mode = skcipher_is_bulk_mode(any_tfm(cc));
>
> again please use: bool bulk_mode = ...

OK.

>
>> + struct bio *bio_in = ctx->bio_in;
>> + struct bio *bio_out = ctx->bio_out;
>> + unsigned int total_bytes = bio_in->bi_iter.bi_size;
>> struct dm_crypt_request *dmreq;
>> + struct scatterlist *sg_in;
>> + struct scatterlist *sg_out;
>> u8 *iv;
>> int r;
>>
>> @@ -852,16 +963,6 @@ static int crypt_convert_block(struct crypt_config *cc,
>>
>> dmreq->iv_sector = ctx->cc_sector;
>> dmreq->ctx = ctx;
>> - sg_init_table(&dmreq->sg_in, 1);
>> - sg_set_page(&dmreq->sg_in, bv_in.bv_page, 1 << SECTOR_SHIFT,
>> - bv_in.bv_offset);
>> -
>> - sg_init_table(&dmreq->sg_out, 1);
>> - sg_set_page(&dmreq->sg_out, bv_out.bv_page, 1 << SECTOR_SHIFT,
>> - bv_out.bv_offset);
>> -
>> - bio_advance_iter(ctx->bio_in, &ctx->iter_in, 1 << SECTOR_SHIFT);
>> - bio_advance_iter(ctx->bio_out, &ctx->iter_out, 1 << SECTOR_SHIFT);
>>
>> if (cc->iv_gen_ops) {
>> r = cc->iv_gen_ops->generator(cc, iv, dmreq);
>> @@ -869,8 +970,63 @@ static int crypt_convert_block(struct crypt_config *cc,
>> return r;
>> }
>>
>> - skcipher_request_set_crypt(req, &dmreq->sg_in, &dmreq->sg_out,
>> - 1 << SECTOR_SHIFT, iv);
>> + if (mode && ctx->sgt_in.orig_nents > 0) {
>> + struct scatterlist *sg = NULL;
>> + unsigned int total_sg_in, total_sg_out;
>> +
>> + total_sg_in = blk_bio_map_sg(bdev_get_queue(bio_in->bi_bdev),
>> + bio_in, ctx->sgt_in.sgl, &sg);
>> + if ((total_sg_in <= 0) ||
>> + (total_sg_in > ctx->sgt_in.orig_nents)) {
>> + DMERR("%s in sg map error %d, sg table nents[%d]\n",
>> + __func__, total_sg_in, ctx->sgt_in.orig_nents);
>> + return -EINVAL;
>> + }
>> +
>> + if (sg)
>> + sg_mark_end(sg);
>> +
>> + ctx->iter_in.bi_size -= total_bytes;
>> + sg_in = ctx->sgt_in.sgl;
>> + sg_out = ctx->sgt_in.sgl;
>> +
>> + if (bio_data_dir(bio_in) == READ)
>> + goto set_crypt;
>> +
>> + sg = NULL;
>> + total_sg_out = blk_bio_map_sg(bdev_get_queue(bio_out->bi_bdev),
>> + bio_out, ctx->sgt_out.sgl, &sg);
>> + if ((total_sg_out <= 0) ||
>> + (total_sg_out > ctx->sgt_out.orig_nents)) {
>> + DMERR("%s out sg map error %d, sg table nents[%d]\n",
>> + __func__, total_sg_out, ctx->sgt_out.orig_nents);
>> + return -EINVAL;
>> + }
>> +
>> + if (sg)
>> + sg_mark_end(sg);
>> +
>> + ctx->iter_out.bi_size -= total_bytes;
>> + sg_out = ctx->sgt_out.sgl;
>> + } else {
>> + sg_init_table(&dmreq->sg_in, 1);
>> + sg_set_page(&dmreq->sg_in, bv_in.bv_page, 1 << SECTOR_SHIFT,
>> + bv_in.bv_offset);
>> +
>> + sg_init_table(&dmreq->sg_out, 1);
>> + sg_set_page(&dmreq->sg_out, bv_out.bv_page, 1 << SECTOR_SHIFT,
>> + bv_out.bv_offset);
>> +
>> + bio_advance_iter(ctx->bio_in, &ctx->iter_in, 1 << SECTOR_SHIFT);
>> + bio_advance_iter(ctx->bio_out, &ctx->iter_out, 1 << SECTOR_SHIFT);
>> +
>> + sg_in = &dmreq->sg_in;
>> + sg_out = &dmreq->sg_out;
>> + total_bytes = 1 << SECTOR_SHIFT;
>> + }
>> +
>> +set_crypt:
>> + skcipher_request_set_crypt(req, sg_in, sg_out, total_bytes, iv);
>
> Given how long this code has gotten I'd prefer to see this factored out
> to a new setup method.

I'll refactor this long function. Thanks for your comments.

>
>> if (bio_data_dir(ctx->bio_in) == WRITE)
>> r = crypto_skcipher_encrypt(req);
>> @@ -1081,6 +1237,8 @@ static void crypt_dec_pending(struct dm_crypt_io *io)
>> if (io->ctx.req)
>> crypt_free_req(cc, io->ctx.req, base_bio);
>>
>> + sg_free_table(&io->ctx.sgt_in);
>> + sg_free_table(&io->ctx.sgt_out);
>> base_bio->bi_error = error;
>> bio_endio(base_bio);
>> }
>> @@ -1312,6 +1470,9 @@ static void kcryptd_crypt_write_convert(struct dm_crypt_io *io)
>> io->ctx.iter_out = clone->bi_iter;
>>
>> sector += bio_sectors(clone);
>> + r = crypt_convert_alloc_table(cc, &io->ctx);
>> + if (r < 0)
>> + io->error = -EIO;
>>
>> crypt_inc_pending(io);
>> r = crypt_convert(cc, &io->ctx);
>> @@ -1343,6 +1504,9 @@ static void kcryptd_crypt_read_convert(struct dm_crypt_io *io)
>>
>> crypt_convert_init(cc, &io->ctx, io->base_bio, io->base_bio,
>> io->sector);
>> + r = crypt_convert_alloc_table(cc, &io->ctx);
>> + if (r < 0)
>> + io->error = -EIO;
>>
>> r = crypt_convert(cc, &io->ctx);
>> if (r < 0)
>> --
>> 1.7.9.5
>>
>> --
>> dm-devel mailing list
>> [email protected]
>> https://www.redhat.com/mailman/listinfo/dm-devel



--
Baolin.wang
Best Regards

2016-05-27 06:31:43

by Milan Broz

[permalink] [raw]
Subject: Re: [RFC 2/3] crypto: Introduce CRYPTO_ALG_BULK flag

On 05/25/2016 08:12 AM, Baolin Wang wrote:
> Now some cipher hardware engines prefer to handle bulk block rather than one
> sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
> the intermediate values (IV) by themselves in one bulk block. This means we
> can increase the size of the request by merging request rather than always 512
> bytes and thus increase the hardware engine processing speed.

Hi,

could you please elaborate how exactly you are processing independently
encrypted sectors? For example with XTS mode. Do you play internally with
tweak calculation? Does this keep 512 bytes sector encryption blocks independent?

(If not, it is breaking compatibility everywhere and you are reinventing
disk encryption logic here - just for performance reason for some hw
not designed for this task... But that was said several times already.)

> So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
> mode.

What exactly skcipher will do if this flag is set?

Which drivers it should use? I do not see any posted patch that uses this flag yet.
How we can test it?

Milan

>
> Signed-off-by: Baolin Wang <[email protected]>
> ---
> include/crypto/skcipher.h | 7 +++++++
> include/linux/crypto.h | 6 ++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
> index 0f987f5..d89d29a 100644
> --- a/include/crypto/skcipher.h
> +++ b/include/crypto/skcipher.h
> @@ -519,5 +519,12 @@ static inline void skcipher_request_set_crypt(
> req->iv = iv;
> }
>
> +static inline unsigned int skcipher_is_bulk_mode(struct crypto_skcipher *sk_tfm)
> +{
> + struct crypto_tfm *tfm = crypto_skcipher_tfm(sk_tfm);
> +
> + return crypto_tfm_alg_bulk(tfm);
> +}
> +
> #endif /* _CRYPTO_SKCIPHER_H */
>
> diff --git a/include/linux/crypto.h b/include/linux/crypto.h
> index 6e28c89..a315487 100644
> --- a/include/linux/crypto.h
> +++ b/include/linux/crypto.h
> @@ -63,6 +63,7 @@
> #define CRYPTO_ALG_DEAD 0x00000020
> #define CRYPTO_ALG_DYING 0x00000040
> #define CRYPTO_ALG_ASYNC 0x00000080
> +#define CRYPTO_ALG_BULK 0x00000100
>
> /*
> * Set this bit if and only if the algorithm requires another algorithm of
> @@ -623,6 +624,11 @@ static inline u32 crypto_tfm_alg_type(struct crypto_tfm *tfm)
> return tfm->__crt_alg->cra_flags & CRYPTO_ALG_TYPE_MASK;
> }
>
> +static inline unsigned int crypto_tfm_alg_bulk(struct crypto_tfm *tfm)
> +{
> + return tfm->__crt_alg->cra_flags & CRYPTO_ALG_BULK;
> +}
> +
> static inline unsigned int crypto_tfm_alg_blocksize(struct crypto_tfm *tfm)
> {
> return tfm->__crt_alg->cra_blocksize;
>

2016-05-27 07:04:24

by Baolin Wang

[permalink] [raw]
Subject: Re: [RFC 2/3] crypto: Introduce CRYPTO_ALG_BULK flag

Hi Milan,

On 27 May 2016 at 14:31, Milan Broz <[email protected]> wrote:
> On 05/25/2016 08:12 AM, Baolin Wang wrote:
>> Now some cipher hardware engines prefer to handle bulk block rather than one
>> sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
>> the intermediate values (IV) by themselves in one bulk block. This means we
>> can increase the size of the request by merging request rather than always 512
>> bytes and thus increase the hardware engine processing speed.
>
> Hi,
>
> could you please elaborate how exactly you are processing independently
> encrypted sectors? For example with XTS mode. Do you play internally with
> tweak calculation? Does this keep 512 bytes sector encryption blocks independent?
>
> (If not, it is breaking compatibility everywhere and you are reinventing
> disk encryption logic here - just for performance reason for some hw
> not designed for this task... But that was said several times already.)

These are what the cipher hardware engine and engine driver should do,
for software we just need send one initial IV and bulk data to crypto
layer, which is enough.

>
>> So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
>> mode.
>
> What exactly skcipher will do if this flag is set?

I think that depends on how to implement the cipher engine driver.

>
> Which drivers it should use? I do not see any posted patch that uses this flag yet.
> How we can test it?

Some cipher engine drivers which support bulk mode should use this
flag. Yeah, we need upstream one cipher driver with this flag for
testing.

>
> Milan
>
>>
>> Signed-off-by: Baolin Wang <[email protected]>
>> ---
>> include/crypto/skcipher.h | 7 +++++++
>> include/linux/crypto.h | 6 ++++++
>> 2 files changed, 13 insertions(+)
>>
>> diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
>> index 0f987f5..d89d29a 100644
>> --- a/include/crypto/skcipher.h
>> +++ b/include/crypto/skcipher.h
>> @@ -519,5 +519,12 @@ static inline void skcipher_request_set_crypt(
>> req->iv = iv;
>> }
>>
>> +static inline unsigned int skcipher_is_bulk_mode(struct crypto_skcipher *sk_tfm)
>> +{
>> + struct crypto_tfm *tfm = crypto_skcipher_tfm(sk_tfm);
>> +
>> + return crypto_tfm_alg_bulk(tfm);
>> +}
>> +
>> #endif /* _CRYPTO_SKCIPHER_H */
>>
>> diff --git a/include/linux/crypto.h b/include/linux/crypto.h
>> index 6e28c89..a315487 100644
>> --- a/include/linux/crypto.h
>> +++ b/include/linux/crypto.h
>> @@ -63,6 +63,7 @@
>> #define CRYPTO_ALG_DEAD 0x00000020
>> #define CRYPTO_ALG_DYING 0x00000040
>> #define CRYPTO_ALG_ASYNC 0x00000080
>> +#define CRYPTO_ALG_BULK 0x00000100
>>
>> /*
>> * Set this bit if and only if the algorithm requires another algorithm of
>> @@ -623,6 +624,11 @@ static inline u32 crypto_tfm_alg_type(struct crypto_tfm *tfm)
>> return tfm->__crt_alg->cra_flags & CRYPTO_ALG_TYPE_MASK;
>> }
>>
>> +static inline unsigned int crypto_tfm_alg_bulk(struct crypto_tfm *tfm)
>> +{
>> + return tfm->__crt_alg->cra_flags & CRYPTO_ALG_BULK;
>> +}
>> +
>> static inline unsigned int crypto_tfm_alg_blocksize(struct crypto_tfm *tfm)
>> {
>> return tfm->__crt_alg->cra_blocksize;
>>
>



--
Baolin.wang
Best Regards

2016-05-27 07:53:40

by Milan Broz

[permalink] [raw]
Subject: Re: [RFC 2/3] crypto: Introduce CRYPTO_ALG_BULK flag

On 05/27/2016 09:04 AM, Baolin Wang wrote:
> Hi Milan,
>
> On 27 May 2016 at 14:31, Milan Broz <[email protected]> wrote:
>> On 05/25/2016 08:12 AM, Baolin Wang wrote:
>>> Now some cipher hardware engines prefer to handle bulk block rather than one
>>> sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
>>> the intermediate values (IV) by themselves in one bulk block. This means we
>>> can increase the size of the request by merging request rather than always 512
>>> bytes and thus increase the hardware engine processing speed.
>>
>> Hi,
>>
>> could you please elaborate how exactly you are processing independently
>> encrypted sectors? For example with XTS mode. Do you play internally with
>> tweak calculation? Does this keep 512 bytes sector encryption blocks independent?
>>
>> (If not, it is breaking compatibility everywhere and you are reinventing
>> disk encryption logic here - just for performance reason for some hw
>> not designed for this task... But that was said several times already.)
>
> These are what the cipher hardware engine and engine driver should do,
> for software we just need send one initial IV and bulk data to crypto
> layer, which is enough.

Hi,

Thanks for answer.

So this is just doing some kind of batch processing optimization inside the driver?
I still do not understand why it is not possible to "batch" these requests inside
you driver (with async API) then and process them in one go without any changes
to dmcrypt though. (But I am not familiar with these drivers.)

If I understand it correctly, subsequent IVs are calculated inside
your driver just based on some initial value?
This can work only for sequential IVs (or, better said, for predictable
IVs like plain64, implemented inside your hw/driver).

It cannot work for IVs/tweaks that are randomized or keyed (like ESSIV).
Yes, I know that using ESSIV for XTS is considered overkill but I do not
see you are checking this condition anywhere in code (we can definitely
configure such a mapping).

>>> So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
>>> mode.
>>
>> What exactly skcipher will do if this flag is set?
>
> I think that depends on how to implement the cipher engine driver.

I do not care about implementation details, I just would like to see
some high-level description for the new API flag.
(Or at least read the code that implements it :)

>> Which drivers it should use? I do not see any posted patch that uses this flag yet.
>> How we can test it?
>
> Some cipher engine drivers which support bulk mode should use this
> flag. Yeah, we need upstream one cipher driver with this flag for
> testing.

I think if you introduce new flag, you should also post drivers that uses it.
Otherwise it is just unused code for mainline.

And I definitely would like to test it somewhere and see real
performance data (not just simple dd).

Thanks,
Milan

>
>>
>> Milan
>>
>>>
>>> Signed-off-by: Baolin Wang <[email protected]>
>>> ---
>>> include/crypto/skcipher.h | 7 +++++++
>>> include/linux/crypto.h | 6 ++++++
>>> 2 files changed, 13 insertions(+)
>>>
>>> diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
>>> index 0f987f5..d89d29a 100644
>>> --- a/include/crypto/skcipher.h
>>> +++ b/include/crypto/skcipher.h
>>> @@ -519,5 +519,12 @@ static inline void skcipher_request_set_crypt(
>>> req->iv = iv;
>>> }
>>>
>>> +static inline unsigned int skcipher_is_bulk_mode(struct crypto_skcipher *sk_tfm)
>>> +{
>>> + struct crypto_tfm *tfm = crypto_skcipher_tfm(sk_tfm);
>>> +
>>> + return crypto_tfm_alg_bulk(tfm);
>>> +}
>>> +
>>> #endif /* _CRYPTO_SKCIPHER_H */
>>>
>>> diff --git a/include/linux/crypto.h b/include/linux/crypto.h
>>> index 6e28c89..a315487 100644
>>> --- a/include/linux/crypto.h
>>> +++ b/include/linux/crypto.h
>>> @@ -63,6 +63,7 @@
>>> #define CRYPTO_ALG_DEAD 0x00000020
>>> #define CRYPTO_ALG_DYING 0x00000040
>>> #define CRYPTO_ALG_ASYNC 0x00000080
>>> +#define CRYPTO_ALG_BULK 0x00000100
>>>
>>> /*
>>> * Set this bit if and only if the algorithm requires another algorithm of
>>> @@ -623,6 +624,11 @@ static inline u32 crypto_tfm_alg_type(struct crypto_tfm *tfm)
>>> return tfm->__crt_alg->cra_flags & CRYPTO_ALG_TYPE_MASK;
>>> }
>>>
>>> +static inline unsigned int crypto_tfm_alg_bulk(struct crypto_tfm *tfm)
>>> +{
>>> + return tfm->__crt_alg->cra_flags & CRYPTO_ALG_BULK;
>>> +}
>>> +
>>> static inline unsigned int crypto_tfm_alg_blocksize(struct crypto_tfm *tfm)
>>> {
>>> return tfm->__crt_alg->cra_blocksize;
>>>
>>
>
>
>

2016-05-27 09:04:35

by Baolin Wang

[permalink] [raw]
Subject: Re: [RFC 2/3] crypto: Introduce CRYPTO_ALG_BULK flag

On 27 May 2016 at 15:53, Milan Broz <[email protected]> wrote:
> On 05/27/2016 09:04 AM, Baolin Wang wrote:
>> Hi Milan,
>>
>> On 27 May 2016 at 14:31, Milan Broz <[email protected]> wrote:
>>> On 05/25/2016 08:12 AM, Baolin Wang wrote:
>>>> Now some cipher hardware engines prefer to handle bulk block rather than one
>>>> sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
>>>> the intermediate values (IV) by themselves in one bulk block. This means we
>>>> can increase the size of the request by merging request rather than always 512
>>>> bytes and thus increase the hardware engine processing speed.
>>>
>>> Hi,
>>>
>>> could you please elaborate how exactly you are processing independently
>>> encrypted sectors? For example with XTS mode. Do you play internally with
>>> tweak calculation? Does this keep 512 bytes sector encryption blocks independent?
>>>
>>> (If not, it is breaking compatibility everywhere and you are reinventing
>>> disk encryption logic here - just for performance reason for some hw
>>> not designed for this task... But that was said several times already.)
>>
>> These are what the cipher hardware engine and engine driver should do,
>> for software we just need send one initial IV and bulk data to crypto
>> layer, which is enough.
>
> Hi,
>
> Thanks for answer.
>
> So this is just doing some kind of batch processing optimization inside the driver?
> I still do not understand why it is not possible to "batch" these requests inside
> you driver (with async API) then and process them in one go without any changes
> to dmcrypt though. (But I am not familiar with these drivers.)

I think it is not only for the driver level, but also for cipher API.
If one cipher engine can support bulk mode, then we should send bulk
data to it, not only one sector, which need to be implemented at top
API level, such as:

skcipher_request_set_crypt(req, sgt_in.sgl, sgt_out.sgl, total_bytes,
iv); /* send sg table to crypt layer */

If move these optimization into driver, it means we need to merge
requests from dm-crypt together to be one big request for engine
driver, but it will be low efficiency and not help. Why we can not
send one bulk request at first in dm-crypt? We just need to set the
different parameters for skcipher_request_set_crypt() function
according to different cipher mode in dm-crypt.

>
> If I understand it correctly, subsequent IVs are calculated inside
> your driver just based on some initial value?

Yes, something like this.

> This can work only for sequential IVs (or, better said, for predictable
> IVs like plain64, implemented inside your hw/driver).
>
> It cannot work for IVs/tweaks that are randomized or keyed (like ESSIV).
> Yes, I know that using ESSIV for XTS is considered overkill but I do not
> see you are checking this condition anywhere in code (we can definitely
> configure such a mapping).

So if some ciphers (like: aes(cbc)) can not support bulk mode, then we
should not add the CRYPTO_ALG_BULK flag for this cipher in the cipher
driver.

>
>>>> So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
>>>> mode.
>>>
>>> What exactly skcipher will do if this flag is set?
>>
>> I think that depends on how to implement the cipher engine driver.
>
> I do not care about implementation details, I just would like to see
> some high-level description for the new API flag.
> (Or at least read the code that implements it :)

I think for high-level description, we can use sg table to map one
bulk block and send to crypt layer by skcipher_request_set_crypt()
function.

>
>>> Which drivers it should use? I do not see any posted patch that uses this flag yet.
>>> How we can test it?
>>
>> Some cipher engine drivers which support bulk mode should use this
>> flag. Yeah, we need upstream one cipher driver with this flag for
>> testing.
>
> I think if you introduce new flag, you should also post drivers that uses it.
> Otherwise it is just unused code for mainline.

Yeah, that's right.

>
> And I definitely would like to test it somewhere and see real
> performance data (not just simple dd).

OK. Thanks.

>
> Thanks,
> Milan
>
>>
>>>
>>> Milan
>>>
>>>>
>>>> Signed-off-by: Baolin Wang <[email protected]>
>>>> ---
>>>> include/crypto/skcipher.h | 7 +++++++
>>>> include/linux/crypto.h | 6 ++++++
>>>> 2 files changed, 13 insertions(+)
>>>>
>>>> diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
>>>> index 0f987f5..d89d29a 100644
>>>> --- a/include/crypto/skcipher.h
>>>> +++ b/include/crypto/skcipher.h
>>>> @@ -519,5 +519,12 @@ static inline void skcipher_request_set_crypt(
>>>> req->iv = iv;
>>>> }
>>>>
>>>> +static inline unsigned int skcipher_is_bulk_mode(struct crypto_skcipher *sk_tfm)
>>>> +{
>>>> + struct crypto_tfm *tfm = crypto_skcipher_tfm(sk_tfm);
>>>> +
>>>> + return crypto_tfm_alg_bulk(tfm);
>>>> +}
>>>> +
>>>> #endif /* _CRYPTO_SKCIPHER_H */
>>>>
>>>> diff --git a/include/linux/crypto.h b/include/linux/crypto.h
>>>> index 6e28c89..a315487 100644
>>>> --- a/include/linux/crypto.h
>>>> +++ b/include/linux/crypto.h
>>>> @@ -63,6 +63,7 @@
>>>> #define CRYPTO_ALG_DEAD 0x00000020
>>>> #define CRYPTO_ALG_DYING 0x00000040
>>>> #define CRYPTO_ALG_ASYNC 0x00000080
>>>> +#define CRYPTO_ALG_BULK 0x00000100
>>>>
>>>> /*
>>>> * Set this bit if and only if the algorithm requires another algorithm of
>>>> @@ -623,6 +624,11 @@ static inline u32 crypto_tfm_alg_type(struct crypto_tfm *tfm)
>>>> return tfm->__crt_alg->cra_flags & CRYPTO_ALG_TYPE_MASK;
>>>> }
>>>>
>>>> +static inline unsigned int crypto_tfm_alg_bulk(struct crypto_tfm *tfm)
>>>> +{
>>>> + return tfm->__crt_alg->cra_flags & CRYPTO_ALG_BULK;
>>>> +}
>>>> +
>>>> static inline unsigned int crypto_tfm_alg_blocksize(struct crypto_tfm *tfm)
>>>> {
>>>> return tfm->__crt_alg->cra_blocksize;
>>>>
>>>
>>
>>
>>



--
Baolin.wang
Best Regards