This patch set fix various problems spotted during T10/DIF integrity machinery testing.
TOC:
## Fix various bugs in T10/DIF/DIX infrastructure
0001-bio-integrity-Do-not-allocate-integrity-context-for
0002-bio-integrity-bio_trim-should-truncate-integrity-vec
0003-bio-integrity-bio_integrity_advance-must-update-inte
0004-bio-integrity-fix-interface-for-bio_integrity_trim
0005-bio-integrity-fold-bio_integrity_enabled-to-bio_integrity
## Cleanup T10/DIF/DIX infrastructure
0006-T10-Move-opencoded-contants-to-common-header
0007-Guard-bvec-iteration-logic-v3
0008-bio-add-bvec_iter-rewind-API
0009-bio-integrity-Restore-original-iterator-on-verify
Changes since V2
- Cleanups in response to commens from axboe@ and hch@
- Use rewind bvec api to restore original vector on verify
- fix one more bug with bip_seed update on bio split.
Changes since V1
- fix issues potted by kbuild bot
- Replace BUG_ON with error logic for 7'th patch
Testcase: xfstest blockdev/003
https://github.com/dmonakhov/xfstests/commit/3c6509eaa83b9c17cd0bc95d73fcdd76e1c54a85
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Dmitry Monakhov <[email protected]>
---
block/bio.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/bio.c b/block/bio.c
index e75878f..fa84323 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1907,6 +1907,10 @@ void bio_trim(struct bio *bio, int offset, int size)
bio_advance(bio, offset << 9);
bio->bi_iter.bi_size = size;
+
+ if (bio_integrity(bio))
+ bio_integrity_trim(bio, 0, size);
+
}
EXPORT_SYMBOL_GPL(bio_trim);
--
2.9.3
Currently ->verify_fn not woks at all because at the moment it is called
bio->bi_iter.bi_size == 0, so we do not iterate integrity bvecs at all.
In order to perform verification we need to know original data vector,
with new bvec rewind API this is trivial.
testcase: https://github.com/dmonakhov/xfstests/commit/3c6509eaa83b9c17cd0bc95d73fcdd76e1c54a85
Signed-off-by: Dmitry Monakhov <[email protected]>
---
block/bio-integrity.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index f2b9f09..f08096d 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -184,9 +184,10 @@ static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
/**
* bio_integrity_process - Process integrity metadata for a bio
* @bio: bio to generate/verify integrity metadata for
+ * @proc_iter: iterator to process
* @proc_fn: Pointer to the relevant processing function
*/
-static int bio_integrity_process(struct bio *bio,
+static int bio_integrity_process(struct bio *bio, struct bvec_iter *proc_iter,
integrity_processing_fn *proc_fn)
{
struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
@@ -200,10 +201,10 @@ static int bio_integrity_process(struct bio *bio,
iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
iter.interval = 1 << bi->interval_exp;
- iter.seed = bip_get_seed(bip);
+ iter.seed = proc_iter->bi_sector;
iter.prot_buf = prot_buf;
- bio_for_each_segment(bv, bio, bviter) {
+ __bio_for_each_segment(bv, bio, bviter, *proc_iter) {
void *kaddr = kmap_atomic(bv.bv_page);
iter.data_buf = kaddr + bv.bv_offset;
@@ -310,7 +311,8 @@ static int bio_integrity_setup(struct bio *bio)
/* Auto-generate integrity metadata if this is a write */
if (bio_data_dir(bio) == WRITE)
- bio_integrity_process(bio, bi->profile->generate_fn);
+ bio_integrity_process(bio, &bio->bi_iter,
+ bi->profile->generate_fn);
return 0;
}
@@ -376,9 +378,15 @@ static void bio_integrity_verify_fn(struct work_struct *work)
container_of(work, struct bio_integrity_payload, bip_work);
struct bio *bio = bip->bip_bio;
struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
-
- bio->bi_error = bio_integrity_process(bio, bi->profile->verify_fn);
-
+ struct bvec_iter iter = bio->bi_iter;
+
+ /* At the moment verify is called bio's iterator was advanced
+ * during split and completion, we need to rewind iterator to
+ * it's original position */
+ bio_rewind_iter(bio, &iter, iter.bi_done);
+ if (!bio->bi_error)
+ bio->bi_error = bio_integrity_process(bio, &iter,
+ bi->profile->verify_fn);
/* Restore original bio completion handler */
bio->bi_end_io = bip->bip_end_io;
bio_endio(bio);
--
2.9.3
Currently all integrity prep hooks are open-coded, and if prepare fails
we ignore it's code and fail bio with EIO. Let's return real error to
upper layer, so later caller may react accordingly.
In fact no one want to use bio_integrity_prep() w/o bio_integrity_enabled,
so it is reasonable to fold it in to one function.
Signed-off-by: Dmitry Monakhov <[email protected]>
---
Documentation/block/data-integrity.txt | 3 --
block/bio-integrity.c | 88 ++++++++++++++++++----------------
block/blk-core.c | 5 +-
block/blk-mq.c | 8 +---
drivers/nvdimm/blk.c | 13 +----
drivers/nvdimm/btt.c | 13 +----
include/linux/bio.h | 6 ---
7 files changed, 55 insertions(+), 81 deletions(-)
diff --git a/Documentation/block/data-integrity.txt b/Documentation/block/data-integrity.txt
index f56ec97..37ac837 100644
--- a/Documentation/block/data-integrity.txt
+++ b/Documentation/block/data-integrity.txt
@@ -202,9 +202,6 @@ will require extra work due to the application tag.
added. It is up to the caller to ensure that the bio does not
change while I/O is in progress.
- bio_integrity_prep() should only be called if
- bio_integrity_enabled() returned 1.
-
5.3 PASSING EXISTING INTEGRITY METADATA
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 6170dad..f2b9f09 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -160,44 +160,6 @@ int bio_integrity_add_page(struct bio *bio, struct page *page,
EXPORT_SYMBOL(bio_integrity_add_page);
/**
- * bio_integrity_enabled - Check whether integrity can be passed
- * @bio: bio to check
- *
- * Description: Determines whether bio_integrity_prep() can be called
- * on this bio or not. bio data direction and target device must be
- * set prior to calling. The functions honors the write_generate and
- * read_verify flags in sysfs.
- */
-bool bio_integrity_enabled(struct bio *bio)
-{
- struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
-
- if (bio_op(bio) != REQ_OP_READ && bio_op(bio) != REQ_OP_WRITE)
- return false;
-
- if (!bio_sectors(bio))
- return false;
-
- /* Already protected? */
- if (bio_integrity(bio))
- return false;
-
- if (bi == NULL)
- return false;
-
- if (bio_data_dir(bio) == READ && bi->profile->verify_fn != NULL &&
- (bi->flags & BLK_INTEGRITY_VERIFY))
- return true;
-
- if (bio_data_dir(bio) == WRITE && bi->profile->generate_fn != NULL &&
- (bi->flags & BLK_INTEGRITY_GENERATE))
- return true;
-
- return false;
-}
-EXPORT_SYMBOL(bio_integrity_enabled);
-
-/**
* bio_integrity_intervals - Return number of integrity intervals for a bio
* @bi: blk_integrity profile for device
* @sectors: Size of the bio in 512-byte sectors
@@ -259,7 +221,7 @@ static int bio_integrity_process(struct bio *bio,
}
/**
- * bio_integrity_prep - Prepare bio for integrity I/O
+ * bio_integrity_setup - Prepare bio for integrity I/O
* @bio: bio to prepare
*
* Description: Allocates a buffer for integrity metadata, maps the
@@ -269,7 +231,7 @@ static int bio_integrity_process(struct bio *bio,
* block device's integrity function. In the READ case, the buffer
* will be prepared for DMA and a suitable end_io handler set up.
*/
-int bio_integrity_prep(struct bio *bio)
+static int bio_integrity_setup(struct bio *bio)
{
struct bio_integrity_payload *bip;
struct blk_integrity *bi;
@@ -352,6 +314,52 @@ int bio_integrity_prep(struct bio *bio)
return 0;
}
+/**
+ * bio_integrity_prep - Prepare bio for integrity I/O
+ * @bio: bio to prepare
+ *
+ * Description: Checks if the bio already has an integrity payload attached.
+ * If it does, the payload has been generated by another kernel subsystem,
+ * and we just pass it through. Otherwise allocates integrity payload.
+ */
+
+int bio_integrity_prep(struct bio *bio)
+{
+ int err;
+ struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
+
+ if (bio_op(bio) != REQ_OP_READ && bio_op(bio) != REQ_OP_WRITE)
+ return 0;
+
+ if (!bio_sectors(bio))
+ return 0;
+
+ /* Already protected? */
+ if (bio_integrity(bio))
+ return 0;
+
+ if (bi == NULL)
+ return 0;
+
+ if (bio_data_dir(bio) == READ && bi->profile->verify_fn != NULL &&
+ (bi->flags & BLK_INTEGRITY_VERIFY))
+ goto need_prep;
+
+ if (bio_data_dir(bio) == WRITE && bi->profile->generate_fn != NULL &&
+ (bi->flags & BLK_INTEGRITY_GENERATE))
+ goto need_prep;
+
+ return 0;
+need_prep:
+ err = bio_integrity_setup(bio);
+ if (err) {
+ bio->bi_error = err;
+ bio_endio(bio);
+ }
+ return err;
+}
+
+
EXPORT_SYMBOL(bio_integrity_prep);
/**
diff --git a/block/blk-core.c b/block/blk-core.c
index d772c22..275c1e4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1637,11 +1637,8 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
blk_queue_split(q, &bio, q->bio_split);
- if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
- bio->bi_error = -EIO;
- bio_endio(bio);
+ if (bio_integrity_prep(bio))
return BLK_QC_T_NONE;
- }
if (op_is_flush(bio->bi_opf)) {
spin_lock_irq(q->queue_lock);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 08a49c6..b19cd92 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1489,10 +1489,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
blk_queue_bounce(q, &bio);
- if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
- bio_io_error(bio);
+ if (bio_integrity_prep(bio))
return BLK_QC_T_NONE;
- }
blk_queue_split(q, &bio, q->bio_split);
@@ -1611,10 +1609,8 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
blk_queue_bounce(q, &bio);
- if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
- bio_io_error(bio);
+ if (bio_integrity_prep(bio))
return BLK_QC_T_NONE;
- }
blk_queue_split(q, &bio, q->bio_split);
diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index 9faaa96..0b49336 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -179,16 +179,8 @@ static blk_qc_t nd_blk_make_request(struct request_queue *q, struct bio *bio)
int err = 0, rw;
bool do_acct;
- /*
- * bio_integrity_enabled also checks if the bio already has an
- * integrity payload attached. If it does, we *don't* do a
- * bio_integrity_prep here - the payload has been generated by
- * another kernel subsystem, and we just pass it through.
- */
- if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
- bio->bi_error = -EIO;
- goto out;
- }
+ if (bio_integrity_prep(bio))
+ return BLK_QC_T_NONE;
bip = bio_integrity(bio);
nsblk = q->queuedata;
@@ -212,7 +204,6 @@ static blk_qc_t nd_blk_make_request(struct request_queue *q, struct bio *bio)
if (do_acct)
nd_iostat_end(bio, start);
- out:
bio_endio(bio);
return BLK_QC_T_NONE;
}
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 368795a..3d7a9fe 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1158,16 +1158,8 @@ static blk_qc_t btt_make_request(struct request_queue *q, struct bio *bio)
int err = 0;
bool do_acct;
- /*
- * bio_integrity_enabled also checks if the bio already has an
- * integrity payload attached. If it does, we *don't* do a
- * bio_integrity_prep here - the payload has been generated by
- * another kernel subsystem, and we just pass it through.
- */
- if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
- bio->bi_error = -EIO;
- goto out;
- }
+ if (bio_integrity_prep(bio))
+ return BLK_QC_T_NONE;
do_acct = nd_iostat_start(bio, &start);
bio_for_each_segment(bvec, bio, iter) {
@@ -1194,7 +1186,6 @@ static blk_qc_t btt_make_request(struct request_queue *q, struct bio *bio)
if (do_acct)
nd_iostat_end(bio, start);
-out:
bio_endio(bio);
return BLK_QC_T_NONE;
}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 7f7bf37..65445c0 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -727,7 +727,6 @@ struct biovec_slab {
extern struct bio_integrity_payload *bio_integrity_alloc(struct bio *, gfp_t, unsigned int);
extern void bio_integrity_free(struct bio *);
extern int bio_integrity_add_page(struct bio *, struct page *, unsigned int, unsigned int);
-extern bool bio_integrity_enabled(struct bio *bio);
extern int bio_integrity_prep(struct bio *);
extern void bio_integrity_endio(struct bio *);
extern void bio_integrity_advance(struct bio *, unsigned int);
@@ -744,11 +743,6 @@ static inline void *bio_integrity(struct bio *bio)
return NULL;
}
-static inline bool bio_integrity_enabled(struct bio *bio)
-{
- return false;
-}
-
static inline int bioset_integrity_create(struct bio_set *bs, int pool_size)
{
return 0;
--
2.9.3
Signed-off-by: Dmitry Monakhov <[email protected]>
---
block/t10-pi.c | 9 +++------
drivers/scsi/lpfc/lpfc_scsi.c | 5 +++--
drivers/scsi/qla2xxx/qla_isr.c | 8 ++++----
drivers/target/target_core_sbc.c | 2 +-
include/linux/t10-pi.h | 2 ++
5 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/block/t10-pi.c b/block/t10-pi.c
index 2c97912..485cecd 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -28,9 +28,6 @@
typedef __be16 (csum_fn) (void *, unsigned int);
-static const __be16 APP_ESCAPE = (__force __be16) 0xffff;
-static const __be32 REF_ESCAPE = (__force __be32) 0xffffffff;
-
static __be16 t10_pi_crc_fn(void *data, unsigned int len)
{
return cpu_to_be16(crc_t10dif(data, len));
@@ -82,7 +79,7 @@ static int t10_pi_verify(struct blk_integrity_iter *iter, csum_fn *fn,
switch (type) {
case 1:
case 2:
- if (pi->app_tag == APP_ESCAPE)
+ if (pi->app_tag == T10_APP_ESCAPE)
goto next;
if (be32_to_cpu(pi->ref_tag) !=
@@ -95,8 +92,8 @@ static int t10_pi_verify(struct blk_integrity_iter *iter, csum_fn *fn,
}
break;
case 3:
- if (pi->app_tag == APP_ESCAPE &&
- pi->ref_tag == REF_ESCAPE)
+ if (pi->app_tag == T10_APP_ESCAPE &&
+ pi->ref_tag == T10_REF_ESCAPE)
goto next;
break;
}
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 54fd0c8..6f6b40e 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -26,6 +26,7 @@
#include <linux/export.h>
#include <linux/delay.h>
#include <asm/unaligned.h>
+#include <linux/t10-pi.h>
#include <linux/crc-t10dif.h>
#include <net/checksum.h>
@@ -2934,8 +2935,8 @@ lpfc_calc_bg_err(struct lpfc_hba *phba, struct lpfc_scsi_buf *lpfc_cmd)
* First check to see if a protection data
* check is valid
*/
- if ((src->ref_tag == 0xffffffff) ||
- (src->app_tag == 0xffff)) {
+ if ((src->ref_tag == T10_REF_ESCAPE) ||
+ (src->app_tag == T10_APP_ESCAPE)) {
start_ref_tag++;
goto skipit;
}
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 3203367..ed4b302 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -1950,9 +1950,9 @@ qla2x00_handle_dif_error(srb_t *sp, struct sts_entry_24xx *sts24)
* For type 3: ref & app tag is all 'f's
* For type 0,1,2: app tag is all 'f's
*/
- if ((a_app_tag == 0xffff) &&
+ if ((a_app_tag == T10_APP_ESCAPE) &&
((scsi_get_prot_type(cmd) != SCSI_PROT_DIF_TYPE3) ||
- (a_ref_tag == 0xffffffff))) {
+ (a_ref_tag == T10_REF_ESCAPE))) {
uint32_t blocks_done, resid;
sector_t lba_s = scsi_get_lba(cmd);
@@ -1994,9 +1994,9 @@ qla2x00_handle_dif_error(srb_t *sp, struct sts_entry_24xx *sts24)
spt = page_address(sg_page(sg)) + sg->offset;
spt += j;
- spt->app_tag = 0xffff;
+ spt->app_tag = T10_APP_ESCAPE;
if (scsi_get_prot_type(cmd) == SCSI_PROT_DIF_TYPE3)
- spt->ref_tag = 0xffffffff;
+ spt->ref_tag = T10_REF_ESCAPE;
}
return 0;
diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index c194063..927ef44 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -1446,7 +1446,7 @@ sbc_dif_verify(struct se_cmd *cmd, sector_t start, unsigned int sectors,
(unsigned long long)sector, sdt->guard_tag,
sdt->app_tag, be32_to_cpu(sdt->ref_tag));
- if (sdt->app_tag == cpu_to_be16(0xffff)) {
+ if (sdt->app_tag == T10_APP_ESCAPE) {
dsg_off += block_size;
goto next;
}
diff --git a/include/linux/t10-pi.h b/include/linux/t10-pi.h
index 9fba9dd..f892442 100644
--- a/include/linux/t10-pi.h
+++ b/include/linux/t10-pi.h
@@ -33,6 +33,8 @@ struct t10_pi_tuple {
__be32 ref_tag; /* Target LBA or indirect LBA */
};
+#define T10_APP_ESCAPE cpu_to_be16(0xffff)
+#define T10_REF_ESCAPE cpu_to_be32(0xffffffff)
extern struct blk_integrity_profile t10_pi_type1_crc;
extern struct blk_integrity_profile t10_pi_type1_ip;
--
2.9.3
Some ->bi_end_io handlers (for example: pi_verify or decrypt handlers)
need to know original data vector, but after bio traverse io-stack it may
be advanced, splited and relocated many times so it is hard to guess
original iterator. Let's add 'bi_done' conter which accounts number
of bytes iterator was advanced during it's evolution. Later end_io handler
may easily restore original iterator by rewinding iterator to
iter->bi_done.
Note: this change makes sizeof (struct bvec_iter) multiple to 8
Signed-off-by: Dmitry Monakhov <[email protected]>
---
include/linux/bio.h | 21 +++++++++++++++++++--
include/linux/bvec.h | 26 ++++++++++++++++++++++++++
2 files changed, 45 insertions(+), 2 deletions(-)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 12b7fcd..491fb0c 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -166,9 +166,10 @@ static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
{
iter->bi_sector += bytes >> 9;
- if (bio_no_advance_iter(bio))
+ if (bio_no_advance_iter(bio)) {
iter->bi_size -= bytes;
- else {
+ iter->bi_done += bytes;
+ } else {
int err;
err = bvec_iter_advance(bio->bi_io_vec, iter, bytes);
if (unlikely(err))
@@ -176,6 +177,22 @@ static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
}
}
+static inline void bio_rewind_iter(struct bio *bio, struct bvec_iter *iter,
+ unsigned bytes)
+{
+ iter->bi_sector -= bytes >> 9;
+
+ if (bio_no_advance_iter(bio)) {
+ iter->bi_size += bytes;
+ iter->bi_done -= bytes;
+ } else {
+ int err;
+ err = bvec_iter_rewind(bio->bi_io_vec, iter, bytes);
+ if (unlikely(err))
+ bio->bi_error = err;
+ }
+}
+
#define __bio_for_each_segment(bvl, bio, iter, start) \
for (iter = (start); \
(iter).bi_size && \
diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 984a7a8..35cb97b 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -40,6 +40,8 @@ struct bvec_iter {
unsigned int bi_idx; /* current index into bvl_vec */
+ unsigned int bi_done; /* number of bytes completed */
+
unsigned int bi_bvec_done; /* number of bytes completed in
current bvec */
};
@@ -84,6 +86,7 @@ static inline int bvec_iter_advance(const struct bio_vec *bv,
bytes -= len;
iter->bi_size -= len;
iter->bi_bvec_done += len;
+ iter->bi_done += len;
if (iter->bi_bvec_done == __bvec_iter_bvec(bv, *iter)->bv_len) {
iter->bi_bvec_done = 0;
@@ -93,6 +96,29 @@ static inline int bvec_iter_advance(const struct bio_vec *bv,
return 0;
}
+static inline int bvec_iter_rewind(const struct bio_vec *bv,
+ struct bvec_iter *iter,
+ unsigned bytes)
+{
+ while (bytes) {
+ unsigned len = min(bytes, iter->bi_bvec_done);
+ if (iter->bi_bvec_done == 0 ) {
+ if (WARN_ONCE(iter->bi_idx == 0,
+ "Attempted to rewind iter beyond "
+ "bvec's boundaries\n")) {
+ return -EINVAL;
+ }
+ iter->bi_idx--;
+ iter->bi_bvec_done = __bvec_iter_bvec(bv, *iter)->bv_len;
+ continue;
+ }
+ bytes -= len;
+ iter->bi_size += len;
+ iter->bi_bvec_done -= len;
+ }
+ return 0;
+}
+
#define for_each_bvec(bvl, bio_vec, iter, start) \
for (iter = (start); \
(iter).bi_size && \
--
2.9.3
Currently if some one try to advance bvec beyond it's size we simply
dump WARN_ONCE and continue to iterate beyond bvec array boundaries.
This simply means that we endup dereferencing/corrupting random memory
region.
Sane reaction would be to propagate error back to calling context
But bvec_iter_advance's calling context is not always good for error
handling. For safity reason let truncate iterator size to zero which
will break external iteration loop which prevent us from unpredictable
memory range corruption. And even it caller ignores an error, it will
corrupt it's own bvecs, not others.
This patch does:
- Return error back to caller with hope that it will react on this
- Truncate iterator size
Code was added long time ago here 4550dd6c, luckily no one hit it
in real life :)
changes since V1:
- Replace BUG_ON with error logic.
Signed-off-by: Dmitry Monakhov <[email protected]>
---
drivers/nvdimm/blk.c | 4 +++-
drivers/nvdimm/btt.c | 4 +++-
include/linux/bio.h | 8 ++++++--
include/linux/bvec.h | 11 ++++++++---
4 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index 0b49336..c82331b 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -106,7 +106,9 @@ static int nd_blk_rw_integrity(struct nd_namespace_blk *nsblk,
len -= cur_len;
dev_offset += cur_len;
- bvec_iter_advance(bip->bip_vec, &bip->bip_iter, cur_len);
+ err = bvec_iter_advance(bip->bip_vec, &bip->bip_iter, cur_len);
+ if (err)
+ return err;
}
return err;
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 3d7a9fe..5a68681 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -942,7 +942,9 @@ static int btt_rw_integrity(struct btt *btt, struct bio_integrity_payload *bip,
len -= cur_len;
meta_nsoff += cur_len;
- bvec_iter_advance(bip->bip_vec, &bip->bip_iter, cur_len);
+ ret = bvec_iter_advance(bip->bip_vec, &bip->bip_iter, cur_len);
+ if (ret)
+ return ret;
}
return ret;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 65445c0..12b7fcd 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -168,8 +168,12 @@ static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
if (bio_no_advance_iter(bio))
iter->bi_size -= bytes;
- else
- bvec_iter_advance(bio->bi_io_vec, iter, bytes);
+ else {
+ int err;
+ err = bvec_iter_advance(bio->bi_io_vec, iter, bytes);
+ if (unlikely(err))
+ bio->bi_error = err;
+ }
}
#define __bio_for_each_segment(bvl, bio, iter, start) \
diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 89b65b8..984a7a8 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -22,6 +22,7 @@
#include <linux/kernel.h>
#include <linux/bug.h>
+#include <linux/errno.h>
/*
* was unsigned short, but we might as well be ready for > 64kB I/O pages
@@ -66,12 +67,15 @@ struct bvec_iter {
.bv_offset = bvec_iter_offset((bvec), (iter)), \
})
-static inline void bvec_iter_advance(const struct bio_vec *bv,
+static inline int bvec_iter_advance(const struct bio_vec *bv,
struct bvec_iter *iter,
unsigned bytes)
{
- WARN_ONCE(bytes > iter->bi_size,
- "Attempted to advance past end of bvec iter\n");
+ if (WARN_ONCE(bytes > iter->bi_size,
+ "Attempted to advance past end of bvec iter\n")) {
+ iter->bi_size = 0;
+ return -EINVAL;
+ }
while (bytes) {
unsigned iter_len = bvec_iter_len(bv, *iter);
@@ -86,6 +90,7 @@ static inline void bvec_iter_advance(const struct bio_vec *bv,
iter->bi_idx++;
}
}
+ return 0;
}
#define for_each_bvec(bvl, bio_vec, iter, start) \
--
2.9.3
bio_integrity_trim inherent it's interface from bio_trim and accept
offset and size, but this API is error prone because data offset
must always be insync with bio's data offset. That is why we have
integrity update hook in bio_advance()
So only meaningful values are: offset == 0, sectors == bio_sectors(bio)
Let's just remove them completely.
changes from v1:
- remove 'sectors' arguments
Signed-off-by: Dmitry Monakhov <[email protected]>
---
block/bio-integrity.c | 11 ++---------
block/bio.c | 4 ++--
drivers/md/dm.c | 2 +-
include/linux/bio.h | 5 ++---
4 files changed, 7 insertions(+), 15 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 82a6ffb..6170dad 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -433,22 +433,15 @@ EXPORT_SYMBOL(bio_integrity_advance);
/**
* bio_integrity_trim - Trim integrity vector
* @bio: bio whose integrity vector to update
- * @offset: offset to first data sector
- * @sectors: number of data sectors
*
* Description: Used to trim the integrity vector in a cloned bio.
- * The ivec will be advanced corresponding to 'offset' data sectors
- * and the length will be truncated corresponding to 'len' data
- * sectors.
*/
-void bio_integrity_trim(struct bio *bio, unsigned int offset,
- unsigned int sectors)
+void bio_integrity_trim(struct bio *bio)
{
struct bio_integrity_payload *bip = bio_integrity(bio);
struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
- bio_integrity_advance(bio, offset << 9);
- bip->bip_iter.bi_size = bio_integrity_bytes(bi, sectors);
+ bip->bip_iter.bi_size = bio_integrity_bytes(bi, bio_sectors(bio));
}
EXPORT_SYMBOL(bio_integrity_trim);
diff --git a/block/bio.c b/block/bio.c
index fa84323..65da4c9 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1878,7 +1878,7 @@ struct bio *bio_split(struct bio *bio, int sectors,
split->bi_iter.bi_size = sectors << 9;
if (bio_integrity(split))
- bio_integrity_trim(split, 0, sectors);
+ bio_integrity_trim(split);
bio_advance(bio, split->bi_iter.bi_size);
@@ -1909,7 +1909,7 @@ void bio_trim(struct bio *bio, int offset, int size)
bio->bi_iter.bi_size = size;
if (bio_integrity(bio))
- bio_integrity_trim(bio, 0, size);
+ bio_integrity_trim(bio);
}
EXPORT_SYMBOL_GPL(bio_trim);
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index dfb7597..25c5a8b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1102,7 +1102,7 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
clone->bi_iter.bi_size = to_bytes(len);
if (bio_integrity(bio))
- bio_integrity_trim(clone, 0, len);
+ bio_integrity_trim(clone);
return 0;
}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 8e52119..7f7bf37 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -731,7 +731,7 @@ extern bool bio_integrity_enabled(struct bio *bio);
extern int bio_integrity_prep(struct bio *);
extern void bio_integrity_endio(struct bio *);
extern void bio_integrity_advance(struct bio *, unsigned int);
-extern void bio_integrity_trim(struct bio *, unsigned int, unsigned int);
+extern void bio_integrity_trim(struct bio *);
extern int bio_integrity_clone(struct bio *, struct bio *, gfp_t);
extern int bioset_integrity_create(struct bio_set *, int);
extern void bioset_integrity_free(struct bio_set *);
@@ -781,8 +781,7 @@ static inline void bio_integrity_advance(struct bio *bio,
return;
}
-static inline void bio_integrity_trim(struct bio *bio, unsigned int offset,
- unsigned int sectors)
+static inline void bio_integrity_trim(struct bio *bio)
{
return;
}
--
2.9.3
SCSI drivers do care about bip_seed so we must update it accordingly.
Signed-off-by: Dmitry Monakhov <[email protected]>
---
block/bio-integrity.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index b5009a8..82a6ffb 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -425,6 +425,7 @@ void bio_integrity_advance(struct bio *bio, unsigned int bytes_done)
struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
unsigned bytes = bio_integrity_bytes(bi, bytes_done >> 9);
+ bip->bip_iter.bi_sector += bytes_done >> 9;
bvec_iter_advance(bip->bip_vec, &bip->bip_iter, bytes);
}
EXPORT_SYMBOL(bio_integrity_advance);
--
2.9.3
If bio has no data, such as ones from blkdev_issue_flush(),
then we have nothing to protect.
This patch prevent bugon like follows:
kfree_debugcheck: out of range ptr ac1fa1d106742a5ah
kernel BUG at mm/slab.c:2773!
invalid opcode: 0000 [#1] SMP
Modules linked in: bcache
CPU: 0 PID: 4428 Comm: xfs_io Tainted: G W 4.11.0-rc4-ext4-00041-g2ef0043-dirty #43
Hardware name: Virtuozzo KVM, BIOS seabios-1.7.5-11.vz7.4 04/01/2014
task: ffff880137786440 task.stack: ffffc90000ba8000
RIP: 0010:kfree_debugcheck+0x25/0x2a
RSP: 0018:ffffc90000babde0 EFLAGS: 00010082
RAX: 0000000000000034 RBX: ac1fa1d106742a5a RCX: 0000000000000007
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013f3ccb40
RBP: ffffc90000babde8 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000fcb76420 R11: 00000000725172ed R12: 0000000000000282
R13: ffffffff8150e766 R14: ffff88013a145e00 R15: 0000000000000001
FS: 00007fb09384bf40(0000) GS:ffff88013f200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd0172f9e40 CR3: 0000000137fa9000 CR4: 00000000000006f0
Call Trace:
kfree+0xc8/0x1b3
bio_integrity_free+0xc3/0x16b
bio_free+0x25/0x66
bio_put+0x14/0x26
blkdev_issue_flush+0x7a/0x85
blkdev_fsync+0x35/0x42
vfs_fsync_range+0x8e/0x9f
vfs_fsync+0x1c/0x1e
do_fsync+0x31/0x4a
SyS_fsync+0x10/0x14
entry_SYSCALL_64_fastpath+0x1f/0xc2
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Dmitry Monakhov <[email protected]>
---
block/bio-integrity.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 5384713..b5009a8 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -175,6 +175,9 @@ bool bio_integrity_enabled(struct bio *bio)
if (bio_op(bio) != REQ_OP_READ && bio_op(bio) != REQ_OP_WRITE)
return false;
+ if (!bio_sectors(bio))
+ return false;
+
/* Already protected? */
if (bio_integrity(bio))
return false;
--
2.9.3
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> If bio has no data, such as ones from blkdev_issue_flush(),
> then we have nothing to protect.
>
> This patch prevent bugon like follows:
>
> kfree_debugcheck: out of range ptr ac1fa1d106742a5ah
> kernel BUG at mm/slab.c:2773!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: bcache
> CPU: 0 PID: 4428 Comm: xfs_io Tainted: G W 4.11.0-rc4-ext4-00041-g2ef0043-dirty #43
> Hardware name: Virtuozzo KVM, BIOS seabios-1.7.5-11.vz7.4 04/01/2014
> task: ffff880137786440 task.stack: ffffc90000ba8000
> RIP: 0010:kfree_debugcheck+0x25/0x2a
> RSP: 0018:ffffc90000babde0 EFLAGS: 00010082
> RAX: 0000000000000034 RBX: ac1fa1d106742a5a RCX: 0000000000000007
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013f3ccb40
> RBP: ffffc90000babde8 R08: 0000000000000000 R09: 0000000000000000
> R10: 00000000fcb76420 R11: 00000000725172ed R12: 0000000000000282
> R13: ffffffff8150e766 R14: ffff88013a145e00 R15: 0000000000000001
> FS: 00007fb09384bf40(0000) GS:ffff88013f200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fd0172f9e40 CR3: 0000000137fa9000 CR4: 00000000000006f0
> Call Trace:
> kfree+0xc8/0x1b3
> bio_integrity_free+0xc3/0x16b
> bio_free+0x25/0x66
> bio_put+0x14/0x26
> blkdev_issue_flush+0x7a/0x85
> blkdev_fsync+0x35/0x42
> vfs_fsync_range+0x8e/0x9f
> vfs_fsync+0x1c/0x1e
> do_fsync+0x31/0x4a
> SyS_fsync+0x10/0x14
> entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> Reviewed-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> block/bio-integrity.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/block/bio-integrity.c b/block/bio-integrity.c
> index 5384713..b5009a8 100644
> --- a/block/bio-integrity.c
> +++ b/block/bio-integrity.c
> @@ -175,6 +175,9 @@ bool bio_integrity_enabled(struct bio *bio)
> if (bio_op(bio) != REQ_OP_READ && bio_op(bio) != REQ_OP_WRITE)
> return false;
>
> + if (!bio_sectors(bio))
> + return false;
> +
> /* Already protected? */
> if (bio_integrity(bio))
> return false;
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Reviewed-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> block/bio.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/block/bio.c b/block/bio.c
> index e75878f..fa84323 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1907,6 +1907,10 @@ void bio_trim(struct bio *bio, int offset, int size)
> bio_advance(bio, offset << 9);
>
> bio->bi_iter.bi_size = size;
> +
> + if (bio_integrity(bio))
> + bio_integrity_trim(bio, 0, size);
> +
> }
> EXPORT_SYMBOL_GPL(bio_trim);
>
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> SCSI drivers do care about bip_seed so we must update it accordingly.
>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> block/bio-integrity.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/block/bio-integrity.c b/block/bio-integrity.c
> index b5009a8..82a6ffb 100644
> --- a/block/bio-integrity.c
> +++ b/block/bio-integrity.c
> @@ -425,6 +425,7 @@ void bio_integrity_advance(struct bio *bio, unsigned int bytes_done)
> struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
> unsigned bytes = bio_integrity_bytes(bi, bytes_done >> 9);
>
> + bip->bip_iter.bi_sector += bytes_done >> 9;
> bvec_iter_advance(bip->bip_vec, &bip->bip_iter, bytes);
> }
> EXPORT_SYMBOL(bio_integrity_advance);
>
Odd that we've missed that one...
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> bio_integrity_trim inherent it's interface from bio_trim and accept
> offset and size, but this API is error prone because data offset
> must always be insync with bio's data offset. That is why we have
> integrity update hook in bio_advance()
>
> So only meaningful values are: offset == 0, sectors == bio_sectors(bio)
> Let's just remove them completely.
>
> changes from v1:
> - remove 'sectors' arguments
>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> block/bio-integrity.c | 11 ++---------
> block/bio.c | 4 ++--
> drivers/md/dm.c | 2 +-
> include/linux/bio.h | 5 ++---
> 4 files changed, 7 insertions(+), 15 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Currently all integrity prep hooks are open-coded, and if prepare fails
> we ignore it's code and fail bio with EIO. Let's return real error to
> upper layer, so later caller may react accordingly.
>
> In fact no one want to use bio_integrity_prep() w/o bio_integrity_enabled,
> so it is reasonable to fold it in to one function.
>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> Documentation/block/data-integrity.txt | 3 --
> block/bio-integrity.c | 88 ++++++++++++++++++----------------
> block/blk-core.c | 5 +-
> block/blk-mq.c | 8 +---
> drivers/nvdimm/blk.c | 13 +----
> drivers/nvdimm/btt.c | 13 +----
> include/linux/bio.h | 6 ---
> 7 files changed, 55 insertions(+), 81 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> block/t10-pi.c | 9 +++------
> drivers/scsi/lpfc/lpfc_scsi.c | 5 +++--
> drivers/scsi/qla2xxx/qla_isr.c | 8 ++++----
> drivers/target/target_core_sbc.c | 2 +-
> include/linux/t10-pi.h | 2 ++
> 5 files changed, 13 insertions(+), 13 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Currently if some one try to advance bvec beyond it's size we simply
> dump WARN_ONCE and continue to iterate beyond bvec array boundaries.
> This simply means that we endup dereferencing/corrupting random memory
> region.
>
> Sane reaction would be to propagate error back to calling context
> But bvec_iter_advance's calling context is not always good for error
> handling. For safity reason let truncate iterator size to zero which
> will break external iteration loop which prevent us from unpredictable
> memory range corruption. And even it caller ignores an error, it will
> corrupt it's own bvecs, not others.
>
> This patch does:
> - Return error back to caller with hope that it will react on this
> - Truncate iterator size
>
> Code was added long time ago here 4550dd6c, luckily no one hit it
> in real life :)
>
> changes since V1:
> - Replace BUG_ON with error logic.
>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> drivers/nvdimm/blk.c | 4 +++-
> drivers/nvdimm/btt.c | 4 +++-
> include/linux/bio.h | 8 ++++++--
> include/linux/bvec.h | 11 ++++++++---
> 4 files changed, 20 insertions(+), 7 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Some ->bi_end_io handlers (for example: pi_verify or decrypt handlers)
> need to know original data vector, but after bio traverse io-stack it may
> be advanced, splited and relocated many times so it is hard to guess
> original iterator. Let's add 'bi_done' conter which accounts number
> of bytes iterator was advanced during it's evolution. Later end_io handler
> may easily restore original iterator by rewinding iterator to
> iter->bi_done.
>
> Note: this change makes sizeof (struct bvec_iter) multiple to 8
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> include/linux/bio.h | 21 +++++++++++++++++++--
> include/linux/bvec.h | 26 ++++++++++++++++++++++++++
> 2 files changed, 45 insertions(+), 2 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Currently ->verify_fn not woks at all because at the moment it is called
> bio->bi_iter.bi_size == 0, so we do not iterate integrity bvecs at all.
>
> In order to perform verification we need to know original data vector,
> with new bvec rewind API this is trivial.
>
> testcase: https://github.com/dmonakhov/xfstests/commit/3c6509eaa83b9c17cd0bc95d73fcdd76e1c54a85
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> block/bio-integrity.c | 22 +++++++++++++++-------
> 1 file changed, 15 insertions(+), 7 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Please drop the v2 from the subject line and move your changelog either
into the cover letter (preferred) or at least below the --- line.
Otherwise looks good:
Reviewed-by: Christoph Hellwig <[email protected]>
> --- a/Documentation/block/data-integrity.txt
> +++ b/Documentation/block/data-integrity.txt
> @@ -202,9 +202,6 @@ will require extra work due to the application tag.
> added. It is up to the caller to ensure that the bio does not
> change while I/O is in progress.
>
> - bio_integrity_prep() should only be called if
> - bio_integrity_enabled() returned 1.
> -
This should grow a blurb that bio_integrity_prep will complete the bio
if an error occurs.
> /**
> - * bio_integrity_prep - Prepare bio for integrity I/O
> + * bio_integrity_setup - Prepare bio for integrity I/O
> * @bio: bio to prepare
> *
> * Description: Allocates a buffer for integrity metadata, maps the
> @@ -269,7 +231,7 @@ static int bio_integrity_process(struct bio *bio,
> * block device's integrity function. In the READ case, the buffer
> * will be prepared for DMA and a suitable end_io handler set up.
> */
> -int bio_integrity_prep(struct bio *bio)
> +static int bio_integrity_setup(struct bio *bio)
Instead of renaming the function I'd suggest simply merging the two,
the end result will be smaller and more readable.
> + if (bio_data_dir(bio) == READ && bi->profile->verify_fn != NULL &&
> + (bi->flags & BLK_INTEGRITY_VERIFY))
> + goto need_prep;
> +
> + if (bio_data_dir(bio) == WRITE && bi->profile->generate_fn != NULL &&
> + (bi->flags & BLK_INTEGRITY_GENERATE))
> + goto need_prep;
I'd invert the conditions to avoid the goto:
if (bio_data_dir(bio) == READ) {
if (!bi->profile->verify_fn ||
!(bi->flags & BLK_INTEGRITY_VERIFY))
return 0;
} else {
if (!bi->profile->generate_fn ||
!(bi->flags & BLK_INTEGRITY_GENERATE))
return 0;
}
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
> if (bio_no_advance_iter(bio))
> iter->bi_size -= bytes;
> - else
> - bvec_iter_advance(bio->bi_io_vec, iter, bytes);
> + else {
> + int err;
> + err = bvec_iter_advance(bio->bi_io_vec, iter, bytes);
> + if (unlikely(err))
> + bio->bi_error = err;
> + }
I don't think that setting bi_error here is a good idea without actually
completing the bio, which would be a much bigger change.
Maybe leave the error ignored here for now with a fixme comment, and
then we can look into proper error handling in a separate series.
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Tue, Apr 04, 2017 at 10:56:40PM +0400, Dmitry Monakhov wrote:
> Some ->bi_end_io handlers (for example: pi_verify or decrypt handlers)
Just curious: are you planning to use it for the latter as well in
future patches? If not can you at least give the relevant maintainers
a headsup?
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Dmitry Monakhov <[email protected]> writes:
> If bio has no data, such as ones from blkdev_issue_flush(),
> then we have nothing to protect.
Reviewed-by: Martin K. Petersen <[email protected]>
--
Martin K. Petersen Oracle Linux Engineering
Dmitry Monakhov <[email protected]> writes:
Reviewed-by: Martin K. Petersen <[email protected]>
--
Martin K. Petersen Oracle Linux Engineering
Dmitry Monakhov <[email protected]> writes:
> SCSI drivers do care about bip_seed so we must update it accordingly.
> + bip->bip_iter.bi_sector += bytes_done >> 9;
This needs to count protection intervals. Otherwise things will break
for block sizes different from 512 bytes.
--
Martin K. Petersen Oracle Linux Engineering
Dmitry Monakhov <[email protected]> writes:
> bio_integrity_trim inherent it's interface from bio_trim and accept
> offset and size, but this API is error prone because data offset must
> always be insync with bio's data offset. That is why we have integrity
> update hook in bio_advance()
>
> So only meaningful values are: offset == 0, sectors ==
> bio_sectors(bio) Let's just remove them completely.
This interface predates the iter/immutable stuff. It should be fine to
remove the call arguments now.
Reviewed-by: Martin K. Petersen <[email protected]>
--
Martin K. Petersen Oracle Linux Engineering
Dmitry Monakhov <[email protected]> writes:
> Currently all integrity prep hooks are open-coded, and if prepare
> fails we ignore it's code and fail bio with EIO. Let's return real
> error to upper layer, so later caller may react accordingly.
>
> In fact no one want to use bio_integrity_prep() w/o
> bio_integrity_enabled, so it is reasonable to fold it in to one
> function.
Agree with Christoph's comments. Otherwise OK.
--
Martin K. Petersen Oracle Linux Engineering
Dmitry Monakhov <[email protected]> writes:
Dmitry,
> diff --git a/include/linux/t10-pi.h b/include/linux/t10-pi.h
> index 9fba9dd..f892442 100644
> --- a/include/linux/t10-pi.h
> +++ b/include/linux/t10-pi.h
> @@ -33,6 +33,8 @@ struct t10_pi_tuple {
> __be32 ref_tag; /* Target LBA or indirect LBA */
> };
>
> +#define T10_APP_ESCAPE cpu_to_be16(0xffff)
> +#define T10_REF_ESCAPE cpu_to_be32(0xffffffff)
T10 just means SCSI and does not imply protection information. Please
make these T10_PI_{REF,APP}_ESCAPE (ideally in an enum).
--
Martin K. Petersen Oracle Linux Engineering
Dmitry Monakhov <[email protected]> writes:
Dmitry,
> Some ->bi_end_io handlers (for example: pi_verify or decrypt handlers)
> need to know original data vector, but after bio traverse io-stack it
> may be advanced, splited and relocated many times so it is hard to
> guess original iterator. Let's add 'bi_done' conter which accounts
> number of bytes iterator was advanced during it's evolution. Later
> end_io handler may easily restore original iterator by rewinding
> iterator to iter->bi_done.
Originally, the original integrity bip was immutable. Any slicing and
dicing was exclusively done on clones. And therefore the original bip
was always suitable for use with verify.
--
Martin K. Petersen Oracle Linux Engineering