The NVM Express protocol added enhancements to the data integrity field
formats beyond the T10 defined protection information. A detailed
description of the new formats can be found in the NVMe's NVM Command
Set Specification, section 5.2, available at:
https://nvmexpress.org/wp-content/uploads/NVM-Command-Set-Specification-1.0b-2021.12.18-Ratified.pdf
This series implements one possible new format: the CRC64 guard with
48-bit reference tags. This does not add support for the variable
"storage tag" field, or any potential hardware acceleration.
Changes since RFC v1:
- Generated the reflected CRC table and cacluated CRC accordingly
instead of reflecting the input/output (Eric Biggers, patch 3)
- Fixed Kconfig.debug dependency for CRC tests (patch 4)
- Fixed endian conversion sparse error (patch 7).
- Added support for PRACT (Klaus Jensen, patch 7)
Keith Busch (7):
block: support pi with extended metadata
nvme: allow integrity on extended metadata formats
lib: add rocksoft model crc64
lib: add crc64 tests
asm-generic: introduce be48 unaligned accessors
block: add pi for nvme enhanced integrity
nvme: add support for enhanced metadata
block/Kconfig | 1 +
block/bio-integrity.c | 1 +
block/t10-pi.c | 198 +++++++++++++++++++++++++++++++-
drivers/nvme/host/core.c | 167 ++++++++++++++++++++++-----
drivers/nvme/host/nvme.h | 4 +-
include/asm-generic/unaligned.h | 26 +++++
include/linux/blk-integrity.h | 1 +
include/linux/crc64.h | 2 +
include/linux/nvme.h | 53 ++++++++-
include/linux/t10-pi.h | 20 ++++
lib/Kconfig.debug | 4 +
lib/Makefile | 1 +
lib/crc64.c | 26 +++++
lib/gen_crc64table.c | 51 ++++++--
lib/test_crc64.c | 68 +++++++++++
15 files changed, 576 insertions(+), 47 deletions(-)
create mode 100644 lib/test_crc64.c
--
2.25.4
The nvme spec allows protection information formats with metadata
extending beyond the pi field. Use the actual size of the metadata field
for incrementing the protection buffer.
Signed-off-by: Keith Busch <[email protected]>
---
block/bio-integrity.c | 1 +
block/t10-pi.c | 4 ++--
include/linux/blk-integrity.h | 1 +
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index d25114715459..e40b1f965960 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -165,6 +165,7 @@ static blk_status_t bio_integrity_process(struct bio *bio,
iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
iter.interval = 1 << bi->interval_exp;
+ iter.tuple_size = bi->tuple_size;
iter.seed = proc_iter->bi_sector;
iter.prot_buf = bvec_virt(bip->bip_vec);
diff --git a/block/t10-pi.c b/block/t10-pi.c
index 25a52a2a09a8..758a76518854 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -44,7 +44,7 @@ static blk_status_t t10_pi_generate(struct blk_integrity_iter *iter,
pi->ref_tag = 0;
iter->data_buf += iter->interval;
- iter->prot_buf += sizeof(struct t10_pi_tuple);
+ iter->prot_buf += iter->tuple_size;
iter->seed++;
}
@@ -93,7 +93,7 @@ static blk_status_t t10_pi_verify(struct blk_integrity_iter *iter,
next:
iter->data_buf += iter->interval;
- iter->prot_buf += sizeof(struct t10_pi_tuple);
+ iter->prot_buf += iter->tuple_size;
iter->seed++;
}
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index 8a038ea0717e..378b2459efe2 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -19,6 +19,7 @@ struct blk_integrity_iter {
sector_t seed;
unsigned int data_size;
unsigned short interval;
+ unsigned char tuple_size;
const char *disk_name;
};
--
2.25.4
Provide a module to test the rocksoft crc64 calculations with well known
inputs and exepected values.
Signed-off-by: Keith Busch <[email protected]>
---
v1->v2:
Fixed Kconfig dependency
lib/Kconfig.debug | 4 +++
lib/Makefile | 1 +
lib/test_crc64.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 73 insertions(+)
create mode 100644 lib/test_crc64.c
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 14b89aa37c5c..149de11ae903 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2214,6 +2214,10 @@ config TEST_UUID
config TEST_XARRAY
tristate "Test the XArray code at runtime"
+config TEST_CRC64
+ depends on CRC64
+ tristate "Test the crc64 code at runtime"
+
config TEST_OVERFLOW
tristate "Test check_*_overflow() functions at runtime"
diff --git a/lib/Makefile b/lib/Makefile
index 300f569c626b..e100a4d6a950 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -103,6 +103,7 @@ obj-$(CONFIG_TEST_HMM) += test_hmm.o
obj-$(CONFIG_TEST_FREE_PAGES) += test_free_pages.o
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
obj-$(CONFIG_TEST_REF_TRACKER) += test_ref_tracker.o
+obj-$(CONFIG_TEST_CRC64) += test_crc64.o
#
# CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
# off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
diff --git a/lib/test_crc64.c b/lib/test_crc64.c
new file mode 100644
index 000000000000..283fef8f110e
--- /dev/null
+++ b/lib/test_crc64.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Tests were selected from NVM Express NVM Command Set Specification 1.0a,
+ * section 5.2.1.3.5 "64b CRC Test Cases" available here:
+ *
+ * https://nvmexpress.org/wp-content/uploads/NVMe-NVM-Command-Set-Specification-1.0a-2021.07.26-Ratified.pdf
+ *
+ * Copyright 2022 Keith Busch <[email protected]>
+ */
+
+#include <linux/crc64.h>
+#include <linux/module.h>
+
+static unsigned int tests_passed;
+static unsigned int tests_run;
+
+#define ALL_ZEROS 0x6482D367EB22B64EULL
+#define ALL_FFS 0xC0DDBA7302ECA3ACULL
+#define INC 0x3E729F5F6750449CULL
+#define DEC 0x9A2DF64B8E9E517EULL
+
+static u8 buffer[4096];
+
+#define CRC_CHECK(c, v) do { \
+ tests_run++; \
+ if (c != v) \
+ printk("BUG at %s:%d expected:%llx got:%llx\n", \
+ __func__, __LINE__, v, c); \
+ else \
+ tests_passed++; \
+} while (0)
+
+
+static int crc_tests(void)
+{
+ __u64 crc;
+ int i;
+
+ memset(buffer, 0, sizeof(buffer));
+ crc = crc64_rocksoft(~0ULL, buffer, 4096);
+ CRC_CHECK(crc, ALL_ZEROS);
+
+ memset(buffer, 0xff, sizeof(buffer));
+ crc = crc64_rocksoft(~0ULL, buffer, 4096);
+ CRC_CHECK(crc, ALL_FFS);
+
+ for (i = 0; i < 4096; i++)
+ buffer[i] = i & 0xff;
+ crc = crc64_rocksoft(~0ULL, buffer, 4096);
+ CRC_CHECK(crc, INC);
+
+ for (i = 0; i < 4096; i++)
+ buffer[i] = 0xff - (i & 0xff);
+ crc = crc64_rocksoft(~0ULL, buffer, 4096);
+ CRC_CHECK(crc, DEC);
+
+ printk("CRC64: %u of %u tests passed\n", tests_passed, tests_run);
+ return (tests_run == tests_passed) ? 0 : -EINVAL;
+}
+
+static void crc_exit(void)
+{
+}
+
+module_init(crc_tests);
+module_exit(crc_exit);
+MODULE_AUTHOR("Keith Busch <[email protected]>");
+MODULE_LICENSE("GPL");
--
2.25.4
The NVMe protocol extended the data integrity fields with unaligned
48-bit reference tags. Provide some helper accessors in
preparation for these.
Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
include/asm-generic/unaligned.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h
index 1c4242416c9f..8fc637379899 100644
--- a/include/asm-generic/unaligned.h
+++ b/include/asm-generic/unaligned.h
@@ -126,4 +126,30 @@ static inline void put_unaligned_le24(const u32 val, void *p)
__put_unaligned_le24(val, p);
}
+static inline void __put_unaligned_be48(const u64 val, __u8 *p)
+{
+ *p++ = val >> 40;
+ *p++ = val >> 32;
+ *p++ = val >> 24;
+ *p++ = val >> 16;
+ *p++ = val >> 8;
+ *p++ = val;
+}
+
+static inline void put_unaligned_be48(const u64 val, void *p)
+{
+ __put_unaligned_be48(val, p);
+}
+
+static inline u64 __get_unaligned_be48(const u8 *p)
+{
+ return (u64)p[0] << 40 | (u64)p[1] << 32 | p[2] << 24 |
+ p[3] << 16 | p[4] << 8 | p[5];
+}
+
+static inline u64 get_unaligned_be48(const void *p)
+{
+ return __get_unaligned_be48(p);
+}
+
#endif /* __ASM_GENERIC_UNALIGNED_H */
--
2.25.4
On 2/1/22 20:01, Keith Busch wrote:
> The nvme spec allows protection information formats with metadata
> extending beyond the pi field. Use the actual size of the metadata field
> for incrementing the protection buffer.
>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> block/bio-integrity.c | 1 +
> block/t10-pi.c | 4 ++--
> include/linux/blk-integrity.h | 1 +
> 3 files changed, 4 insertions(+), 2 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
Keith,
> The NVMe protocol extended the data integrity fields with unaligned
> 48-bit reference tags. Provide some helper accessors in
> preparation for these.
Looks good. Needed this a few times in the past.
Reviewed-by: Martin K. Petersen <[email protected]>
--
Martin K. Petersen Oracle Linux Engineering
Keith,
> The nvme spec allows protection information formats with metadata
> extending beyond the pi field.
This may be true but it seems the rationale for the patch in the context
of this series is to enable PI metadata bigger than t10_pi_tuple?
Reviewed-by: Martin K. Petersen <[email protected]>
--
Martin K. Petersen Oracle Linux Engineering
The NVM Express specification extended data integrity fields to 64 bits
using the Rocksoft^TM parameters. Add the poly to the crc64 table
generation, and provide a library routine implementing the algorithm.
The Rocksoft 64-bit CRC model parameters are as follows:
Poly: 0xAD93D23594C93659
Initial value: 0xFFFFFFFFFFFFFFFF
Reflected Input: True
Reflected Output: True
Xor Final: 0xFFFFFFFFFFFFFFFF
Since this model used reflected bits, the implementation generates the
reflected table so the result is ordered consistently.
Cc: Eric Biggers <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
v1->v2:
Generate a reflected table for the polynomial so that the inputs and
outputs don't need to be reflected during calculating the CRC
include/linux/crc64.h | 2 ++
lib/crc64.c | 26 ++++++++++++++++++++++
lib/gen_crc64table.c | 51 +++++++++++++++++++++++++++++++++----------
3 files changed, 68 insertions(+), 11 deletions(-)
diff --git a/include/linux/crc64.h b/include/linux/crc64.h
index c756e65a1b58..9f2f20216503 100644
--- a/include/linux/crc64.h
+++ b/include/linux/crc64.h
@@ -8,4 +8,6 @@
#include <linux/types.h>
u64 __pure crc64_be(u64 crc, const void *p, size_t len);
+u64 __pure crc64_rocksoft(u64 crc, const void *p, size_t len);
+
#endif /* _LINUX_CRC64_H */
diff --git a/lib/crc64.c b/lib/crc64.c
index 9f852a89ee2a..e223e72be44d 100644
--- a/lib/crc64.c
+++ b/lib/crc64.c
@@ -22,6 +22,13 @@
* x^24 + x^23 + x^22 + x^21 + x^19 + x^17 + x^13 + x^12 + x^10 + x^9 +
* x^7 + x^4 + x + 1
*
+ * crc64rocksoft[256] table is from the Rocksoft specification polynomial
+ * defined as,
+ *
+ * x^64 + x^63 + x^61 + x^59 + x^58 + x^56 + x^55 + x^52 + x^49 + x^48 + x^47 +
+ * x^46 + x^44 + x^41 + x^37 + x^36 + x^34 + x^32 + x^31 + x^28 + x^26 + x^23 +
+ * x^22 + x^19 + x^16 + x^13 + x^12 + x^10 + x^9 + x^6 + x^4 + x^3 + 1
+ *
* Copyright 2018 SUSE Linux.
* Author: Coly Li <[email protected]>
*/
@@ -55,3 +62,22 @@ u64 __pure crc64_be(u64 crc, const void *p, size_t len)
return crc;
}
EXPORT_SYMBOL_GPL(crc64_be);
+
+/**
+ * crc64_rocksoft - Calculate bitwise Rocksoft CRC64
+ * @crc: seed value for computation. (u64)~0 for a new CRC calculation, or the
+ * previous crc64 value if computing incrementally.
+ * @p: pointer to buffer over which CRC64 is run
+ * @len: length of buffer @p
+ */
+u64 __pure crc64_rocksoft(u64 crc, const void *p, size_t len)
+{
+ const unsigned char *_p = p;
+ size_t i;
+
+ for (i = 0; i < len; i++)
+ crc = (crc >> 8) ^ crc64rocksofttable[(crc & 0xff) ^ *_p++];
+
+ return crc ^ (u64)~0;
+}
+EXPORT_SYMBOL_GPL(crc64_rocksoft);
diff --git a/lib/gen_crc64table.c b/lib/gen_crc64table.c
index 094b43aef8db..55e222acd0b8 100644
--- a/lib/gen_crc64table.c
+++ b/lib/gen_crc64table.c
@@ -17,10 +17,30 @@
#include <stdio.h>
#define CRC64_ECMA182_POLY 0x42F0E1EBA9EA3693ULL
+#define CRC64_ROCKSOFT_POLY 0x9A6C9329AC4BC9B5ULL
static uint64_t crc64_table[256] = {0};
+static uint64_t crc64_rocksoft_table[256] = {0};
-static void generate_crc64_table(void)
+static void generate_reflected_crc64_table(uint64_t table[256], uint64_t poly)
+{
+ uint64_t i, j, c, crc;
+
+ for (i = 0; i < 256; i++) {
+ crc = 0ULL;
+ c = i;
+
+ for (j = 0; j < 8; j++) {
+ if ((crc ^ (c >> j)) & 1)
+ crc = (crc >> 1) ^ poly;
+ else
+ crc >>= 1;
+ }
+ table[i] = crc;
+ }
+}
+
+static void generate_crc64_table(uint64_t table[256], uint64_t poly)
{
uint64_t i, j, c, crc;
@@ -30,26 +50,22 @@ static void generate_crc64_table(void)
for (j = 0; j < 8; j++) {
if ((crc ^ c) & 0x8000000000000000ULL)
- crc = (crc << 1) ^ CRC64_ECMA182_POLY;
+ crc = (crc << 1) ^ poly;
else
crc <<= 1;
c <<= 1;
}
- crc64_table[i] = crc;
+ table[i] = crc;
}
}
-static void print_crc64_table(void)
+static void output_table(uint64_t table[256])
{
int i;
- printf("/* this file is generated - do not edit */\n\n");
- printf("#include <linux/types.h>\n");
- printf("#include <linux/cache.h>\n\n");
- printf("static const u64 ____cacheline_aligned crc64table[256] = {\n");
for (i = 0; i < 256; i++) {
- printf("\t0x%016" PRIx64 "ULL", crc64_table[i]);
+ printf("\t0x%016" PRIx64 "ULL", table[i]);
if (i & 0x1)
printf(",\n");
else
@@ -58,9 +74,22 @@ static void print_crc64_table(void)
printf("};\n");
}
+static void print_crc64_tables(void)
+{
+ printf("/* this file is generated - do not edit */\n\n");
+ printf("#include <linux/types.h>\n");
+ printf("#include <linux/cache.h>\n\n");
+ printf("static const u64 ____cacheline_aligned crc64table[256] = {\n");
+ output_table(crc64_table);
+
+ printf("\nstatic const u64 ____cacheline_aligned crc64rocksofttable[256] = {\n");
+ output_table(crc64_rocksoft_table);
+}
+
int main(int argc, char *argv[])
{
- generate_crc64_table();
- print_crc64_table();
+ generate_crc64_table(crc64_table, CRC64_ECMA182_POLY);
+ generate_reflected_crc64_table(crc64_rocksoft_table, CRC64_ROCKSOFT_POLY);
+ print_crc64_tables();
return 0;
}
--
2.25.4
On 2/1/22 20:01, Keith Busch wrote:
> The NVM Express specification extended data integrity fields to 64 bits
> using the Rocksoft^TM parameters. Add the poly to the crc64 table
> generation, and provide a library routine implementing the algorithm.
>
> The Rocksoft 64-bit CRC model parameters are as follows:
> Poly: 0xAD93D23594C93659
> Initial value: 0xFFFFFFFFFFFFFFFF
> Reflected Input: True
> Reflected Output: True
> Xor Final: 0xFFFFFFFFFFFFFFFF
>
> Since this model used reflected bits, the implementation generates the
> reflected table so the result is ordered consistently.
>
> Cc: Eric Biggers <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> v1->v2:
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
On 2/1/22 20:01, Keith Busch wrote:
> Provide a module to test the rocksoft crc64 calculations with well known
> inputs and exepected values.
>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> v1->v2:
>
> Fixed Kconfig dependency
>
> lib/Kconfig.debug | 4 +++
> lib/Makefile | 1 +
> lib/test_crc64.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 73 insertions(+)
> create mode 100644 lib/test_crc64.c
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
On 2/1/22 20:01, Keith Busch wrote:
> The NVMe protocol extended the data integrity fields with unaligned
> 48-bit reference tags. Provide some helper accessors in
> preparation for these.
>
> Acked-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> include/asm-generic/unaligned.h | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
Hehe.
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer