From Israel,
The purpose of this patchset is to add support for inline
encryption/decryption of the data at storage protocols like nvmf over
RDMA (at a similar way like integrity is used via unique mkey).
This patchset adds support for plaintext keys. The patches were tested
on BF-3 HW with fscrypt tool to test this feature, which showed reduce
in CPU utilization when comparing at 64k or more IO size. The CPU utilization
was improved by more than 50% comparing to the SW only solution at this case.
How to configure fscrypt to enable plaintext keys:
# mkfs.ext4 -O encrypt /dev/nvme0n1
# mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
# head -c 64 /dev/urandom > /tmp/master_key
# fscryptctl add_key /mnt/crypto/ < /tmp/master_key
# mkdir /mnt/crypto/test1
# fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
“fscryptctl add_key” command.
# echo foo > /mnt/crypto/test1/foo
Notes:
- At plaintext mode only, the user set a master key and the fscrypt
driver derived from it the DEK and the key identifier.
- 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
- Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
There is no special configuration to support crypto at nvme modules.
Thanks
Israel Rukshin (13):
net/mlx5: Introduce crypto IFC bits and structures
net/mlx5: Introduce crypto capabilities macro
RDMA: Split kernel-only create QP flags from uverbs create QP flags
RDMA/core: Add cryptographic device capabilities
RDMA/core: Add DEK management API
RDMA/core: Introduce MR type for crypto operations
RDMA/core: Add support for creating crypto enabled QPs
RDMA/mlx5: Add cryptographic device capabilities
RDMA/mlx5: Add DEK management API
RDMA/mlx5: Add AES-XTS crypto support
nvme: Introduce a local variable
nvme: Add crypto profile at nvme controller
nvme-rdma: Add inline encryption support
drivers/infiniband/core/device.c | 3 +
drivers/infiniband/core/mr_pool.c | 2 +
drivers/infiniband/core/uverbs_std_types_qp.c | 12 +-
drivers/infiniband/core/verbs.c | 91 ++++++-
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 2 +-
drivers/infiniband/hw/mlx4/mlx4_ib.h | 4 +-
drivers/infiniband/hw/mlx4/qp.c | 4 +-
drivers/infiniband/hw/mlx5/Makefile | 1 +
drivers/infiniband/hw/mlx5/crypto.c | 115 +++++++++
drivers/infiniband/hw/mlx5/crypto.h | 46 ++++
drivers/infiniband/hw/mlx5/main.c | 5 +
drivers/infiniband/hw/mlx5/mlx5_ib.h | 5 +-
drivers/infiniband/hw/mlx5/mr.c | 33 +++
drivers/infiniband/hw/mlx5/qp.c | 15 +-
drivers/infiniband/hw/mlx5/wr.c | 164 +++++++++++-
drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/fw.c | 6 +
.../net/ethernet/mellanox/mlx5/core/main.c | 1 +
drivers/nvme/host/core.c | 10 +-
drivers/nvme/host/nvme.h | 4 +
drivers/nvme/host/rdma.c | 236 +++++++++++++++++-
include/linux/mlx5/device.h | 4 +
include/linux/mlx5/mlx5_ifc.h | 36 ++-
include/rdma/crypto.h | 118 +++++++++
include/rdma/ib_verbs.h | 46 +++-
include/trace/events/rdma_core.h | 33 +++
26 files changed, 954 insertions(+), 44 deletions(-)
create mode 100644 drivers/infiniband/hw/mlx5/crypto.c
create mode 100644 drivers/infiniband/hw/mlx5/crypto.h
create mode 100644 include/rdma/crypto.h
--
2.39.0
Eric,
>> Notes:
>> - At plaintext mode only, the user set a master key and the fscrypt
>> driver derived from it the DEK and the key identifier.
>> - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>> - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
>>
>> There is no special configuration to support crypto at nvme modules.
>>
>> Thanks
>
> Very interesting work! Can you Cc me on future versions?
>
> I'm glad to see that this hardware allows all 16 IV bytes to be specified.
>
> Does it also handle programming and evicting keys efficiently?
>
> Also, just checking: have you tested that the ciphertext that this inline
> encryption hardware produces is correct? That's always super important to test.
> There are xfstests that test for it, e.g. generic/582. Another way to test it
> is to just manually test whether encrypted files that were created when the
> filesystem was mounted with '-o inlinecrypt' show the same contents when the
> filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).
>
> - Eric
>
I'm wondering which are the xfstests that needs to run in order
to establish the correctness/stability apart from generic/582
this work ?
-ck
Hi Leon,
On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
> >From Israel,
>
> The purpose of this patchset is to add support for inline
> encryption/decryption of the data at storage protocols like nvmf over
> RDMA (at a similar way like integrity is used via unique mkey).
>
> This patchset adds support for plaintext keys. The patches were tested
> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> was improved by more than 50% comparing to the SW only solution at this case.
>
> How to configure fscrypt to enable plaintext keys:
> # mkfs.ext4 -O encrypt /dev/nvme0n1
> # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
> # head -c 64 /dev/urandom > /tmp/master_key
> # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
> # mkdir /mnt/crypto/test1
> # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
> ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
> “fscryptctl add_key” command.
> # echo foo > /mnt/crypto/test1/foo
>
> Notes:
> - At plaintext mode only, the user set a master key and the fscrypt
> driver derived from it the DEK and the key identifier.
> - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
> - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
>
> There is no special configuration to support crypto at nvme modules.
>
> Thanks
Very interesting work! Can you Cc me on future versions?
I'm glad to see that this hardware allows all 16 IV bytes to be specified.
Does it also handle programming and evicting keys efficiently?
Also, just checking: have you tested that the ciphertext that this inline
encryption hardware produces is correct? That's always super important to test.
There are xfstests that test for it, e.g. generic/582. Another way to test it
is to just manually test whether encrypted files that were created when the
filesystem was mounted with '-o inlinecrypt' show the same contents when the
filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).
- Eric
On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
> >From Israel,
>
> The purpose of this patchset is to add support for inline
> encryption/decryption of the data at storage protocols like nvmf over
> RDMA (at a similar way like integrity is used via unique mkey).
>
> This patchset adds support for plaintext keys. The patches were tested
> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> was improved by more than 50% comparing to the SW only solution at this case.
One thing that needs to be solved before we can look into this is the
interaction with protection information (or integrity data in Linux
terms). Currently inline encryption and protection information are
mutally incompatible.
On Wed, Jan 18, 2023 at 07:14:30AM +0000, Chaitanya Kulkarni wrote:
> Eric,
>
> >> Notes:
> >> - At plaintext mode only, the user set a master key and the fscrypt
> >> driver derived from it the DEK and the key identifier.
> >> - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
> >> - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
> >>
> >> There is no special configuration to support crypto at nvme modules.
> >>
> >> Thanks
> >
> > Very interesting work! Can you Cc me on future versions?
> >
> > I'm glad to see that this hardware allows all 16 IV bytes to be specified.
> >
> > Does it also handle programming and evicting keys efficiently?
> >
> > Also, just checking: have you tested that the ciphertext that this inline
> > encryption hardware produces is correct? That's always super important to test.
> > There are xfstests that test for it, e.g. generic/582. Another way to test it
> > is to just manually test whether encrypted files that were created when the
> > filesystem was mounted with '-o inlinecrypt' show the same contents when the
> > filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).
> >
> > - Eric
> >
>
> I'm wondering which are the xfstests that needs to run in order
> to establish the correctness/stability apart from generic/582
> this work ?
>
See https://docs.kernel.org/filesystems/fscrypt.html#tests.
- Eric
On Tue, Jan 17, 2023 at 10:47:44PM -0800, Eric Biggers wrote:
> Hi Leon,
>
> On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
> > >From Israel,
> >
> > The purpose of this patchset is to add support for inline
> > encryption/decryption of the data at storage protocols like nvmf over
> > RDMA (at a similar way like integrity is used via unique mkey).
> >
> > This patchset adds support for plaintext keys. The patches were tested
> > on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> > in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> > was improved by more than 50% comparing to the SW only solution at this case.
> >
> > How to configure fscrypt to enable plaintext keys:
> > # mkfs.ext4 -O encrypt /dev/nvme0n1
> > # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
> > # head -c 64 /dev/urandom > /tmp/master_key
> > # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
> > # mkdir /mnt/crypto/test1
> > # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
> > ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
> > “fscryptctl add_key” command.
> > # echo foo > /mnt/crypto/test1/foo
> >
> > Notes:
> > - At plaintext mode only, the user set a master key and the fscrypt
> > driver derived from it the DEK and the key identifier.
> > - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
> > - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
> >
> > There is no special configuration to support crypto at nvme modules.
> >
> > Thanks
>
> Very interesting work! Can you Cc me on future versions?
Sure
>
> I'm glad to see that this hardware allows all 16 IV bytes to be specified.
>
> Does it also handle programming and evicting keys efficiently?
"efficiently" is a very subjective term. We are using FW command
interface to program keys and this interface can do hundreds/thousands
commands per-second.
Thanks
Hi Eric,
On 1/18/2023 8:47 AM, Eric Biggers wrote:
> Hi Leon,
>
> On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
>> >From Israel,
>>
>> The purpose of this patchset is to add support for inline
>> encryption/decryption of the data at storage protocols like nvmf over
>> RDMA (at a similar way like integrity is used via unique mkey).
>>
>> This patchset adds support for plaintext keys. The patches were tested
>> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
>> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
>> was improved by more than 50% comparing to the SW only solution at this case.
>>
>> How to configure fscrypt to enable plaintext keys:
>> # mkfs.ext4 -O encrypt /dev/nvme0n1
>> # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
>> # head -c 64 /dev/urandom > /tmp/master_key
>> # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
>> # mkdir /mnt/crypto/test1
>> # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
>> ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
>> “fscryptctl add_key” command.
>> # echo foo > /mnt/crypto/test1/foo
>>
>> Notes:
>> - At plaintext mode only, the user set a master key and the fscrypt
>> driver derived from it the DEK and the key identifier.
>> - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>> - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
>>
>> There is no special configuration to support crypto at nvme modules.
>>
>> Thanks
> Very interesting work! Can you Cc me on future versions?
>
> I'm glad to see that this hardware allows all 16 IV bytes to be specified.
>
> Does it also handle programming and evicting keys efficiently?
>
> Also, just checking: have you tested that the ciphertext that this inline
> encryption hardware produces is correct? That's always super important to test.
> There are xfstests that test for it, e.g. generic/582. Another way to test it
> is to just manually test whether encrypted files that were created when the
> filesystem was mounted with '-o inlinecrypt' show the same contents when the
> filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).
sure, I ran the manual test of comparing the encrypted files content
with and without the '-o inlinecrypt' at the mount command.
>
> - Eric
- Israel
On 18/01/2023 9:36, Christoph Hellwig wrote:
> On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
>> >From Israel,
>>
>> The purpose of this patchset is to add support for inline
>> encryption/decryption of the data at storage protocols like nvmf over
>> RDMA (at a similar way like integrity is used via unique mkey).
>>
>> This patchset adds support for plaintext keys. The patches were tested
>> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
>> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
>> was improved by more than 50% comparing to the SW only solution at this case.
> One thing that needs to be solved before we can look into this is the
> interaction with protection information (or integrity data in Linux
> terms). Currently inline encryption and protection information are
> mutally incompatible.
Well, for sure this should be solved. Not sure that it should be done
before this series.
This patch set doesn't break/change existing behavior of PI and
Encryption offloads so the above can be done incrementally.
It's already big enough series and 2 subsystems are involved.
Today, if one would like to run both features using a capable device
then the PI will be offloaded (no SW fallback) and the Encryption will
be using SW fallback.
There should be a serious instrumentation in the block layer to make
both operations offloaded.
We'll start looking into it. Any suggestions and designs are welcomed.
We also prepared patches to extend the blk lavel encryption (in addition
to fscrypt exist today).
> From Israel,
>
> The purpose of this patchset is to add support for inline
> encryption/decryption of the data at storage protocols like nvmf over
> RDMA (at a similar way like integrity is used via unique mkey).
>
> This patchset adds support for plaintext keys. The patches were tested
> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> was improved by more than 50% comparing to the SW only solution at this case.
>
> How to configure fscrypt to enable plaintext keys:
> # mkfs.ext4 -O encrypt /dev/nvme0n1
> # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
> # head -c 64 /dev/urandom > /tmp/master_key
> # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
> # mkdir /mnt/crypto/test1
> # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
> ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
> “fscryptctl add_key” command.
> # echo foo > /mnt/crypto/test1/foo
>
> Notes:
> - At plaintext mode only, the user set a master key and the fscrypt
> driver derived from it the DEK and the key identifier.
> - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
> - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
>
> There is no special configuration to support crypto at nvme modules.
Hey, this looks sane to me in a very first glance.
Few high level questions:
- what happens with multipathing? when if not all devices are
capable. SW fallback?
- Does the crypt stuff stay intact when bio is requeued?
I'm assuming you tested this with multipathing? This is not very
useful if it is incompatible with it.
Hi Sagi,
On 1/23/2023 1:27 PM, Sagi Grimberg wrote:
>
>> From Israel,
>>
>> The purpose of this patchset is to add support for inline
>> encryption/decryption of the data at storage protocols like nvmf over
>> RDMA (at a similar way like integrity is used via unique mkey).
>>
>> This patchset adds support for plaintext keys. The patches were tested
>> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
>> in CPU utilization when comparing at 64k or more IO size. The CPU
>> utilization
>> was improved by more than 50% comparing to the SW only solution at
>> this case.
>>
>> How to configure fscrypt to enable plaintext keys:
>> # mkfs.ext4 -O encrypt /dev/nvme0n1
>> # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
>> # head -c 64 /dev/urandom > /tmp/master_key
>> # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
>> # mkdir /mnt/crypto/test1
>> # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb
>> /mnt/crypto/test1
>> ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
>> “fscryptctl add_key” command.
>> # echo foo > /mnt/crypto/test1/foo
>>
>> Notes:
>> - At plaintext mode only, the user set a master key and the fscrypt
>> driver derived from it the DEK and the key identifier.
>> - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>> - Only on the first IO, nvme-rdma gets a callback to load the
>> derived DEK.
>>
>> There is no special configuration to support crypto at nvme modules.
>
> Hey, this looks sane to me in a very first glance.
>
> Few high level questions:
> - what happens with multipathing? when if not all devices are
> capable. SW fallback?
SW fallback happens every time the device doesn't support the specific
crypto request (which include data-unit-size, mode and dun_bytes).
So with multipathing, one path uses the HW crypto offload and the other
one uses the SW fallback.
> - Does the crypt stuff stay intact when bio is requeued?
Yes, the crypto ctx is copied when cloning the bio.
>
> I'm assuming you tested this with multipathing? This is not very
> useful if it is incompatible with it.
Yes, sure.
You can see the call to blk_crypto_reprogram_all_keys(), which is
called when the controller was reconnected after port toggling.
- Israel
On Wed, Jan 18, 2023 at 04:20:27PM +0200, Max Gurtovoy wrote:
> Today, if one would like to run both features using a capable device then
> the PI will be offloaded (no SW fallback) and the Encryption will be using
> SW fallback.
It's not just running both - it's offering both as currently registering
these fatures is mutally incompatible.
On Mon, Jan 23, 2023 at 02:57:18PM +0200, Israel Rukshin wrote:
>> - what happens with multipathing? when if not all devices are
>> capable. SW fallback?
> SW fallback happens every time the device doesn't support the specific
> crypto request (which include data-unit-size, mode and dun_bytes).
> So with multipathing, one path uses the HW crypto offload and the other one
> uses the SW fallback.
That's a big no-go. The blk-crypto-fallback code is just a toy example
and not actually safe to use in prodution. Most importantly it just
kmallocs a bio clone and pages for it without any mempool that guarantees
forward progress.
On 30/01/2023 14:35, Christoph Hellwig wrote:
> On Wed, Jan 18, 2023 at 04:20:27PM +0200, Max Gurtovoy wrote:
>> Today, if one would like to run both features using a capable device then
>> the PI will be offloaded (no SW fallback) and the Encryption will be using
>> SW fallback.
> It's not just running both - it's offering both as currently registering
> these fatures is mutally incompatible.
For sure we would like to offer both, but as mentioned before, this will
require blk layer instrumentation and ib_core instrumentation to some
pipeline_mr or something like that.
We are interested in doing this IO path but doing all in 1 series
doesn't sounds possible.
First we would like to have at least the mutual exclusive solution since
there are users that may want only one type of acceleration.
re-designing, both blk layer and ib_core for the PI + CRYPTO case in
this series is not feasible IMO.
>>> Today, if one would like to run both features using a capable device
>>> then
>>> the PI will be offloaded (no SW fallback) and the Encryption will be
>>> using
>>> SW fallback.
>> It's not just running both - it's offering both as currently registering
>> these fatures is mutally incompatible.
>
> For sure we would like to offer both, but as mentioned before, this will
> require blk layer instrumentation and ib_core instrumentation to some
> pipeline_mr or something like that.
Can you explain what is needed in the block layer that prevents queueing
a request that has PI and crypto ctx?
As for the rdma portion, I don't know enough if this is something
that is supported by the HW and lacks a SW interface, or inherently
unsupported in HW...