2023-04-28 20:56:47

by Tom Zanussi

[permalink] [raw]
Subject: [PATCH v3 00/15] crypto: Add Intel Analytics Accelerator (IAA) crypto compression driver

Hi, this is v3 of the IAA crypto driver, incorporating feedback from
v2.

v3 changes:

- Reworked the code to only allow the registered crypto alg to be
unregistered by removing the module. Also added an iaa_wq_get()
and iaa_wq_put() to take/give up a reference to the work queue
while there are compresses/decompresses in flight. This is
synchronized with the wq remove function, so that the
iaa_wq/iaa_devices can't go away beneath active operations. This
was tested by removing/disabling the iaa wqs/devices while
operations were in flight.

- Simplified the rebalance code and removed cpu_to_iaa() function
since it was overly complicated and wasn't actually working as
advertised.

- As a result of reworking the above code, fixed several bugs such
as possibly unregistering an unregistered crypto alg, a memory
leak where iaa_wqs weren't being freed, and making sure the
compression schemes were registered before registering the driver.

- Added set_/idxd_wq_private() accessors for wq private data.

- Added missing XPORT_SYMBOL_NS_GPL() to [PATCH 04/15] dmaengine:
idxd: Export descriptor management functions

- Added Dave's Reviewed-by: tags from v2.

- Updated Documentation and commit messages to reflect the changes
above.

- Rebased to to cryptodev tree, since that has the earlier changes
that moved the intel drivers to crypto/intel.

v2 changes:

- Removed legacy interface and all related code; merged async
interface into main deflate patch.

- Added support for the null destination case. Thanks to Giovanni
Cabiddu for making me aware of this as well as the selftests for
it.

- Had to do some rearrangement of the code in order to pass all the
selftests. Also added a testcase for 'canned'.

- Moved the iaa crypto driver to drivers/crypto/intel, and moved all
the other intel drivers there as well (which will be posted as a
separate series immediately following this one).

- Added an iaa crypto section to MAINTAINERS.

- Updated the documenation and commit messages to reflect the removal
of the legacy interface.

- Changed kernel version from 6.3.0 to 6.4.0 in patch 01/15 (wq
driver name support)

v1:

This series adds Linux crypto algorithm support for Intel® In-memory
Analytics Accelerator (Intel IAA) [1] hardware compression and
decompression, which is available on Sapphire Rapids systems.

The IAA crypto support is implemented as an IDXD sub-driver. The IDXD
driver already present in the kernel provides discovery and management
of the IAA devices on a system, as well as all the functionality
needed to manage, submit, and wait for completion of work executed on
them. The first 7 patches (patches starting with dmaengine:) add
small bits of underlying IDXD plumbing needed to allow external
sub-drivers to take advantage of this support and claim ownership of
specific IAA devices and workqueues.

The remaining patches add the main support for this feature via the
crypto API, making it transparently accessible to kernel features that
can make use of it such as zswap and zram (patches starting with
crypto – iaa:).

These include both sync/async support for the deflate algorithm
implemented by the IAA hardware, as well as an additional option for
driver statistics and Documentation.

Patch 8 ('[PATCH 08/15] crypto: iaa - Add IAA Compression Accelerator
Documentation') describes the IAA crypto driver in detail; the
following is just a high-level synopsis meant to aid the following
discussion.

The IAA hardware is fairly complex and generally requires a
knowledgeable administrator with sufficiently detailed understanding
of the hardware to set it up before it can be used. As mentioned in
the Documentation, this typically requires using a special tool called
accel-config to enumerate and configure IAA workqueues, engines, etc,
although this can also be done using only sysfs files.

The operation of the driver mirrors this requirement and only allows
the hardware to be accessed via the crypto layer once the hardware has
been configured and bound to the the IAA crypto driver. As an IDXD
sub-driver, the IAA crypto driver essentially takes ownership of the
hardware until it is given up explicitly by the administrator. This
occurs automatically when the administrator enables the first IAA
workqueue or disables the last one; the iaa_crypto (sync and async)
algorithms are registered when the first workqueue is enabled, and
deregistered when the last one is disabled.

The normal sequence of operations would normally be:

< configure the hardware using accel-config or sysfs >

< configure the iaa crypto driver (see below) >

< configure the subsystem e.g. zswap/zram to use the iaa_crypto algo >

< run the workload >

There are a small number of iaa_crypto driver attributes that the
administrator can configure, and which also need to be configured
before the algorithm is enabled:

compression_mode:

The IAA crypto driver supports an extensible interface supporting
any number of different compression modes that can be tailored to
specific types of workloads. These are implemented as tables and
given arbitrary names describing their intent.

There are currently only 2 compression modes, “canned” and “fixed”.
In order to set a compression mode, echo the mode’s name to the
compression_mode driver attribute:

echo "canned" > /sys/bus/dsa/drivers/crypto/compression_mode

There are a few other available iaa_crypto driver attributes (see
Documentation for details) but the main one we want to consider in
detail for now is the ‘sync_mode’ attribute.

The ‘sync_mode’ attribute has 3 possible settings: ‘sync’, ‘async’,
and ‘async_irq’.

The context for these different modes is that although the iaa_crypto
driver implements the asynchronous crypto interface, the async
interface is currently only used in a synchronous way by facilities
like zswap that make use of it.

This is fine for software compress/decompress algorithms, since
there’s no real benefit in being able to use a truly asynchronous
interface with them. This isn’t the case, though, for hardware
compress/decompress engines such as IAA, where truly asynchronous
behavior is beneficial if not completely necessary to make optimal use
of the hardware.

The IAA crypto driver ‘sync_mode’ support should allow facilities such
as zswap to ‘support async (de)compression in some way [2]’ once
they are modified to actually make use of it.

When the ‘async_irq’ sync_mode is specified, the driver sets the bits
in the IAA work descriptor to generate an irq when the work completes.
So for every compression or decompression, the IAA acomp_alg
implementations called by crypto_acomp_compress/decompress() simply
set up the descriptor, turn on the 'request irq' bit and return
immediately with -EINPROGRESS. When the work completes, the irq fires
and the IDXD driver’s irq thread for that irq invokes the callback the
iaa_crypto module registered with IDXD. When the irq thread gets
scheduled, it wakes up the caller, which could be for instance zswap,
waiting synchronously via crypto_wait_req().

Using the simple madvise test program in '[PATCH 08/15] crypto: iaa -
Add IAA Compression Accelerator Documentation' along with a set of
pages from the spec17 benchmark and tracepoint instrumentation
measuring the time taken between the start and end of each compress
and decompress, this case, async_irq, takes on average 6,847 ns for
compression and 5,840 ns for decompression. (See Table 1 below for a
summary of all the tests.)

When sync_mode is set to ‘sync’, the interrupt bit is not set and the
work descriptor is submitted in the same way it was for the previous
case. In this case the call doesn’t return but rather loops around
waiting in the iaa_crypto driver’s check_completion() function which
continually checks the descriptor’s completion bit until it finds it
set to ‘completed’. It then returns to the caller, again for example
zswap waiting in crypto_wait_req(). From the standpoint of zswap,
this case is exactly the same as the previous case, the difference
seen only in the crypto layer and the iaa_crypto driver internally;
from its standpoint they’re both synchronous calls. There is however
a large performance difference: an average of 3,177 ns for compress
and 2,235 ns for decompress.

The final sync_mode is ‘async’. In this case also the interrupt bit
is not set and the work descriptor is submitted, returning immediately
to the caller with -EINPROGRESS. Because there’s no interrupt set to
notify anyone when the work completes, the caller needs to somehow
check for work completion. Because core code like zswap can’t do this
directly by for example calling iaa_crypto’s check_completion(), there
would need to be some changes made to code like zswap and the crypto
layer in order to take advantage of this mode. As such, there are no
numbers to share for this mode.

Finally, just a quick discussion of the remaining numbers in Table 1,
those comparing the iaa_crypto sync and async irq cases to software
deflate. Software deflate took average of 108,978 ns for compress and
14,485 ns for decompress.

As can be seen from Table 1, the numbers using the iaa_crypto driver
for deflate as compared to software are so much better that merging it
would seem to make sense on its own merits. The 'async' sync_mode
described above, however, offers the possibility of even greater gains
to be had against higher-performing algorithms such as lzo, via
parallelization, once the calling facilities are modified to take
advantage of it. Follow-up patchsets to this one will demonstrate
concretely how that might be accomplished.

Thanks,

Tom


Table 1. Zswap latency and compression numbers (in ns):

Algorithm compress decompress
----------------------------------------------------------
iaa sync 3,177 2,235
iaa async irq 6,847 5,840
software deflate 108,978 14,485

[1] https://cdrdv2.intel.com/v1/dl/getContent/721858

[2] https://lore.kernel.org/lkml/[email protected]/


Dave Jiang (2):
dmaengine: idxd: add wq driver name support for accel-config user tool
dmaengine: idxd: add external module driver support for dsa_bus_type

Tom Zanussi (13):
dmaengine: idxd: Export drv_enable/disable and related functions
dmaengine: idxd: Export descriptor management functions
dmaengine: idxd: Export wq resource management functions
dmaengine: idxd: Add private_data to struct idxd_wq
dmaengine: idxd: add callback support for iaa crypto
crypto: iaa - Add IAA Compression Accelerator Documentation
crypto: iaa - Add Intel IAA Compression Accelerator crypto driver core
crypto: iaa - Add per-cpu workqueue table with rebalancing
crypto: iaa - Add compression mode management along with fixed mode
crypto: iaa - Add support for iaa_crypto deflate compression algorithm
crypto: iaa - Add support for default IAA 'canned' compression mode
crypto: iaa - Add irq support for the crypto async interface
crypto: iaa - Add IAA Compression Accelerator stats

.../ABI/stable/sysfs-driver-dma-idxd | 6 +
.../driver-api/crypto/iaa/iaa-crypto.rst | 640 +++++
Documentation/driver-api/crypto/iaa/index.rst | 20 +
Documentation/driver-api/crypto/index.rst | 20 +
Documentation/driver-api/index.rst | 1 +
MAINTAINERS | 7 +
crypto/testmgr.c | 10 +
crypto/testmgr.h | 72 +
drivers/crypto/intel/Kconfig | 1 +
drivers/crypto/intel/Makefile | 1 +
drivers/crypto/intel/iaa/Kconfig | 19 +
drivers/crypto/intel/iaa/Makefile | 12 +
drivers/crypto/intel/iaa/iaa_crypto.h | 175 ++
.../crypto/intel/iaa/iaa_crypto_comp_canned.c | 110 +
.../crypto/intel/iaa/iaa_crypto_comp_fixed.c | 92 +
drivers/crypto/intel/iaa/iaa_crypto_main.c | 2151 +++++++++++++++++
drivers/crypto/intel/iaa/iaa_crypto_stats.c | 271 +++
drivers/crypto/intel/iaa/iaa_crypto_stats.h | 58 +
drivers/dma/idxd/bus.c | 6 +
drivers/dma/idxd/cdev.c | 8 +
drivers/dma/idxd/device.c | 9 +-
drivers/dma/idxd/dma.c | 9 +-
drivers/dma/idxd/idxd.h | 84 +-
drivers/dma/idxd/irq.c | 12 +-
drivers/dma/idxd/submit.c | 9 +-
drivers/dma/idxd/sysfs.c | 28 +
include/uapi/linux/idxd.h | 1 +
27 files changed, 3812 insertions(+), 20 deletions(-)
create mode 100644 Documentation/driver-api/crypto/iaa/iaa-crypto.rst
create mode 100644 Documentation/driver-api/crypto/iaa/index.rst
create mode 100644 Documentation/driver-api/crypto/index.rst
create mode 100644 drivers/crypto/intel/iaa/Kconfig
create mode 100644 drivers/crypto/intel/iaa/Makefile
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto.h
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_comp_canned.c
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_comp_fixed.c
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_main.c
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_stats.c
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_stats.h

--
2.34.1


2023-04-28 20:57:46

by Tom Zanussi

[permalink] [raw]
Subject: [PATCH v3 09/15] crypto: iaa - Add Intel IAA Compression Accelerator crypto driver core

The Intel Analytics Accelerator (IAA) is a hardware accelerator that
provides very high thoughput compression/decompression compatible with
the DEFLATE compression standard described in RFC 1951, which is the
compression/decompression algorithm exported by this module.

Users can select IAA compress/decompress acceleration by specifying
'iaa_crypto' as the compression algorithm to use by whatever facility
allows asynchronous compression algorithms to be selected.

For example, zswap can select iaa_crypto via:

# echo iaa_crypto > /sys/module/zswap/parameters/compressor

This patch adds iaa_crypto as an idxd sub-driver and tracks iaa
devices and workqueues as they are probed or removed.

[ Based on work originally by George Powley, Jing Lin and Kyung Min
Park ]

Signed-off-by: Tom Zanussi <[email protected]>
---
MAINTAINERS | 7 +
drivers/crypto/intel/Kconfig | 1 +
drivers/crypto/intel/Makefile | 1 +
drivers/crypto/intel/iaa/Kconfig | 10 +
drivers/crypto/intel/iaa/Makefile | 10 +
drivers/crypto/intel/iaa/iaa_crypto.h | 30 ++
drivers/crypto/intel/iaa/iaa_crypto_main.c | 326 +++++++++++++++++++++
7 files changed, 385 insertions(+)
create mode 100644 drivers/crypto/intel/iaa/Kconfig
create mode 100644 drivers/crypto/intel/iaa/Makefile
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto.h
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 45ee4e6faf9c..8697dc9a7b36 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10313,6 +10313,13 @@ S: Supported
Q: https://patchwork.kernel.org/project/linux-dmaengine/list/
F: drivers/dma/ioat*

+INTEL IAA CRYPTO DRIVER
+M: Tom Zanussi <[email protected]>
+L: [email protected]
+S: Supported
+F: Documentation/driver-api/crypto/iaa/iaa-crypto.rst
+F: drivers/crypto/intel/iaa/*
+
INTEL IDXD DRIVER
M: Fenghua Yu <[email protected]>
M: Dave Jiang <[email protected]>
diff --git a/drivers/crypto/intel/Kconfig b/drivers/crypto/intel/Kconfig
index 3d90c87d4094..f38cd62a3f67 100644
--- a/drivers/crypto/intel/Kconfig
+++ b/drivers/crypto/intel/Kconfig
@@ -3,3 +3,4 @@
source "drivers/crypto/intel/keembay/Kconfig"
source "drivers/crypto/intel/ixp4xx/Kconfig"
source "drivers/crypto/intel/qat/Kconfig"
+source "drivers/crypto/intel/iaa/Kconfig"
diff --git a/drivers/crypto/intel/Makefile b/drivers/crypto/intel/Makefile
index b3d0352ae188..2f56f6d34cf0 100644
--- a/drivers/crypto/intel/Makefile
+++ b/drivers/crypto/intel/Makefile
@@ -3,3 +3,4 @@
obj-y += keembay/
obj-y += ixp4xx/
obj-$(CONFIG_CRYPTO_DEV_QAT) += qat/
+obj-$(CONFIG_CRYPTO_DEV_IAA_CRYPTO) += iaa/
diff --git a/drivers/crypto/intel/iaa/Kconfig b/drivers/crypto/intel/iaa/Kconfig
new file mode 100644
index 000000000000..fcccb6ff7e29
--- /dev/null
+++ b/drivers/crypto/intel/iaa/Kconfig
@@ -0,0 +1,10 @@
+config CRYPTO_DEV_IAA_CRYPTO
+ tristate "Support for Intel(R) IAA Compression Accelerator"
+ depends on CRYPTO_DEFLATE
+ depends on INTEL_IDXD
+ default n
+ help
+ This driver supports acceleration for compression and
+ decompression with the Intel Analytics Accelerator (IAA)
+ hardware using the cryptographic API. If you choose 'M'
+ here, the module will be called iaa_crypto.
diff --git a/drivers/crypto/intel/iaa/Makefile b/drivers/crypto/intel/iaa/Makefile
new file mode 100644
index 000000000000..03859431c897
--- /dev/null
+++ b/drivers/crypto/intel/iaa/Makefile
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for IAA crypto device drivers
+#
+
+ccflags-y += -I $(srctree)/drivers/dma/idxd -DDEFAULT_SYMBOL_NAMESPACE=IDXD
+
+obj-$(CONFIG_CRYPTO_DEV_IAA_CRYPTO) := iaa_crypto.o
+
+iaa_crypto-y := iaa_crypto_main.o
diff --git a/drivers/crypto/intel/iaa/iaa_crypto.h b/drivers/crypto/intel/iaa/iaa_crypto.h
new file mode 100644
index 000000000000..5d1fff7f4b8e
--- /dev/null
+++ b/drivers/crypto/intel/iaa/iaa_crypto.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
+
+#ifndef __IAA_CRYPTO_H__
+#define __IAA_CRYPTO_H__
+
+#include <linux/crypto.h>
+#include <linux/idxd.h>
+#include <uapi/linux/idxd.h>
+
+#define IDXD_SUBDRIVER_NAME "crypto"
+
+/* Representation of IAA workqueue */
+struct iaa_wq {
+ struct list_head list;
+ struct idxd_wq *wq;
+
+ struct iaa_device *iaa_device;
+};
+
+/* Representation of IAA device with wqs, populated by probe */
+struct iaa_device {
+ struct list_head list;
+ struct idxd_device *idxd;
+
+ int n_wq;
+ struct list_head wqs;
+};
+
+#endif
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
new file mode 100644
index 000000000000..8cf0c7bf9005
--- /dev/null
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -0,0 +1,326 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/iommu.h>
+#include <uapi/linux/idxd.h>
+#include <linux/highmem.h>
+#include <linux/sched/smt.h>
+
+#include "idxd.h"
+#include "iaa_crypto.h"
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+
+#define pr_fmt(fmt) "idxd: " IDXD_SUBDRIVER_NAME ": " fmt
+
+/* number of iaa instances probed */
+static unsigned int nr_iaa;
+
+static LIST_HEAD(iaa_devices);
+static DEFINE_MUTEX(iaa_devices_lock);
+
+static struct iaa_device *iaa_device_alloc(void)
+{
+ struct iaa_device *iaa_device;
+
+ iaa_device = kzalloc(sizeof(*iaa_device), GFP_KERNEL);
+ if (!iaa_device)
+ return NULL;
+
+ INIT_LIST_HEAD(&iaa_device->wqs);
+
+ return iaa_device;
+}
+
+static void iaa_device_free(struct iaa_device *iaa_device)
+{
+ struct iaa_wq *iaa_wq, *next;
+
+ list_for_each_entry_safe(iaa_wq, next, &iaa_device->wqs, list) {
+ list_del(&iaa_wq->list);
+ kfree(iaa_wq);
+ }
+
+ kfree(iaa_device);
+}
+
+static bool iaa_has_wq(struct iaa_device *iaa_device, struct idxd_wq *wq)
+{
+ struct iaa_wq *iaa_wq;
+
+ list_for_each_entry(iaa_wq, &iaa_device->wqs, list) {
+ if (iaa_wq->wq == wq)
+ return true;
+ }
+
+ return false;
+}
+
+static struct iaa_device *add_iaa_device(struct idxd_device *idxd)
+{
+ struct iaa_device *iaa_device;
+
+ iaa_device = iaa_device_alloc();
+ if (!iaa_device)
+ return NULL;
+
+ iaa_device->idxd = idxd;
+
+ list_add_tail(&iaa_device->list, &iaa_devices);
+
+ nr_iaa++;
+
+ return iaa_device;
+}
+
+static void del_iaa_device(struct iaa_device *iaa_device)
+{
+ list_del(&iaa_device->list);
+
+ iaa_device_free(iaa_device);
+
+ nr_iaa--;
+}
+
+static int add_iaa_wq(struct iaa_device *iaa_device, struct idxd_wq *wq,
+ struct iaa_wq **new_wq)
+{
+ struct idxd_device *idxd = iaa_device->idxd;
+ struct pci_dev *pdev = idxd->pdev;
+ struct device *dev = &pdev->dev;
+ struct iaa_wq *iaa_wq;
+
+ iaa_wq = kzalloc(sizeof(*iaa_wq), GFP_KERNEL);
+ if (!iaa_wq)
+ return -ENOMEM;
+
+ iaa_wq->wq = wq;
+ iaa_wq->iaa_device = iaa_device;
+ set_idxd_wq_private(wq, iaa_wq);
+
+ list_add_tail(&iaa_wq->list, &iaa_device->wqs);
+
+ iaa_device->n_wq++;
+
+ if (new_wq)
+ *new_wq = iaa_wq;
+
+ dev_dbg(dev, "added wq %d to iaa device %d, n_wq %d\n",
+ wq->id, iaa_device->idxd->id, iaa_device->n_wq);
+
+ return 0;
+}
+
+static void del_iaa_wq(struct iaa_device *iaa_device, struct idxd_wq *wq)
+{
+ struct idxd_device *idxd = iaa_device->idxd;
+ struct pci_dev *pdev = idxd->pdev;
+ struct device *dev = &pdev->dev;
+ struct iaa_wq *iaa_wq;
+
+ list_for_each_entry(iaa_wq, &iaa_device->wqs, list) {
+ if (iaa_wq->wq == wq) {
+ list_del(&iaa_wq->list);
+ iaa_device->n_wq--;
+
+ dev_dbg(dev, "removed wq %d from iaa_device %d, n_wq %d, nr_iaa %d\n",
+ wq->id, iaa_device->idxd->id,
+ iaa_device->n_wq, nr_iaa);
+
+ if (iaa_device->n_wq == 0)
+ del_iaa_device(iaa_device);
+ break;
+ }
+ }
+}
+
+static int save_iaa_wq(struct idxd_wq *wq)
+{
+ struct iaa_device *iaa_device, *found = NULL;
+ struct idxd_device *idxd;
+ struct pci_dev *pdev;
+ struct device *dev;
+ int ret = 0;
+
+ list_for_each_entry(iaa_device, &iaa_devices, list) {
+ if (iaa_device->idxd == wq->idxd) {
+ idxd = iaa_device->idxd;
+ pdev = idxd->pdev;
+ dev = &pdev->dev;
+ /*
+ * Check to see that we don't already have this wq.
+ * Shouldn't happen but we don't control probing.
+ */
+ if (iaa_has_wq(iaa_device, wq)) {
+ dev_dbg(dev, "same wq probed multiple times for iaa_device %p\n",
+ iaa_device);
+ goto out;
+ }
+
+ found = iaa_device;
+
+ ret = add_iaa_wq(iaa_device, wq, NULL);
+ if (ret)
+ goto out;
+
+ break;
+ }
+ }
+
+ if (!found) {
+ struct iaa_device *new_device;
+ struct iaa_wq *new_wq;
+
+ new_device = add_iaa_device(wq->idxd);
+ if (!new_device) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ret = add_iaa_wq(new_device, wq, &new_wq);
+ if (ret) {
+ del_iaa_device(new_device);
+ goto out;
+ }
+ }
+
+ if (WARN_ON(nr_iaa == 0))
+ return -EINVAL;
+
+ idxd_wq_get(wq);
+out:
+ return 0;
+}
+
+static void remove_iaa_wq(struct idxd_wq *wq)
+{
+ struct iaa_device *iaa_device;
+
+ list_for_each_entry(iaa_device, &iaa_devices, list) {
+ if (iaa_has_wq(iaa_device, wq)) {
+ del_iaa_wq(iaa_device, wq);
+ idxd_wq_put(wq);
+ break;
+ }
+ }
+}
+
+static int iaa_crypto_probe(struct idxd_dev *idxd_dev)
+{
+ struct idxd_wq *wq = idxd_dev_to_wq(idxd_dev);
+ struct idxd_device *idxd = wq->idxd;
+ struct idxd_driver_data *data = idxd->data;
+ struct device *dev = &idxd_dev->conf_dev;
+ int ret = 0;
+
+ if (idxd->state != IDXD_DEV_ENABLED)
+ return -ENXIO;
+
+ if (data->type != IDXD_TYPE_IAX)
+ return -ENODEV;
+
+ mutex_lock(&wq->wq_lock);
+
+ if (!idxd_wq_driver_name_match(wq, dev)) {
+ dev_dbg(dev, "wq %d.%d driver_name match failed: wq driver_name %s, dev driver name %s\n",
+ idxd->id, wq->id, wq->driver_name, dev->driver->name);
+ idxd->cmd_status = IDXD_SCMD_WQ_NO_DRV_NAME;
+ ret = -ENODEV;
+ goto err;
+ }
+
+ wq->type = IDXD_WQT_KERNEL;
+
+ ret = drv_enable_wq(wq);
+ if (ret < 0) {
+ dev_dbg(dev, "enable wq %d.%d failed: %d\n",
+ idxd->id, wq->id, ret);
+ ret = -ENXIO;
+ goto err;
+ }
+
+ mutex_lock(&iaa_devices_lock);
+
+ ret = save_iaa_wq(wq);
+ if (ret)
+ goto err_save;
+
+ mutex_unlock(&iaa_devices_lock);
+out:
+ mutex_unlock(&wq->wq_lock);
+
+ return ret;
+
+err_save:
+ drv_disable_wq(wq);
+err:
+ wq->type = IDXD_WQT_NONE;
+
+ goto out;
+}
+
+static void iaa_crypto_remove(struct idxd_dev *idxd_dev)
+{
+ struct idxd_wq *wq = idxd_dev_to_wq(idxd_dev);
+
+ idxd_wq_quiesce(wq);
+
+ mutex_lock(&wq->wq_lock);
+ mutex_lock(&iaa_devices_lock);
+
+ remove_iaa_wq(wq);
+ drv_disable_wq(wq);
+
+ mutex_unlock(&iaa_devices_lock);
+ mutex_unlock(&wq->wq_lock);
+}
+
+static enum idxd_dev_type dev_types[] = {
+ IDXD_DEV_WQ,
+ IDXD_DEV_NONE,
+};
+
+static struct idxd_device_driver iaa_crypto_driver = {
+ .probe = iaa_crypto_probe,
+ .remove = iaa_crypto_remove,
+ .name = IDXD_SUBDRIVER_NAME,
+ .type = dev_types,
+};
+
+static int __init iaa_crypto_init_module(void)
+{
+ int ret = 0;
+
+ ret = idxd_driver_register(&iaa_crypto_driver);
+ if (ret) {
+ pr_debug("IAA wq sub-driver registration failed\n");
+ goto out;
+ }
+
+ pr_debug("initialized\n");
+out:
+ return ret;
+}
+
+static void __exit iaa_crypto_cleanup_module(void)
+{
+ idxd_driver_unregister(&iaa_crypto_driver);
+
+ pr_debug("cleaned up\n");
+}
+
+MODULE_IMPORT_NS(IDXD);
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_IDXD_DEVICE(0);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("IAA Compression Accelerator Crypto Driver");
+
+module_init(iaa_crypto_init_module);
+module_exit(iaa_crypto_cleanup_module);
--
2.34.1

2023-04-28 20:58:06

by Tom Zanussi

[permalink] [raw]
Subject: [PATCH v3 11/15] crypto: iaa - Add compression mode management along with fixed mode

Additionally, it defines an in-kernel API for adding, removing, and
setting compression schemes, which can be used by kernel modules or
other kernel code that implements IAA compression schemes.

It also adds a separate file, iaa_crypto_comp_fixed.c, containing
huffman tables generated for a compression scheme named 'fixed'.
Future compression schemes can be added in a similar fashion.

The compression mode in effect can be selected by the user via the new
iaa_crypto 'compression_mode' driver attribute. Currently, there is
only one compression mode available, 'fixed' mode:

echo "fixed" > /sys/bus/dsa/drivers/crypto/compression_mode

Signed-off-by: Tom Zanussi <[email protected]>
---
drivers/crypto/intel/iaa/Makefile | 2 +-
drivers/crypto/intel/iaa/iaa_crypto.h | 87 ++++
.../crypto/intel/iaa/iaa_crypto_comp_fixed.c | 92 ++++
drivers/crypto/intel/iaa/iaa_crypto_main.c | 422 +++++++++++++++++-
4 files changed, 601 insertions(+), 2 deletions(-)
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_comp_fixed.c

diff --git a/drivers/crypto/intel/iaa/Makefile b/drivers/crypto/intel/iaa/Makefile
index 03859431c897..cc87feffd059 100644
--- a/drivers/crypto/intel/iaa/Makefile
+++ b/drivers/crypto/intel/iaa/Makefile
@@ -7,4 +7,4 @@ ccflags-y += -I $(srctree)/drivers/dma/idxd -DDEFAULT_SYMBOL_NAMESPACE=IDXD

obj-$(CONFIG_CRYPTO_DEV_IAA_CRYPTO) := iaa_crypto.o

-iaa_crypto-y := iaa_crypto_main.o
+iaa_crypto-y := iaa_crypto_main.o iaa_crypto_comp_fixed.o
diff --git a/drivers/crypto/intel/iaa/iaa_crypto.h b/drivers/crypto/intel/iaa/iaa_crypto.h
index c25546fa87f7..2daa3522e073 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto.h
+++ b/drivers/crypto/intel/iaa/iaa_crypto.h
@@ -10,6 +10,11 @@

#define IDXD_SUBDRIVER_NAME "crypto"

+#define IAA_COMP_MODES_MAX 2
+
+#define FIXED_HDR 0x2
+#define FIXED_HDR_SIZE 3
+
/* Representation of IAA workqueue */
struct iaa_wq {
struct list_head list;
@@ -18,11 +23,24 @@ struct iaa_wq {
struct iaa_device *iaa_device;
};

+struct iaa_device_compression_mode {
+ const char *name;
+
+ struct aecs_comp_table_record *aecs_comp_table;
+ struct aecs_decomp_table_record *aecs_decomp_table;
+
+ dma_addr_t aecs_comp_table_dma_addr;
+ dma_addr_t aecs_decomp_table_dma_addr;
+};
+
/* Representation of IAA device with wqs, populated by probe */
struct iaa_device {
struct list_head list;
struct idxd_device *idxd;

+ struct iaa_device_compression_mode *compression_modes[IAA_COMP_MODES_MAX];
+ struct iaa_device_compression_mode *active_compression_mode;
+
int n_wq;
struct list_head wqs;
};
@@ -34,4 +52,73 @@ struct wq_table_entry {
int cur_wq;
};

+#define IAA_AECS_ALIGN 32
+
+/*
+ * Analytics Engine Configuration and State (AECS) contains parameters and
+ * internal state of the analytics engine.
+ */
+struct aecs_comp_table_record {
+ u32 crc;
+ u32 xor_checksum;
+ u32 reserved0[5];
+ u32 num_output_accum_bits;
+ u8 output_accum[256];
+ u32 ll_sym[286];
+ u32 reserved1;
+ u32 reserved2;
+ u32 d_sym[30];
+ u32 reserved_padding[2];
+} __packed;
+
+/* AECS for decompress */
+struct aecs_decomp_table_record {
+ u32 crc;
+ u32 xor_checksum;
+ u32 low_filter_param;
+ u32 high_filter_param;
+ u32 output_mod_idx;
+ u32 drop_init_decomp_out_bytes;
+ u32 reserved[36];
+ u32 output_accum_data[2];
+ u32 out_bits_valid;
+ u32 bit_off_indexing;
+ u32 input_accum_data[64];
+ u8 size_qw[32];
+ u32 decomp_state[1220];
+} __packed;
+
+int iaa_aecs_init_fixed(void);
+void iaa_aecs_cleanup_fixed(void);
+
+typedef int (*iaa_dev_comp_init_fn_t) (struct iaa_device_compression_mode *mode);
+typedef int (*iaa_dev_comp_free_fn_t) (struct iaa_device_compression_mode *mode);
+
+struct iaa_compression_mode {
+ const char *name;
+ u32 *ll_table;
+ int ll_table_size;
+ u32 *d_table;
+ int d_table_size;
+ u32 *header_table;
+ int header_table_size;
+ u16 gen_decomp_table_flags;
+ iaa_dev_comp_init_fn_t init;
+ iaa_dev_comp_free_fn_t free;
+};
+
+int add_iaa_compression_mode(const char *name,
+ const u32 *ll_table,
+ int ll_table_size,
+ const u32 *d_table,
+ int d_table_size,
+ const u8 *header_table,
+ int header_table_size,
+ u16 gen_decomp_table_flags,
+ iaa_dev_comp_init_fn_t init,
+ iaa_dev_comp_free_fn_t free);
+
+void remove_iaa_compression_mode(const char *name);
+int set_iaa_compression_mode(const char *name);
+
#endif
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_comp_fixed.c b/drivers/crypto/intel/iaa/iaa_crypto_comp_fixed.c
new file mode 100644
index 000000000000..e965da11b4d9
--- /dev/null
+++ b/drivers/crypto/intel/iaa/iaa_crypto_comp_fixed.c
@@ -0,0 +1,92 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
+
+#include "idxd.h"
+#include "iaa_crypto.h"
+
+/*
+ * Fixed Huffman tables the IAA hardware requires to implement RFC-1951.
+ */
+const u32 fixed_ll_sym[286] = {
+ 0x40030, 0x40031, 0x40032, 0x40033, 0x40034, 0x40035, 0x40036, 0x40037,
+ 0x40038, 0x40039, 0x4003A, 0x4003B, 0x4003C, 0x4003D, 0x4003E, 0x4003F,
+ 0x40040, 0x40041, 0x40042, 0x40043, 0x40044, 0x40045, 0x40046, 0x40047,
+ 0x40048, 0x40049, 0x4004A, 0x4004B, 0x4004C, 0x4004D, 0x4004E, 0x4004F,
+ 0x40050, 0x40051, 0x40052, 0x40053, 0x40054, 0x40055, 0x40056, 0x40057,
+ 0x40058, 0x40059, 0x4005A, 0x4005B, 0x4005C, 0x4005D, 0x4005E, 0x4005F,
+ 0x40060, 0x40061, 0x40062, 0x40063, 0x40064, 0x40065, 0x40066, 0x40067,
+ 0x40068, 0x40069, 0x4006A, 0x4006B, 0x4006C, 0x4006D, 0x4006E, 0x4006F,
+ 0x40070, 0x40071, 0x40072, 0x40073, 0x40074, 0x40075, 0x40076, 0x40077,
+ 0x40078, 0x40079, 0x4007A, 0x4007B, 0x4007C, 0x4007D, 0x4007E, 0x4007F,
+ 0x40080, 0x40081, 0x40082, 0x40083, 0x40084, 0x40085, 0x40086, 0x40087,
+ 0x40088, 0x40089, 0x4008A, 0x4008B, 0x4008C, 0x4008D, 0x4008E, 0x4008F,
+ 0x40090, 0x40091, 0x40092, 0x40093, 0x40094, 0x40095, 0x40096, 0x40097,
+ 0x40098, 0x40099, 0x4009A, 0x4009B, 0x4009C, 0x4009D, 0x4009E, 0x4009F,
+ 0x400A0, 0x400A1, 0x400A2, 0x400A3, 0x400A4, 0x400A5, 0x400A6, 0x400A7,
+ 0x400A8, 0x400A9, 0x400AA, 0x400AB, 0x400AC, 0x400AD, 0x400AE, 0x400AF,
+ 0x400B0, 0x400B1, 0x400B2, 0x400B3, 0x400B4, 0x400B5, 0x400B6, 0x400B7,
+ 0x400B8, 0x400B9, 0x400BA, 0x400BB, 0x400BC, 0x400BD, 0x400BE, 0x400BF,
+ 0x48190, 0x48191, 0x48192, 0x48193, 0x48194, 0x48195, 0x48196, 0x48197,
+ 0x48198, 0x48199, 0x4819A, 0x4819B, 0x4819C, 0x4819D, 0x4819E, 0x4819F,
+ 0x481A0, 0x481A1, 0x481A2, 0x481A3, 0x481A4, 0x481A5, 0x481A6, 0x481A7,
+ 0x481A8, 0x481A9, 0x481AA, 0x481AB, 0x481AC, 0x481AD, 0x481AE, 0x481AF,
+ 0x481B0, 0x481B1, 0x481B2, 0x481B3, 0x481B4, 0x481B5, 0x481B6, 0x481B7,
+ 0x481B8, 0x481B9, 0x481BA, 0x481BB, 0x481BC, 0x481BD, 0x481BE, 0x481BF,
+ 0x481C0, 0x481C1, 0x481C2, 0x481C3, 0x481C4, 0x481C5, 0x481C6, 0x481C7,
+ 0x481C8, 0x481C9, 0x481CA, 0x481CB, 0x481CC, 0x481CD, 0x481CE, 0x481CF,
+ 0x481D0, 0x481D1, 0x481D2, 0x481D3, 0x481D4, 0x481D5, 0x481D6, 0x481D7,
+ 0x481D8, 0x481D9, 0x481DA, 0x481DB, 0x481DC, 0x481DD, 0x481DE, 0x481DF,
+ 0x481E0, 0x481E1, 0x481E2, 0x481E3, 0x481E4, 0x481E5, 0x481E6, 0x481E7,
+ 0x481E8, 0x481E9, 0x481EA, 0x481EB, 0x481EC, 0x481ED, 0x481EE, 0x481EF,
+ 0x481F0, 0x481F1, 0x481F2, 0x481F3, 0x481F4, 0x481F5, 0x481F6, 0x481F7,
+ 0x481F8, 0x481F9, 0x481FA, 0x481FB, 0x481FC, 0x481FD, 0x481FE, 0x481FF,
+ 0x38000, 0x38001, 0x38002, 0x38003, 0x38004, 0x38005, 0x38006, 0x38007,
+ 0x38008, 0x38009, 0x3800A, 0x3800B, 0x3800C, 0x3800D, 0x3800E, 0x3800F,
+ 0x38010, 0x38011, 0x38012, 0x38013, 0x38014, 0x38015, 0x38016, 0x38017,
+ 0x400C0, 0x400C1, 0x400C2, 0x400C3, 0x400C4, 0x400C5
+};
+
+const u32 fixed_d_sym[30] = {
+ 0x28000, 0x28001, 0x28002, 0x28003, 0x28004, 0x28005, 0x28006, 0x28007,
+ 0x28008, 0x28009, 0x2800A, 0x2800B, 0x2800C, 0x2800D, 0x2800E, 0x2800F,
+ 0x28010, 0x28011, 0x28012, 0x28013, 0x28014, 0x28015, 0x28016, 0x28017,
+ 0x28018, 0x28019, 0x2801A, 0x2801B, 0x2801C, 0x2801D
+};
+
+static int init_fixed_mode(struct iaa_device_compression_mode *mode)
+{
+ struct aecs_comp_table_record *comp_table = mode->aecs_comp_table;
+ u32 bfinal = 1;
+ u32 offset;
+
+ /* Configure aecs table using fixed Huffman table */
+ comp_table->crc = 0;
+ comp_table->xor_checksum = 0;
+ offset = comp_table->num_output_accum_bits / 8;
+ comp_table->output_accum[offset] = FIXED_HDR | bfinal;
+ comp_table->num_output_accum_bits = FIXED_HDR_SIZE;
+
+ return 0;
+}
+
+int iaa_aecs_init_fixed(void)
+{
+ int ret;
+
+ ret = add_iaa_compression_mode("fixed",
+ fixed_ll_sym,
+ sizeof(fixed_ll_sym),
+ fixed_d_sym,
+ sizeof(fixed_d_sym),
+ NULL, 0, 0,
+ init_fixed_mode, NULL);
+ if (!ret)
+ pr_debug("IAA fixed compression mode initialized\n");
+
+ return ret;
+}
+
+void iaa_aecs_cleanup_fixed(void)
+{
+ remove_iaa_compression_mode("fixed");
+}
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index bc7249ab3a89..832b823cc0d5 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -66,6 +66,381 @@ static void wq_table_clear_entry(int cpu)
static LIST_HEAD(iaa_devices);
static DEFINE_MUTEX(iaa_devices_lock);

+static struct iaa_compression_mode *iaa_compression_modes[IAA_COMP_MODES_MAX];
+static int active_compression_mode;
+
+static ssize_t compression_mode_show(struct device_driver *driver, char *buf)
+{
+ int ret = 0;
+
+ ret = sprintf(buf, "%s\n", "fixed");
+
+ return ret;
+}
+
+static ssize_t compression_mode_store(struct device_driver *driver,
+ const char *buf, size_t count)
+{
+ int ret = -EBUSY;
+ char *mode_name;
+
+ mutex_lock(&iaa_devices_lock);
+
+ mode_name = kstrndup(buf, count, GFP_KERNEL);
+ if (!mode_name) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ret = set_iaa_compression_mode(strim(mode_name));
+ if (ret == 0)
+ ret = count;
+
+ kfree(mode_name);
+out:
+ mutex_unlock(&iaa_devices_lock);
+
+ return ret;
+}
+static DRIVER_ATTR_RW(compression_mode);
+
+static int find_empty_iaa_compression_mode(void)
+{
+ int i = -EINVAL;
+
+ for (i = 0; i < IAA_COMP_MODES_MAX; i++) {
+ if (iaa_compression_modes[i])
+ continue;
+ break;
+ }
+
+ return i;
+}
+
+static struct iaa_compression_mode *find_iaa_compression_mode(const char *name, int *idx)
+{
+ struct iaa_compression_mode *mode;
+ int i;
+
+ for (i = 0; i < IAA_COMP_MODES_MAX; i++) {
+ mode = iaa_compression_modes[i];
+ if (!mode)
+ continue;
+
+ if (!strcmp(mode->name, name)) {
+ *idx = i;
+ return iaa_compression_modes[i];
+ }
+ }
+
+ return NULL;
+}
+
+static void free_iaa_compression_mode(struct iaa_compression_mode *mode)
+{
+ kfree(mode->name);
+ kfree(mode->ll_table);
+ kfree(mode->d_table);
+ kfree(mode->header_table);
+
+ kfree(mode);
+}
+
+/*
+ * IAA Compression modes are defined by an ll_table, a d_table, and an
+ * optional header_table. These tables are typically generated and
+ * captured using statistics collected from running actual
+ * compress/decompress workloads.
+ *
+ * A module or other kernel code can add and remove compression modes
+ * with a given name using the exported @add_iaa_compression_mode()
+ * and @remove_iaa_compression_mode functions.
+ *
+ * Successfully added compression modes can be selected using the
+ * function @set_iaa_compression_mode(), passing in the name of the
+ * compression mode. Henceforth, all compressions and decompressions
+ * will use the given compression mode. Any in-flight decompressions
+ * using the old mode will subsequently fail.
+ *
+ * When a new compression mode is added, the tables are saved in a
+ * global compression mode list. When IAA devices are added, a
+ * per-IAA device dma mapping is created for each IAA device, for each
+ * compression mode. These are the tables used to do the actual
+ * compression/deccompression and are unmapped if/when the devices are
+ * removed. Currently, compression modes must be added before any
+ * device is added, and removed after all devices have been removed.
+ */
+
+/**
+ * remove_iaa_compression_mode - Remove an IAA compression mode
+ * @name: The name the compression mode will be known as
+ *
+ * Remove the IAA compression mode named @name.
+ */
+void remove_iaa_compression_mode(const char *name)
+{
+ struct iaa_compression_mode *mode;
+ int idx;
+
+ mutex_lock(&iaa_devices_lock);
+
+ if (!list_empty(&iaa_devices))
+ goto out;
+
+ mode = find_iaa_compression_mode(name, &idx);
+ if (mode) {
+ free_iaa_compression_mode(mode);
+ iaa_compression_modes[idx] = NULL;
+ }
+out:
+ mutex_unlock(&iaa_devices_lock);
+}
+EXPORT_SYMBOL_GPL(remove_iaa_compression_mode);
+
+/**
+ * add_iaa_compression_mode - Add an IAA compression mode
+ * @name: The name the compression mode will be known as
+ * @ll_table: The ll table
+ * @ll_table_size: The ll table size in bytes
+ * @d_table: The d table
+ * @d_table_size: The d table size in bytes
+ * @header_table: Optional header table
+ * @header_table_size: Optional header table size in bytes
+ * @gen_decomp_table_flags: Otional flags used to generate the decomp table
+ * @init: Optional callback function to init the compression mode data
+ * @free: Optional callback function to free the compression mode data
+ *
+ * Add a new IAA compression mode named @name. If successful, @name
+ * can subsequently be given to @set_iaa_compression_mode() to make
+ * that mode the current mode for iaa compression/decompression.
+ *
+ * Returns 0 if successful, errcode otherwise.
+ */
+int add_iaa_compression_mode(const char *name,
+ const u32 *ll_table,
+ int ll_table_size,
+ const u32 *d_table,
+ int d_table_size,
+ const u8 *header_table,
+ int header_table_size,
+ u16 gen_decomp_table_flags,
+ iaa_dev_comp_init_fn_t init,
+ iaa_dev_comp_free_fn_t free)
+{
+ struct iaa_compression_mode *mode;
+ int idx, ret = -ENOMEM;
+
+ mutex_lock(&iaa_devices_lock);
+
+ if (!list_empty(&iaa_devices)) {
+ ret = -EBUSY;
+ goto out;
+ }
+
+ mode = kzalloc(sizeof(*mode), GFP_KERNEL);
+ if (!mode)
+ goto out;
+
+ mode->name = kstrdup(name, GFP_KERNEL);
+ if (!mode->name)
+ goto free;
+
+ if (ll_table) {
+ mode->ll_table = kzalloc(ll_table_size, GFP_KERNEL);
+ if (!mode->ll_table)
+ goto free;
+ memcpy(mode->ll_table, ll_table, ll_table_size);
+ mode->ll_table_size = ll_table_size;
+ }
+
+ if (d_table) {
+ mode->d_table = kzalloc(d_table_size, GFP_KERNEL);
+ if (!mode->d_table)
+ goto free;
+ memcpy(mode->d_table, d_table, d_table_size);
+ mode->d_table_size = d_table_size;
+ }
+
+ if (header_table) {
+ mode->header_table = kzalloc(header_table_size, GFP_KERNEL);
+ if (!mode->header_table)
+ goto free;
+ memcpy(mode->header_table, header_table, header_table_size);
+ mode->header_table_size = header_table_size;
+ }
+
+ mode->gen_decomp_table_flags = gen_decomp_table_flags;
+
+ mode->init = init;
+ mode->free = free;
+
+ idx = find_empty_iaa_compression_mode();
+ if (idx < 0)
+ goto free;
+
+ pr_debug("IAA compression mode %s added at idx %d\n",
+ mode->name, idx);
+
+ iaa_compression_modes[idx] = mode;
+
+ ret = 0;
+out:
+ mutex_unlock(&iaa_devices_lock);
+
+ return ret;
+free:
+ free_iaa_compression_mode(mode);
+ goto out;
+}
+EXPORT_SYMBOL_GPL(add_iaa_compression_mode);
+
+static void set_iaa_device_compression_mode(struct iaa_device *iaa_device, int idx)
+{
+ iaa_device->active_compression_mode = iaa_device->compression_modes[idx];
+}
+
+static void update_iaa_devices_compression_mode(void)
+{
+ struct iaa_device *iaa_device;
+
+ list_for_each_entry(iaa_device, &iaa_devices, list)
+ set_iaa_device_compression_mode(iaa_device, active_compression_mode);
+}
+
+/**
+ * set_iaa_compression_mode - Set an IAA compression mode
+ * @name: The name of the compression mode
+ *
+ * Make the IAA compression mode named @name the current compression
+ * mode used by compression/decompression.
+ */
+
+int set_iaa_compression_mode(const char *name)
+{
+ struct iaa_compression_mode *mode;
+ int ret = -EINVAL;
+ int idx;
+
+ mode = find_iaa_compression_mode(name, &idx);
+ if (mode) {
+ active_compression_mode = idx;
+ update_iaa_devices_compression_mode();
+ pr_debug("compression mode set to: %s\n", name);
+ ret = 0;
+ }
+
+ return ret;
+}
+
+static void free_device_compression_mode(struct iaa_device *iaa_device,
+ struct iaa_device_compression_mode *device_mode)
+{
+ size_t size = sizeof(struct aecs_comp_table_record) + IAA_AECS_ALIGN;
+ struct device *dev = &iaa_device->idxd->pdev->dev;
+
+ kfree(device_mode->name);
+
+ if (device_mode->aecs_comp_table)
+ dma_free_coherent(dev, size, device_mode->aecs_comp_table,
+ device_mode->aecs_comp_table_dma_addr);
+ if (device_mode->aecs_decomp_table)
+ dma_free_coherent(dev, size, device_mode->aecs_decomp_table,
+ device_mode->aecs_decomp_table_dma_addr);
+
+ kfree(device_mode);
+}
+
+static int init_device_compression_mode(struct iaa_device *iaa_device,
+ struct iaa_compression_mode *mode,
+ int idx, struct idxd_wq *wq)
+{
+ size_t size = sizeof(struct aecs_comp_table_record) + IAA_AECS_ALIGN;
+ struct device *dev = &iaa_device->idxd->pdev->dev;
+ struct iaa_device_compression_mode *device_mode;
+ int ret = -ENOMEM;
+
+ device_mode = kzalloc(sizeof(*device_mode), GFP_KERNEL);
+ if (!device_mode)
+ return -ENOMEM;
+
+ device_mode->name = kstrdup(mode->name, GFP_KERNEL);
+ if (!device_mode->name)
+ goto free;
+
+ device_mode->aecs_comp_table = dma_alloc_coherent(dev, size,
+ &device_mode->aecs_comp_table_dma_addr, GFP_KERNEL);
+ if (!device_mode->aecs_comp_table)
+ goto free;
+
+ device_mode->aecs_decomp_table = dma_alloc_coherent(dev, size,
+ &device_mode->aecs_decomp_table_dma_addr, GFP_KERNEL);
+ if (!device_mode->aecs_decomp_table)
+ goto free;
+
+ /* Add Huffman table to aecs */
+ memset(device_mode->aecs_comp_table, 0, sizeof(*device_mode->aecs_comp_table));
+ memcpy(device_mode->aecs_comp_table->ll_sym, mode->ll_table, mode->ll_table_size);
+ memcpy(device_mode->aecs_comp_table->d_sym, mode->d_table, mode->d_table_size);
+
+ if (mode->init) {
+ ret = mode->init(device_mode);
+ if (ret)
+ goto free;
+ }
+
+ /* mode index should match iaa_compression_modes idx */
+ iaa_device->compression_modes[idx] = device_mode;
+
+ pr_debug("IAA %s compression mode initialized for iaa device %d\n",
+ mode->name, iaa_device->idxd->id);
+
+ ret = 0;
+out:
+ return ret;
+free:
+ pr_debug("IAA %s compression mode initialization failed for iaa device %d\n",
+ mode->name, iaa_device->idxd->id);
+
+ free_device_compression_mode(iaa_device, device_mode);
+ goto out;
+}
+
+static int init_device_compression_modes(struct iaa_device *iaa_device,
+ struct idxd_wq *wq)
+{
+ struct iaa_compression_mode *mode;
+ int i, ret = 0;
+
+ for (i = 0; i < IAA_COMP_MODES_MAX; i++) {
+ mode = iaa_compression_modes[i];
+ if (!mode)
+ continue;
+
+ ret = init_device_compression_mode(iaa_device, mode, i, wq);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
+static void remove_device_compression_modes(struct iaa_device *iaa_device)
+{
+ struct iaa_device_compression_mode *device_mode;
+ int i;
+
+ for (i = 0; i < IAA_COMP_MODES_MAX; i++) {
+ device_mode = iaa_device->compression_modes[i];
+ if (!device_mode)
+ continue;
+
+ free_device_compression_mode(iaa_device, device_mode);
+ iaa_device->compression_modes[i] = NULL;
+ if (iaa_compression_modes[i]->free)
+ iaa_compression_modes[i]->free(device_mode);
+ }
+}
+
static struct iaa_device *iaa_device_alloc(void)
{
struct iaa_device *iaa_device;
@@ -120,8 +495,23 @@ static struct iaa_device *add_iaa_device(struct idxd_device *idxd)
return iaa_device;
}

+static int init_iaa_device(struct iaa_device *iaa_device, struct iaa_wq *iaa_wq)
+{
+ int ret = 0;
+
+ ret = init_device_compression_modes(iaa_device, iaa_wq->wq);
+ if (ret)
+ return ret;
+
+ set_iaa_device_compression_mode(iaa_device, active_compression_mode);
+
+ return ret;
+}
+
static void del_iaa_device(struct iaa_device *iaa_device)
{
+ remove_device_compression_modes(iaa_device);
+
list_del(&iaa_device->list);

iaa_device_free(iaa_device);
@@ -276,6 +666,13 @@ static int save_iaa_wq(struct idxd_wq *wq)
del_iaa_device(new_device);
goto out;
}
+
+ ret = init_iaa_device(new_device, new_wq);
+ if (ret) {
+ del_iaa_wq(new_device, new_wq->wq);
+ del_iaa_device(new_device);
+ goto out;
+ }
}

if (WARN_ON(nr_iaa == 0))
@@ -519,20 +916,43 @@ static int __init iaa_crypto_init_module(void)
nr_nodes = num_online_nodes();
nr_cpus_per_node = nr_cpus / nr_nodes;

+ ret = iaa_aecs_init_fixed();
+ if (ret < 0) {
+ pr_debug("IAA fixed compression mode init failed\n");
+ goto out;
+ }
+
ret = idxd_driver_register(&iaa_crypto_driver);
if (ret) {
pr_debug("IAA wq sub-driver registration failed\n");
- goto out;
+ goto err_driver_reg;
+ }
+
+ ret = driver_create_file(&iaa_crypto_driver.drv,
+ &driver_attr_compression_mode);
+ if (ret) {
+ pr_debug("IAA compression mode attr creation failed\n");
+ goto err_comp_attr_create;
}

pr_debug("initialized\n");
out:
return ret;
+
+err_comp_attr_create:
+ idxd_driver_unregister(&iaa_crypto_driver);
+err_driver_reg:
+ iaa_aecs_cleanup_fixed();
+
+ goto out;
}

static void __exit iaa_crypto_cleanup_module(void)
{
+ driver_remove_file(&iaa_crypto_driver.drv,
+ &driver_attr_compression_mode);
idxd_driver_unregister(&iaa_crypto_driver);
+ iaa_aecs_cleanup_fixed();

pr_debug("cleaned up\n");
}
--
2.34.1

2023-04-28 20:58:26

by Tom Zanussi

[permalink] [raw]
Subject: [PATCH v3 13/15] crypto: iaa - Add support for default IAA 'canned' compression mode

Add support for a 'canned' compression mode using the IAA compression
mode in-kernel API.

The IAA 'canned' compression mode is added alongside the existing
'fixed' compression mode and is named simply 'canned'. It implements
a good general-purpose compression scheme whose tables were generated
from statistics derived from a wide variety of SPEC17 workloads.

It provides much better overall characteristics than the existing
deflate-1951 tables implemented by 'fixed'.

Either 'fixed' or 'canned' modes can be chosen as the mode to be used
for compression/decompression via the iaa_crypto compression_mode
iaa_crypto driver attribute:

To choose 'fixed' mode:

echo "fixed" > /sys/bus/dsa/drivers/crypto/compression_mode

To choose 'canned' mode:

echo "canned" > /sys/bus/dsa/drivers/crypto/compression_mode

[ Based on work originally by George Powley, Jing Lin and Kyung Min
Park ]

Signed-off-by: Tom Zanussi <[email protected]>
---
crypto/testmgr.c | 10 ++
crypto/testmgr.h | 72 ++++++++++++
drivers/crypto/intel/iaa/Makefile | 2 +-
drivers/crypto/intel/iaa/iaa_crypto.h | 2 +
.../crypto/intel/iaa/iaa_crypto_comp_canned.c | 110 ++++++++++++++++++
drivers/crypto/intel/iaa/iaa_crypto_main.c | 44 ++++++-
6 files changed, 237 insertions(+), 3 deletions(-)
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_comp_canned.c

diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 216878c8bc3d..28e29676a6ef 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -5313,6 +5313,16 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.hash = __VECS(hmac_streebog512_tv_template)
}
+ }, {
+ .alg = "iaa-canned-deflate",
+ .test = alg_test_comp,
+ .fips_allowed = 1,
+ .suite = {
+ .comp = {
+ .comp = __VECS(iaa_canned_deflate_comp_tv_template),
+ .decomp = __VECS(iaa_canned_deflate_decomp_tv_template)
+ }
+ }
}, {
.alg = "jitterentropy_rng",
.fips_allowed = 1,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 5ca7a412508f..0fa6bc1a33cf 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -35829,6 +35829,78 @@ static const struct comp_testvec zlib_deflate_decomp_tv_template[] = {
},
};

+static const struct comp_testvec iaa_canned_deflate_comp_tv_template[] = {
+ {
+ .inlen = 70,
+ .outlen = 37,
+ .input = "Join us now and share the software "
+ "Join us now and share the software ",
+ .output = "\x6d\x23\x43\x23\xa4\x71\x31\xd2"
+ "\x88\xc8\x61\x52\x75\x84\x56\x1a"
+ "\x13\xa2\x8e\xd6\x49\x63\x43\x74"
+ "\xd2\x98\xc8\xe0\xd8\x61\x58\x69"
+ "\xcb\x71\x01\xe5\x7f",
+ }, {
+ .inlen = 191,
+ .outlen = 128,
+ .input = "This document describes a compression method based on the DEFLATE"
+ "compression algorithm. This document defines the application of "
+ "the DEFLATE algorithm to the IP Payload Compression Protocol.",
+ .output = "\xdd\x42\x42\x63\xa4\xda\x48\x4d"
+ "\x5c\xb8\x2e\x22\x56\xaa\xd5\xc5"
+ "\x68\xa2\x43\x83\x74\x31\x52\xb5"
+ "\x54\x13\x19\x1e\x15\xad\x8b\x89"
+ "\x09\x8d\x8c\x90\x86\xeb\x62\x43"
+ "\x22\xb5\xd2\x20\x75\x8c\x4e\x2b"
+ "\x05\x3d\x36\x44\x27\xf5\x69\xe5"
+ "\xdb\xde\xbb\x5b\x2b\x7d\x37\x75"
+ "\xd8\xc0\xc8\xe8\xd0\xd8\x90\x70"
+ "\x7b\xa9\x54\x1c\x38\x38\x34\x02"
+ "\xc2\xe2\x8e\xea\xa8\xa8\xb0\x50"
+ "\x8d\x3a\x16\xf7\x88\x0c\xd6\x8f"
+ "\x95\x1f\x40\x1a\x1b\x29\x34\xb4"
+ "\xf1\x97\xfa\xab\x87\x87\x45\xaa"
+ "\xb5\xd2\x96\x7a\x03\xf9\x47\x47"
+ "\xc6\x46\x6a\x22\xc3\xec\xff\x07",
+ },
+};
+
+static const struct comp_testvec iaa_canned_deflate_decomp_tv_template[] = {
+ {
+ .inlen = 128,
+ .outlen = 191,
+ .input = "\xdd\x42\x42\x63\xa4\xda\x48\x4d"
+ "\x5c\xb8\x2e\x22\x56\xaa\xd5\xc5"
+ "\x68\xa2\x43\x83\x74\x31\x52\xb5"
+ "\x54\x13\x19\x1e\x15\xad\x8b\x89"
+ "\x09\x8d\x8c\x90\x86\xeb\x62\x43"
+ "\x22\xb5\xd2\x20\x75\x8c\x4e\x2b"
+ "\x05\x3d\x36\x44\x27\xf5\x69\xe5"
+ "\xdb\xde\xbb\x5b\x2b\x7d\x37\x75"
+ "\xd8\xc0\xc8\xe8\xd0\xd8\x90\x70"
+ "\x7b\xa9\x54\x1c\x38\x38\x34\x02"
+ "\xc2\xe2\x8e\xea\xa8\xa8\xb0\x50"
+ "\x8d\x3a\x16\xf7\x88\x0c\xd6\x8f"
+ "\x95\x1f\x40\x1a\x1b\x29\x34\xb4"
+ "\xf1\x97\xfa\xab\x87\x87\x45\xaa"
+ "\xb5\xd2\x96\x7a\x03\xf9\x47\x47"
+ "\xc6\x46\x6a\x22\xc3\xec\xff\x07",
+ .output = "This document describes a compression method based on the DEFLATE"
+ "compression algorithm. This document defines the application of "
+ "the DEFLATE algorithm to the IP Payload Compression Protocol.",
+ }, {
+ .inlen = 37,
+ .outlen = 70,
+ .input = "\x6d\x23\x43\x23\xa4\x71\x31\xd2"
+ "\x88\xc8\x61\x52\x75\x84\x56\x1a"
+ "\x13\xa2\x8e\xd6\x49\x63\x43\x74"
+ "\xd2\x98\xc8\xe0\xd8\x61\x58\x69"
+ "\xcb\x71\x01\xe5\x7f",
+ .output = "Join us now and share the software "
+ "Join us now and share the software ",
+ },
+};
+
/*
* LZO test vectors (null-terminated strings).
*/
diff --git a/drivers/crypto/intel/iaa/Makefile b/drivers/crypto/intel/iaa/Makefile
index cc87feffd059..ff6ab1d0bc13 100644
--- a/drivers/crypto/intel/iaa/Makefile
+++ b/drivers/crypto/intel/iaa/Makefile
@@ -7,4 +7,4 @@ ccflags-y += -I $(srctree)/drivers/dma/idxd -DDEFAULT_SYMBOL_NAMESPACE=IDXD

obj-$(CONFIG_CRYPTO_DEV_IAA_CRYPTO) := iaa_crypto.o

-iaa_crypto-y := iaa_crypto_main.o iaa_crypto_comp_fixed.o
+iaa_crypto-y := iaa_crypto_main.o iaa_crypto_comp_canned.o iaa_crypto_comp_fixed.o
diff --git a/drivers/crypto/intel/iaa/iaa_crypto.h b/drivers/crypto/intel/iaa/iaa_crypto.h
index 864aa5e27e2e..cf9aec13b98d 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto.h
+++ b/drivers/crypto/intel/iaa/iaa_crypto.h
@@ -115,6 +115,8 @@ struct aecs_decomp_table_record {
u32 decomp_state[1220];
} __packed;

+int iaa_aecs_init_canned(void);
+void iaa_aecs_cleanup_canned(void);
int iaa_aecs_init_fixed(void);
void iaa_aecs_cleanup_fixed(void);

diff --git a/drivers/crypto/intel/iaa/iaa_crypto_comp_canned.c b/drivers/crypto/intel/iaa/iaa_crypto_comp_canned.c
new file mode 100644
index 000000000000..aff0899ffb9e
--- /dev/null
+++ b/drivers/crypto/intel/iaa/iaa_crypto_comp_canned.c
@@ -0,0 +1,110 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
+
+#include "idxd.h"
+#include "iaa_crypto.h"
+
+#define IAA_AECS_ALIGN 32
+
+/*
+ * These tables were generated from statistics derived from a wide
+ * variety of SPEC17 workloads and implement a good general-purpose
+ * compression scheme called simply 'canned'.
+ */
+
+static const u32 canned_ll_iaa[286] = {
+0x28002, 0x38024, 0x40066, 0x40067, 0x40068, 0x48144, 0x40069, 0x48145,
+0x4006a, 0x48146, 0x4006b, 0x48147, 0x48148, 0x48149, 0x4814a, 0x4814b,
+0x4006c, 0x4814c, 0x4814d, 0x4814e, 0x4814f, 0x48150, 0x48151, 0x48152,
+0x4006d, 0x48153, 0x48154, 0x48155, 0x48156, 0x48157, 0x48158, 0x48159,
+0x38025, 0x4815a, 0x4815b, 0x4815c, 0x4815d, 0x4815e, 0x4815f, 0x48160,
+0x4006e, 0x48161, 0x48162, 0x48163, 0x48164, 0x48165, 0x4006f, 0x48166,
+0x38026, 0x38027, 0x40070, 0x40071, 0x40072, 0x40073, 0x40074, 0x40075,
+0x38028, 0x40076, 0x40077, 0x48167, 0x40078, 0x40079, 0x4007a, 0x38029,
+0x3802a, 0x4007b, 0x48168, 0x48169, 0x4007c, 0x4816a, 0x4007d, 0x4816b,
+0x4007e, 0x4816c, 0x4816d, 0x4816e, 0x4816f, 0x48170, 0x48171, 0x48172,
+0x4007f, 0x48173, 0x48174, 0x48175, 0x48176, 0x48177, 0x48178, 0x48179,
+0x40080, 0x4817a, 0x4817b, 0x4817c, 0x4817d, 0x4817e, 0x4817f, 0x48180,
+0x40081, 0x3802b, 0x40082, 0x3802c, 0x3802d, 0x3802e, 0x40083, 0x48181,
+0x40084, 0x40085, 0x48182, 0x48183, 0x40086, 0x40087, 0x40088, 0x40089,
+0x4008a, 0x48184, 0x4008b, 0x4008c, 0x4008d, 0x4008e, 0x48185, 0x48186,
+0x4008f, 0x48187, 0x48188, 0x48189, 0x4818a, 0x4818b, 0x4818c, 0x4818d,
+0x40090, 0x4818e, 0x4818f, 0x48190, 0x48191, 0x48192, 0x48193, 0x48194,
+0x40091, 0x48195, 0x48196, 0x48197, 0x48198, 0x48199, 0x4819a, 0x4819b,
+0x40092, 0x4819c, 0x4819d, 0x4819e, 0x4819f, 0x481a0, 0x481a1, 0x481a2,
+0x40093, 0x481a3, 0x481a4, 0x481a5, 0x481a6, 0x481a7, 0x481a8, 0x481a9,
+0x40094, 0x481aa, 0x481ab, 0x481ac, 0x481ad, 0x481ae, 0x481af, 0x481b0,
+0x481b1, 0x481b2, 0x481b3, 0x481b4, 0x481b5, 0x481b6, 0x481b7, 0x481b8,
+0x40095, 0x481b9, 0x481ba, 0x481bb, 0x481bc, 0x481bd, 0x481be, 0x481bf,
+0x40096, 0x481c0, 0x481c1, 0x481c2, 0x481c3, 0x481c4, 0x481c5, 0x40097,
+0x40098, 0x481c6, 0x481c7, 0x481c8, 0x481c9, 0x481ca, 0x481cb, 0x481cc,
+0x40099, 0x481cd, 0x481ce, 0x481cf, 0x481d0, 0x481d1, 0x481d2, 0x481d3,
+0x4009a, 0x481d4, 0x481d5, 0x481d6, 0x481d7, 0x481d8, 0x481d9, 0x481da,
+0x481db, 0x481dc, 0x481dd, 0x481de, 0x481df, 0x481e0, 0x481e1, 0x481e2,
+0x4009b, 0x481e3, 0x481e4, 0x481e5, 0x481e6, 0x481e7, 0x481e8, 0x481e9,
+0x481ea, 0x481eb, 0x481ec, 0x481ed, 0x481ee, 0x481ef, 0x481f0, 0x481f1,
+0x4009c, 0x481f2, 0x481f3, 0x481f4, 0x481f5, 0x481f6, 0x481f7, 0x481f8,
+0x481f9, 0x481fa, 0x481fb, 0x503fe, 0x481fc, 0x481fd, 0x481fe, 0x4009d,
+0x503ff, 0x20000, 0x28003, 0x30010, 0x28004, 0x28005, 0x28006, 0x4009e,
+0x4009f, 0x3802f, 0x38030, 0x30011, 0x400a0, 0x38031, 0x38032, 0x400a1,
+0x28007, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+};
+
+static const u32 canned_d_iaa[30] = {
+0x3807e, 0x20004, 0x481fe, 0x18000, 0x400fe, 0x18001, 0x3003c, 0x20005,
+0x20006, 0x28016, 0x20007, 0x28017, 0x20008, 0x28018, 0x28019, 0x20009,
+0x2000a, 0x2801a, 0x2801b, 0x2801c, 0x2801d, 0x3003d, 0x3003e, 0x481ff,
+0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+};
+
+#define CANNED_HEADER_SIZE (71)
+
+static const u8 canned_header[] = {
+0x85, 0xd7, 0x05, 0x40, 0x54, 0x4d, 0x10, 0x06,
+0x80, 0x3d, 0x40, 0x44, 0x2c, 0x6c, 0xec, 0xb3,
+0xb1, 0xf0, 0x8e, 0x3c, 0x10, 0xb8, 0x43, 0x11,
+0xb1, 0xb1, 0xb1, 0x39, 0xee, 0x0e, 0x41, 0x29,
+0x09, 0xc5, 0xc6, 0xc6, 0xc6, 0xc6, 0xc6, 0x6e,
+0xc5, 0xc6, 0x04, 0x1b, 0x1b, 0xbb, 0x15, 0xbb,
+0x15, 0x3b, 0x7e, 0x95, 0x7f, 0xf6, 0xed, 0x2e,
+0xdc, 0x3d, 0xee, 0xd8, 0x1b, 0x3f, 0xbe, 0x37,
+0xb3, 0xb3, 0xb3, 0x3b, 0xb3, 0xf1, 0x9e,
+};
+
+#define HEADER_SIZE_IN_BITS 568
+
+#define CEIL(a, b) (((a) + ((b) - 1)) / (b))
+
+int iaa_aecs_init_canned(void)
+{
+ u16 gen_decomp_table_flags;
+ unsigned int slen;
+ int ret;
+
+ slen = CEIL(HEADER_SIZE_IN_BITS, 8);
+
+ gen_decomp_table_flags = 0x1;
+ gen_decomp_table_flags |= 1 << 9; // suppress output
+ gen_decomp_table_flags |= (((slen * 8) - HEADER_SIZE_IN_BITS) << 6);
+
+ ret = add_iaa_compression_mode("canned",
+ canned_ll_iaa,
+ sizeof(canned_ll_iaa),
+ canned_d_iaa,
+ sizeof(canned_d_iaa),
+ canned_header,
+ sizeof(canned_header),
+ gen_decomp_table_flags,
+ NULL, NULL);
+
+ if (!ret)
+ pr_debug("IAA canned compression mode initialized\n");
+
+ return ret;
+}
+
+void iaa_aecs_cleanup_canned(void)
+{
+ remove_iaa_compression_mode("canned");
+}
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index 5a9575c826b2..c82e002043bb 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -120,11 +120,16 @@ static DRIVER_ATTR_RW(verify_compress);
static struct iaa_compression_mode *iaa_compression_modes[IAA_COMP_MODES_MAX];
static int active_compression_mode;

+static bool canned_mode = true;
+
static ssize_t compression_mode_show(struct device_driver *driver, char *buf)
{
int ret = 0;

- ret = sprintf(buf, "%s\n", "fixed");
+ if (canned_mode)
+ ret = sprintf(buf, "%s\n", "canned");
+ else
+ ret = sprintf(buf, "%s\n", "fixed");

return ret;
}
@@ -383,6 +388,12 @@ int set_iaa_compression_mode(const char *name)
ret = 0;
}

+ if (ret == 0 && !strcmp(name, "canned"))
+ canned_mode = true;
+
+ if (ret == 0 && !strcmp(name, "fixed"))
+ canned_mode = false;
+
return ret;
}

@@ -1153,6 +1164,7 @@ static int iaa_compress_verify(struct crypto_tfm *tfm, struct acomp_req *req,
enum idxd_op_type optype;
struct iaa_wq *iaa_wq;
struct pci_dev *pdev;
+ dma_addr_t src2_addr;
struct device *dev;
int ret = 0;

@@ -1185,6 +1197,13 @@ static int iaa_compress_verify(struct crypto_tfm *tfm, struct acomp_req *req,
desc->max_dst_size = slen;
desc->completion_addr = idxd_desc->compl_dma;

+ if (canned_mode) {
+ src2_addr = iaa_wq->iaa_device->active_compression_mode->aecs_decomp_table_dma_addr;
+ desc->src2_addr = (u64)src2_addr;
+ desc->src2_size = 1088;
+ desc->flags |= IDXD_OP_FLAG_RD_SRC2_AECS;
+ }
+
dev_dbg(dev, "(verify) compression mode %s,"
" desc->src1_addr %llx, desc->src1_size %d,"
" desc->dst_addr %llx, desc->max_dst_size %d,"
@@ -1237,6 +1256,7 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req,
enum idxd_op_type optype;
struct iaa_wq *iaa_wq;
struct pci_dev *pdev;
+ dma_addr_t src2_addr;
struct device *dev;
int ret = 0;

@@ -1268,6 +1288,13 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req,
desc->src1_size = slen;
desc->completion_addr = idxd_desc->compl_dma;

+ if (canned_mode) {
+ src2_addr = iaa_wq->iaa_device->active_compression_mode->aecs_decomp_table_dma_addr;
+ desc->src2_addr = (u64)src2_addr;
+ desc->src2_size = 1088;
+ desc->flags |= IDXD_OP_FLAG_RD_SRC2_AECS;
+ }
+
dev_dbg(dev, "%s: decompression mode %s,"
" desc->src1_addr %llx, desc->src1_size %d,"
" desc->dst_addr %llx, desc->max_dst_size %d,"
@@ -1573,7 +1600,6 @@ static struct acomp_alg iaa_acomp_deflate = {
.decompress = iaa_comp_adecompress,
.dst_free = sgl_free,
.base = {
- .cra_name = "deflate",
.cra_driver_name = "iaa_crypto",
.cra_module = THIS_MODULE,
.cra_priority = IAA_ALG_PRIORITY,
@@ -1584,6 +1610,11 @@ static int iaa_register_compression_device(void)
{
int ret;

+ if (canned_mode)
+ strcpy(iaa_acomp_deflate.base.cra_name, "iaa-canned-deflate");
+ else
+ strcpy(iaa_acomp_deflate.base.cra_name, "deflate");
+
ret = crypto_register_acomp(&iaa_acomp_deflate);
if (ret)
pr_err("deflate algorithm acomp registration failed (%d)\n", ret);
@@ -1764,6 +1795,12 @@ static int __init iaa_crypto_init_module(void)
goto out;
}

+ ret = iaa_aecs_init_canned();
+ if (ret < 0) {
+ pr_debug("IAA canned compression mode init failed\n");
+ goto err_canned;
+ }
+
ret = idxd_driver_register(&iaa_crypto_driver);
if (ret) {
pr_debug("IAA wq sub-driver registration failed\n");
@@ -1795,6 +1832,8 @@ static int __init iaa_crypto_init_module(void)
idxd_driver_unregister(&iaa_crypto_driver);
err_driver_reg:
iaa_aecs_cleanup_fixed();
+err_canned:
+ iaa_aecs_cleanup_fixed();

goto out;
}
@@ -1809,6 +1848,7 @@ static void __exit iaa_crypto_cleanup_module(void)
driver_remove_file(&iaa_crypto_driver.drv,
&driver_attr_verify_compress);
idxd_driver_unregister(&iaa_crypto_driver);
+ iaa_aecs_cleanup_canned();
iaa_aecs_cleanup_fixed();

pr_debug("cleaned up\n");
--
2.34.1

2023-04-28 20:58:48

by Tom Zanussi

[permalink] [raw]
Subject: [PATCH v3 15/15] crypto: iaa - Add IAA Compression Accelerator stats

Add support for optional debugfs statistics support for the IAA
Compression Accelerator. This is enabled by the kernel config item:

CRYPTO_DEV_IAA_CRYPTO_STATS

When enabled, the IAA crypto driver will generate statistics which can
be accessed at /sys/kernel/debug/iaa-crypto/.

See Documentation/driver-api/crypto/iax/iax-crypto.rst for details.

Signed-off-by: Tom Zanussi <[email protected]>
---
drivers/crypto/intel/iaa/Kconfig | 9 +
drivers/crypto/intel/iaa/Makefile | 2 +
drivers/crypto/intel/iaa/iaa_crypto.h | 22 ++
drivers/crypto/intel/iaa/iaa_crypto_main.c | 65 +++++
drivers/crypto/intel/iaa/iaa_crypto_stats.c | 271 ++++++++++++++++++++
drivers/crypto/intel/iaa/iaa_crypto_stats.h | 58 +++++
6 files changed, 427 insertions(+)
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_stats.c
create mode 100644 drivers/crypto/intel/iaa/iaa_crypto_stats.h

diff --git a/drivers/crypto/intel/iaa/Kconfig b/drivers/crypto/intel/iaa/Kconfig
index fcccb6ff7e29..cffb3a4359fc 100644
--- a/drivers/crypto/intel/iaa/Kconfig
+++ b/drivers/crypto/intel/iaa/Kconfig
@@ -8,3 +8,12 @@ config CRYPTO_DEV_IAA_CRYPTO
decompression with the Intel Analytics Accelerator (IAA)
hardware using the cryptographic API. If you choose 'M'
here, the module will be called iaa_crypto.
+
+config CRYPTO_DEV_IAA_CRYPTO_STATS
+ bool "Enable Intel(R) IAA Compression Accelerator Statistics"
+ depends on CRYPTO_DEV_IAA_CRYPTO
+ default n
+ help
+ Enable statistics for the IAA compression accelerator.
+ These include per-device and per-workqueue statistics in
+ addition to global driver statistics.
diff --git a/drivers/crypto/intel/iaa/Makefile b/drivers/crypto/intel/iaa/Makefile
index ff6ab1d0bc13..2a1bee385932 100644
--- a/drivers/crypto/intel/iaa/Makefile
+++ b/drivers/crypto/intel/iaa/Makefile
@@ -8,3 +8,5 @@ ccflags-y += -I $(srctree)/drivers/dma/idxd -DDEFAULT_SYMBOL_NAMESPACE=IDXD
obj-$(CONFIG_CRYPTO_DEV_IAA_CRYPTO) := iaa_crypto.o

iaa_crypto-y := iaa_crypto_main.o iaa_crypto_comp_canned.o iaa_crypto_comp_fixed.o
+
+iaa_crypto-$(CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS) += iaa_crypto_stats.o
diff --git a/drivers/crypto/intel/iaa/iaa_crypto.h b/drivers/crypto/intel/iaa/iaa_crypto.h
index cf9aec13b98d..e322fb94ce51 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto.h
+++ b/drivers/crypto/intel/iaa/iaa_crypto.h
@@ -48,6 +48,11 @@ struct iaa_wq {
bool remove;

struct iaa_device *iaa_device;
+
+ u64 comp_calls;
+ u64 comp_bytes;
+ u64 decomp_calls;
+ u64 decomp_bytes;
};

struct iaa_device_compression_mode {
@@ -70,6 +75,11 @@ struct iaa_device {

int n_wq;
struct list_head wqs;
+
+ u64 comp_calls;
+ u64 comp_bytes;
+ u64 decomp_calls;
+ u64 decomp_bytes;
};

struct wq_table_entry {
@@ -150,4 +160,16 @@ int add_iaa_compression_mode(const char *name,
void remove_iaa_compression_mode(const char *name);
int set_iaa_compression_mode(const char *name);

+#if defined(CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS)
+void global_stats_show(struct seq_file *m);
+void device_stats_show(struct seq_file *m, struct iaa_device *iaa_device);
+void reset_iaa_crypto_stats(void);
+void reset_device_stats(struct iaa_device *iaa_device);
+#else
+static inline void global_stats_show(struct seq_file *m) {}
+static inline void device_stats_show(struct seq_file *m, struct iaa_device *iaa_device) {}
+static inline void reset_iaa_crypto_stats(void) {}
+static inline void reset_device_stats(struct iaa_device *iaa_device) {}
+#endif
+
#endif
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index b1293400d466..f7f7fef2fc9a 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -14,6 +14,7 @@

#include "idxd.h"
#include "iaa_crypto.h"
+#include "iaa_crypto_stats.h"

#ifdef pr_fmt
#undef pr_fmt
@@ -1141,6 +1142,7 @@ static inline int check_completion(struct device *dev,
ret = -ETIMEDOUT;
dev_dbg(dev, "%s timed out, size=0x%x\n",
op_str, comp->output_size);
+ update_completion_timeout_errs();
goto out;
}

@@ -1150,6 +1152,7 @@ static inline int check_completion(struct device *dev,
dev_dbg(dev, "compressed > uncompressed size,"
" not compressing, size=0x%x\n",
comp->output_size);
+ update_completion_comp_buf_overflow_errs();
goto out;
}

@@ -1162,6 +1165,7 @@ static inline int check_completion(struct device *dev,
dev_dbg(dev, "iaa %s status=0x%x, error=0x%x, size=0x%x\n",
op_str, comp->status, comp->error_code, comp->output_size);
print_hex_dump(KERN_INFO, "cmp-rec: ", DUMP_PREFIX_OFFSET, 8, 1, comp, 64, 0);
+ update_completion_einval_errs();

goto out;
}
@@ -1207,6 +1211,15 @@ static void iaa_desc_complete(struct idxd_desc *idxd_desc,

ctx->req->dlen = idxd_desc->iax_completion->output_size;

+ /* Update stats */
+ if (ctx->compress) {
+ update_total_comp_bytes_out(ctx->req->dlen);
+ update_wq_comp_bytes(iaa_wq->wq, ctx->req->dlen);
+ } else {
+ update_total_decomp_bytes_in(ctx->req->dlen);
+ update_wq_decomp_bytes(iaa_wq->wq, ctx->req->dlen);
+ }
+
if (ctx->compress && iaa_verify_compress) {
u32 compression_crc;

@@ -1311,6 +1324,10 @@ static int iaa_compress(struct crypto_tfm *tfm, struct acomp_req *req,
goto err;
}

+ /* Update stats */
+ update_total_comp_calls();
+ update_wq_comp_calls(wq);
+
if (async_mode && !disable_async) {
ret = -EINPROGRESS;
dev_dbg(dev, "%s: returning -EINPROGRESS\n", __func__);
@@ -1325,6 +1342,10 @@ static int iaa_compress(struct crypto_tfm *tfm, struct acomp_req *req,

*dlen = idxd_desc->iax_completion->output_size;

+ /* Update stats */
+ update_total_comp_bytes_out(*dlen);
+ update_wq_comp_bytes(wq, *dlen);
+
*compression_crc = idxd_desc->iax_completion->crc;

if (!async_mode)
@@ -1511,6 +1532,10 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req,
goto err;
}

+ /* Update stats */
+ update_total_decomp_calls();
+ update_wq_decomp_calls(wq);
+
if (async_mode && !disable_async) {
ret = -EINPROGRESS;
dev_dbg(dev, "%s: returning -EINPROGRESS\n", __func__);
@@ -1527,6 +1552,10 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req,

if (!async_mode)
idxd_free_desc(wq, idxd_desc);
+
+ /* Update stats */
+ update_total_decomp_bytes_in(slen);
+ update_wq_decomp_bytes(wq, slen);
out:
return ret;
err:
@@ -1991,6 +2020,38 @@ static struct idxd_device_driver iaa_crypto_driver = {
.desc_complete = iaa_desc_complete,
};

+int wq_stats_show(struct seq_file *m, void *v)
+{
+ struct iaa_device *iaa_device;
+
+ mutex_lock(&iaa_devices_lock);
+
+ global_stats_show(m);
+
+ list_for_each_entry(iaa_device, &iaa_devices, list)
+ device_stats_show(m, iaa_device);
+
+ mutex_unlock(&iaa_devices_lock);
+
+ return 0;
+}
+
+int iaa_crypto_stats_reset(void *data, u64 value)
+{
+ struct iaa_device *iaa_device;
+
+ reset_iaa_crypto_stats();
+
+ mutex_lock(&iaa_devices_lock);
+
+ list_for_each_entry(iaa_device, &iaa_devices, list)
+ reset_device_stats(iaa_device);
+
+ mutex_unlock(&iaa_devices_lock);
+
+ return 0;
+}
+
static int __init iaa_crypto_init_module(void)
{
int ret = 0;
@@ -2038,6 +2099,9 @@ static int __init iaa_crypto_init_module(void)
goto err_sync_attr_create;
}

+ if (iaa_crypto_debugfs_init())
+ pr_warn("debugfs init failed, stats not available\n");
+
pr_debug("initialized\n");
out:
return ret;
@@ -2063,6 +2127,7 @@ static void __exit iaa_crypto_cleanup_module(void)
if (iaa_unregister_compression_device())
pr_debug("IAA compression device unregister failed\n");

+ iaa_crypto_debugfs_cleanup();
driver_remove_file(&iaa_crypto_driver.drv,
&driver_attr_sync_mode);
driver_remove_file(&iaa_crypto_driver.drv,
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_stats.c b/drivers/crypto/intel/iaa/iaa_crypto_stats.c
new file mode 100644
index 000000000000..5ef9b6ae4d74
--- /dev/null
+++ b/drivers/crypto/intel/iaa/iaa_crypto_stats.c
@@ -0,0 +1,271 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/highmem.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/smp.h>
+#include <uapi/linux/idxd.h>
+#include <linux/idxd.h>
+#include <linux/dmaengine.h>
+#include "../../dma/idxd/idxd.h"
+#include <linux/debugfs.h>
+#include <crypto/internal/acompress.h>
+#include "iaa_crypto.h"
+#include "iaa_crypto_stats.h"
+
+static u64 total_comp_calls;
+static u64 total_decomp_calls;
+static u64 max_comp_delay_ns;
+static u64 max_decomp_delay_ns;
+static u64 max_acomp_delay_ns;
+static u64 max_adecomp_delay_ns;
+static u64 total_comp_bytes_out;
+static u64 total_decomp_bytes_in;
+static u64 total_completion_einval_errors;
+static u64 total_completion_timeout_errors;
+static u64 total_completion_comp_buf_overflow_errors;
+
+static struct dentry *iaa_crypto_debugfs_root;
+
+void update_total_comp_calls(void)
+{
+ total_comp_calls++;
+}
+
+void update_total_comp_bytes_out(int n)
+{
+ total_comp_bytes_out += n;
+}
+
+void update_total_decomp_calls(void)
+{
+ total_decomp_calls++;
+}
+
+void update_total_decomp_bytes_in(int n)
+{
+ total_decomp_bytes_in += n;
+}
+
+void update_completion_einval_errs(void)
+{
+ total_completion_einval_errors++;
+}
+
+void update_completion_timeout_errs(void)
+{
+ total_completion_timeout_errors++;
+}
+
+void update_completion_comp_buf_overflow_errs(void)
+{
+ total_completion_comp_buf_overflow_errors++;
+}
+
+void update_max_comp_delay_ns(u64 start_time_ns)
+{
+ u64 time_diff;
+
+ time_diff = ktime_get_ns() - start_time_ns;
+
+ if (time_diff > max_comp_delay_ns)
+ max_comp_delay_ns = time_diff;
+}
+
+void update_max_decomp_delay_ns(u64 start_time_ns)
+{
+ u64 time_diff;
+
+ time_diff = ktime_get_ns() - start_time_ns;
+
+ if (time_diff > max_decomp_delay_ns)
+ max_decomp_delay_ns = time_diff;
+}
+
+void update_max_acomp_delay_ns(u64 start_time_ns)
+{
+ u64 time_diff;
+
+ time_diff = ktime_get_ns() - start_time_ns;
+
+ if (time_diff > max_acomp_delay_ns)
+ max_acomp_delay_ns = time_diff;
+}
+
+void update_max_adecomp_delay_ns(u64 start_time_ns)
+{
+ u64 time_diff;
+
+ time_diff = ktime_get_ns() - start_time_ns;
+
+ if (time_diff > max_adecomp_delay_ns)
+
+ max_adecomp_delay_ns = time_diff;
+}
+
+void update_wq_comp_calls(struct idxd_wq *idxd_wq)
+{
+ struct iaa_wq *wq = idxd_wq_private(idxd_wq);
+
+ wq->comp_calls++;
+ wq->iaa_device->comp_calls++;
+}
+
+void update_wq_comp_bytes(struct idxd_wq *idxd_wq, int n)
+{
+ struct iaa_wq *wq = idxd_wq_private(idxd_wq);
+
+ wq->comp_bytes += n;
+ wq->iaa_device->comp_bytes += n;
+}
+
+void update_wq_decomp_calls(struct idxd_wq *idxd_wq)
+{
+ struct iaa_wq *wq = idxd_wq_private(idxd_wq);
+
+ wq->decomp_calls++;
+ wq->iaa_device->decomp_calls++;
+}
+
+void update_wq_decomp_bytes(struct idxd_wq *idxd_wq, int n)
+{
+ struct iaa_wq *wq = idxd_wq_private(idxd_wq);
+
+ wq->decomp_bytes += n;
+ wq->iaa_device->decomp_bytes += n;
+}
+
+void reset_iaa_crypto_stats(void)
+{
+ total_comp_calls = 0;
+ total_decomp_calls = 0;
+ max_comp_delay_ns = 0;
+ max_decomp_delay_ns = 0;
+ max_acomp_delay_ns = 0;
+ max_adecomp_delay_ns = 0;
+ total_comp_bytes_out = 0;
+ total_decomp_bytes_in = 0;
+ total_completion_einval_errors = 0;
+ total_completion_timeout_errors = 0;
+ total_completion_comp_buf_overflow_errors = 0;
+}
+
+static void reset_wq_stats(struct iaa_wq *wq)
+{
+ wq->comp_calls = 0;
+ wq->comp_bytes = 0;
+ wq->decomp_calls = 0;
+ wq->decomp_bytes = 0;
+}
+
+void reset_device_stats(struct iaa_device *iaa_device)
+{
+ struct iaa_wq *iaa_wq;
+
+ iaa_device->comp_calls = 0;
+ iaa_device->comp_bytes = 0;
+ iaa_device->decomp_calls = 0;
+ iaa_device->decomp_bytes = 0;
+
+ list_for_each_entry(iaa_wq, &iaa_device->wqs, list)
+ reset_wq_stats(iaa_wq);
+}
+
+static void wq_show(struct seq_file *m, struct iaa_wq *iaa_wq)
+{
+ seq_printf(m, " name: %s\n", iaa_wq->wq->name);
+ seq_printf(m, " comp_calls: %llu\n", iaa_wq->comp_calls);
+ seq_printf(m, " comp_bytes: %llu\n", iaa_wq->comp_bytes);
+ seq_printf(m, " decomp_calls: %llu\n", iaa_wq->decomp_calls);
+ seq_printf(m, " decomp_bytes: %llu\n\n", iaa_wq->decomp_bytes);
+}
+
+void device_stats_show(struct seq_file *m, struct iaa_device *iaa_device)
+{
+ struct iaa_wq *iaa_wq;
+
+ seq_puts(m, "iaa device:\n");
+ seq_printf(m, " id: %d\n", iaa_device->idxd->id);
+ seq_printf(m, " n_wqs: %d\n", iaa_device->n_wq);
+ seq_printf(m, " comp_calls: %llu\n", iaa_device->comp_calls);
+ seq_printf(m, " comp_bytes: %llu\n", iaa_device->comp_bytes);
+ seq_printf(m, " decomp_calls: %llu\n", iaa_device->decomp_calls);
+ seq_printf(m, " decomp_bytes: %llu\n", iaa_device->decomp_bytes);
+ seq_puts(m, " wqs:\n");
+
+ list_for_each_entry(iaa_wq, &iaa_device->wqs, list)
+ wq_show(m, iaa_wq);
+}
+
+void global_stats_show(struct seq_file *m)
+{
+ seq_puts(m, "global stats:\n");
+ seq_printf(m, " total_comp_calls: %llu\n", total_comp_calls);
+ seq_printf(m, " total_decomp_calls: %llu\n", total_decomp_calls);
+ seq_printf(m, " total_comp_bytes_out: %llu\n", total_comp_bytes_out);
+ seq_printf(m, " total_decomp_bytes_in: %llu\n", total_decomp_bytes_in);
+ seq_printf(m, " total_completion_einval_errors: %llu\n",
+ total_completion_einval_errors);
+ seq_printf(m, " total_completion_timeout_errors: %llu\n",
+ total_completion_timeout_errors);
+ seq_printf(m, " total_completion_comp_buf_overflow_errors: %llu\n\n",
+ total_completion_comp_buf_overflow_errors);
+}
+
+static int wq_stats_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, wq_stats_show, file);
+}
+
+const struct file_operations wq_stats_fops = {
+ .open = wq_stats_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+DEFINE_DEBUGFS_ATTRIBUTE(wq_stats_reset_fops, NULL, iaa_crypto_stats_reset, "%llu\n");
+
+int __init iaa_crypto_debugfs_init(void)
+{
+ if (!debugfs_initialized())
+ return -ENODEV;
+
+ iaa_crypto_debugfs_root = debugfs_create_dir("iaa_crypto", NULL);
+ if (!iaa_crypto_debugfs_root)
+ return -ENOMEM;
+
+ debugfs_create_u64("max_comp_delay_ns", 0644,
+ iaa_crypto_debugfs_root, &max_comp_delay_ns);
+ debugfs_create_u64("max_decomp_delay_ns", 0644,
+ iaa_crypto_debugfs_root, &max_decomp_delay_ns);
+ debugfs_create_u64("max_acomp_delay_ns", 0644,
+ iaa_crypto_debugfs_root, &max_comp_delay_ns);
+ debugfs_create_u64("max_adecomp_delay_ns", 0644,
+ iaa_crypto_debugfs_root, &max_decomp_delay_ns);
+ debugfs_create_u64("total_comp_calls", 0644,
+ iaa_crypto_debugfs_root, &total_comp_calls);
+ debugfs_create_u64("total_decomp_calls", 0644,
+ iaa_crypto_debugfs_root, &total_decomp_calls);
+ debugfs_create_u64("total_comp_bytes_out", 0644,
+ iaa_crypto_debugfs_root, &total_comp_bytes_out);
+ debugfs_create_u64("total_decomp_bytes_in", 0644,
+ iaa_crypto_debugfs_root, &total_decomp_bytes_in);
+ debugfs_create_file("wq_stats", 0644, iaa_crypto_debugfs_root, NULL,
+ &wq_stats_fops);
+ debugfs_create_file("stats_reset", 0644, iaa_crypto_debugfs_root, NULL,
+ &wq_stats_reset_fops);
+
+ return 0;
+}
+
+void __exit iaa_crypto_debugfs_cleanup(void)
+{
+ debugfs_remove_recursive(iaa_crypto_debugfs_root);
+}
+
+MODULE_LICENSE("GPL");
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_stats.h b/drivers/crypto/intel/iaa/iaa_crypto_stats.h
new file mode 100644
index 000000000000..ad8333329fa6
--- /dev/null
+++ b/drivers/crypto/intel/iaa/iaa_crypto_stats.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
+
+#ifndef __CRYPTO_DEV_IAA_CRYPTO_STATS_H__
+#define __CRYPTO_DEV_IAA_CRYPTO_STATS_H__
+
+#if defined(CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS)
+int iaa_crypto_debugfs_init(void);
+void iaa_crypto_debugfs_cleanup(void);
+
+void update_total_comp_calls(void);
+void update_total_comp_bytes_out(int n);
+void update_total_decomp_calls(void);
+void update_total_decomp_bytes_in(int n);
+void update_max_comp_delay_ns(u64 start_time_ns);
+void update_max_decomp_delay_ns(u64 start_time_ns);
+void update_max_acomp_delay_ns(u64 start_time_ns);
+void update_max_adecomp_delay_ns(u64 start_time_ns);
+void update_completion_einval_errs(void);
+void update_completion_timeout_errs(void);
+void update_completion_comp_buf_overflow_errs(void);
+
+void update_wq_comp_calls(struct idxd_wq *idxd_wq);
+void update_wq_comp_bytes(struct idxd_wq *idxd_wq, int n);
+void update_wq_decomp_calls(struct idxd_wq *idxd_wq);
+void update_wq_decomp_bytes(struct idxd_wq *idxd_wq, int n);
+
+int wq_stats_show(struct seq_file *m, void *v);
+int iaa_crypto_stats_reset(void *data, u64 value);
+
+static inline u64 iaa_get_ts(void) { return ktime_get_ns(); }
+
+#else
+static inline int iaa_crypto_debugfs_init(void) { return 0; }
+static inline void iaa_crypto_debugfs_cleanup(void) {}
+
+static inline void update_total_comp_calls(void) {}
+static inline void update_total_comp_bytes_out(int n) {}
+static inline void update_total_decomp_calls(void) {}
+static inline void update_total_decomp_bytes_in(int n) {}
+static inline void update_max_comp_delay_ns(u64 start_time_ns) {}
+static inline void update_max_decomp_delay_ns(u64 start_time_ns) {}
+static inline void update_max_acomp_delay_ns(u64 start_time_ns) {}
+static inline void update_max_adecomp_delay_ns(u64 start_time_ns) {}
+static inline void update_completion_einval_errs(void) {}
+static inline void update_completion_timeout_errs(void) {}
+static inline void update_completion_comp_buf_overflow_errs(void) {}
+
+static inline void update_wq_comp_calls(struct idxd_wq *idxd_wq) {}
+static inline void update_wq_comp_bytes(struct idxd_wq *idxd_wq, int n) {}
+static inline void update_wq_decomp_calls(struct idxd_wq *idxd_wq) {}
+static inline void update_wq_decomp_bytes(struct idxd_wq *idxd_wq, int n) {}
+
+static inline u64 iaa_get_ts(void) { return 0; }
+
+#endif // CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS
+
+#endif
--
2.34.1