2023-11-21 18:37:32

by Damian Muszynski

[permalink] [raw]
Subject: [PATCH] crypto: qat - add sysfs_added flag for ras

The qat_ras sysfs attribute group is registered within the
adf_dev_start() function, alongside other driver components.
If any of the functions preceding the group registration fails,
the adf_dev_start() function returns, and the caller, to undo the
operation, invokes adf_dev_stop() followed by adf_dev_shutdown().
However, the current flow lacks information about whether the
registration of the qat_ras attribute group was successful or not.

In cases where this condition is encountered, an error similar to
the following might be reported:

4xxx 0000:6b:00.0: Starting device qat_dev0
4xxx 0000:6b:00.0: qat_dev0 started 9 acceleration engines
4xxx 0000:6b:00.0: Failed to send init message
4xxx 0000:6b:00.0: Failed to start device qat_dev0
sysfs group 'qat_ras' not found for kobject '0000:6b:00.0'
...
sysfs_remove_groups+0x29/0x50
adf_sysfs_stop_ras+0x4b/0x80 [intel_qat]
adf_dev_stop+0x43/0x1d0 [intel_qat]
adf_dev_down+0x4b/0x150 [intel_qat]
...
4xxx 0000:6b:00.0: qat_dev0 stopped 9 acceleration engines
4xxx 0000:6b:00.0: Resetting device qat_dev0

To prevent attempting to remove attributes from a group that has not
been added yet, a flag named 'sysfs_added' is introduced. This flag
is set to true upon the successful registration of the attribute group.

Fixes: 532d7f6bc458 ("crypto: qat - add error counters")
Signed-off-by: Damian Muszynski <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Ahsan Atta <[email protected]>
---
drivers/crypto/intel/qat/qat_common/adf_accel_devices.h | 1 +
.../crypto/intel/qat/qat_common/adf_sysfs_ras_counters.c | 7 ++++++-
2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h b/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h
index 4ff5729a3496..9d5fdd529a2e 100644
--- a/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h
+++ b/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h
@@ -92,6 +92,7 @@ enum ras_errors {

struct adf_error_counters {
atomic_t counter[ADF_RAS_ERRORS];
+ bool sysfs_added;
bool enabled;
};

diff --git a/drivers/crypto/intel/qat/qat_common/adf_sysfs_ras_counters.c b/drivers/crypto/intel/qat/qat_common/adf_sysfs_ras_counters.c
index cffe2d722995..e97c67c87b3c 100644
--- a/drivers/crypto/intel/qat/qat_common/adf_sysfs_ras_counters.c
+++ b/drivers/crypto/intel/qat/qat_common/adf_sysfs_ras_counters.c
@@ -99,6 +99,8 @@ void adf_sysfs_start_ras(struct adf_accel_dev *accel_dev)
if (device_add_group(&GET_DEV(accel_dev), &qat_ras_group))
dev_err(&GET_DEV(accel_dev),
"Failed to create qat_ras attribute group.\n");
+
+ accel_dev->ras_errors.sysfs_added = true;
}

void adf_sysfs_stop_ras(struct adf_accel_dev *accel_dev)
@@ -106,7 +108,10 @@ void adf_sysfs_stop_ras(struct adf_accel_dev *accel_dev)
if (!accel_dev->ras_errors.enabled)
return;

- device_remove_group(&GET_DEV(accel_dev), &qat_ras_group);
+ if (accel_dev->ras_errors.sysfs_added) {
+ device_remove_group(&GET_DEV(accel_dev), &qat_ras_group);
+ accel_dev->ras_errors.sysfs_added = false;
+ }

ADF_RAS_ERR_CTR_CLEAR(accel_dev->ras_errors);
}

base-commit: f36285cc1e99472bb4c6741981594a5934ad4c4e
prerequisite-patch-id: 1375fd7754ab07f7e90594b4f4893487400a7052
--
2.41.0



2023-12-01 10:38:35

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] crypto: qat - add sysfs_added flag for ras

On Tue, Nov 21, 2023 at 05:59:45PM +0100, Damian Muszynski wrote:
> The qat_ras sysfs attribute group is registered within the
> adf_dev_start() function, alongside other driver components.
> If any of the functions preceding the group registration fails,
> the adf_dev_start() function returns, and the caller, to undo the
> operation, invokes adf_dev_stop() followed by adf_dev_shutdown().
> However, the current flow lacks information about whether the
> registration of the qat_ras attribute group was successful or not.
>
> In cases where this condition is encountered, an error similar to
> the following might be reported:
>
> 4xxx 0000:6b:00.0: Starting device qat_dev0
> 4xxx 0000:6b:00.0: qat_dev0 started 9 acceleration engines
> 4xxx 0000:6b:00.0: Failed to send init message
> 4xxx 0000:6b:00.0: Failed to start device qat_dev0
> sysfs group 'qat_ras' not found for kobject '0000:6b:00.0'
> ...
> sysfs_remove_groups+0x29/0x50
> adf_sysfs_stop_ras+0x4b/0x80 [intel_qat]
> adf_dev_stop+0x43/0x1d0 [intel_qat]
> adf_dev_down+0x4b/0x150 [intel_qat]
> ...
> 4xxx 0000:6b:00.0: qat_dev0 stopped 9 acceleration engines
> 4xxx 0000:6b:00.0: Resetting device qat_dev0
>
> To prevent attempting to remove attributes from a group that has not
> been added yet, a flag named 'sysfs_added' is introduced. This flag
> is set to true upon the successful registration of the attribute group.
>
> Fixes: 532d7f6bc458 ("crypto: qat - add error counters")
> Signed-off-by: Damian Muszynski <[email protected]>
> Reviewed-by: Giovanni Cabiddu <[email protected]>
> Reviewed-by: Ahsan Atta <[email protected]>
> ---
> drivers/crypto/intel/qat/qat_common/adf_accel_devices.h | 1 +
> .../crypto/intel/qat/qat_common/adf_sysfs_ras_counters.c | 7 ++++++-
> 2 files changed, 7 insertions(+), 1 deletion(-)

Patch applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt