2023-07-29 12:19:57

by Aleksa Savic

[permalink] [raw]
Subject: [PATCH] hwmon: (aquacomputer_d5next) Add selective 200ms delay after sending ctrl report

Add a 200ms delay after sending a ctrl report to Quadro,
Octo, D5 Next and Aquaero to give them enough time to
process the request and save the data to memory. Otherwise,
under heavier userspace loads where multiple sysfs entries
are usually set in quick succession, a new ctrl report could
be requested from the device while it's still processing the
previous one and fail with -EPIPE.

Reported by a user on Github [1] and tested by both of us.

[1] https://github.com/aleksamagicka/aquacomputer_d5next-hwmon/issues/82

Cc: [email protected]
Signed-off-by: Aleksa Savic <[email protected]>
---
drivers/hwmon/aquacomputer_d5next.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/drivers/hwmon/aquacomputer_d5next.c b/drivers/hwmon/aquacomputer_d5next.c
index a997dbcb563f..9cb55d51185a 100644
--- a/drivers/hwmon/aquacomputer_d5next.c
+++ b/drivers/hwmon/aquacomputer_d5next.c
@@ -652,6 +652,31 @@ static int aqc_send_ctrl_data(struct aqc_data *priv)
ret = hid_hw_raw_request(priv->hdev, priv->secondary_ctrl_report_id,
priv->secondary_ctrl_report, priv->secondary_ctrl_report_size,
HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
+ if (ret < 0)
+ return ret;
+
+ /*
+ * Wait 200ms before returning to make sure that the device actually processed both reports
+ * and saved ctrl data to memory. Otherwise, an aqc_get_ctrl_data() call made shortly after
+ * may fail with -EPIPE because the device is still busy and can't provide data. This can
+ * happen when userspace tools, such as fancontrol or liquidctl, write to sysfs entries in
+ * quick succession.
+ *
+ * 200ms was found to be the sweet spot between fixing the issue and not significantly
+ * prolonging the call. Quadro, Octo, D5 Next and Aquaero are currently known to be
+ * affected.
+ */
+ switch (priv->kind) {
+ case quadro:
+ case octo:
+ case d5next:
+ case aquaero:
+ msleep(200);
+ break;
+ default:
+ break;
+ }
+
return ret;
}

--
2.41.0



2023-07-29 14:28:38

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] hwmon: (aquacomputer_d5next) Add selective 200ms delay after sending ctrl report

Hi Aleksa,

kernel test robot noticed the following build errors:

[auto build test ERROR on groeck-staging/hwmon-next]
[also build test ERROR on linus/master v6.5-rc3 next-20230728]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Aleksa-Savic/hwmon-aquacomputer_d5next-Add-selective-200ms-delay-after-sending-ctrl-report/20230729-193038
base: https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-next
patch link: https://lore.kernel.org/r/20230729112732.5516-1-savicaleksa83%40gmail.com
patch subject: [PATCH] hwmon: (aquacomputer_d5next) Add selective 200ms delay after sending ctrl report
config: arm-randconfig-r005-20230729 (https://download.01.org/0day-ci/archive/20230729/[email protected]/config)
compiler: arm-linux-gnueabi-gcc (GCC) 12.3.0
reproduce: (https://download.01.org/0day-ci/archive/20230729/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

drivers/hwmon/aquacomputer_d5next.c: In function 'aqc_send_ctrl_data':
>> drivers/hwmon/aquacomputer_d5next.c:674:17: error: implicit declaration of function 'msleep' [-Werror=implicit-function-declaration]
674 | msleep(200);
| ^~~~~~
cc1: some warnings being treated as errors


vim +/msleep +674 drivers/hwmon/aquacomputer_d5next.c

627
628 /* Expects the mutex to be locked */
629 static int aqc_send_ctrl_data(struct aqc_data *priv)
630 {
631 int ret;
632 u16 checksum;
633
634 /* Checksum is not needed for Aquaero */
635 if (priv->kind != aquaero) {
636 /* Init and xorout value for CRC-16/USB is 0xffff */
637 checksum = crc16(0xffff, priv->buffer + priv->checksum_start,
638 priv->checksum_length);
639 checksum ^= 0xffff;
640
641 /* Place the new checksum at the end of the report */
642 put_unaligned_be16(checksum, priv->buffer + priv->checksum_offset);
643 }
644
645 /* Send the patched up report back to the device */
646 ret = hid_hw_raw_request(priv->hdev, priv->ctrl_report_id, priv->buffer, priv->buffer_size,
647 HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
648 if (ret < 0)
649 return ret;
650
651 /* The official software sends this report after every change, so do it here as well */
652 ret = hid_hw_raw_request(priv->hdev, priv->secondary_ctrl_report_id,
653 priv->secondary_ctrl_report, priv->secondary_ctrl_report_size,
654 HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
655 if (ret < 0)
656 return ret;
657
658 /*
659 * Wait 200ms before returning to make sure that the device actually processed both reports
660 * and saved ctrl data to memory. Otherwise, an aqc_get_ctrl_data() call made shortly after
661 * may fail with -EPIPE because the device is still busy and can't provide data. This can
662 * happen when userspace tools, such as fancontrol or liquidctl, write to sysfs entries in
663 * quick succession.
664 *
665 * 200ms was found to be the sweet spot between fixing the issue and not significantly
666 * prolonging the call. Quadro, Octo, D5 Next and Aquaero are currently known to be
667 * affected.
668 */
669 switch (priv->kind) {
670 case quadro:
671 case octo:
672 case d5next:
673 case aquaero:
> 674 msleep(200);
675 break;
676 default:
677 break;
678 }
679
680 return ret;
681 }
682

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-07-29 14:28:44

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] hwmon: (aquacomputer_d5next) Add selective 200ms delay after sending ctrl report

On 7/29/23 04:27, Aleksa Savic wrote:
> Add a 200ms delay after sending a ctrl report to Quadro,
> Octo, D5 Next and Aquaero to give them enough time to
> process the request and save the data to memory. Otherwise,
> under heavier userspace loads where multiple sysfs entries
> are usually set in quick succession, a new ctrl report could
> be requested from the device while it's still processing the
> previous one and fail with -EPIPE.
>
> Reported by a user on Github [1] and tested by both of us.
>
> [1] https://github.com/aleksamagicka/aquacomputer_d5next-hwmon/issues/82
>
> Cc: [email protected]
> Signed-off-by: Aleksa Savic <[email protected]>
> ---
> drivers/hwmon/aquacomputer_d5next.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/drivers/hwmon/aquacomputer_d5next.c b/drivers/hwmon/aquacomputer_d5next.c
> index a997dbcb563f..9cb55d51185a 100644
> --- a/drivers/hwmon/aquacomputer_d5next.c
> +++ b/drivers/hwmon/aquacomputer_d5next.c
> @@ -652,6 +652,31 @@ static int aqc_send_ctrl_data(struct aqc_data *priv)
> ret = hid_hw_raw_request(priv->hdev, priv->secondary_ctrl_report_id,
> priv->secondary_ctrl_report, priv->secondary_ctrl_report_size,
> HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
> + if (ret < 0)
> + return ret;
> +
> + /*
> + * Wait 200ms before returning to make sure that the device actually processed both reports
> + * and saved ctrl data to memory. Otherwise, an aqc_get_ctrl_data() call made shortly after
> + * may fail with -EPIPE because the device is still busy and can't provide data. This can
> + * happen when userspace tools, such as fancontrol or liquidctl, write to sysfs entries in
> + * quick succession.
> + *
> + * 200ms was found to be the sweet spot between fixing the issue and not significantly
> + * prolonging the call. Quadro, Octo, D5 Next and Aquaero are currently known to be
> + * affected.
> + */
> + switch (priv->kind) {
> + case quadro:
> + case octo:
> + case d5next:
> + case aquaero:
> + msleep(200);
> + break;
> + default:
> + break;
> + }
> +
> return ret;
> }
>

This would force writes to sleep even if there is no subsequent operation.
Please make this conditional by saving the most recent access time and wait
on the subsequent operation. I would also suggest to store the wait time
in struct aqc_data to avoid the switch statement in the data path. An example
for a driver doing something similar is drivers/hwmon/pmbus/zl6100.c.

Thanks,
Guenter


2023-07-29 20:19:02

by Aleksa Savic

[permalink] [raw]
Subject: Re: [PATCH] hwmon: (aquacomputer_d5next) Add selective 200ms delay after sending ctrl report

On 2023-07-29 15:44:32 GMT+02:00, Guenter Roeck wrote:
> On 7/29/23 04:27, Aleksa Savic wrote:
>> Add a 200ms delay after sending a ctrl report to Quadro,
>> Octo, D5 Next and Aquaero to give them enough time to
>> process the request and save the data to memory. Otherwise,
>> under heavier userspace loads where multiple sysfs entries
>> are usually set in quick succession, a new ctrl report could
>> be requested from the device while it's still processing the
>> previous one and fail with -EPIPE.
>>
>> Reported by a user on Github [1] and tested by both of us.
>>
>> [1] https://github.com/aleksamagicka/aquacomputer_d5next-hwmon/issues/82
>>
>> Cc: [email protected]
>> Signed-off-by: Aleksa Savic <[email protected]>
>> ---
>>   drivers/hwmon/aquacomputer_d5next.c | 25 +++++++++++++++++++++++++
>>   1 file changed, 25 insertions(+)
>>
>> diff --git a/drivers/hwmon/aquacomputer_d5next.c b/drivers/hwmon/aquacomputer_d5next.c
>> index a997dbcb563f..9cb55d51185a 100644
>> --- a/drivers/hwmon/aquacomputer_d5next.c
>> +++ b/drivers/hwmon/aquacomputer_d5next.c
>> @@ -652,6 +652,31 @@ static int aqc_send_ctrl_data(struct aqc_data *priv)
>>       ret = hid_hw_raw_request(priv->hdev, priv->secondary_ctrl_report_id,
>>                    priv->secondary_ctrl_report, priv->secondary_ctrl_report_size,
>>                    HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
>> +    if (ret < 0)
>> +        return ret;
>> +
>> +    /*
>> +     * Wait 200ms before returning to make sure that the device actually processed both reports
>> +     * and saved ctrl data to memory. Otherwise, an aqc_get_ctrl_data() call made shortly after
>> +     * may fail with -EPIPE because the device is still busy and can't provide data. This can
>> +     * happen when userspace tools, such as fancontrol or liquidctl, write to sysfs entries in
>> +     * quick succession.
>> +     *
>> +     * 200ms was found to be the sweet spot between fixing the issue and not significantly
>> +     * prolonging the call. Quadro, Octo, D5 Next and Aquaero are currently known to be
>> +     * affected.
>> +     */
>> +    switch (priv->kind) {
>> +    case quadro:
>> +    case octo:
>> +    case d5next:
>> +    case aquaero:
>> +        msleep(200);
>> +        break;
>> +    default:
>> +        break;
>> +    }
>> +
>>       return ret;
>>   }
>>  
>
> This would force writes to sleep even if there is no subsequent operation.
> Please make this conditional by saving the most recent access time and wait
> on the subsequent operation. I would also suggest to store the wait time
> in struct aqc_data to avoid the switch statement in the data path. An example
> for a driver doing something similar is drivers/hwmon/pmbus/zl6100.c.
>
> Thanks,
> Guenter
>

Thanks, will take a look at it. Please ignore v2 of this patch in this case.

Aleksa