2023-09-05 16:03:05

by Lu Jialin

[permalink] [raw]
Subject: [PATCH v3] crypto: Fix hungtask for PADATA_RESET

We found a hungtask bug in test_aead_vec_cfg as follows:

INFO: task cryptomgr_test:391009 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Call trace:
__switch_to+0x98/0xe0
__schedule+0x6c4/0xf40
schedule+0xd8/0x1b4
schedule_timeout+0x474/0x560
wait_for_common+0x368/0x4e0
wait_for_completion+0x20/0x30
wait_for_completion+0x20/0x30
test_aead_vec_cfg+0xab4/0xd50
test_aead+0x144/0x1f0
alg_test_aead+0xd8/0x1e0
alg_test+0x634/0x890
cryptomgr_test+0x40/0x70
kthread+0x1e0/0x220
ret_from_fork+0x10/0x18
Kernel panic - not syncing: hung_task: blocked tasks

For padata_do_parallel, when the return err is 0 or -EBUSY, it will call
wait_for_completion(&wait->completion) in test_aead_vec_cfg. In normal
case, aead_request_complete() will be called in pcrypt_aead_serial and the
return err is 0 for padata_do_parallel. But, when pinst->flags is
PADATA_RESET, the return err is -EBUSY for padata_do_parallel, and it
won't call aead_request_complete(). Therefore, test_aead_vec_cfg will
hung at wait_for_completion(&wait->completion), which will cause
hungtask.

The problem comes as following:
(padata_do_parallel) |
rcu_read_lock_bh(); |
err = -EINVAL; | (padata_replace)
| pinst->flags |= PADATA_RESET;
err = -EBUSY |
if (pinst->flags & PADATA_RESET) |
rcu_read_unlock_bh() |
return err

In order to resolve the problem, we replace the return err -EBUSY with
-EAGAIN, which means parallel_data is changing, and the caller should call
it again.

v3:
remove retry and just change the return err.
v2:
introduce padata_try_do_parallel() in pcrypt_aead_encrypt and
pcrypt_aead_decrypt to solve the hungtask.

Signed-off-by: Lu Jialin <[email protected]>
Signed-off-by: Guo Zihua <[email protected]>
---
crypto/pcrypt.c | 4 ++++
kernel/padata.c | 2 +-
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index 8c1d0ca41213..d0d954fe9d54 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -117,6 +117,8 @@ static int pcrypt_aead_encrypt(struct aead_request *req)
err = padata_do_parallel(ictx->psenc, padata, &ctx->cb_cpu);
if (!err)
return -EINPROGRESS;
+ if (err == -EBUSY)
+ return -EAGAIN;

return err;
}
@@ -164,6 +166,8 @@ static int pcrypt_aead_decrypt(struct aead_request *req)
err = padata_do_parallel(ictx->psdec, padata, &ctx->cb_cpu);
if (!err)
return -EINPROGRESS;
+ if (err == -EBUSY)
+ return -EAGAIN;

return err;
}
diff --git a/kernel/padata.c b/kernel/padata.c
index 222d60195de6..81c8183f3176 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -202,7 +202,7 @@ int padata_do_parallel(struct padata_shell *ps,
*cb_cpu = cpu;
}

- err = -EBUSY;
+ err = -EBUSY;
if ((pinst->flags & PADATA_RESET))
goto out;

--
2.34.1


2023-09-05 16:42:38

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH v3] crypto: Fix hungtask for PADATA_RESET

On Mon, Sep 04, 2023 at 01:33:41PM +0000, Lu Jialin wrote:
> ---
> crypto/pcrypt.c | 4 ++++
> kernel/padata.c | 2 +-
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> index 8c1d0ca41213..d0d954fe9d54 100644
> --- a/crypto/pcrypt.c
> +++ b/crypto/pcrypt.c
> @@ -117,6 +117,8 @@ static int pcrypt_aead_encrypt(struct aead_request *req)
> err = padata_do_parallel(ictx->psenc, padata, &ctx->cb_cpu);
> if (!err)
> return -EINPROGRESS;
> + if (err == -EBUSY)
> + return -EAGAIN;
>
> return err;
> }
> @@ -164,6 +166,8 @@ static int pcrypt_aead_decrypt(struct aead_request *req)
> err = padata_do_parallel(ictx->psdec, padata, &ctx->cb_cpu);
> if (!err)
> return -EINPROGRESS;
> + if (err == -EBUSY)
> + return -EAGAIN;
>
> return err;
> }
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 222d60195de6..81c8183f3176 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -202,7 +202,7 @@ int padata_do_parallel(struct padata_shell *ps,
> *cb_cpu = cpu;
> }
>
> - err = -EBUSY;
> + err = -EBUSY;

Why not just returning -EAGAIN here directly?

2023-09-06 08:32:57

by Lu Jialin

[permalink] [raw]
Subject: Re: [PATCH v3] crypto: Fix hungtask for PADATA_RESET

Hi Steffen,

padata_do_parallel is only called by pcrypt_aead_encrypt/decrypt,
therefore, changing in padata_do_parallel and changing in
pcrypt_aead_encrypt/decrypt have the same effect. Both should be ok.

Thanks.

Herbert, the two ways look both right. What is your suggestion?

On 2023/9/5 17:45, Steffen Klassert wrote:
> On Mon, Sep 04, 2023 at 01:33:41PM +0000, Lu Jialin wrote:
>> ---
>> crypto/pcrypt.c | 4 ++++
>> kernel/padata.c | 2 +-
>> 2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
>> index 8c1d0ca41213..d0d954fe9d54 100644
>> --- a/crypto/pcrypt.c
>> +++ b/crypto/pcrypt.c
>> @@ -117,6 +117,8 @@ static int pcrypt_aead_encrypt(struct aead_request *req)
>> err = padata_do_parallel(ictx->psenc, padata, &ctx->cb_cpu);
>> if (!err)
>> return -EINPROGRESS;
>> + if (err == -EBUSY)
>> + return -EAGAIN;
>>
>> return err;
>> }
>> @@ -164,6 +166,8 @@ static int pcrypt_aead_decrypt(struct aead_request *req)
>> err = padata_do_parallel(ictx->psdec, padata, &ctx->cb_cpu);
>> if (!err)
>> return -EINPROGRESS;
>> + if (err == -EBUSY)
>> + return -EAGAIN;
>>
>> return err;
>> }
>> diff --git a/kernel/padata.c b/kernel/padata.c
>> index 222d60195de6..81c8183f3176 100644
>> --- a/kernel/padata.c
>> +++ b/kernel/padata.c
>> @@ -202,7 +202,7 @@ int padata_do_parallel(struct padata_shell *ps,
>> *cb_cpu = cpu;
>> }
>>
>> - err = -EBUSY;
>> + err = -EBUSY;
> Why not just returning -EAGAIN here directly?
>
>

2023-09-16 11:09:54

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH v3] crypto: Fix hungtask for PADATA_RESET

On Mon, Sep 04, 2023 at 01:33:41PM +0000, Lu Jialin wrote:
> We found a hungtask bug in test_aead_vec_cfg as follows:
>
> INFO: task cryptomgr_test:391009 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Call trace:
> __switch_to+0x98/0xe0
> __schedule+0x6c4/0xf40
> schedule+0xd8/0x1b4
> schedule_timeout+0x474/0x560
> wait_for_common+0x368/0x4e0
> wait_for_completion+0x20/0x30
> wait_for_completion+0x20/0x30
> test_aead_vec_cfg+0xab4/0xd50
> test_aead+0x144/0x1f0
> alg_test_aead+0xd8/0x1e0
> alg_test+0x634/0x890
> cryptomgr_test+0x40/0x70
> kthread+0x1e0/0x220
> ret_from_fork+0x10/0x18
> Kernel panic - not syncing: hung_task: blocked tasks
>
> For padata_do_parallel, when the return err is 0 or -EBUSY, it will call
> wait_for_completion(&wait->completion) in test_aead_vec_cfg. In normal
> case, aead_request_complete() will be called in pcrypt_aead_serial and the
> return err is 0 for padata_do_parallel. But, when pinst->flags is
> PADATA_RESET, the return err is -EBUSY for padata_do_parallel, and it
> won't call aead_request_complete(). Therefore, test_aead_vec_cfg will
> hung at wait_for_completion(&wait->completion), which will cause
> hungtask.
>
> The problem comes as following:
> (padata_do_parallel) |
> rcu_read_lock_bh(); |
> err = -EINVAL; | (padata_replace)
> | pinst->flags |= PADATA_RESET;
> err = -EBUSY |
> if (pinst->flags & PADATA_RESET) |
> rcu_read_unlock_bh() |
> return err
>
> In order to resolve the problem, we replace the return err -EBUSY with
> -EAGAIN, which means parallel_data is changing, and the caller should call
> it again.
>
> v3:
> remove retry and just change the return err.
> v2:
> introduce padata_try_do_parallel() in pcrypt_aead_encrypt and
> pcrypt_aead_decrypt to solve the hungtask.
>
> Signed-off-by: Lu Jialin <[email protected]>
> Signed-off-by: Guo Zihua <[email protected]>
> ---
> crypto/pcrypt.c | 4 ++++
> kernel/padata.c | 2 +-
> 2 files changed, 5 insertions(+), 1 deletion(-)

Patch applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt