2023-09-18 17:38:38

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1] SUNRPC: Remove BUG_ON call sites

From: Chuck Lever <[email protected]>

There is no need to take down the whole system for these assertions.

I'd rather not attempt a heroic save here, as some bug has occurred
that has left the transport data structures in an unknown state.
Just warn and then leak the left-over resources.

Signed-off-by: Chuck Lever <[email protected]>
---
net/sunrpc/svc.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

Let's start here. Comments?


diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 587811a002c9..11a1d5e7f5c7 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -575,11 +575,14 @@ svc_destroy(struct kref *ref)
timer_shutdown_sync(&serv->sv_temptimer);

/*
- * The last user is gone and thus all sockets have to be destroyed to
- * the point. Check this.
+ * Remaining transports at this point are not expected.
*/
- BUG_ON(!list_empty(&serv->sv_permsocks));
- BUG_ON(!list_empty(&serv->sv_tempsocks));
+ if (unlikely(!list_empty(&serv->sv_permsocks)))
+ pr_warn("SVC: permsocks remain for %s\n",
+ serv->sv_program->pg_name);
+ if (unlikely(!list_empty(&serv->sv_tempsocks)))
+ pr_warn("SVC: tempsocks remain for %s\n",
+ serv->sv_program->pg_name);

cache_clean_deferred(serv);




2023-09-19 13:10:36

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v1] SUNRPC: Remove BUG_ON call sites

On Tue, 19 Sep 2023, Chuck Lever wrote:
> From: Chuck Lever <[email protected]>
>
> There is no need to take down the whole system for these assertions.
>
> I'd rather not attempt a heroic save here, as some bug has occurred
> that has left the transport data structures in an unknown state.
> Just warn and then leak the left-over resources.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> net/sunrpc/svc.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> Let's start here. Comments?
>
>
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index 587811a002c9..11a1d5e7f5c7 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -575,11 +575,14 @@ svc_destroy(struct kref *ref)
> timer_shutdown_sync(&serv->sv_temptimer);
>
> /*
> - * The last user is gone and thus all sockets have to be destroyed to
> - * the point. Check this.
> + * Remaining transports at this point are not expected.
> */
> - BUG_ON(!list_empty(&serv->sv_permsocks));
> - BUG_ON(!list_empty(&serv->sv_tempsocks));
> + if (unlikely(!list_empty(&serv->sv_permsocks)))
> + pr_warn("SVC: permsocks remain for %s\n",
> + serv->sv_program->pg_name);

I would go with WARN_ON_ONCE() but I agree with the principle.
Maybe
WARN_ONCE(!list_empty(&serv->sv_permsocks),
"SVC: permsocks remain for %s\n",
serv->sv_program->pg_name);
This gives the stack trace which might be helpful.
But

Reviewed-by: NeilBrown <[email protected]>

if you prefer it the way it is.

NeilBrown

> + if (unlikely(!list_empty(&serv->sv_tempsocks)))
> + pr_warn("SVC: tempsocks remain for %s\n",
> + serv->sv_program->pg_name);
>
> cache_clean_deferred(serv);
>
>
>
>

2023-09-19 18:48:45

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH v1] SUNRPC: Remove BUG_ON call sites



> On Sep 18, 2023, at 8:02 PM, NeilBrown <[email protected]> wrote:
>
> On Tue, 19 Sep 2023, Chuck Lever wrote:
>> From: Chuck Lever <[email protected]>
>>
>> There is no need to take down the whole system for these assertions.
>>
>> I'd rather not attempt a heroic save here, as some bug has occurred
>> that has left the transport data structures in an unknown state.
>> Just warn and then leak the left-over resources.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> net/sunrpc/svc.c | 11 +++++++----
>> 1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> Let's start here. Comments?
>>
>>
>> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
>> index 587811a002c9..11a1d5e7f5c7 100644
>> --- a/net/sunrpc/svc.c
>> +++ b/net/sunrpc/svc.c
>> @@ -575,11 +575,14 @@ svc_destroy(struct kref *ref)
>> timer_shutdown_sync(&serv->sv_temptimer);
>>
>> /*
>> - * The last user is gone and thus all sockets have to be destroyed to
>> - * the point. Check this.
>> + * Remaining transports at this point are not expected.
>> */
>> - BUG_ON(!list_empty(&serv->sv_permsocks));
>> - BUG_ON(!list_empty(&serv->sv_tempsocks));
>> + if (unlikely(!list_empty(&serv->sv_permsocks)))
>> + pr_warn("SVC: permsocks remain for %s\n",
>> + serv->sv_program->pg_name);
>
> I would go with WARN_ON_ONCE() but I agree with the principle.
> Maybe
> WARN_ONCE(!list_empty(&serv->sv_permsocks),
> "SVC: permsocks remain for %s\n",
> serv->sv_program->pg_name);
> This gives the stack trace which might be helpful.

I couldn't think of any additional value that a stack
trace would provide over which upper layer protocol
was calling, which is provided by pg_name.



> But
>
> Reviewed-by: NeilBrown <[email protected]>
>
> if you prefer it the way it is.

WARN_ONCE is a more conservative change, so let's
do that.


> NeilBrown
>
>> + if (unlikely(!list_empty(&serv->sv_tempsocks)))
>> + pr_warn("SVC: tempsocks remain for %s\n",
>> + serv->sv_program->pg_name);
>>
>> cache_clean_deferred(serv);


--
Chuck Lever


2023-09-22 20:55:38

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v1] SUNRPC: Remove BUG_ON call sites

On Mon, 2023-09-18 at 10:18 -0400, Chuck Lever wrote:
> From: Chuck Lever <[email protected]>
>
> There is no need to take down the whole system for these assertions.
>
> I'd rather not attempt a heroic save here, as some bug has occurred
> that has left the transport data structures in an unknown state.
> Just warn and then leak the left-over resources.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> net/sunrpc/svc.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> Let's start here. Comments?
>
>
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index 587811a002c9..11a1d5e7f5c7 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -575,11 +575,14 @@ svc_destroy(struct kref *ref)
> timer_shutdown_sync(&serv->sv_temptimer);
>
> /*
> - * The last user is gone and thus all sockets have to be destroyed to
> - * the point. Check this.
> + * Remaining transports at this point are not expected.
> */
> - BUG_ON(!list_empty(&serv->sv_permsocks));
> - BUG_ON(!list_empty(&serv->sv_tempsocks));
> + if (unlikely(!list_empty(&serv->sv_permsocks)))
> + pr_warn("SVC: permsocks remain for %s\n",
> + serv->sv_program->pg_name);
> + if (unlikely(!list_empty(&serv->sv_tempsocks)))
> + pr_warn("SVC: tempsocks remain for %s\n",
> + serv->sv_program->pg_name);
>
> cache_clean_deferred(serv);
>
>
>

Reviewed-by: Jeff Layton <[email protected]>