2024-04-23 03:12:54

by Dai Ngo

[permalink] [raw]
Subject: PATCH [v2 0/3] NFSD: drop TCP connections when NFSv4 client enters courtesy state

When a v4.0 client enters courtesy state all its v4 states remain valid
and its fore and back channel TCP connection remained in ESTABLISHED
state until the TCP keep-alive mechanism timed out and shuts down the
back channel connection. The fore channel connection remains in ESTABLISHED
state between 6 - 12 minutes before the NFSv4 server's 6-minute idle timer
(svc_age_temp_xprts) shuts down the idle connection.

Since NFSv4.1 mount uses the same TCP connection for both fore and back
channel connection there is no TCP keep-alive packet sent from the server
to the client. The server's idle timer does not shutdown an idle v4.1
connection since the svc_xprt->xpt_ref is more than 1: 1 for sv_tempsocks
list, one for the session's nfsd4_conn and 1 for the back channel.

To conserve system resources in large configuration where there are lots
of idle clients, this patch series drop the fore and back channel connection
of NFSv4 client as soon as it enters the courtesy state. The fore and back
channel connections are automatically re-established when the courtesy
client reconnects.

v1 - v2:
[patch 3/3]: a more reliable way to check if the connection needs to be dropped.

fs/nfsd/nfs4callback.c | 14 ++++++++++++--
fs/nfsd/nfs4state.c | 32 ++++++++++++++++++++++++++++++--
fs/nfsd/state.h | 2 ++
3 files changed, 44 insertions(+), 4 deletions(-)



2024-04-23 03:12:57

by Dai Ngo

[permalink] [raw]
Subject: [PATCH v2 1/3] NFSD: mark cl_cb_state as NFSD4_CB_DOWN if cl_cb_client is NULL

In nfsd4_run_cb_work if the rpc_clnt for the back channel is no longer
exists, the callback state in nfs4_client should be marked as NFSD4_CB_DOWN
so the server can notify the client to establish a new back channel
connection.

Signed-off-by: Dai Ngo <[email protected]>
---
fs/nfsd/nfs4callback.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index cf87ace7a1b0..f8bb5ff2e9ac 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1491,9 +1491,14 @@ nfsd4_run_cb_work(struct work_struct *work)

clnt = clp->cl_cb_client;
if (!clnt) {
- if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
+ if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
nfsd41_destroy_cb(cb);
- else {
+ clear_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
+
+ /* let client knows BC is down when it reconnects */
+ clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
+ nfsd4_mark_cb_down(clp);
+ } else {
/*
* XXX: Ideally, we could wait for the client to
* reconnect, but I haven't figured out how
--
2.39.3


2024-04-23 03:12:59

by Dai Ngo

[permalink] [raw]
Subject: [PATCH v2 2/3] NFSD: add helper to set NFSD4_CLIENT_CB_KILL to stop the callback

Add helper for nfs4state functions to set NFSD4_CLIENT_CB_KILL
to stop the callback.

Signed-off-by: Dai Ngo <[email protected]>
---
fs/nfsd/nfs4callback.c | 5 +++++
fs/nfsd/state.h | 1 +
2 files changed, 6 insertions(+)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index f8bb5ff2e9ac..4c3a4d5df626 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1562,3 +1562,8 @@ bool nfsd4_run_cb(struct nfsd4_callback *cb)
nfsd41_cb_inflight_end(clp);
return queued;
}
+
+void nfsd4_kill_callback(struct nfs4_client *clp)
+{
+ set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
+}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index f42d8d782c84..cde05c26afd8 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -757,6 +757,7 @@ struct nfsd_file *find_any_file(struct nfs4_file *f);

#ifdef CONFIG_NFSD_V4
void nfsd4_revoke_states(struct net *net, struct super_block *sb);
+void nfsd4_kill_callback(struct nfs4_client *clp);
#else
static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb)
{
--
2.39.3


2024-04-23 03:13:00

by Dai Ngo

[permalink] [raw]
Subject: [PATCH v2 3/3] NFSD: drop TCP connections when NFSv4 client enters courtesy state

When a v4.0 client enters courtesy state all its v4 states remain valid
and its fore and back channel TCP connection remained in ESTABLISHED
state until the TCP keep-alive mechanism timed out and shuts down the
back channel connection. The fore channel connection remains in ESTABLISHED
state between 6 - 12 minutes before the NFSv4 server's 6-minute idle timer
(svc_age_temp_xprts) shuts down the idle connection.

Since NFSv4.1 mount uses the same TCP connection for both fore and back
channel connection there is no TCP keep-alive packet sent from the server
to the client. The server's idle timer does not shutdown an idle v4.1
connection since the svc_xprt->xpt_ref is more than 1: 1 for sv_tempsocks
list, one for the session's nfsd4_conn and 1 for the back channel.

To conserve system resources in large configuration where there are lots
of idle clients, this patch drops the fore and back channel connection
of NFSv4 client as soon as it enters the courtesy state. The fore and back
channel connections are automatically re-established when the courtesy
client reconnects.

Signed-off-by: Dai Ngo <[email protected]>
---
fs/nfsd/nfs4state.c | 32 ++++++++++++++++++++++++++++++--
fs/nfsd/state.h | 1 +
2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a20c2c9d7d45..d9f6e7dbb2e1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6369,6 +6369,22 @@ nfs4_anylock_blockers(struct nfs4_client *clp)
return false;
}

+static void nfsd4_drop_conns(struct nfsd_net *nn, struct nfs4_client *clp)
+{
+ struct svc_xprt *xprt;
+
+ /* stop requeueing callback in nfsd4_run_cb_work */
+ nfsd4_kill_callback(clp);
+
+ spin_lock_bh(&nn->nfsd_serv->sv_lock);
+ list_for_each_entry(xprt, &nn->nfsd_serv->sv_tempsocks, xpt_list) {
+ if (rpc_cmp_addr((struct sockaddr *)&clp->cl_addr,
+ (struct sockaddr *)&xprt->xpt_remote))
+ svc_xprt_deferred_close(xprt);
+ }
+ spin_unlock_bh(&nn->nfsd_serv->sv_lock);
+}
+
static void
nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
struct laundry_time *lt)
@@ -6376,10 +6392,13 @@ nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
unsigned int maxreap, reapcnt = 0;
struct list_head *pos, *next;
struct nfs4_client *clp;
+ struct list_head conn_reaplist;
+ bool drop;

maxreap = (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients) ?
NFSD_CLIENT_MAX_TRIM_PER_RUN : 0;
INIT_LIST_HEAD(reaplist);
+ INIT_LIST_HEAD(&conn_reaplist);
spin_lock(&nn->client_lock);
list_for_each_safe(pos, next, &nn->client_lru) {
clp = list_entry(pos, struct nfs4_client, cl_lru);
@@ -6387,16 +6406,22 @@ nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
goto exp_client;
if (!state_expired(lt, clp->cl_time))
break;
+ drop = false;
if (!atomic_read(&clp->cl_rpc_users)) {
- if (clp->cl_state == NFSD4_ACTIVE)
+ if (clp->cl_state == NFSD4_ACTIVE) {
atomic_inc(&nn->nfsd_courtesy_clients);
+ drop = true;
+ }
clp->cl_state = NFSD4_COURTESY;
}
if (!client_has_state(clp))
goto exp_client;
if (!nfs4_anylock_blockers(clp))
- if (reapcnt >= maxreap)
+ if (reapcnt >= maxreap) {
+ if (drop)
+ list_add(&clp->cl_conn_lru, &conn_reaplist);
continue;
+ }
exp_client:
if (!mark_client_expired_locked(clp)) {
list_add(&clp->cl_lru, reaplist);
@@ -6404,6 +6429,9 @@ nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
}
}
spin_unlock(&nn->client_lock);
+
+ list_for_each_entry(clp, &conn_reaplist, cl_conn_lru)
+ nfsd4_drop_conns(nn, clp);
}

static void
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index cde05c26afd8..fe7b5bd6460b 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -420,6 +420,7 @@ struct nfs4_client {
int cl_cb_state;
struct nfsd4_callback cl_cb_null;
struct nfsd4_session *cl_cb_session;
+ struct list_head cl_conn_lru;

/* for all client information that callback code might need: */
spinlock_t cl_lock;
--
2.39.3


2024-04-23 13:42:16

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] NFSD: mark cl_cb_state as NFSD4_CB_DOWN if cl_cb_client is NULL

On Mon, Apr 22, 2024 at 08:12:31PM -0700, Dai Ngo wrote:
> In nfsd4_run_cb_work if the rpc_clnt for the back channel is no longer
> exists, the callback state in nfs4_client should be marked as NFSD4_CB_DOWN
> so the server can notify the client to establish a new back channel
> connection.
>
> Signed-off-by: Dai Ngo <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index cf87ace7a1b0..f8bb5ff2e9ac 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1491,9 +1491,14 @@ nfsd4_run_cb_work(struct work_struct *work)
>
> clnt = clp->cl_cb_client;
> if (!clnt) {
> - if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
> + if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
> nfsd41_destroy_cb(cb);
> - else {
> + clear_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
> +
> + /* let client knows BC is down when it reconnects */
> + clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
> + nfsd4_mark_cb_down(clp);
> + } else {
> /*
> * XXX: Ideally, we could wait for the client to
> * reconnect, but I haven't figured out how

NFSD4_CLIENT_CB_KILL is for when the lease is getting expunged. It's
not supposed to be used when only the transport is closed. Thus,
shouldn't you mark_cb_down in this arm, instead? Even so, isn't the
backchannel already marked down when we get here?


--
Chuck Lever

2024-04-23 17:58:10

by Dai Ngo

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] NFSD: mark cl_cb_state as NFSD4_CB_DOWN if cl_cb_client is NULL


On 4/23/24 6:41 AM, Chuck Lever wrote:
> On Mon, Apr 22, 2024 at 08:12:31PM -0700, Dai Ngo wrote:
>> In nfsd4_run_cb_work if the rpc_clnt for the back channel is no longer
>> exists, the callback state in nfs4_client should be marked as NFSD4_CB_DOWN
>> so the server can notify the client to establish a new back channel
>> connection.
>>
>> Signed-off-by: Dai Ngo <[email protected]>
>> ---
>> fs/nfsd/nfs4callback.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>> index cf87ace7a1b0..f8bb5ff2e9ac 100644
>> --- a/fs/nfsd/nfs4callback.c
>> +++ b/fs/nfsd/nfs4callback.c
>> @@ -1491,9 +1491,14 @@ nfsd4_run_cb_work(struct work_struct *work)
>>
>> clnt = clp->cl_cb_client;
>> if (!clnt) {
>> - if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
>> + if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
>> nfsd41_destroy_cb(cb);
>> - else {
>> + clear_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
>> +
>> + /* let client knows BC is down when it reconnects */
>> + clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
>> + nfsd4_mark_cb_down(clp);
>> + } else {
>> /*
>> * XXX: Ideally, we could wait for the client to
>> * reconnect, but I haven't figured out how
> NFSD4_CLIENT_CB_KILL is for when the lease is getting expunged. It's
> not supposed to be used when only the transport is closed.

The reason NFSD4_CLIENT_CB_KILL needs to be set when the transport is
closed is because of commit c1ccfcf1a9bf3.

When the transport is closed, nfsd4_conn_lost is called which then calls
nfsd4_probe_callback to set NFSD4_CLIENT_CB_UPDATE and schedule cl_cb_null
work to activate the callback worker (nfsd4_run_cb_work) to do the update.

Callback worker calls nfsd4_process_cb_update to do rpc_shutdown_client
then clear cl_cb_client.

When nfsd4_process_cb_update returns to nfsd4_run_cb_work, if cl_cb_client
is NULL and NFSD4_CLIENT_CB_KILL not set then it re-queues the callback,
causing an infinite loop.


> Thus,
> shouldn't you mark_cb_down in this arm, instead?

I'm not clear what you mean here, the callback worker calls
nfsd4_mark_cb_down after destroying the callback.

> Even so, isn't the
> backchannel already marked down when we get here?

No, according to my testing. Without marking the back channel down the
client does not re-establish the back channel when it reconnects.

-Dai

>
>

2024-04-23 18:34:24

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] NFSD: mark cl_cb_state as NFSD4_CB_DOWN if cl_cb_client is NULL

On Tue, Apr 23, 2024 at 10:49:25AM -0700, Dai Ngo wrote:
>
> On 4/23/24 6:41 AM, Chuck Lever wrote:
> > On Mon, Apr 22, 2024 at 08:12:31PM -0700, Dai Ngo wrote:
> > > In nfsd4_run_cb_work if the rpc_clnt for the back channel is no longer
> > > exists, the callback state in nfs4_client should be marked as NFSD4_CB_DOWN
> > > so the server can notify the client to establish a new back channel
> > > connection.
> > >
> > > Signed-off-by: Dai Ngo <[email protected]>
> > > ---
> > > fs/nfsd/nfs4callback.c | 9 +++++++--
> > > 1 file changed, 7 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > index cf87ace7a1b0..f8bb5ff2e9ac 100644
> > > --- a/fs/nfsd/nfs4callback.c
> > > +++ b/fs/nfsd/nfs4callback.c
> > > @@ -1491,9 +1491,14 @@ nfsd4_run_cb_work(struct work_struct *work)
> > > clnt = clp->cl_cb_client;
> > > if (!clnt) {
> > > - if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
> > > + if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
> > > nfsd41_destroy_cb(cb);
> > > - else {
> > > + clear_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
> > > +
> > > + /* let client knows BC is down when it reconnects */
> > > + clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
> > > + nfsd4_mark_cb_down(clp);
> > > + } else {
> > > /*
> > > * XXX: Ideally, we could wait for the client to
> > > * reconnect, but I haven't figured out how
> > NFSD4_CLIENT_CB_KILL is for when the lease is getting expunged. It's
> > not supposed to be used when only the transport is closed.
>
> The reason NFSD4_CLIENT_CB_KILL needs to be set when the transport is
> closed is because of commit c1ccfcf1a9bf3.
>
> When the transport is closed, nfsd4_conn_lost is called which then calls
> nfsd4_probe_callback to set NFSD4_CLIENT_CB_UPDATE and schedule cl_cb_null
> work to activate the callback worker (nfsd4_run_cb_work) to do the update.
>
> Callback worker calls nfsd4_process_cb_update to do rpc_shutdown_client
> then clear cl_cb_client.
>
> When nfsd4_process_cb_update returns to nfsd4_run_cb_work, if cl_cb_client
> is NULL and NFSD4_CLIENT_CB_KILL not set then it re-queues the callback,
> causing an infinite loop.

That's the way it is supposed to work today. The callback is
re-queued until the client reconnects, at which point the loop is
broken.


> > Thus, shouldn't you mark_cb_down in this arm, instead?
>
> I'm not clear what you mean here, the callback worker calls
> nfsd4_mark_cb_down after destroying the callback.

No, I mean in the re-queue case.


> > Even so, isn't the
> > backchannel already marked down when we get here?
>
> No, according to my testing. Without marking the back channel down the
> client does not re-establish the back channel when it reconnects.

I didn't expect that closing the transport on the server side would
need any changes in fs/nfsd/nfs4callback.c. Let me get the
backchannel retransmit behavior sorted first. I'm still working on
setting up a test rig here.


--
Chuck Lever

2024-04-23 20:26:20

by Dai Ngo

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] NFSD: mark cl_cb_state as NFSD4_CB_DOWN if cl_cb_client is NULL


On 4/23/24 11:08 AM, Chuck Lever wrote:
> On Tue, Apr 23, 2024 at 10:49:25AM -0700, Dai Ngo wrote:
>> On 4/23/24 6:41 AM, Chuck Lever wrote:
>>> On Mon, Apr 22, 2024 at 08:12:31PM -0700, Dai Ngo wrote:
>>>> In nfsd4_run_cb_work if the rpc_clnt for the back channel is no longer
>>>> exists, the callback state in nfs4_client should be marked as NFSD4_CB_DOWN
>>>> so the server can notify the client to establish a new back channel
>>>> connection.
>>>>
>>>> Signed-off-by: Dai Ngo <[email protected]>
>>>> ---
>>>> fs/nfsd/nfs4callback.c | 9 +++++++--
>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>>> index cf87ace7a1b0..f8bb5ff2e9ac 100644
>>>> --- a/fs/nfsd/nfs4callback.c
>>>> +++ b/fs/nfsd/nfs4callback.c
>>>> @@ -1491,9 +1491,14 @@ nfsd4_run_cb_work(struct work_struct *work)
>>>> clnt = clp->cl_cb_client;
>>>> if (!clnt) {
>>>> - if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
>>>> + if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
>>>> nfsd41_destroy_cb(cb);
>>>> - else {
>>>> + clear_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
>>>> +
>>>> + /* let client knows BC is down when it reconnects */
>>>> + clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
>>>> + nfsd4_mark_cb_down(clp);
>>>> + } else {
>>>> /*
>>>> * XXX: Ideally, we could wait for the client to
>>>> * reconnect, but I haven't figured out how
>>> NFSD4_CLIENT_CB_KILL is for when the lease is getting expunged. It's
>>> not supposed to be used when only the transport is closed.
>> The reason NFSD4_CLIENT_CB_KILL needs to be set when the transport is
>> closed is because of commit c1ccfcf1a9bf3.
>>
>> When the transport is closed, nfsd4_conn_lost is called which then calls
>> nfsd4_probe_callback to set NFSD4_CLIENT_CB_UPDATE and schedule cl_cb_null
>> work to activate the callback worker (nfsd4_run_cb_work) to do the update.
>>
>> Callback worker calls nfsd4_process_cb_update to do rpc_shutdown_client
>> then clear cl_cb_client.
>>
>> When nfsd4_process_cb_update returns to nfsd4_run_cb_work, if cl_cb_client
>> is NULL and NFSD4_CLIENT_CB_KILL not set then it re-queues the callback,
>> causing an infinite loop.
> That's the way it is supposed to work today. The callback is
> re-queued until the client reconnects, at which point the loop is
> broken.

As you mentioned below, this needs to be reworked.

What if the client never comes back, decommissioned or student hibernates
the laptop and opens it up few days later. Even when the client comes back,
it might have been rebooted so the callback does not mean anything to it.

>
>
>>> Thus, shouldn't you mark_cb_down in this arm, instead?
>> I'm not clear what you mean here, the callback worker calls
>> nfsd4_mark_cb_down after destroying the callback.
> No, I mean in the re-queue case.

In the case of re-queue, the back channel is already marked as NFSD4_CB_DOWN
and cl_flags is NFSD4_CLIENT_STABLE|NFSD4_CLIENT_RECLAIM_COMPLETE|NFSD4_CLIENT_CONFIRMED:

Apr 23 08:07:23 nfsvmc14 kernel: nfsd4_run_cb_work: NULL cl_cb_client REQUEUE CB cb[ffff888126e8a728] clp[ffff888126e8a430] cl_cb_state[2] cl_flags[0x1c]

but that does not stop the loop.

>
>>> Even so, isn't the
>>> backchannel already marked down when we get here?
>> No, according to my testing. Without marking the back channel down the
>> client does not re-establish the back channel when it reconnects.
> I didn't expect that closing the transport on the server side would
> need any changes in fs/nfsd/nfs4callback.c. Let me get the
> backchannel retransmit behavior sorted first. I'm still working on
> setting up a test rig here.

Thanks, I will wait until you sort this out.

-Dai

>
>