Hello,
I was thinking how to best address the performance regression coming from
NFS write congestion. After considering various options and concerns raised
in the previous discussion, I've got an idea for a simple option that could
help to keep the server more busy - just mimick what block devices do and
block the flush worker waiting for congestion to resolve instead of aborting
the writeback. And it actually helps enough that I don't think more complex
solutions are warranted at this point.
This patch series has two preparatory cleanups and then a patch implementing
this idea.
Honza
Atomic types should better be initialized with atomic_long_set() instead
of relying on zeroing done by kzalloc(). Clean this up.
Signed-off-by: Jan Kara <[email protected]>
---
fs/nfs/client.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index de77848ae654..3b252dceebf5 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -994,6 +994,8 @@ struct nfs_server *nfs_alloc_server(void)
server->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
+ atomic_long_set(&server->writeback, 0);
+
ida_init(&server->openowner_id);
ida_init(&server->lockowner_id);
pnfs_init_server(server);
--
2.35.3
nfss->writeback is updated only when we are ending page writeback and at
that moment we also clear nfss->write_congested. So there's no point in
rechecking congestion state in nfs_commit_release_pages(). Drop the
pointless check.
Signed-off-by: Jan Kara <[email protected]>
---
fs/nfs/write.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 5fc12a721ec3..c6255d7edd3c 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1837,7 +1837,6 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
struct nfs_page *req;
int status = data->task.tk_status;
struct nfs_commit_info cinfo;
- struct nfs_server *nfss;
struct folio *folio;
while (!list_empty(&data->pages)) {
@@ -1880,9 +1879,6 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
/* Latency breaker */
cond_resched();
}
- nfss = NFS_SERVER(data->inode);
- if (atomic_long_read(&nfss->writeback) < NFS_CONGESTION_OFF_THRESH)
- nfss->write_congested = 0;
nfs_init_cinfo(&cinfo, data->inode, data->dreq);
nfs_commit_end(cinfo.mds);
--
2.35.3
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
On Thu, 2024-06-13 at 10:28 +0200, Jan Kara wrote:
> nfss->writeback is updated only when we are ending page writeback and
> at
> that moment we also clear nfss->write_congested. So there's no point
> in
> rechecking congestion state in nfs_commit_release_pages(). Drop the
> pointless check.
>
> Signed-off-by: Jan Kara <[email protected]>
> ---
> fs/nfs/write.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index 5fc12a721ec3..c6255d7edd3c 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1837,7 +1837,6 @@ static void nfs_commit_release_pages(struct
> nfs_commit_data *data)
> struct nfs_page *req;
> int status = data->task.tk_status;
> struct nfs_commit_info cinfo;
> - struct nfs_server *nfss;
> struct folio *folio;
>
> while (!list_empty(&data->pages)) {
> @@ -1880,9 +1879,6 @@ static void nfs_commit_release_pages(struct
> nfs_commit_data *data)
> /* Latency breaker */
> cond_resched();
> }
> - nfss = NFS_SERVER(data->inode);
> - if (atomic_long_read(&nfss->writeback) <
> NFS_CONGESTION_OFF_THRESH)
> - nfss->write_congested = 0;
>
> nfs_init_cinfo(&cinfo, data->inode, data->dreq);
> nfs_commit_end(cinfo.mds);
Reviewed-by: Jeff Layton <[email protected]>
On Thu, 2024-06-13 at 10:28 +0200, Jan Kara wrote:
> Atomic types should better be initialized with atomic_long_set()
> instead
> of relying on zeroing done by kzalloc(). Clean this up.
>
> Signed-off-by: Jan Kara <[email protected]>
> ---
> fs/nfs/client.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> index de77848ae654..3b252dceebf5 100644
> --- a/fs/nfs/client.c
> +++ b/fs/nfs/client.c
> @@ -994,6 +994,8 @@ struct nfs_server *nfs_alloc_server(void)
>
> server->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
>
> + atomic_long_set(&server->writeback, 0);
> +
> ida_init(&server->openowner_id);
> ida_init(&server->lockowner_id);
> pnfs_init_server(server);
I'm guilty of doing this, well, all over the place. Is there any
plausible scenario where another task could see this value set to non-
zero after kzalloc()? One would hope not...
--
Jeff Layton <[email protected]>
On Thu 13-06-24 15:56:36, Jeff Layton wrote:
> On Thu, 2024-06-13 at 10:28 +0200, Jan Kara wrote:
> > Atomic types should better be initialized with atomic_long_set()
> > instead
> > of relying on zeroing done by kzalloc(). Clean this up.
> >
> > Signed-off-by: Jan Kara <[email protected]>
> > ---
> > ?fs/nfs/client.c | 2 ++
> > ?1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> > index de77848ae654..3b252dceebf5 100644
> > --- a/fs/nfs/client.c
> > +++ b/fs/nfs/client.c
> > @@ -994,6 +994,8 @@ struct nfs_server *nfs_alloc_server(void)
> > ?
> > ? server->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
> > ?
> > + atomic_long_set(&server->writeback, 0);
> > +
> > ? ida_init(&server->openowner_id);
> > ? ida_init(&server->lockowner_id);
> > ? pnfs_init_server(server);
>
> I'm guilty of doing this, well, all over the place. Is there any
> plausible scenario where another task could see this value set to non-
> zero after kzalloc()? One would hope not...
No, it is not a practical problem these days. It is more a theoretical
correctness thing that atomic_t should be initialized through atomic_set()
and not memset() because in theory you don't know how some weird
architecture decides to implement it.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR