2022-05-06 01:32:53

by Jing Xia

[permalink] [raw]
Subject: [PATCH] writeback: Avoid skipping inode writeback

We have run into an issue that a task gets stuck in
balance_dirty_pages_ratelimited() when perform I/O stress testing.
The reason we observed is that an I_DIRTY_PAGES inode with lots
of dirty pages is in b_dirty_time list and standard background
writeback cannot writeback the inode.
After studing the relevant code, the following scenario may lead
to the issue:

task1 task2
----- -----
fuse_flush
write_inode_now //in b_dirty_time
writeback_single_inode
__writeback_single_inode
fuse_write_end
filemap_dirty_folio
__xa_set_mark:PAGECACHE_TAG_DIRTY
lock inode->i_lock
if mapping tagged PAGECACHE_TAG_DIRTY
inode->i_state |= I_DIRTY_PAGES
unlock inode->i_lock
__mark_inode_dirty:I_DIRTY_PAGES
lock inode->i_lock
-was dirty,inode stays in
-b_dirty_time
unlock inode->i_lock

if(!(inode->i_state & I_DIRTY_All))
-not true,so nothing done

This patch moves the dirty inode to b_dirty list when the inode
currently is not queued in b_io or b_more_io list at the end of
writeback_single_inode.

Signed-off-by: Jing Xia <[email protected]>
---
fs/fs-writeback.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 591fe9cf1659..d7763feaf14a 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1712,6 +1712,9 @@ static int writeback_single_inode(struct inode *inode,
*/
if (!(inode->i_state & I_DIRTY_ALL))
inode_cgwb_move_to_attached(inode, wb);
+ else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
+ redirty_tail_locked(inode, wb);
+
spin_unlock(&wb->list_lock);
inode_sync_complete(inode);
out:
--
2.17.1



2022-05-09 09:16:32

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] writeback: Avoid skipping inode writeback

On Thu 05-05-22 21:47:31, Jing Xia wrote:
> We have run into an issue that a task gets stuck in
> balance_dirty_pages_ratelimited() when perform I/O stress testing.
> The reason we observed is that an I_DIRTY_PAGES inode with lots
> of dirty pages is in b_dirty_time list and standard background
> writeback cannot writeback the inode.
> After studing the relevant code, the following scenario may lead
> to the issue:
>
> task1 task2
> ----- -----
> fuse_flush
> write_inode_now //in b_dirty_time
> writeback_single_inode
> __writeback_single_inode
> fuse_write_end
> filemap_dirty_folio
> __xa_set_mark:PAGECACHE_TAG_DIRTY
> lock inode->i_lock
> if mapping tagged PAGECACHE_TAG_DIRTY
> inode->i_state |= I_DIRTY_PAGES
> unlock inode->i_lock
> __mark_inode_dirty:I_DIRTY_PAGES
> lock inode->i_lock
> -was dirty,inode stays in
> -b_dirty_time
> unlock inode->i_lock
>
> if(!(inode->i_state & I_DIRTY_All))
> -not true,so nothing done
>
> This patch moves the dirty inode to b_dirty list when the inode
> currently is not queued in b_io or b_more_io list at the end of
> writeback_single_inode.
>
> Signed-off-by: Jing Xia <[email protected]>

Thanks for report and the fix! The patch looks good so feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Also please add tags:

CC: [email protected]
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")

Thanks.
Honza

> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 591fe9cf1659..d7763feaf14a 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1712,6 +1712,9 @@ static int writeback_single_inode(struct inode *inode,
> */
> if (!(inode->i_state & I_DIRTY_ALL))
> inode_cgwb_move_to_attached(inode, wb);
> + else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
> + redirty_tail_locked(inode, wb);
> +
> spin_unlock(&wb->list_lock);
> inode_sync_complete(inode);
> out:
> --
> 2.17.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-05-09 10:23:34

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] writeback: Avoid skipping inode writeback

On Thu, May 05, 2022 at 09:47:31PM +0800, Jing Xia wrote:
> if (!(inode->i_state & I_DIRTY_ALL))
> inode_cgwb_move_to_attached(inode, wb);
> + else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))

Please turn this into

else if ((inode->i_state & I_DIRTY) &&
!(inode->i_state & I_SYNC_QUEUED))

to keep it a little more readable.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2022-05-09 10:53:28

by jing xia

[permalink] [raw]
Subject: Re: [PATCH] writeback: Avoid skipping inode writeback

Thanks, I'll update the patch.

On Thu, May 5, 2022 at 11:40 PM Jan Kara <[email protected]> wrote:
>
> On Thu 05-05-22 21:47:31, Jing Xia wrote:
> > We have run into an issue that a task gets stuck in
> > balance_dirty_pages_ratelimited() when perform I/O stress testing.
> > The reason we observed is that an I_DIRTY_PAGES inode with lots
> > of dirty pages is in b_dirty_time list and standard background
> > writeback cannot writeback the inode.
> > After studing the relevant code, the following scenario may lead
> > to the issue:
> >
> > task1 task2
> > ----- -----
> > fuse_flush
> > write_inode_now //in b_dirty_time
> > writeback_single_inode
> > __writeback_single_inode
> > fuse_write_end
> > filemap_dirty_folio
> > __xa_set_mark:PAGECACHE_TAG_DIRTY
> > lock inode->i_lock
> > if mapping tagged PAGECACHE_TAG_DIRTY
> > inode->i_state |= I_DIRTY_PAGES
> > unlock inode->i_lock
> > __mark_inode_dirty:I_DIRTY_PAGES
> > lock inode->i_lock
> > -was dirty,inode stays in
> > -b_dirty_time
> > unlock inode->i_lock
> >
> > if(!(inode->i_state & I_DIRTY_All))
> > -not true,so nothing done
> >
> > This patch moves the dirty inode to b_dirty list when the inode
> > currently is not queued in b_io or b_more_io list at the end of
> > writeback_single_inode.
> >
> > Signed-off-by: Jing Xia <[email protected]>
>
> Thanks for report and the fix! The patch looks good so feel free to add:
>
> Reviewed-by: Jan Kara <[email protected]>
>
> Also please add tags:
>
> CC: [email protected]
> Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
>
> Thanks.
> Honza
>
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 591fe9cf1659..d7763feaf14a 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -1712,6 +1712,9 @@ static int writeback_single_inode(struct inode *inode,
> > */
> > if (!(inode->i_state & I_DIRTY_ALL))
> > inode_cgwb_move_to_attached(inode, wb);
> > + else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
> > + redirty_tail_locked(inode, wb);
> > +
> > spin_unlock(&wb->list_lock);
> > inode_sync_complete(inode);
> > out:
> > --
> > 2.17.1
> >
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2022-05-09 10:54:18

by jing xia

[permalink] [raw]
Subject: Re: [PATCH] writeback: Avoid skipping inode writeback

On Mon, May 9, 2022 at 2:46 PM Christoph Hellwig <[email protected]> wrote:
>
> On Thu, May 05, 2022 at 09:47:31PM +0800, Jing Xia wrote:
> > if (!(inode->i_state & I_DIRTY_ALL))
> > inode_cgwb_move_to_attached(inode, wb);
> > + else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
>
> Please turn this into
>
> else if ((inode->i_state & I_DIRTY) &&
> !(inode->i_state & I_SYNC_QUEUED))
>
> to keep it a little more readable.
>
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig <[email protected]>

Ok. And thanks for the review.