Hi Jens,
Today's linux-next merge of the block tree got a conflict in
fs/fs-writeback.c between commit ef7fdf5e8c87 ("vfs: add support for a
lazytime mount option") from the ext4 tree and commit 9c6ac78eb352
("writeback: fix a subtle race condition in I_DIRTY clearing") from the
block tree.
I fixed it up (I took a guess, plese check - see below) and can carry
the fix as necessary (no action is required).
--
Cheers,
Stephen Rothwell [email protected]
diff --cc fs/fs-writeback.c
index 3d87174408ae,2d609a5fbfea..000000000000
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@@ -482,14 -479,30 +482,30 @@@ __writeback_single_inode(struct inode *
* write_inode()
*/
spin_lock(&inode->i_lock);
- /* Clear I_DIRTY_PAGES if we've written out all dirty pages */
- if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
- inode->i_state &= ~I_DIRTY_PAGES;
+
- dirty = inode->i_state & I_DIRTY;
- inode->i_state &= ~I_DIRTY;
+ dirty = inode->i_state & I_DIRTY_INODE;
+ inode->i_state &= ~I_DIRTY_INODE;
+
+ /*
+ * Paired with smp_mb() in __mark_inode_dirty(). This allows
+ * __mark_inode_dirty() to test i_state without grabbing i_lock -
+ * either they see the I_DIRTY bits cleared or we see the dirtied
+ * inode.
+ *
+ * I_DIRTY_PAGES is always cleared together above even if @mapping
+ * still has dirty pages. The flag is reinstated after smp_mb() if
+ * necessary. This guarantees that either __mark_inode_dirty()
+ * sees clear I_DIRTY_PAGES or we see PAGECACHE_TAG_DIRTY.
+ */
+ smp_mb();
+
+ if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
+ inode->i_state |= I_DIRTY_PAGES;
+
spin_unlock(&inode->i_lock);
+
/* Don't write the inode if only I_DIRTY_PAGES was set */
- if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
+ if (dirty) {
int err = write_inode(inode, wbc);
if (ret == 0)
ret = err;
On Thu, Nov 27, 2014 at 02:53:47PM +1100, Stephen Rothwell wrote:
> Hi Jens,
>
> Today's linux-next merge of the block tree got a conflict in
> fs/fs-writeback.c between commit ef7fdf5e8c87 ("vfs: add support for a
> lazytime mount option")
Mergign that into a branch for linux-next must surely have been a
mistake, as the code isn't anywhere near ready.
I also think it very much should go into through the vfs tree.
Hello,
[adding Jeremiah Mahler to CC]
2014-11-27, 14:53:47 +1100, Stephen Rothwell wrote:
> Hi Jens,
>
> Today's linux-next merge of the block tree got a conflict in
> fs/fs-writeback.c between commit ef7fdf5e8c87 ("vfs: add support for a
> lazytime mount option") from the ext4 tree and commit 9c6ac78eb352
> ("writeback: fix a subtle race condition in I_DIRTY clearing") from the
> block tree.
>
> I fixed it up (I took a guess, plese check - see below) and can carry
> the fix as necessary (no action is required).
>
> --
> Cheers,
> Stephen Rothwell [email protected]
>
> diff --cc fs/fs-writeback.c
> index 3d87174408ae,2d609a5fbfea..000000000000
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@@ -482,14 -479,30 +482,30 @@@ __writeback_single_inode(struct inode *
> * write_inode()
> */
> spin_lock(&inode->i_lock);
> - /* Clear I_DIRTY_PAGES if we've written out all dirty pages */
> - if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> - inode->i_state &= ~I_DIRTY_PAGES;
> +
> - dirty = inode->i_state & I_DIRTY;
> - inode->i_state &= ~I_DIRTY;
> + dirty = inode->i_state & I_DIRTY_INODE;
> + inode->i_state &= ~I_DIRTY_INODE;
> +
> + /*
> + * Paired with smp_mb() in __mark_inode_dirty(). This allows
> + * __mark_inode_dirty() to test i_state without grabbing i_lock -
> + * either they see the I_DIRTY bits cleared or we see the dirtied
> + * inode.
> + *
> + * I_DIRTY_PAGES is always cleared together above even if @mapping
> + * still has dirty pages. The flag is reinstated after smp_mb() if
> + * necessary. This guarantees that either __mark_inode_dirty()
> + * sees clear I_DIRTY_PAGES or we see PAGECACHE_TAG_DIRTY.
> + */
> + smp_mb();
> +
> + if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> + inode->i_state |= I_DIRTY_PAGES;
> +
> spin_unlock(&inode->i_lock);
> +
> /* Don't write the inode if only I_DIRTY_PAGES was set */
> - if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
> + if (dirty) {
> int err = write_inode(inode, wbc);
> if (ret == 0)
> ret = err;
I think there's a problem in your fix, Stephen.
I'm getting hangs at boot (strangely -- in QEMU -- only when booting
via grub, not when using -kernel) and during shutdown. Jeremiah seems
to have the same problem and his bisection led to the merge commit:
https://lkml.org/lkml/2014/11/29/17
The following solves both issues for me. I think it makes sense given
the #defines from ef7fdf5e8c87, since Tejun intended to clear
I_DIRTY_PAGES.
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index b70e45f45afa..6b2510d97a0a 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -484,7 +484,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
spin_lock(&inode->i_lock);
dirty = inode->i_state & I_DIRTY_INODE;
- inode->i_state &= ~I_DIRTY_INODE;
+ inode->i_state &= ~I_DIRTY;
/*
* Paired with smp_mb() in __mark_inode_dirty(). This allows
--
Thanks,
Sabrina
Note the relevant changes from the ext4 tree were dropped from the
dev branch, so this shouldn't be an issue moving forward.
- Ted
Hi,
On Sat, Nov 29, 2014 at 11:08:33AM +0100, Sabrina Dubroca wrote:
> Hello,
>
> [adding Jeremiah Mahler to CC]
>
> 2014-11-27, 14:53:47 +1100, Stephen Rothwell wrote:
> > Hi Jens,
> >
> > Today's linux-next merge of the block tree got a conflict in
> > fs/fs-writeback.c between commit ef7fdf5e8c87 ("vfs: add support for a
> > lazytime mount option") from the ext4 tree and commit 9c6ac78eb352
> > ("writeback: fix a subtle race condition in I_DIRTY clearing") from the
> > block tree.
> >
> > I fixed it up (I took a guess, plese check - see below) and can carry
> > the fix as necessary (no action is required).
> >
> > --
> > Cheers,
> > Stephen Rothwell [email protected]
> >
> > diff --cc fs/fs-writeback.c
> > index 3d87174408ae,2d609a5fbfea..000000000000
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@@ -482,14 -479,30 +482,30 @@@ __writeback_single_inode(struct inode *
> > * write_inode()
> > */
> > spin_lock(&inode->i_lock);
> > - /* Clear I_DIRTY_PAGES if we've written out all dirty pages */
> > - if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> > - inode->i_state &= ~I_DIRTY_PAGES;
> > +
> > - dirty = inode->i_state & I_DIRTY;
> > - inode->i_state &= ~I_DIRTY;
> > + dirty = inode->i_state & I_DIRTY_INODE;
> > + inode->i_state &= ~I_DIRTY_INODE;
> > +
> > + /*
> > + * Paired with smp_mb() in __mark_inode_dirty(). This allows
> > + * __mark_inode_dirty() to test i_state without grabbing i_lock -
> > + * either they see the I_DIRTY bits cleared or we see the dirtied
> > + * inode.
> > + *
> > + * I_DIRTY_PAGES is always cleared together above even if @mapping
> > + * still has dirty pages. The flag is reinstated after smp_mb() if
> > + * necessary. This guarantees that either __mark_inode_dirty()
> > + * sees clear I_DIRTY_PAGES or we see PAGECACHE_TAG_DIRTY.
> > + */
> > + smp_mb();
> > +
> > + if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> > + inode->i_state |= I_DIRTY_PAGES;
> > +
> > spin_unlock(&inode->i_lock);
> > +
> > /* Don't write the inode if only I_DIRTY_PAGES was set */
> > - if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
> > + if (dirty) {
> > int err = write_inode(inode, wbc);
> > if (ret == 0)
> > ret = err;
>
>
> I think there's a problem in your fix, Stephen.
>
> I'm getting hangs at boot (strangely -- in QEMU -- only when booting
> via grub, not when using -kernel) and during shutdown. Jeremiah seems
> to have the same problem and his bisection led to the merge commit:
> https://lkml.org/lkml/2014/11/29/17
>
> The following solves both issues for me. I think it makes sense given
> the #defines from ef7fdf5e8c87, since Tejun intended to clear
> I_DIRTY_PAGES.
>
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index b70e45f45afa..6b2510d97a0a 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -484,7 +484,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
> spin_lock(&inode->i_lock);
>
> dirty = inode->i_state & I_DIRTY_INODE;
> - inode->i_state &= ~I_DIRTY_INODE;
> + inode->i_state &= ~I_DIRTY;
>
> /*
> * Paired with smp_mb() in __mark_inode_dirty(). This allows
>
>
> --
> Thanks,
> Sabrina
That change fixes the problem.
Thanks Sabrina and Stephen!
--
- Jeremiah Mahler