Commit acc8d8588cb7 converted afs_writepages_region() to write back a
folio batch. The function waits for writeback to a folio, but then
proceeds to the rest of the batch without trying to write that folio
again. This patch fixes has it attempt to write the folio again.
This has only been compile tested.
Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()")
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/afs/write.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/afs/write.c b/fs/afs/write.c
index a724228e4d94..18ccb613dff8 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -731,6 +731,7 @@ static int afs_writepages_region(struct address_space *mapping,
* (changing page->mapping to NULL), or even swizzled
* back from swapper_space to tmpfs file mapping
*/
+try_again:
if (wbc->sync_mode != WB_SYNC_NONE) {
ret = folio_lock_killable(folio);
if (ret < 0) {
@@ -757,6 +758,7 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ goto try_again;
} else {
start += folio_size(folio);
}
--
2.40.1
On Wed, 7 Jun 2023 13:41:20 -0700 "Vishal Moola (Oracle)" <[email protected]> wrote:
> Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> folio batch. The function waits for writeback to a folio, but then
> proceeds to the rest of the batch without trying to write that folio
> again. This patch fixes has it attempt to write the folio again.
>
> This has only been compile tested.
This seems fairly serious?
> --- a/fs/afs/write.c
> +++ b/fs/afs/write.c
> @@ -731,6 +731,7 @@ static int afs_writepages_region(struct address_space *mapping,
> * (changing page->mapping to NULL), or even swizzled
> * back from swapper_space to tmpfs file mapping
> */
> +try_again:
> if (wbc->sync_mode != WB_SYNC_NONE) {
> ret = folio_lock_killable(folio);
> if (ret < 0) {
> @@ -757,6 +758,7 @@ static int afs_writepages_region(struct address_space *mapping,
> #ifdef CONFIG_AFS_FSCACHE
> folio_wait_fscache(folio);
> #endif
> + goto try_again;
> } else {
> start += folio_size(folio);
> }
From my reading, we'll fail to write out the dirty data. Presumably
not easily observable, as it will get written out again later on. But
we're also calling afs_write_back_from_locked_folio() with an unlocked
folio, which might cause mayhem.
So I'm suspecting that a cc:stable is needed. David, could you please
take a look and perhaps retest?
Thanks.
Vishal Moola (Oracle) <[email protected]> wrote:
> + goto try_again;
> } else {
> start += folio_size(folio);
The "else" is then redundant.
David
Andrew Morton <[email protected]> wrote:
> > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > folio batch. The function waits for writeback to a folio, but then
> > proceeds to the rest of the batch without trying to write that folio
> > again. This patch fixes has it attempt to write the folio again.
> >
> > This has only been compile tested.
>
> This seems fairly serious?
We will try to write the again later, but sync()/fsync() might now have
skipped it.
> From my reading, we'll fail to write out the dirty data. Presumably
> not easily observable, as it will get written out again later on.
As it's a network filesystem, interactions with third parties could cause
apparent corruption. Closing a file will flush it - but if there's a
simultaneous op of some other kind, a bit of a flush or a sync may get missed
and the copy visible to another user be temporarily missing that bit.
> But we're also calling afs_write_back_from_locked_folio() with an unlocked
> folio, which might cause mayhem.
Without this patch, you mean? There's a "continue" statement that should send
us back to the top of the loop before we get as far as
afs_write_back_from_locked_folio() - and then the folio_unlock() there would
go bang.
David
On Fri, 16 Jun 2023 23:43:02 +0100 David Howells <[email protected]> wrote:
> Andrew Morton <[email protected]> wrote:
>
> > > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > > folio batch. The function waits for writeback to a folio, but then
> > > proceeds to the rest of the batch without trying to write that folio
> > > again. This patch fixes has it attempt to write the folio again.
> > >
> > > This has only been compile tested.
> >
> > This seems fairly serious?
>
> We will try to write the again later, but sync()/fsync() might now have
> skipped it.
>
> > From my reading, we'll fail to write out the dirty data. Presumably
> > not easily observable, as it will get written out again later on.
>
> As it's a network filesystem, interactions with third parties could cause
> apparent corruption. Closing a file will flush it - but if there's a
> simultaneous op of some other kind, a bit of a flush or a sync may get missed
> and the copy visible to another user be temporarily missing that bit.
>
> > But we're also calling afs_write_back_from_locked_folio() with an unlocked
> > folio, which might cause mayhem.
>
> Without this patch, you mean? There's a "continue" statement that should send
> us back to the top of the loop before we get as far as
> afs_write_back_from_locked_folio() - and then the folio_unlock() there would
> go bang.
>
Well, what I'm really asking is the thing I ask seven times a day:
- what are the end-user visible effects of the bug
- should be fix be backported into earlier kernels
Andrew Morton <[email protected]> wrote:
> Well, what I'm really asking is the thing I ask seven times a day:
>
> - what are the end-user visible effects of the bug
A third party might see an incomplete flush after they've done a sync - which
amounts to temporary file corruption.
> - should be fix be backported into earlier kernels
Yes.
David