When shmem returns AOP_WRITEPAGE_ACTIVATE, the inode pages cannot be
synced in the near future. So write_cache_pages shall stop writting this
inode, and shmem shall increase pages_skipped to instruct VFS not to
busy retry.
CC: Hugh Dickins <[email protected]>
Signed-off-by: Wu Fengguang <[email protected]>
---
---
mm/page-writeback.c | 23 +++++++++++------------
mm/shmem.c | 1 +
2 files changed, 12 insertions(+), 12 deletions(-)
--- linux.orig/mm/page-writeback.c 2009-10-06 23:39:28.000000000 +0800
+++ linux/mm/page-writeback.c 2009-10-06 23:39:29.000000000 +0800
@@ -851,19 +851,18 @@ continue_unlock:
if (ret == AOP_WRITEPAGE_ACTIVATE) {
unlock_page(page);
ret = 0;
- } else {
- /*
- * done_index is set past this page,
- * so media errors will not choke
- * background writeout for the entire
- * file. This has consequences for
- * range_cyclic semantics (ie. it may
- * not be suitable for data integrity
- * writeout).
- */
- done = 1;
- break;
}
+ /*
+ * done_index is set past this page,
+ * so media errors will not choke
+ * background writeout for the entire
+ * file. This has consequences for
+ * range_cyclic semantics (ie. it may
+ * not be suitable for data integrity
+ * writeout).
+ */
+ done = 1;
+ break;
}
if (nr_to_write > 0) {
--- linux.orig/mm/shmem.c 2009-10-06 23:37:40.000000000 +0800
+++ linux/mm/shmem.c 2009-10-06 23:39:29.000000000 +0800
@@ -1103,6 +1103,7 @@ unlock:
*/
swapcache_free(swap, NULL);
redirty:
+ wbc->pages_skipped++;
set_page_dirty(page);
if (wbc->for_reclaim)
return AOP_WRITEPAGE_ACTIVATE; /* Return with page locked */
On Wed, 7 Oct 2009, Wu Fengguang wrote:
> When shmem returns AOP_WRITEPAGE_ACTIVATE, the inode pages cannot be
> synced in the near future. So write_cache_pages shall stop writting this
> inode, and shmem shall increase pages_skipped to instruct VFS not to
> busy retry.
>
> CC: Hugh Dickins <[email protected]>
> Signed-off-by: Wu Fengguang <[email protected]>
Okay, it embarrasses me to see AOP_WRITEPAGE_ACTIVATE (and its horrid
"in this one case the page is still locked" semantic) still around -
my patch to remove it vanished from mmotm (probably caused a temporary
conflict) and I've never chased it up (partly out of guilt that I'd not
yet kept my promise to contact the openAFS people about their use of it).
But that's orthogonal to your concern here: for so long as there has
been a wbc->pages_skipped, I guess shmem_writepage() should have been
incrementing it there - thanks. But I don't believe the VFS will ever
have any interest in pages_skipped from shmem_writepage(): do you have
evidence that it does? If so, I need to investigate.
And the accompanying change to write_cache_pages() seems irrelevant
and misguided. Irrelevant because write_cache_pages() should never be
dealing with shmem_writepage() (its bdi should keep it well away), and
should never be dealing with reclaim, which is the only case in which
shmem_writepage() returns AOP_WRITEPAGE_ACTIVATE - or have your other
changes, or the bdi work, changed that?
And misguided because in your change to write_cache_pages() you're
taking AOP_WRITEPAGE_ACTIVATE to say that it should now give up, not
process more pages. We just don't know that. All it means is that
this one page couldn't be written and should be reactivated (if it
were under reclaim): it might be the case that every other page tried
after would get treated in the same way, or it might be the case that
the next page would get written successfully. That info is just not
provided.
Hugh
> ---
> mm/page-writeback.c | 23 +++++++++++------------
> mm/shmem.c | 1 +
> 2 files changed, 12 insertions(+), 12 deletions(-)
>
> --- linux.orig/mm/page-writeback.c 2009-10-06 23:39:28.000000000 +0800
> +++ linux/mm/page-writeback.c 2009-10-06 23:39:29.000000000 +0800
> @@ -851,19 +851,18 @@ continue_unlock:
> if (ret == AOP_WRITEPAGE_ACTIVATE) {
> unlock_page(page);
> ret = 0;
> - } else {
> - /*
> - * done_index is set past this page,
> - * so media errors will not choke
> - * background writeout for the entire
> - * file. This has consequences for
> - * range_cyclic semantics (ie. it may
> - * not be suitable for data integrity
> - * writeout).
> - */
> - done = 1;
> - break;
> }
> + /*
> + * done_index is set past this page,
> + * so media errors will not choke
> + * background writeout for the entire
> + * file. This has consequences for
> + * range_cyclic semantics (ie. it may
> + * not be suitable for data integrity
> + * writeout).
> + */
> + done = 1;
> + break;
> }
>
> if (nr_to_write > 0) {
> --- linux.orig/mm/shmem.c 2009-10-06 23:37:40.000000000 +0800
> +++ linux/mm/shmem.c 2009-10-06 23:39:29.000000000 +0800
> @@ -1103,6 +1103,7 @@ unlock:
> */
> swapcache_free(swap, NULL);
> redirty:
> + wbc->pages_skipped++;
> set_page_dirty(page);
> if (wbc->for_reclaim)
> return AOP_WRITEPAGE_ACTIVATE; /* Return with page locked */
On Wed, Oct 07, 2009 at 07:57:00PM +0800, Hugh Dickins wrote:
> On Wed, 7 Oct 2009, Wu Fengguang wrote:
>
> > When shmem returns AOP_WRITEPAGE_ACTIVATE, the inode pages cannot be
> > synced in the near future. So write_cache_pages shall stop writting this
> > inode, and shmem shall increase pages_skipped to instruct VFS not to
> > busy retry.
> >
> > CC: Hugh Dickins <[email protected]>
> > Signed-off-by: Wu Fengguang <[email protected]>
>
> Okay, it embarrasses me to see AOP_WRITEPAGE_ACTIVATE (and its horrid
> "in this one case the page is still locked" semantic) still around -
> my patch to remove it vanished from mmotm (probably caused a temporary
> conflict) and I've never chased it up (partly out of guilt that I'd not
> yet kept my promise to contact the openAFS people about their use of it).
Googled this one :)
http://markmail.org/thread/fivi4bgylwsy26ws
In fact we could just return from shmem_writepage() with PG_dirty and
!PG_writeback. Then the page will be put back to LRU with PG_reclaim
cleared. It could well happen in other filesystems who has trouble to
writeback the page for the time (those already do pages_skipped++).
> But that's orthogonal to your concern here: for so long as there has
> been a wbc->pages_skipped, I guess shmem_writepage() should have been
> incrementing it there - thanks. But I don't believe the VFS will ever
> have any interest in pages_skipped from shmem_writepage(): do you have
> evidence that it does? If so, I need to investigate.
Yes :) Except in this new code.
> And the accompanying change to write_cache_pages() seems irrelevant
> and misguided. Irrelevant because write_cache_pages() should never be
> dealing with shmem_writepage() (its bdi should keep it well away), and
> should never be dealing with reclaim, which is the only case in which
> shmem_writepage() returns AOP_WRITEPAGE_ACTIVATE - or have your other
> changes, or the bdi work, changed that?
That becomes possible with my another patch (30/45). I've added check
to avoid doing lumpy reclaim for shmem. So now everything returns to
normal :)
> And misguided because in your change to write_cache_pages() you're
> taking AOP_WRITEPAGE_ACTIVATE to say that it should now give up, not
> process more pages. We just don't know that. All it means is that
> this one page couldn't be written and should be reactivated (if it
> were under reclaim): it might be the case that every other page tried
> after would get treated in the same way, or it might be the case that
> the next page would get written successfully. That info is just not
> provided.
Yes, it was over-smart indeed. I'll revert that chunk (or
AOP_WRITEPAGE_ACTIVATE) totally.
Thanks,
Fengguang
> > ---
> > mm/page-writeback.c | 23 +++++++++++------------
> > mm/shmem.c | 1 +
> > 2 files changed, 12 insertions(+), 12 deletions(-)
> >
> > --- linux.orig/mm/page-writeback.c 2009-10-06 23:39:28.000000000 +0800
> > +++ linux/mm/page-writeback.c 2009-10-06 23:39:29.000000000 +0800
> > @@ -851,19 +851,18 @@ continue_unlock:
> > if (ret == AOP_WRITEPAGE_ACTIVATE) {
> > unlock_page(page);
> > ret = 0;
> > - } else {
> > - /*
> > - * done_index is set past this page,
> > - * so media errors will not choke
> > - * background writeout for the entire
> > - * file. This has consequences for
> > - * range_cyclic semantics (ie. it may
> > - * not be suitable for data integrity
> > - * writeout).
> > - */
> > - done = 1;
> > - break;
> > }
> > + /*
> > + * done_index is set past this page,
> > + * so media errors will not choke
> > + * background writeout for the entire
> > + * file. This has consequences for
> > + * range_cyclic semantics (ie. it may
> > + * not be suitable for data integrity
> > + * writeout).
> > + */
> > + done = 1;
> > + break;
> > }
> >
> > if (nr_to_write > 0) {
> > --- linux.orig/mm/shmem.c 2009-10-06 23:37:40.000000000 +0800
> > +++ linux/mm/shmem.c 2009-10-06 23:39:29.000000000 +0800
> > @@ -1103,6 +1103,7 @@ unlock:
> > */
> > swapcache_free(swap, NULL);
> > redirty:
> > + wbc->pages_skipped++;
> > set_page_dirty(page);
> > if (wbc->for_reclaim)
> > return AOP_WRITEPAGE_ACTIVATE; /* Return with page locked */