2007-10-08 16:54:54

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

It seems that with the recent usage of ->page_mkwrite() a little detail
was overlooked.

.22-rc1 merged OCFS2 usage of this hook
.23-rc1 merged XFS usage
.24-rc1 will most likely merge NFS usage

Please consider this for .23 final and maybe even .22.x

---
Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()

All the current page_mkwrite() implementations also set the page dirty. Which
results in the set_page_dirty_balance() call to _not_ call balance, because the
page is already found dirty.

This allows us to dirty a _lot_ of pages without ever hitting
balance_dirty_pages(). Not good (tm).

Force a balance call if ->page_mkwrite() was successful.

Signed-off-by: Peter Zijlstra <[email protected]>
---
include/linux/writeback.h | 2 +-
mm/memory.c | 9 +++++++--
mm/page-writeback.c | 4 ++--
3 files changed, 10 insertions(+), 5 deletions(-)

Index: linux-2.6/include/linux/writeback.h
===================================================================
--- linux-2.6.orig/include/linux/writeback.h
+++ linux-2.6/include/linux/writeback.h
@@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
loff_t pos, loff_t count);
int sync_page_range_nolock(struct inode *inode, struct address_space *mapping,
loff_t pos, loff_t count);
-void set_page_dirty_balance(struct page *page);
+void set_page_dirty_balance(struct page *page, int page_mkwrite);
void writeback_set_ratelimit(void);

/* pdflush.c */
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
struct page *old_page, *new_page;
pte_t entry;
int reuse = 0, ret = 0;
+ int page_mkwrite = 0;
struct page *dirty_page = NULL;

old_page = vm_normal_page(vma, address, orig_pte);
@@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
page_cache_release(old_page);
if (!pte_same(*page_table, orig_pte))
goto unlock;
+
+ page_mkwrite = 1;
}
dirty_page = old_page;
get_page(dirty_page);
@@ -1691,7 +1694,7 @@ unlock:
* do_no_page is protected similarly.
*/
wait_on_page_locked(dirty_page);
- set_page_dirty_balance(dirty_page);
+ set_page_dirty_balance(dirty_page, page_mkwrite);
put_page(dirty_page);
}
return ret;
@@ -2238,6 +2241,7 @@ static int __do_fault(struct mm_struct *
struct page *dirty_page = NULL;
struct vm_fault vmf;
int ret;
+ int page_mkwrite = 0;

vmf.virtual_address = (void __user *)(address & PAGE_MASK);
vmf.pgoff = pgoff;
@@ -2315,6 +2319,7 @@ static int __do_fault(struct mm_struct *
anon = 1; /* no anon but release vmf.page */
goto out;
}
+ page_mkwrite = 1;
}
}

@@ -2375,7 +2380,7 @@ out_unlocked:
if (anon)
page_cache_release(vmf.page);
else if (dirty_page) {
- set_page_dirty_balance(dirty_page);
+ set_page_dirty_balance(dirty_page, page_mkwrite);
put_page(dirty_page);
}

Index: linux-2.6/mm/page-writeback.c
===================================================================
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -460,9 +460,9 @@ static void balance_dirty_pages(struct a
pdflush_operation(background_writeout, 0);
}

-void set_page_dirty_balance(struct page *page)
+void set_page_dirty_balance(struct page *page, int page_mkwrite)
{
- if (set_page_dirty(page)) {
+ if (set_page_dirty(page) || page_mkwrite) {
struct address_space *mapping = page_mapping(page);

if (mapping)



2007-10-08 23:12:18

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
> It seems that with the recent usage of ->page_mkwrite() a little detail
> was overlooked.
>
> .22-rc1 merged OCFS2 usage of this hook
> .23-rc1 merged XFS usage
> .24-rc1 will most likely merge NFS usage
>
> Please consider this for .23 final and maybe even .22.x
>
> ---
> Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
>
> All the current page_mkwrite() implementations also set the page dirty.
> Which results in the set_page_dirty_balance() call to _not_ call balance,
> because the page is already found dirty.
>
> This allows us to dirty a _lot_ of pages without ever hitting
> balance_dirty_pages(). Not good (tm).
>
> Force a balance call if ->page_mkwrite() was successful.

Would it be better to just have the callers set_page_dirty_balance()?


> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> include/linux/writeback.h | 2 +-
> mm/memory.c | 9 +++++++--
> mm/page-writeback.c | 4 ++--
> 3 files changed, 10 insertions(+), 5 deletions(-)
>
> Index: linux-2.6/include/linux/writeback.h
> ===================================================================
> --- linux-2.6.orig/include/linux/writeback.h
> +++ linux-2.6/include/linux/writeback.h
> @@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
> loff_t pos, loff_t count);
> int sync_page_range_nolock(struct inode *inode, struct address_space
> *mapping, loff_t pos, loff_t count);
> -void set_page_dirty_balance(struct page *page);
> +void set_page_dirty_balance(struct page *page, int page_mkwrite);
> void writeback_set_ratelimit(void);
>
> /* pdflush.c */
> Index: linux-2.6/mm/memory.c
> ===================================================================
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
> struct page *old_page, *new_page;
> pte_t entry;
> int reuse = 0, ret = 0;
> + int page_mkwrite = 0;
> struct page *dirty_page = NULL;
>
> old_page = vm_normal_page(vma, address, orig_pte);
> @@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
> page_cache_release(old_page);
> if (!pte_same(*page_table, orig_pte))
> goto unlock;
> +
> + page_mkwrite = 1;
> }
> dirty_page = old_page;
> get_page(dirty_page);
> @@ -1691,7 +1694,7 @@ unlock:
> * do_no_page is protected similarly.
> */
> wait_on_page_locked(dirty_page);
> - set_page_dirty_balance(dirty_page);
> + set_page_dirty_balance(dirty_page, page_mkwrite);
> put_page(dirty_page);
> }
> return ret;
> @@ -2238,6 +2241,7 @@ static int __do_fault(struct mm_struct *
> struct page *dirty_page = NULL;
> struct vm_fault vmf;
> int ret;
> + int page_mkwrite = 0;
>
> vmf.virtual_address = (void __user *)(address & PAGE_MASK);
> vmf.pgoff = pgoff;
> @@ -2315,6 +2319,7 @@ static int __do_fault(struct mm_struct *
> anon = 1; /* no anon but release vmf.page */
> goto out;
> }
> + page_mkwrite = 1;
> }
> }
>
> @@ -2375,7 +2380,7 @@ out_unlocked:
> if (anon)
> page_cache_release(vmf.page);
> else if (dirty_page) {
> - set_page_dirty_balance(dirty_page);
> + set_page_dirty_balance(dirty_page, page_mkwrite);
> put_page(dirty_page);
> }
>
> Index: linux-2.6/mm/page-writeback.c
> ===================================================================
> --- linux-2.6.orig/mm/page-writeback.c
> +++ linux-2.6/mm/page-writeback.c
> @@ -460,9 +460,9 @@ static void balance_dirty_pages(struct a
> pdflush_operation(background_writeout, 0);
> }
>
> -void set_page_dirty_balance(struct page *page)
> +void set_page_dirty_balance(struct page *page, int page_mkwrite)
> {
> - if (set_page_dirty(page)) {
> + if (set_page_dirty(page) || page_mkwrite) {
> struct address_space *mapping = page_mapping(page);
>
> if (mapping)

2007-10-08 23:37:31

by David Chinner

[permalink] [raw]
Subject: Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
> On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
> > It seems that with the recent usage of ->page_mkwrite() a little detail
> > was overlooked.
> >
> > .22-rc1 merged OCFS2 usage of this hook
> > .23-rc1 merged XFS usage
> > .24-rc1 will most likely merge NFS usage
> >
> > Please consider this for .23 final and maybe even .22.x
> >
> > ---
> > Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
> >
> > All the current page_mkwrite() implementations also set the page dirty.
> > Which results in the set_page_dirty_balance() call to _not_ call balance,
> > because the page is already found dirty.
> >
> > This allows us to dirty a _lot_ of pages without ever hitting
> > balance_dirty_pages(). Not good (tm).
> >
> > Force a balance call if ->page_mkwrite() was successful.
>
> Would it be better to just have the callers set_page_dirty_balance()?

block_page_mkwrite() is just using generic interfaces to do this,
same as pretty much any write() system call. The idea was to make it
as similar to the write() call path as possible...

However, unlike generic_file_buffered_write(), we are not calling
balance_dirty_pages_ratelimited(mapping) between
->prepare/commit_write call pairs. Perhaps this should be added to
block_page_mkwrite() after the page is unlocked....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-10-09 00:19:43

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

On Tuesday 09 October 2007 09:36, David Chinner wrote:
> On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
> > On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:

> > > Force a balance call if ->page_mkwrite() was successful.
> >
> > Would it be better to just have the callers set_page_dirty_balance()?
>
> block_page_mkwrite() is just using generic interfaces to do this,
> same as pretty much any write() system call. The idea was to make it
> as similar to the write() call path as possible...
>
> However, unlike generic_file_buffered_write(), we are not calling
> balance_dirty_pages_ratelimited(mapping) between
> ->prepare/commit_write call pairs. Perhaps this should be added to
> block_page_mkwrite() after the page is unlocked....

That sounds pretty sane, in terms of matching with
generic_file_buffered_write.

2007-10-09 02:15:23

by Mark Fasheh

[permalink] [raw]
Subject: Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote:
> > block_page_mkwrite() is just using generic interfaces to do this,
> > same as pretty much any write() system call. The idea was to make it
> > as similar to the write() call path as possible...
> >
> > However, unlike generic_file_buffered_write(), we are not calling
> > balance_dirty_pages_ratelimited(mapping) between
> > ->prepare/commit_write call pairs. Perhaps this should be added to
> > block_page_mkwrite() after the page is unlocked....
>
> That sounds pretty sane, in terms of matching with
> generic_file_buffered_write.

I agree. We could also insert a call to balance_dirty_pages_ratelimited() in
__ocfs2_page_mkwrite.
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[email protected]

2007-10-09 07:35:50

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

On Tuesday 09 October 2007 12:12, Mark Fasheh wrote:
> On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote:
> > > block_page_mkwrite() is just using generic interfaces to do this,
> > > same as pretty much any write() system call. The idea was to make it
> > > as similar to the write() call path as possible...
> > >
> > > However, unlike generic_file_buffered_write(), we are not calling
> > > balance_dirty_pages_ratelimited(mapping) between
> > > ->prepare/commit_write call pairs. Perhaps this should be added to
> > > block_page_mkwrite() after the page is unlocked....
> >
> > That sounds pretty sane, in terms of matching with
> > generic_file_buffered_write.
>
> I agree. We could also insert a call to balance_dirty_pages_ratelimited()
> in __ocfs2_page_mkwrite.

Hmm, Peter's patch got merged -- I suppose that's fine for 2.6.23 though...