2009-09-09 15:08:04

by Fengguang Wu

[permalink] [raw]
Subject: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

Some filesystem may choose to write much more than ratelimit_pages
before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
determine number to write based on real number of dirtied pages.

The increased write_chunk may make the dirtier more bumpy. This is
filesystem writers' duty not to dirty too much at a time without
checking the ratelimit.

Signed-off-by: Wu Fengguang <[email protected]>
---
mm/page-writeback.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

--- linux.orig/mm/page-writeback.c 2009-09-09 21:19:21.000000000 +0800
+++ linux/mm/page-writeback.c 2009-09-09 21:25:45.000000000 +0800
@@ -44,12 +44,12 @@ static long ratelimit_pages = 32;
/*
* When balance_dirty_pages decides that the caller needs to perform some
* non-background writeback, this is how many pages it will attempt to write.
- * It should be somewhat larger than RATELIMIT_PAGES to ensure that reasonably
+ * It should be somewhat larger than dirtied pages to ensure that reasonably
* large amounts of I/O are submitted.
*/
-static inline long sync_writeback_pages(void)
+static inline long sync_writeback_pages(unsigned long dirtied)
{
- return ratelimit_pages + ratelimit_pages / 2;
+ return dirtied + dirtied / 2;
}

/* The following parameters are exported via /proc/sys/vm */
@@ -476,7 +476,8 @@ get_dirty_limits(unsigned long *pbackgro
* If we're over `background_thresh' then pdflush is woken to perform some
* writeout.
*/
-static void balance_dirty_pages(struct address_space *mapping)
+static void balance_dirty_pages(struct address_space *mapping,
+ unsigned long write_chunk)
{
long nr_reclaimable, bdi_nr_reclaimable;
long nr_writeback, bdi_nr_writeback;
@@ -484,7 +485,6 @@ static void balance_dirty_pages(struct a
unsigned long dirty_thresh;
unsigned long bdi_thresh;
unsigned long pages_written = 0;
- unsigned long write_chunk = sync_writeback_pages();

struct backing_dev_info *bdi = mapping->backing_dev_info;

@@ -638,9 +638,10 @@ void balance_dirty_pages_ratelimited_nr(
p = &__get_cpu_var(bdp_ratelimits);
*p += nr_pages_dirtied;
if (unlikely(*p >= ratelimit)) {
+ ratelimit = sync_writeback_pages(*p);
*p = 0;
preempt_enable();
- balance_dirty_pages(mapping);
+ balance_dirty_pages(mapping, ratelimit);
return;
}
preempt_enable();

--


2009-09-09 15:44:19

by Jan Kara

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> Some filesystem may choose to write much more than ratelimit_pages
> before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> determine number to write based on real number of dirtied pages.
>
> The increased write_chunk may make the dirtier more bumpy. This is
> filesystem writers' duty not to dirty too much at a time without
> checking the ratelimit.
I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
dirty the page, not when we write it out. So a problem would only happen if
filesystem dirties pages by set_page_dirty() and won't call
balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
and do_wp_page() takes care of that. So where's the problem?

Honza
> ---
> mm/page-writeback.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> --- linux.orig/mm/page-writeback.c 2009-09-09 21:19:21.000000000 +0800
> +++ linux/mm/page-writeback.c 2009-09-09 21:25:45.000000000 +0800
> @@ -44,12 +44,12 @@ static long ratelimit_pages = 32;
> /*
> * When balance_dirty_pages decides that the caller needs to perform some
> * non-background writeback, this is how many pages it will attempt to write.
> - * It should be somewhat larger than RATELIMIT_PAGES to ensure that reasonably
> + * It should be somewhat larger than dirtied pages to ensure that reasonably
> * large amounts of I/O are submitted.
> */
> -static inline long sync_writeback_pages(void)
> +static inline long sync_writeback_pages(unsigned long dirtied)
> {
> - return ratelimit_pages + ratelimit_pages / 2;
> + return dirtied + dirtied / 2;
> }
>
> /* The following parameters are exported via /proc/sys/vm */
> @@ -476,7 +476,8 @@ get_dirty_limits(unsigned long *pbackgro
> * If we're over `background_thresh' then pdflush is woken to perform some
> * writeout.
> */
> -static void balance_dirty_pages(struct address_space *mapping)
> +static void balance_dirty_pages(struct address_space *mapping,
> + unsigned long write_chunk)
> {
> long nr_reclaimable, bdi_nr_reclaimable;
> long nr_writeback, bdi_nr_writeback;
> @@ -484,7 +485,6 @@ static void balance_dirty_pages(struct a
> unsigned long dirty_thresh;
> unsigned long bdi_thresh;
> unsigned long pages_written = 0;
> - unsigned long write_chunk = sync_writeback_pages();
>
> struct backing_dev_info *bdi = mapping->backing_dev_info;
>
> @@ -638,9 +638,10 @@ void balance_dirty_pages_ratelimited_nr(
> p = &__get_cpu_var(bdp_ratelimits);
> *p += nr_pages_dirtied;
> if (unlikely(*p >= ratelimit)) {
> + ratelimit = sync_writeback_pages(*p);
> *p = 0;
> preempt_enable();
> - balance_dirty_pages(mapping);
> + balance_dirty_pages(mapping, ratelimit);
> return;
> }
> preempt_enable();
>
> --
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2009-09-10 01:42:11

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > Some filesystem may choose to write much more than ratelimit_pages
> > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > determine number to write based on real number of dirtied pages.
> >
> > The increased write_chunk may make the dirtier more bumpy. This is
> > filesystem writers' duty not to dirty too much at a time without
> > checking the ratelimit.
> I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> dirty the page, not when we write it out. So a problem would only happen if
> filesystem dirties pages by set_page_dirty() and won't call
> balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> and do_wp_page() takes care of that. So where's the problem?

It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
(1024 is the computed nrptrs value in a 32bit kernel). And it calls
balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.

Thanks,
Fengguang

> > ---
> > mm/page-writeback.c | 13 +++++++------
> > 1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > --- linux.orig/mm/page-writeback.c 2009-09-09 21:19:21.000000000 +0800
> > +++ linux/mm/page-writeback.c 2009-09-09 21:25:45.000000000 +0800
> > @@ -44,12 +44,12 @@ static long ratelimit_pages = 32;
> > /*
> > * When balance_dirty_pages decides that the caller needs to perform some
> > * non-background writeback, this is how many pages it will attempt to write.
> > - * It should be somewhat larger than RATELIMIT_PAGES to ensure that reasonably
> > + * It should be somewhat larger than dirtied pages to ensure that reasonably
> > * large amounts of I/O are submitted.
> > */
> > -static inline long sync_writeback_pages(void)
> > +static inline long sync_writeback_pages(unsigned long dirtied)
> > {
> > - return ratelimit_pages + ratelimit_pages / 2;
> > + return dirtied + dirtied / 2;
> > }
> >
> > /* The following parameters are exported via /proc/sys/vm */
> > @@ -476,7 +476,8 @@ get_dirty_limits(unsigned long *pbackgro
> > * If we're over `background_thresh' then pdflush is woken to perform some
> > * writeout.
> > */
> > -static void balance_dirty_pages(struct address_space *mapping)
> > +static void balance_dirty_pages(struct address_space *mapping,
> > + unsigned long write_chunk)
> > {
> > long nr_reclaimable, bdi_nr_reclaimable;
> > long nr_writeback, bdi_nr_writeback;
> > @@ -484,7 +485,6 @@ static void balance_dirty_pages(struct a
> > unsigned long dirty_thresh;
> > unsigned long bdi_thresh;
> > unsigned long pages_written = 0;
> > - unsigned long write_chunk = sync_writeback_pages();
> >
> > struct backing_dev_info *bdi = mapping->backing_dev_info;
> >
> > @@ -638,9 +638,10 @@ void balance_dirty_pages_ratelimited_nr(
> > p = &__get_cpu_var(bdp_ratelimits);
> > *p += nr_pages_dirtied;
> > if (unlikely(*p >= ratelimit)) {
> > + ratelimit = sync_writeback_pages(*p);
> > *p = 0;
> > preempt_enable();
> > - balance_dirty_pages(mapping);
> > + balance_dirty_pages(mapping, ratelimit);
> > return;
> > }
> > preempt_enable();
> >
> > --
> >
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2009-09-10 12:58:14

by Chris Mason

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, Sep 10, 2009 at 09:42:01AM +0800, Wu Fengguang wrote:
> On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> > On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > > Some filesystem may choose to write much more than ratelimit_pages
> > > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > > determine number to write based on real number of dirtied pages.
> > >
> > > The increased write_chunk may make the dirtier more bumpy. This is
> > > filesystem writers' duty not to dirty too much at a time without
> > > checking the ratelimit.
> > I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> > dirty the page, not when we write it out. So a problem would only happen if
> > filesystem dirties pages by set_page_dirty() and won't call
> > balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> > and do_wp_page() takes care of that. So where's the problem?
>
> It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
> (1024 is the computed nrptrs value in a 32bit kernel). And it calls
> balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.

I can easily change this to call more often, but we do always call
balance_dirty_pages to reflect how much ram we've really sent down.

-chris

2009-09-10 13:22:09

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, Sep 10, 2009 at 08:57:42PM +0800, Chris Mason wrote:
> On Thu, Sep 10, 2009 at 09:42:01AM +0800, Wu Fengguang wrote:
> > On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> > > On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > > > Some filesystem may choose to write much more than ratelimit_pages
> > > > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > > > determine number to write based on real number of dirtied pages.
> > > >
> > > > The increased write_chunk may make the dirtier more bumpy. This is
> > > > filesystem writers' duty not to dirty too much at a time without
> > > > checking the ratelimit.
> > > I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> > > dirty the page, not when we write it out. So a problem would only happen if
> > > filesystem dirties pages by set_page_dirty() and won't call
> > > balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> > > and do_wp_page() takes care of that. So where's the problem?
> >
> > It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
> > (1024 is the computed nrptrs value in a 32bit kernel). And it calls
> > balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.
>
> I can easily change this to call more often, but we do always call
> balance_dirty_pages to reflect how much ram we've really sent down.

Btrfs is doing OK. 2MB/4MB looks like reasonable chunk sizes. The
need-change part is balance_dirty_pages_ratelimited_nr(), hence this
patch :)

Thanks,
Fengguang

2009-09-10 14:56:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, 2009-09-10 at 21:21 +0800, Wu Fengguang wrote:
> On Thu, Sep 10, 2009 at 08:57:42PM +0800, Chris Mason wrote:
> > On Thu, Sep 10, 2009 at 09:42:01AM +0800, Wu Fengguang wrote:
> > > On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> > > > On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > > > > Some filesystem may choose to write much more than ratelimit_pages
> > > > > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > > > > determine number to write based on real number of dirtied pages.
> > > > >
> > > > > The increased write_chunk may make the dirtier more bumpy. This is
> > > > > filesystem writers' duty not to dirty too much at a time without
> > > > > checking the ratelimit.
> > > > I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> > > > dirty the page, not when we write it out. So a problem would only happen if
> > > > filesystem dirties pages by set_page_dirty() and won't call
> > > > balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> > > > and do_wp_page() takes care of that. So where's the problem?
> > >
> > > It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
> > > (1024 is the computed nrptrs value in a 32bit kernel). And it calls
> > > balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.
> >
> > I can easily change this to call more often, but we do always call
> > balance_dirty_pages to reflect how much ram we've really sent down.
>
> Btrfs is doing OK. 2MB/4MB looks like reasonable chunk sizes. The
> need-change part is balance_dirty_pages_ratelimited_nr(), hence this
> patch :)

I'm not getting it, it calls set_page_dirty() for each page, right? and
then it calls into balance_dirty_pages_ratelimited_nr(), that sounds
right. What is the problem with that?

2009-09-10 15:15:10

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, Sep 10, 2009 at 10:56:04PM +0800, Peter Zijlstra wrote:
> On Thu, 2009-09-10 at 21:21 +0800, Wu Fengguang wrote:
> > On Thu, Sep 10, 2009 at 08:57:42PM +0800, Chris Mason wrote:
> > > On Thu, Sep 10, 2009 at 09:42:01AM +0800, Wu Fengguang wrote:
> > > > On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> > > > > On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > > > > > Some filesystem may choose to write much more than ratelimit_pages
> > > > > > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > > > > > determine number to write based on real number of dirtied pages.
> > > > > >
> > > > > > The increased write_chunk may make the dirtier more bumpy. This is
> > > > > > filesystem writers' duty not to dirty too much at a time without
> > > > > > checking the ratelimit.
> > > > > I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> > > > > dirty the page, not when we write it out. So a problem would only happen if
> > > > > filesystem dirties pages by set_page_dirty() and won't call
> > > > > balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> > > > > and do_wp_page() takes care of that. So where's the problem?
> > > >
> > > > It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
> > > > (1024 is the computed nrptrs value in a 32bit kernel). And it calls
> > > > balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.
> > >
> > > I can easily change this to call more often, but we do always call
> > > balance_dirty_pages to reflect how much ram we've really sent down.
> >
> > Btrfs is doing OK. 2MB/4MB looks like reasonable chunk sizes. The
> > need-change part is balance_dirty_pages_ratelimited_nr(), hence this
> > patch :)
>
> I'm not getting it, it calls set_page_dirty() for each page, right? and
> then it calls into balance_dirty_pages_ratelimited_nr(), that sounds
> right. What is the problem with that?

It looks like btrfs_file_write() eventually calls
__set_page_dirty_buffers() which in turn won't call
balance_dirty_pages*(). This is why do_wp_page() calls
set_page_dirty_balance() to do balance_dirty_pages*().

So btrfs_file_write() explicitly calls
balance_dirty_pages_ratelimited_nr() to get throttled.

Thanks,
Fengguang

2009-09-10 15:31:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, 2009-09-10 at 23:14 +0800, Wu Fengguang wrote:
> On Thu, Sep 10, 2009 at 10:56:04PM +0800, Peter Zijlstra wrote:
> > On Thu, 2009-09-10 at 21:21 +0800, Wu Fengguang wrote:
> > > On Thu, Sep 10, 2009 at 08:57:42PM +0800, Chris Mason wrote:
> > > > On Thu, Sep 10, 2009 at 09:42:01AM +0800, Wu Fengguang wrote:
> > > > > On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> > > > > > On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > > > > > > Some filesystem may choose to write much more than ratelimit_pages
> > > > > > > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > > > > > > determine number to write based on real number of dirtied pages.
> > > > > > >
> > > > > > > The increased write_chunk may make the dirtier more bumpy. This is
> > > > > > > filesystem writers' duty not to dirty too much at a time without
> > > > > > > checking the ratelimit.
> > > > > > I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> > > > > > dirty the page, not when we write it out. So a problem would only happen if
> > > > > > filesystem dirties pages by set_page_dirty() and won't call
> > > > > > balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> > > > > > and do_wp_page() takes care of that. So where's the problem?
> > > > >
> > > > > It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
> > > > > (1024 is the computed nrptrs value in a 32bit kernel). And it calls
> > > > > balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.
> > > >
> > > > I can easily change this to call more often, but we do always call
> > > > balance_dirty_pages to reflect how much ram we've really sent down.
> > >
> > > Btrfs is doing OK. 2MB/4MB looks like reasonable chunk sizes. The
> > > need-change part is balance_dirty_pages_ratelimited_nr(), hence this
> > > patch :)
> >
> > I'm not getting it, it calls set_page_dirty() for each page, right? and
> > then it calls into balance_dirty_pages_ratelimited_nr(), that sounds
> > right. What is the problem with that?
>
> It looks like btrfs_file_write() eventually calls
> __set_page_dirty_buffers() which in turn won't call
> balance_dirty_pages*(). This is why do_wp_page() calls
> set_page_dirty_balance() to do balance_dirty_pages*().
>
> So btrfs_file_write() explicitly calls
> balance_dirty_pages_ratelimited_nr() to get throttled.

Right, so what is wrong with than, and how does this patch fix that?

[ the only thing you have to be careful with is that you don't
excessively grow the error bound on the dirty limit ]

2009-09-10 15:41:27

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, Sep 10, 2009 at 11:31:38PM +0800, Peter Zijlstra wrote:
> On Thu, 2009-09-10 at 23:14 +0800, Wu Fengguang wrote:
> > On Thu, Sep 10, 2009 at 10:56:04PM +0800, Peter Zijlstra wrote:
> > > On Thu, 2009-09-10 at 21:21 +0800, Wu Fengguang wrote:
> > > > On Thu, Sep 10, 2009 at 08:57:42PM +0800, Chris Mason wrote:
> > > > > On Thu, Sep 10, 2009 at 09:42:01AM +0800, Wu Fengguang wrote:
> > > > > > On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> > > > > > > On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > > > > > > > Some filesystem may choose to write much more than ratelimit_pages
> > > > > > > > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > > > > > > > determine number to write based on real number of dirtied pages.
> > > > > > > >
> > > > > > > > The increased write_chunk may make the dirtier more bumpy. This is
> > > > > > > > filesystem writers' duty not to dirty too much at a time without
> > > > > > > > checking the ratelimit.
> > > > > > > I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> > > > > > > dirty the page, not when we write it out. So a problem would only happen if
> > > > > > > filesystem dirties pages by set_page_dirty() and won't call
> > > > > > > balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> > > > > > > and do_wp_page() takes care of that. So where's the problem?
> > > > > >
> > > > > > It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
> > > > > > (1024 is the computed nrptrs value in a 32bit kernel). And it calls
> > > > > > balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.
> > > > >
> > > > > I can easily change this to call more often, but we do always call
> > > > > balance_dirty_pages to reflect how much ram we've really sent down.
> > > >
> > > > Btrfs is doing OK. 2MB/4MB looks like reasonable chunk sizes. The
> > > > need-change part is balance_dirty_pages_ratelimited_nr(), hence this
> > > > patch :)
> > >
> > > I'm not getting it, it calls set_page_dirty() for each page, right? and
> > > then it calls into balance_dirty_pages_ratelimited_nr(), that sounds
> > > right. What is the problem with that?
> >
> > It looks like btrfs_file_write() eventually calls
> > __set_page_dirty_buffers() which in turn won't call
> > balance_dirty_pages*(). This is why do_wp_page() calls
> > set_page_dirty_balance() to do balance_dirty_pages*().
> >
> > So btrfs_file_write() explicitly calls
> > balance_dirty_pages_ratelimited_nr() to get throttled.
>
> Right, so what is wrong with than, and how does this patch fix that?
>
> [ the only thing you have to be careful with is that you don't
> excessively grow the error bound on the dirty limit ]

Then we could form a loop:

btrfs_file_write(): dirty 1024 pages
balance_dirty_pages(): write up to 12 pages (= ratelimit_pages * 1.5)

in which the writeback rate cannot keep up with dirty rate,
and the dirty pages go all the way beyond dirty_thresh.

Sorry for writing such a vague changelog!

Thanks,
Fengguang

2009-09-10 15:54:40

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, 2009-09-10 at 23:41 +0800, Wu Fengguang wrote:
> > > So btrfs_file_write() explicitly calls
> > > balance_dirty_pages_ratelimited_nr() to get throttled.
> >
> > Right, so what is wrong with than, and how does this patch fix that?
> >
> > [ the only thing you have to be careful with is that you don't
> > excessively grow the error bound on the dirty limit ]
>
> Then we could form a loop:
>
> btrfs_file_write(): dirty 1024 pages
> balance_dirty_pages(): write up to 12 pages (= ratelimit_pages * 1.5)
>
> in which the writeback rate cannot keep up with dirty rate,
> and the dirty pages go all the way beyond dirty_thresh.

Ah, ok so this is to keep the error bound on the dirty limit bounded,
because we can break out of balance_dirty_pages() early, the /* We've
done our duty */ break.

Which unbalances the duty vs the dirty ratio.

I figure that with the task dirty limit stuff we could maybe try to get
rid of this break.. worth a try.

> Sorry for writing such a vague changelog!

np, as long as we get there :-)

Change makes sense now, thanks!

2009-09-10 16:09:09

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

On Thu, Sep 10, 2009 at 11:54:29PM +0800, Peter Zijlstra wrote:
> On Thu, 2009-09-10 at 23:41 +0800, Wu Fengguang wrote:
> > > > So btrfs_file_write() explicitly calls
> > > > balance_dirty_pages_ratelimited_nr() to get throttled.
> > >
> > > Right, so what is wrong with than, and how does this patch fix that?
> > >
> > > [ the only thing you have to be careful with is that you don't
> > > excessively grow the error bound on the dirty limit ]
> >
> > Then we could form a loop:
> >
> > btrfs_file_write(): dirty 1024 pages
> > balance_dirty_pages(): write up to 12 pages (= ratelimit_pages * 1.5)
> >
> > in which the writeback rate cannot keep up with dirty rate,
> > and the dirty pages go all the way beyond dirty_thresh.
>
> Ah, ok so this is to keep the error bound on the dirty limit bounded,
> because we can break out of balance_dirty_pages() early, the /* We've
> done our duty */ break.
>
> Which unbalances the duty vs the dirty ratio.

Right!

> I figure that with the task dirty limit stuff we could maybe try to get
> rid of this break.. worth a try.

Be careful. Without that break, the time a task get throttled in a
single trip may go out of control. For example, task B get blocked
for 1000 seconds because there is a task A keep dirtying pages, in
the mean time task A's dirty thresh going down slowly, but still
larger than B's.

> > Sorry for writing such a vague changelog!
>
> np, as long as we get there :-)
>
> Change makes sense now, thanks!

May I add you ack?

Thanks,
Fengguang