2008-03-17 19:21:24

by Miklos Szeredi

[permalink] [raw]
Subject: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

From: Miklos Szeredi <[email protected]>

Fuse's writepage will need to clear page writeback separately from
updating the per BDI counters.

This patch renames end_page_writeback() to __end_page_writeback() and
adds a boolean parameter to indicate if the per BDI stats need to be
updated.

Regular callers get an inline end_page_writeback() without the boolean
parameter.

Signed-off-by: Miklos Szeredi <[email protected]>
---
include/linux/page-flags.h | 2 +-
include/linux/pagemap.h | 7 ++++++-
mm/filemap.c | 7 ++++---
mm/page-writeback.c | 4 ++--
4 files changed, 13 insertions(+), 7 deletions(-)

Index: linux/include/linux/page-flags.h
===================================================================
--- linux.orig/include/linux/page-flags.h 2008-03-17 18:24:13.000000000 +0100
+++ linux/include/linux/page-flags.h 2008-03-17 18:25:53.000000000 +0100
@@ -300,7 +300,7 @@ struct page; /* forward declaration */

extern void cancel_dirty_page(struct page *page, unsigned int account_size);

-int test_clear_page_writeback(struct page *page);
+int test_clear_page_writeback(struct page *page, bool bdi_stats);
int test_set_page_writeback(struct page *page);

static inline void set_page_writeback(struct page *page)
Index: linux/include/linux/pagemap.h
===================================================================
--- linux.orig/include/linux/pagemap.h 2008-03-17 18:24:13.000000000 +0100
+++ linux/include/linux/pagemap.h 2008-03-17 18:25:53.000000000 +0100
@@ -223,7 +223,12 @@ static inline void wait_on_page_writebac
wait_on_page_bit(page, PG_writeback);
}

-extern void end_page_writeback(struct page *page);
+extern void __end_page_writeback(struct page *page, bool bdi_stats);
+
+static inline void end_page_writeback(struct page *page)
+{
+ __end_page_writeback(page, true);
+}

/*
* Fault a userspace page into pagetables. Return non-zero on a fault.
Index: linux/mm/filemap.c
===================================================================
--- linux.orig/mm/filemap.c 2008-03-17 18:25:38.000000000 +0100
+++ linux/mm/filemap.c 2008-03-17 18:25:53.000000000 +0100
@@ -574,19 +574,20 @@ EXPORT_SYMBOL(unlock_page);
/**
* end_page_writeback - end writeback against a page
* @page: the page
+ * @bdi_stats: update the per-bdi writeback counter
*/
-void end_page_writeback(struct page *page)
+void __end_page_writeback(struct page *page, bool bdi_stats)
{
if (TestClearPageReclaim(page))
rotate_reclaimable_page(page);

- if (!test_clear_page_writeback(page))
+ if (!test_clear_page_writeback(page, bdi_stats))
BUG();

smp_mb__after_clear_bit();
wake_up_page(page, PG_writeback);
}
-EXPORT_SYMBOL(end_page_writeback);
+EXPORT_SYMBOL(__end_page_writeback);

/**
* __lock_page - get a lock on the page, assuming we need to sleep to get it
Index: linux/mm/page-writeback.c
===================================================================
--- linux.orig/mm/page-writeback.c 2008-03-17 18:25:17.000000000 +0100
+++ linux/mm/page-writeback.c 2008-03-17 18:25:53.000000000 +0100
@@ -1242,7 +1242,7 @@ int clear_page_dirty_for_io(struct page
}
EXPORT_SYMBOL(clear_page_dirty_for_io);

-int test_clear_page_writeback(struct page *page)
+int test_clear_page_writeback(struct page *page, bool bdi_stats)
{
struct address_space *mapping = page_mapping(page);
int ret;
@@ -1257,7 +1257,7 @@ int test_clear_page_writeback(struct pag
radix_tree_tag_clear(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_WRITEBACK);
- if (bdi_cap_writeback_dirty(bdi)) {
+ if (bdi_stats && bdi_cap_writeback_dirty(bdi)) {
__dec_bdi_stat(bdi, BDI_WRITEBACK);
__bdi_writeout_inc(bdi);
}

--


2008-03-18 05:05:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

On Mon, 17 Mar 2008 20:19:12 +0100 Miklos Szeredi <[email protected]> wrote:

> From: Miklos Szeredi <[email protected]>
>
> Fuse's writepage will need to clear page writeback separately from
> updating the per BDI counters.
>
> This patch renames end_page_writeback() to __end_page_writeback() and
> adds a boolean parameter to indicate if the per BDI stats need to be
> updated.
>
> Regular callers get an inline end_page_writeback() without the boolean
> parameter.
>
> ...
>
> Index: linux/include/linux/page-flags.h
> ===================================================================
> --- linux.orig/include/linux/page-flags.h 2008-03-17 18:24:13.000000000 +0100
> +++ linux/include/linux/page-flags.h 2008-03-17 18:25:53.000000000 +0100
> @@ -300,7 +300,7 @@ struct page; /* forward declaration */
>
> extern void cancel_dirty_page(struct page *page, unsigned int account_size);
>
> -int test_clear_page_writeback(struct page *page);
> +int test_clear_page_writeback(struct page *page, bool bdi_stats);
> int test_set_page_writeback(struct page *page);
>
> static inline void set_page_writeback(struct page *page)
> Index: linux/include/linux/pagemap.h
> ===================================================================
> --- linux.orig/include/linux/pagemap.h 2008-03-17 18:24:13.000000000 +0100
> +++ linux/include/linux/pagemap.h 2008-03-17 18:25:53.000000000 +0100
> @@ -223,7 +223,12 @@ static inline void wait_on_page_writebac
> wait_on_page_bit(page, PG_writeback);
> }
>
> -extern void end_page_writeback(struct page *page);
> +extern void __end_page_writeback(struct page *page, bool bdi_stats);
> +
> +static inline void end_page_writeback(struct page *page)
> +{
> + __end_page_writeback(page, true);
> +}
>
> /*
> * Fault a userspace page into pagetables. Return non-zero on a fault.
> Index: linux/mm/filemap.c
> ===================================================================
> --- linux.orig/mm/filemap.c 2008-03-17 18:25:38.000000000 +0100
> +++ linux/mm/filemap.c 2008-03-17 18:25:53.000000000 +0100
> @@ -574,19 +574,20 @@ EXPORT_SYMBOL(unlock_page);
> /**
> * end_page_writeback - end writeback against a page
> * @page: the page
> + * @bdi_stats: update the per-bdi writeback counter
> */
> -void end_page_writeback(struct page *page)
> +void __end_page_writeback(struct page *page, bool bdi_stats)
> {
> if (TestClearPageReclaim(page))
> rotate_reclaimable_page(page);
>
> - if (!test_clear_page_writeback(page))
> + if (!test_clear_page_writeback(page, bdi_stats))
> BUG();
>
> smp_mb__after_clear_bit();
> wake_up_page(page, PG_writeback);
> }
> -EXPORT_SYMBOL(end_page_writeback);
> +EXPORT_SYMBOL(__end_page_writeback);
>
> /**
> * __lock_page - get a lock on the page, assuming we need to sleep to get it
> Index: linux/mm/page-writeback.c
> ===================================================================
> --- linux.orig/mm/page-writeback.c 2008-03-17 18:25:17.000000000 +0100
> +++ linux/mm/page-writeback.c 2008-03-17 18:25:53.000000000 +0100
> @@ -1242,7 +1242,7 @@ int clear_page_dirty_for_io(struct page
> }
> EXPORT_SYMBOL(clear_page_dirty_for_io);
>
> -int test_clear_page_writeback(struct page *page)
> +int test_clear_page_writeback(struct page *page, bool bdi_stats)
> {
> struct address_space *mapping = page_mapping(page);
> int ret;
> @@ -1257,7 +1257,7 @@ int test_clear_page_writeback(struct pag
> radix_tree_tag_clear(&mapping->page_tree,
> page_index(page),
> PAGECACHE_TAG_WRITEBACK);
> - if (bdi_cap_writeback_dirty(bdi)) {
> + if (bdi_stats && bdi_cap_writeback_dirty(bdi)) {
> __dec_bdi_stat(bdi, BDI_WRITEBACK);
> __bdi_writeout_inc(bdi);
> }

Adding `mode' flags to a core function is generally considered poor form.
And it adds additional overhead and possibly stack utilisation for all
callers.

We generally prefer that a new function be created. After all, that's what
you've done here, only the code has gone and wedged two different functions
into one.


Another approach might be to add a new bdi_cap_foo() flag. We could then do

if (bdi_cap_writeback_dirty(bdi) && bdi_cap_mumble(bdi)) {

here. But even better would be to create a new BDI capability which
indicates that this address_space doesn't want this treatment in
test_clear_page_writeback(), then go fix up all the
!bdi_cap_writeback_dirty() address_spaces to set that flag.

So then the code becomes

if (!bdi_cap_account_writeback_in_test_clear_page_writeback(bdi)) {

(good luck thinking up a better name ;))

Reason: bdi_cap_writeback_dirty() is kinda weirdly intrepreted to mean
various different things in different places and we really should separate
its multiple interpretations into separate flags.

Note that this becomes a standalone VFS cleanup patch, and the fuse code
can then just use it later on.

2008-03-18 08:12:19

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

[PeterZ added to CC]

> On Mon, 17 Mar 2008 20:19:12 +0100 Miklos Szeredi <[email protected]> wrote:
>
> > From: Miklos Szeredi <[email protected]>
> >
> > Fuse's writepage will need to clear page writeback separately from
> > updating the per BDI counters.
> >
> > This patch renames end_page_writeback() to __end_page_writeback() and
> > adds a boolean parameter to indicate if the per BDI stats need to be
> > updated.
> >
> > Regular callers get an inline end_page_writeback() without the boolean
> > parameter.
> >
> > ...
> >
> > Index: linux/include/linux/page-flags.h
> > ===================================================================
> > --- linux.orig/include/linux/page-flags.h 2008-03-17 18:24:13.000000000 +0100
> > +++ linux/include/linux/page-flags.h 2008-03-17 18:25:53.000000000 +0100
> > @@ -300,7 +300,7 @@ struct page; /* forward declaration */
> >
> > extern void cancel_dirty_page(struct page *page, unsigned int account_size);
> >
> > -int test_clear_page_writeback(struct page *page);
> > +int test_clear_page_writeback(struct page *page, bool bdi_stats);
> > int test_set_page_writeback(struct page *page);
> >
> > static inline void set_page_writeback(struct page *page)
> > Index: linux/include/linux/pagemap.h
> > ===================================================================
> > --- linux.orig/include/linux/pagemap.h 2008-03-17 18:24:13.000000000 +0100
> > +++ linux/include/linux/pagemap.h 2008-03-17 18:25:53.000000000 +0100
> > @@ -223,7 +223,12 @@ static inline void wait_on_page_writebac
> > wait_on_page_bit(page, PG_writeback);
> > }
> >
> > -extern void end_page_writeback(struct page *page);
> > +extern void __end_page_writeback(struct page *page, bool bdi_stats);
> > +
> > +static inline void end_page_writeback(struct page *page)
> > +{
> > + __end_page_writeback(page, true);
> > +}
> >
> > /*
> > * Fault a userspace page into pagetables. Return non-zero on a fault.
> > Index: linux/mm/filemap.c
> > ===================================================================
> > --- linux.orig/mm/filemap.c 2008-03-17 18:25:38.000000000 +0100
> > +++ linux/mm/filemap.c 2008-03-17 18:25:53.000000000 +0100
> > @@ -574,19 +574,20 @@ EXPORT_SYMBOL(unlock_page);
> > /**
> > * end_page_writeback - end writeback against a page
> > * @page: the page
> > + * @bdi_stats: update the per-bdi writeback counter
> > */
> > -void end_page_writeback(struct page *page)
> > +void __end_page_writeback(struct page *page, bool bdi_stats)
> > {
> > if (TestClearPageReclaim(page))
> > rotate_reclaimable_page(page);
> >
> > - if (!test_clear_page_writeback(page))
> > + if (!test_clear_page_writeback(page, bdi_stats))
> > BUG();
> >
> > smp_mb__after_clear_bit();
> > wake_up_page(page, PG_writeback);
> > }
> > -EXPORT_SYMBOL(end_page_writeback);
> > +EXPORT_SYMBOL(__end_page_writeback);
> >
> > /**
> > * __lock_page - get a lock on the page, assuming we need to sleep to get it
> > Index: linux/mm/page-writeback.c
> > ===================================================================
> > --- linux.orig/mm/page-writeback.c 2008-03-17 18:25:17.000000000 +0100
> > +++ linux/mm/page-writeback.c 2008-03-17 18:25:53.000000000 +0100
> > @@ -1242,7 +1242,7 @@ int clear_page_dirty_for_io(struct page
> > }
> > EXPORT_SYMBOL(clear_page_dirty_for_io);
> >
> > -int test_clear_page_writeback(struct page *page)
> > +int test_clear_page_writeback(struct page *page, bool bdi_stats)
> > {
> > struct address_space *mapping = page_mapping(page);
> > int ret;
> > @@ -1257,7 +1257,7 @@ int test_clear_page_writeback(struct pag
> > radix_tree_tag_clear(&mapping->page_tree,
> > page_index(page),
> > PAGECACHE_TAG_WRITEBACK);
> > - if (bdi_cap_writeback_dirty(bdi)) {
> > + if (bdi_stats && bdi_cap_writeback_dirty(bdi)) {
> > __dec_bdi_stat(bdi, BDI_WRITEBACK);
> > __bdi_writeout_inc(bdi);
> > }
>
> Adding `mode' flags to a core function is generally considered poor form.
> And it adds additional overhead and possibly stack utilisation for all
> callers.
>
> We generally prefer that a new function be created. After all, that's what
> you've done here, only the code has gone and wedged two different functions
> into one.

Yes, although duplicating such a not entirely trivial function has
it's dangers as well, I think.

> Another approach might be to add a new bdi_cap_foo() flag. We could then do
>
> if (bdi_cap_writeback_dirty(bdi) && bdi_cap_mumble(bdi)) {
>
> here. But even better would be to create a new BDI capability which
> indicates that this address_space doesn't want this treatment in
> test_clear_page_writeback(), then go fix up all the
> !bdi_cap_writeback_dirty() address_spaces to set that flag.
>
> So then the code becomes
>
> if (!bdi_cap_account_writeback_in_test_clear_page_writeback(bdi)) {
>
> (good luck thinking up a better name ;))
>
> Reason: bdi_cap_writeback_dirty() is kinda weirdly intrepreted to mean
> various different things in different places and we really should separate
> its multiple interpretations into separate flags.
>
> Note that this becomes a standalone VFS cleanup patch, and the fuse code
> can then just use it later on.

Hmm, I can see two slightly different meanings of bdi_cap_writeback_dirty():

1) need to call ->writepage (sync_page_range(), ...)
2) need to update BDI stats (test_clear_page_writeback(), ...)

If these two were different flags, then fuse could set the
NEED_WRITEPAGE flag, but clear the NEED_UPDATE_BDI_STATS flag, and do
it manually.

Does that sound workable?

Thanks,
Miklos

2008-03-18 08:19:10

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

On Tue, 18 Mar 2008 09:11:49 +0100 Miklos Szeredi <[email protected]> wrote:

> >
> > Reason: bdi_cap_writeback_dirty() is kinda weirdly intrepreted to mean
> > various different things in different places and we really should separate
> > its multiple interpretations into separate flags.
> >
> > Note that this becomes a standalone VFS cleanup patch, and the fuse code
> > can then just use it later on.
>
> Hmm, I can see two slightly different meanings of bdi_cap_writeback_dirty():
>
> 1) need to call ->writepage (sync_page_range(), ...)
> 2) need to update BDI stats (test_clear_page_writeback(), ...)
>
> If these two were different flags, then fuse could set the
> NEED_WRITEPAGE flag, but clear the NEED_UPDATE_BDI_STATS flag, and do
> it manually.
>
> Does that sound workable?

Yup, thanks.

2008-03-18 11:34:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

On Mon, 2008-03-17 at 20:19 +0100, Miklos Szeredi wrote:
> plain text document attachment (end_page_writeback_nobdi.patch)
> From: Miklos Szeredi <[email protected]>
>
> Fuse's writepage will need to clear page writeback separately from
> updating the per BDI counters.

This is because of the juggling with temporary pages, right?

Would be nice to have some comments in the code explaining this.

2008-03-18 11:59:26

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

> On Mon, 2008-03-17 at 20:19 +0100, Miklos Szeredi wrote:
> > plain text document attachment (end_page_writeback_nobdi.patch)
> > From: Miklos Szeredi <[email protected]>
> >
> > Fuse's writepage will need to clear page writeback separately from
> > updating the per BDI counters.
>
> This is because of the juggling with temporary pages, right?
>
> Would be nice to have some comments in the code explaining this.

Yup, well it will go through a bigger cleanup, as discussed with
Andrew, if that's OK with you?

Miklos

2008-03-18 12:29:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

On Tue, 2008-03-18 at 12:59 +0100, Miklos Szeredi wrote:
> > On Mon, 2008-03-17 at 20:19 +0100, Miklos Szeredi wrote:
> > > plain text document attachment (end_page_writeback_nobdi.patch)
> > > From: Miklos Szeredi <[email protected]>
> > >
> > > Fuse's writepage will need to clear page writeback separately from
> > > updating the per BDI counters.
> >
> > This is because of the juggling with temporary pages, right?
> >
> > Would be nice to have some comments in the code explaining this.
>
> Yup, well it will go through a bigger cleanup, as discussed with
> Andrew, if that's OK with you?

Well, I was pondering that - it hadn't made its way out to the other
side of my brain yet.. but I'll dump have I have:

Yes, it does two things, _however_ those two things are very much
related. Your use-case that breaks this relation is an execption - and I
haven't really grasped it yet..

I'm in general not too keen about you having to export the BDI
accounting stuff and using it explicitly like this, but I'm afraid I
don't see a way around it - the danger is that other filesystems will
get creative (hence the req for GPL - that excludes the most creative
ones).

Yes, it makes sense to delay the write completion accounting until its
actually completed.. but I would suggest all writeback accounting.

So the thing that's in your way is that removing a page from the radix
tree doesn't imply its done writing. So perhaps we should make that
distinction instead?

So instead of conditionally do part of the accounting, never do it and
require something like: page_writeback_complete() to be called after a
successfull test_clear_page_writeback().

2008-03-18 12:51:35

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

> Yes, it does two things, _however_ those two things are very much
> related. Your use-case that breaks this relation is an execption - and I
> haven't really grasped it yet..
>
> I'm in general not too keen about you having to export the BDI
> accounting stuff and using it explicitly like this, but I'm afraid I
> don't see a way around it - the danger is that other filesystems will
> get creative (hence the req for GPL - that excludes the most creative
> ones).
>
> Yes, it makes sense to delay the write completion accounting until its
> actually completed.. but I would suggest all writeback accounting.

Doesn't work, as long as we have throttle_vm_writeout() waiting for
NR_WRITEBACK to go below a threshold, delaying the NR_WRITEBACK
accounting could lead to a deadlock.

So at least until that's resolved NR_WRITEBACK_TEMP needs to be
separate from NR_WRITEBACK_TEMP. And it makes sense possibly even
after that, as they are fundamentally different things. The first one
is page cache pages being under writeout, the second is just kernel
buffers (mostly) unrelated to the page cache.

> So the thing that's in your way is that removing a page from the radix
> tree doesn't imply its done writing. So perhaps we should make that
> distinction instead?
>
> So instead of conditionally do part of the accounting, never do it and
> require something like: page_writeback_complete() to be called after a
> successfull test_clear_page_writeback().

Yes, that's a possibility, but then normal filesystems miss out on the
small optimization provided by doing the BDI accounting functions
inside the same IRQ disabled region as the radix tree operation.
Would that have any significant performance impact?

Miklos

2008-03-18 13:08:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

On Tue, 2008-03-18 at 13:51 +0100, Miklos Szeredi wrote:
> > Yes, it does two things, _however_ those two things are very much
> > related. Your use-case that breaks this relation is an execption - and I
> > haven't really grasped it yet..
> >
> > I'm in general not too keen about you having to export the BDI
> > accounting stuff and using it explicitly like this, but I'm afraid I
> > don't see a way around it - the danger is that other filesystems will
> > get creative (hence the req for GPL - that excludes the most creative
> > ones).
> >
> > Yes, it makes sense to delay the write completion accounting until its
> > actually completed.. but I would suggest all writeback accounting.
>
> Doesn't work, as long as we have throttle_vm_writeout() waiting for
> NR_WRITEBACK to go below a threshold, delaying the NR_WRITEBACK
> accounting could lead to a deadlock.
>
> So at least until that's resolved NR_WRITEBACK_TEMP needs to be
> separate from NR_WRITEBACK_TEMP. And it makes sense possibly even
> after that, as they are fundamentally different things. The first one
> is page cache pages being under writeout, the second is just kernel
> buffers (mostly) unrelated to the page cache.

Urgh, throttle_vm_writeout() again.. Agreed, that'll deadlock.

> > So the thing that's in your way is that removing a page from the radix
> > tree doesn't imply its done writing. So perhaps we should make that
> > distinction instead?
> >
> > So instead of conditionally do part of the accounting, never do it and
> > require something like: page_writeback_complete() to be called after a
> > successfull test_clear_page_writeback().
>
> Yes, that's a possibility, but then normal filesystems miss out on the
> small optimization provided by doing the BDI accounting functions
> inside the same IRQ disabled region as the radix tree operation.
> Would that have any significant performance impact?

Yeah, realized that. Don't know, would have to measure it somehow...
some archs are rather slow with disabling IRQs, but we're talking about
writeout which should be dominated by the IO times.

Its just that your proposal exposes too much guts, I'd like the
interface to be a little higher level.


2008-03-18 13:58:21

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

> > > So the thing that's in your way is that removing a page from the radix
> > > tree doesn't imply its done writing. So perhaps we should make that
> > > distinction instead?
> > >
> > > So instead of conditionally do part of the accounting, never do it and
> > > require something like: page_writeback_complete() to be called after a
> > > successfull test_clear_page_writeback().
> >
> > Yes, that's a possibility, but then normal filesystems miss out on the
> > small optimization provided by doing the BDI accounting functions
> > inside the same IRQ disabled region as the radix tree operation.
> > Would that have any significant performance impact?
>
> Yeah, realized that. Don't know, would have to measure it somehow...
> some archs are rather slow with disabling IRQs, but we're talking about
> writeout which should be dominated by the IO times.
>
> Its just that your proposal exposes too much guts, I'd like the
> interface to be a little higher level.

Well, but this is the kernel, you can't really make foolproof
interfaces. If we'll go with Andrew's suggestion, I'll add comments
warning users about not touching those flags unless they know what
they are doing, OK?

Miklos

2008-03-18 13:59:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

On Tue, 2008-03-18 at 14:58 +0100, Miklos Szeredi wrote:
> > > > So the thing that's in your way is that removing a page from the radix
> > > > tree doesn't imply its done writing. So perhaps we should make that
> > > > distinction instead?
> > > >
> > > > So instead of conditionally do part of the accounting, never do it and
> > > > require something like: page_writeback_complete() to be called after a
> > > > successfull test_clear_page_writeback().
> > >
> > > Yes, that's a possibility, but then normal filesystems miss out on the
> > > small optimization provided by doing the BDI accounting functions
> > > inside the same IRQ disabled region as the radix tree operation.
> > > Would that have any significant performance impact?
> >
> > Yeah, realized that. Don't know, would have to measure it somehow...
> > some archs are rather slow with disabling IRQs, but we're talking about
> > writeout which should be dominated by the IO times.
> >
> > Its just that your proposal exposes too much guts, I'd like the
> > interface to be a little higher level.
>
> Well, but this is the kernel, you can't really make foolproof
> interfaces. If we'll go with Andrew's suggestion, I'll add comments
> warning users about not touching those flags unless they know what
> they are doing, OK?

Yeah, I guess so :-)

2008-03-19 19:35:16

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

On Tue, 18 Mar 2008 16:53:52 +0100 Miklos Szeredi <[email protected]> wrote:

> On a related note, is there a reason why bdi_cap_writeback_dirty() and
> friends need to be macros instead of inline functions?

None whatsoever.

> If not I'd clean that up as well.

Goodness.

2008-03-20 00:23:50

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()

> > Well, but this is the kernel, you can't really make foolproof
> > interfaces. If we'll go with Andrew's suggestion, I'll add comments
> > warning users about not touching those flags unless they know what
> > they are doing, OK?
>
> Yeah, I guess so :-)

Cool :)

On a related note, is there a reason why bdi_cap_writeback_dirty() and
friends need to be macros instead of inline functions? If not I'd
clean that up as well.

Miklos