2009-06-29 08:06:45

by Yanmin Zhang

[permalink] [raw]
Subject: ffsb create_4k 16% regression

I run many ffsb test cases on JBODs (typically 13/12 disks). Comparing
with kernel 2.6.30, 2.6.31-rc1 has about 16% regression with
ffsb_create_4k. The sub test case creates files continuously for 10
minitues and every file is 1MB.

Bisect located below patch.


5cee5815d1564bbbd505fea86f4550f1efdb5cd0 is first bad commit
commit 5cee5815d1564bbbd505fea86f4550f1efdb5cd0
Author: Jan Kara <[email protected]>
Date: Mon Apr 27 16:43:51 2009 +0200

vfs: Make sys_sync() use fsync_super() (version 4)

It is unnecessarily fragile to have two places (fsync_super() and do_sync())
doing data integrity sync of the filesystem. Alter __fsync_super() to
accommodate needs of both callers and use it. So after this patch
__fsync_super() is the only place where we gather all the calls needed to
properly send all data on a filesystem to disk.


As a matter of fact, ffsb calls sys_sync in the end to make sure all data is
flushed to disks and the flushing is counted into the result. vmstat shows
ffsb is blocked when syncing for a long time. With 2.6.30, ffsb is blocked for
a short time.

I checked the patch and did experiments to recover the original methods. Eventually,
the root cause is the patch deletes the calling to wakeup_pdflush when syncing, so
only ffsb is blocked on disk I/O. wakeup_pdflush could ask pdflush to write back pages
with ffsb at the same time.

Below patch against 2.6.31-rc1 fixes it.

Signed-off-by: Zhang Yanmin <[email protected]>

---

--- linux-2.6.31-rc1/fs/sync.c 2009-06-29 12:18:03.000000000 +0800
+++ linux-2.6.31-rc1_sync/fs/sync.c 2009-06-29 14:56:49.000000000 +0800
@@ -114,6 +114,7 @@ restart:

SYSCALL_DEFINE0(sync)
{
+ wakeup_pdflush(0);
sync_filesystems(0);
sync_filesystems(1);
if (unlikely(laptop_mode))



2009-07-01 05:32:43

by Yanmin Zhang

[permalink] [raw]
Subject: Re: ffsb create_4k 16% regression

On Mon, 2009-06-29 at 16:06 +0800, Zhang, Yanmin wrote:
> I run many ffsb test cases on JBODs (typically 13/12 disks). Comparing
> with kernel 2.6.30, 2.6.31-rc1 has about 16% regression with
> ffsb_create_4k. The sub test case creates files continuously for 10
> minitues and every file is 1MB.
>
> Bisect located below patch.
>
>
> 5cee5815d1564bbbd505fea86f4550f1efdb5cd0 is first bad commit
> commit 5cee5815d1564bbbd505fea86f4550f1efdb5cd0
> Author: Jan Kara <[email protected]>
> Date: Mon Apr 27 16:43:51 2009 +0200
>
> vfs: Make sys_sync() use fsync_super() (version 4)
>
> It is unnecessarily fragile to have two places (fsync_super() and do_sync())
> doing data integrity sync of the filesystem. Alter __fsync_super() to
> accommodate needs of both callers and use it. So after this patch
> __fsync_super() is the only place where we gather all the calls needed to
> properly send all data on a filesystem to disk.
>
>
> As a matter of fact, ffsb calls sys_sync in the end to make sure all data is
> flushed to disks and the flushing is counted into the result. vmstat shows
> ffsb is blocked when syncing for a long time. With 2.6.30, ffsb is blocked for
> a short time.
>
> I checked the patch and did experiments to recover the original methods. Eventually,
> the root cause is the patch deletes the calling to wakeup_pdflush when syncing, so
> only ffsb is blocked on disk I/O. wakeup_pdflush could ask pdflush to write back pages
> with ffsb at the same time.
>
> Below patch against 2.6.31-rc1 fixes it.
>
> Signed-off-by: Zhang Yanmin <[email protected]>
>
> ---
>
> --- linux-2.6.31-rc1/fs/sync.c 2009-06-29 12:18:03.000000000 +0800
> +++ linux-2.6.31-rc1_sync/fs/sync.c 2009-06-29 14:56:49.000000000 +0800
> @@ -114,6 +114,7 @@ restart:
>
> SYSCALL_DEFINE0(sync)
> {
> + wakeup_pdflush(0);
> sync_filesystems(0);
> sync_filesystems(1);
> if (unlikely(laptop_mode))
Andrew,

Would you like to consider the patch for mm tree?

Thanks,
Yanmin

2009-07-01 05:46:52

by Andrew Morton

[permalink] [raw]
Subject: Re: ffsb create_4k 16% regression

On Wed, 01 Jul 2009 13:33:00 +0800 "Zhang, Yanmin" <[email protected]> wrote:

> On Mon, 2009-06-29 at 16:06 +0800, Zhang, Yanmin wrote:
> > I run many ffsb test cases on JBODs (typically 13/12 disks). Comparing
> > with kernel 2.6.30, 2.6.31-rc1 has about 16% regression with
> > ffsb_create_4k. The sub test case creates files continuously for 10
> > minitues and every file is 1MB.
> >
> > Bisect located below patch.
> >
> >
> > 5cee5815d1564bbbd505fea86f4550f1efdb5cd0 is first bad commit
> > commit 5cee5815d1564bbbd505fea86f4550f1efdb5cd0
> > Author: Jan Kara <[email protected]>
> > Date: Mon Apr 27 16:43:51 2009 +0200
> >
> > vfs: Make sys_sync() use fsync_super() (version 4)
> >
> > It is unnecessarily fragile to have two places (fsync_super() and do_sync())
> > doing data integrity sync of the filesystem. Alter __fsync_super() to
> > accommodate needs of both callers and use it. So after this patch
> > __fsync_super() is the only place where we gather all the calls needed to
> > properly send all data on a filesystem to disk.
> >
> >
> > As a matter of fact, ffsb calls sys_sync in the end to make sure all data is
> > flushed to disks and the flushing is counted into the result. vmstat shows
> > ffsb is blocked when syncing for a long time. With 2.6.30, ffsb is blocked for
> > a short time.
> >
> > I checked the patch and did experiments to recover the original methods. Eventually,
> > the root cause is the patch deletes the calling to wakeup_pdflush when syncing, so
> > only ffsb is blocked on disk I/O. ___wakeup_pdflush could ask pdflush to write back pages
> > with ffsb at the same time.
> >
> > Below patch against 2.6.31-rc1 fixes it.
> >
> > ___Signed-off-by: Zhang Yanmin <[email protected]>
> >
> > ---
> >
> > --- linux-2.6.31-rc1/fs/sync.c 2009-06-29 12:18:03.000000000 +0800
> > +++ linux-2.6.31-rc1_sync/fs/sync.c 2009-06-29 14:56:49.000000000 +0800
> > @@ -114,6 +114,7 @@ restart:
> >
> > SYSCALL_DEFINE0(sync)
> > {
> > + wakeup_pdflush(0);
> > sync_filesystems(0);
> > sync_filesystems(1);
> > if (unlikely(laptop_mode))
> Andrew,
>
> Would you like to consider the patch for mm tree?

OK.

But if we're to restore that operation, we should also restore the nice
comment explaining why we are doing it:

-/*
- * sync everything. Start out by waking pdflush, because that writes back
- * all queues in parallel.
- */
-static void do_sync(unsigned long wait)

I'll take care of that.