2010-06-01 14:00:37

by Greg Freemyer

[permalink] [raw]
Subject: Re: [PATCH 2.6.27.y 1/3] ext4: Use our own write_cache_pages()

On Mon, May 31, 2010 at 2:35 AM, Kay Diederichs
<[email protected]> wrote:
> Am 30.05.2010 23:25, schrieb [email protected]:
>>
>> On Fri, May 28, 2010 at 08:41:44PM -0500, Jayson R. King wrote:
>>>
>>> The difference is that, 2.6.27's write_cache_pages() in
>>> page-writeback.c still updates wbc->nr_to_write, since the patch
>>> which changed that behavior was dropped from .27-rc2 due to the XFS
>>> regression it causes on mainline. ext4 appears to want the behavior
>>> of write_cache_pages which does not update wbc->nr_to_write. This
>>> write_cache_pages_da() does what ext4 wants, without introducing the
>>> XFS regression. So I believe it is needed.
>>
>> Ah, OK. ?So I understand the motivation now, and that's a valid
>> concern. ?The question is now: how much the goal of the 2.6.27 stable
>> branch to fix bugs, and how much is it to get the best possible
>> performance, at least with respect to ext4? ?It's going to be harder
>> and harder to backport fixes to 2.6.27, and I can speak from
>> experience that it's very easy to introduce regressions while trying
>> to do backports, since sometimes an individual upstream commit can end
>> up introducing a regression, and while we do try to document
>> regression fixes in later commits, sometimes the documentation isn't
>> complete.
>>
>> I just spent the better part of a day trying to fix up a backport
>> series for 2.6.32. ?When I was engaged in this particular exercise, it
>> turns out a particular commit to fix a quota deadlock introduced a
>> regression, and the fix to that introduced yet another, and there were
>> three or four patches that all needed to be pulled in at once. ?Except
>> initially I missed one, and that caused an i_blocks corruption issue
>> when using fallocate() that took me several hours and a reverse
>> git-bisection to find. ?(And this is one set of fixes that will
>> probably never be able to go into 2.6.27.y, since these changes also
>> interlock with probably a dozen or so quota changes that have also
>> gone in over the last couple of kernel releases.)
>>
>> I'll also add that simply testing using dbench, as you said you used
>> in another e-mail message, really isn't good enough to find all
>> possible regressions (it wouldn't have found the i_blocks corruption
>> problem in my initial set of 2.6.32 ext4 backports patches, for
>> example, since dbench only tests a very limited set of fs operations,
>> which doesn't include fallocate, or quotas, or mmap for that matter.)
>>
>> What I would recommend is using the XFSQA (also sometimes known
>> xfstests) test suite to make sure that your changes are sound. ?Dbench
>> will sometimes find issues, yes, but in my experience it's not the
>> best tool. ?The fsstress program, which is called in a number of
>> different configurations by xfstests, has found all sorts of problems
>> that other thing shaven't been able to find. ?Run it on at least a
>> 2-core system, or preferably a 4-core or 8-core system if you have it.
>> I generally run tests using both 4k and 1k blocksize file systems to
>> make sure there aren't problems where the fs blocksize is less than
>> the pagesize.
>>
>> If you are willing to take on the support burden of ext4 for 2.6.27,
>> and do a lot of testing, I at least wouldn't have any objection to
>> these patches. ?It's really a question of risk vs. reward for the
>> users of the 2.6.27 stable tree, plus a question of someone willing to
>> take on the support/debugging burden, and how much testing is done to
>> appropriate tilt the risk/reward balance.
>>
>> Regards,
>>
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>
> For what it's worth: my 2.6.27.45 fileservers deadlock reproducibly after 1
> to 2 minutes of heavy NFS load, when using ext4 (never had a problem with
> ext3). Jayson King's patch series (posted Feb 27) fixed this, and I've been
> running it since May 1 without problems.
>
> From my experience, I'd say that the ext4 deadlock needs to be fixed;
> otherwise ext4 in 2.6.27 should not be called stable.
>
> best wishes,
> Kay

It has always been marked experimental in 2.6.27, not stable so I'm
totally lost about this effort.

See http://lxr.linux.no/#linux+v2.6.27.47/fs/Kconfig

139 config EXT4DEV_FS
140 tristate "Ext4dev/ext4 extended fs support development
(EXPERIMENTAL)"
141 depends on EXPERIMENTAL
142 select JBD2
143 select CRC16
144 help
145 Ext4dev is a predecessor filesystem of the next generation
146 extended fs ext4, based on ext3 filesystem code. It will be
147 renamed ext4 fs later, once ext4dev is mature and stabilized.
...
164
165 If unsure, say N.

Greg


2010-06-01 14:50:15

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2.6.27.y 1/3] ext4: Use our own write_cache_pages()


On Jun 1, 2010, at 9:54 AM, Greg Freemyer wrote:
>
> It has always been marked experimental in 2.6.27, not stable so I'm
> totally lost about this effort.
>
> See http://lxr.linux.no/#linux+v2.6.27.47/fs/Kconfig

This is one of the things that confuses me, actually. Why is it that there are a number of people who want to use ext4 on 2.6.27? Even the enterprise distro's have moved on; SLES 11 SP1 upgraded their users from 2.6.27 to 2.6.32, for example. I wonder if it's time to start a new "stable anchor point" around 2.6.32, given that Ubuntu's latest Long-Term Stable (Lucid LTS) is based on 2.6.32, as is SLES 11 SP1. The RHEL 6 beta is also based on 2.6.32. (And I just spent quite a bit of time over the past week backporting a lot of ext4 bug fixes to 2.6.32.y :-)

If there are people who want to work on trying to backport more ext4 fixes to 2.6.27, they're of course free to do so. I am really curious as to *why*, though.

Regards,

-- Ted

2010-06-01 15:23:11

by Kay Diederichs

[permalink] [raw]
Subject: Re: [PATCH 2.6.27.y 1/3] ext4: Use our own write_cache_pages()

Theodore Tso schrieb:
> On Jun 1, 2010, at 9:54 AM, Greg Freemyer wrote:
>> It has always been marked experimental in 2.6.27, not stable so I'm
>> totally lost about this effort.
>>
>> See http://lxr.linux.no/#linux+v2.6.27.47/fs/Kconfig
>
> This is one of the things that confuses me, actually. Why is it that there are a number of people who want to use ext4 on 2.6.27? Even the enterprise distro's have moved on; SLES 11 SP1 upgraded their users from 2.6.27 to 2.6.32, for example. I wonder if it's time to start a new "stable anchor point" around 2.6.32, given that Ubuntu's latest Long-Term Stable (Lucid LTS) is based on 2.6.32, as is SLES 11 SP1. The RHEL 6 beta is also based on 2.6.32. (And I just spent quite a bit of time over the past week backporting a lot of ext4 bug fixes to 2.6.32.y :-)
>
> If there are people who want to work on trying to backport more ext4 fixes to 2.6.27, they're of course free to do so. I am really curious as to *why*, though.
>
> Regards,
>
> -- Ted
>

The answer is: because 2.6.27.y is supposed to be a _stable_ kernel. If
it were e.g. 2.6.28 or 2.6.29, nobody would care. But as long as there
is a flow of backported fixes (and there have been quite a few ext4
fixes in 2.6.27) I have the expectation that known bugs get fixed sooner
or later.

If a subsystem maintainer says "I'm not going to support this old stable
thing any longer" then things change. But I hear this from you for the
first time - I may have missed earlier announcements to this effect, though.

best,

Kay


Attachments:
smime.p7s (4.64 kB)
S/MIME Cryptographic Signature

2010-06-01 20:37:37

by Jayson R. King

[permalink] [raw]
Subject: Re: [PATCH 2.6.27.y 1/3] ext4: Use our own write_cache_pages()

On 06/01/2010 09:49 AM, Theodore Tso wrote:
> This is one of the things that confuses me, actually. Why is it that there are a number of people who want to use ext4 on 2.6.27? Even the enterprise distro's have moved on; SLES 11 SP1 upgraded their users from 2.6.27 to 2.6.32, for example. I wonder if it's time to start a new "stable anchor point" around 2.6.32, given that Ubuntu's latest Long-Term Stable (Lucid LTS) is based on 2.6.32, as is SLES 11 SP1. The RHEL 6 beta is also based on 2.6.32. (And I just spent quite a bit of time over the past week backporting a lot of ext4 bug fixes to 2.6.32.y :-)
>
> If there are people who want to work on trying to backport more ext4 fixes to 2.6.27, they're of course free to do so. I am really curious as to *why*, though.

2.6.27 is still a good kernel and ext4 is a good filesystem, IMO
(existing deadlock notwithstanding).

Like Kay Diederichs mentioned, .27 has received ext4 updates in the
past, even as recently as April this year ("ext4: Avoid null pointer
dereference..."). Though this of course does not imply that .27 should
receive ext4 fixes (or other fixes) forever, but it is nice to fix the
most serious, show-stopping problems if it is feasable.

(maybe OT?: When I made an attempt to switch to kernel .31 or .32
earlier, the kernel would not boot for me. Surely, I can do some
investigating and get it to boot some day, but I wasn't motivated to
solve it at the time and stuck with .27 instead.)

Thanks for the comments.

Rgds,

Jayson

2010-06-01 22:12:47

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2.6.27.y 1/3] ext4: Use our own write_cache_pages()

On Tue, Jun 01, 2010 at 03:06:37PM -0500, Jayson R. King wrote:
> Like Kay Diederichs mentioned, .27 has received ext4 updates in the
> past, even as recently as April this year ("ext4: Avoid null pointer
> dereference..."). Though this of course does not imply that .27
> should receive ext4 fixes (or other fixes) forever, but it is nice
> to fix the most serious, show-stopping problems if it is feasable.

Some people have, but I had stopped doing wholesale attempts of
backports about 6-9 months ago, due to lack of time and because 2.6.27
was just getting too hard to backport to. So what has been getting
backported has been a little bit of a scattershot. In retrospect, if
I had known people would have wanted to keep it going for this long, I
would have been more aggressive about backporting patches which might
not (strictly speaking) meet the "critical bugfix" category, but which
enables backporting of future important bugs without having to do some
pretty extreme efforts to make the backport work.

(And past a certain point, we end up needing to manually regen each
patch, and because of quota and i_blocks updates are so closely tied
together, and quota received a bunch of "clean up patches", we either
need to merge in all of the quota "clean ups", or we need to regen the
patch pretty much from scratch as part of the stable backport.)

> (maybe OT?: When I made an attempt to switch to kernel .31 or .32
> earlier, the kernel would not boot for me. Surely, I can do some
> investigating and get it to boot some day, but I wasn't motivated to
> solve it at the time and stuck with .27 instead.)

You might want to try the latest 2.6.32 stable kernel and see if it
works any better for you. Since all of the major enterprise distro's
are using 2.6.32 as a base, it's likely that a lot of problem that may
have been in the original 2.6.32 kernel has been fixed.

And, to sweeten the pot, I've done all of the backporting of the ext4
patches to 2.6.32 already, and will probably keep it up for a while,
just because the enterprise distros that are focusing on 2.6.32.

- Ted