2008-07-27 10:28:49

by Peter Meier

[permalink] [raw]
Subject: Trying out 2.6.26-ext4-3

Hi,

well, the timely release of 2.6.26-ext4-3 seemed like a good
omen, and so I decided to try and put my new home box in-
stall on ext4 today, using that latest patchset and e2fsprogs
1.41.0. So far I'm not getting very far.

I'm attempting to install Gentoo Linux, which involves copying
an initial tarball to the root file system and unpacking it. Already
while downloading the tarball into the FS my console an dmesg
were getting spammed with messages of the type:

JBD: pdflush wants too many credits (x > y)

And then, while unpacking it:

JBD: tar wants too many credits (x > y)
JBD: pdflush wants too many credits (x > y)

The latter repeatedly and with no end, to the point where I
eventually had to hard reset the box because it seemingly
wasn't going to stop while acchieving no disk activity (LED
not blinking).

I have an inkling that mke2fs -t ext4 dev didn't make a big
enough journal in the 46 GB big partition, but not being an
ext-pert (...) I wouldn't really know how to interpret the error.

So, anyone want to rescue my Sunday and give some pointers?


Greetings,
Peter


2008-07-27 10:58:31

by Peter Meier

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

Hi again,

some more info:

The x and y in the error messages used to be around 20k
or 30k for x and 8192 for y. Googling around, it seems that's
supposed to say that the maxium transaction size is 8192
blocks. Further, the maximum transaction size is supposedly
1/4 of the journal size. Assuming mke2fs -t ext4dev created
the FS with 4096 blocks (dunno) that would mean the journal
is 128MB, which is apparently typical. At least I have a 40GB
ext3 here that got the same journal size.

So I guess the question is why a basic operation like using
links to download a 130MB tar archive to an ext4dev file sys-
tem or unpacking it with tar would die by way of huge trans-
actions that don't work with a normal journal.

Still hoping I'll be able to continue with the installation today
before the work week starts again ;). Kernel hackers read
lists on Sunday, I hope :-).


Greetings,
Peter

PS.: I'm writing via the shitty GMail web interface from an in-
stall CD; here's hoping replying will work better than I fear it
might.

2008-07-27 11:52:52

by Peter Meier

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

Hi once more,

looking around further, I spotted this patch in the patch queue:

ext4_journal_credits_fix_for_writepages.patch

Again, I'm not a file system hacker, but if I understand the situa-
tion correctly, this patch aims to correct the number of credits
that write activity will try to reserve for a transaction in certain
situations and when using delayed allocation. I think this patch
gets it wrong in some way, and as a result tries to reserve an
amount of credits that far exceeds the maximum transaction
size for an ordinarz 128 MB journal, or maybe this hints at a
larger problem with delayed allocation of some kind.

So, it seems my options for continuing today's installation
activities are to try 2.6.26-ext4-3 but with the above patch re-
verted, or use 2.6.26-ext4-3 as is but disable delayed alloca-
tion via the mount option.

I'm very unsure about the ramifications of either. If I revert the
patch, will I get data corruption from having no solution to the
problem it attempts to fix? If I disable delayed allocation, will
my FS lose some magic I can't get back at a later date when
I renable it again? Or is delayed allocation purely a "runtime
feature" with no effects on the on-disk format?


Greetings,
Peter

2008-07-27 13:40:41

by Peter Meier

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

Hi,

since my time is running out, I have now resorted to performing
my install on the file system mounted with -o nodelalloc, and I
am going to mount it this way in the running system, too, until I
hear something about this issue being sorted.

FWIW, I could reproduce the problem reliably three times on
newly created file systems when mounted with delayed alloca-
tion enabled, and it went away when I started using nodelalloc,
so it does seem be in connection with this. Unfortunately I did
not have the time to do runs with and without Ming's credit patch,
however. But reading the code without truely understanding it,
the non-da codepath calls the function to figure out the credits
to reserver always with a single page argument, while the da
case potentially hands it a larger number, and the resulting
multiple then exceeds the maximum transaction size.

As a closing note, it would be really cool if someone could put
my mind at ease about being able to enable delayed allocation
on the existing file system later after this is sorted, and about not
missing out anything by not having it from the start. As I under-
stand it right now, the only implications of not running with delalloc
are little more fragmentation because the lack of the batching
effect from delalloc results in missed chances to get contiguous
blocks, and the FS will be slower without it, but otherwise, I'm not
missing anything and can enable delalloc later, right? (Note: A
nice answer might be eligible for the wiki, too - I'd gladly do the
work to add it).


Greetings,
Peter

2008-07-27 20:23:11

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

On Sun, Jul 27, 2008 at 10:28:49AM +0000, Peter Meier wrote:
> Hi,
>
> well, the timely release of 2.6.26-ext4-3 seemed like a good
> omen, and so I decided to try and put my new home box in-
> stall on ext4 today, using that latest patchset and e2fsprogs
> 1.41.0. So far I'm not getting very far.
>
> I'm attempting to install Gentoo Linux, which involves copying
> an initial tarball to the root file system and unpacking it. Already
> while downloading the tarball into the FS my console an dmesg
> were getting spammed with messages of the type:
>
> JBD: pdflush wants too many credits (x > y)

Serves me right for not testing things after reordering the patches.
I just pulled mingming journal's credit patch for now until we can
figure out what's wrong. I've just released 2.6.26-git6-ext4-2 and
2.6.26-ext4-4 with the patch pulled. I'll do a bit more testing
before I send out a full announcement.

- Ted

2008-07-28 15:11:10

by Peter Meier

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

Well, my installation went fine using "nodelalloc" with ext4-3,
but then of course I killed the file system when I forgot to add
rootflags=nodelalloc to the kernel's boot command line in the
grub config, so it got mounted with delalloc and proceeded
to eat itself.

I'm stubborn, so I'm going to try again. I'm unsure whether to
try with 2.6.26-ext4-4 and delalloc turned on, however, since
ming's patch was supposed to fix crashes, and I'm not really
looking forward to crashes during installation. So my safest
bet still seems to be to go with nodelalloc.

However, I still don't know whether I can turn delalloc back on
later on the existing FS without missing out on anything from
not having it turned on from the beginning, as per the ques-
tion in my last mail. If anyone could answer that, that'd be
much appreciated.


Greetings,
Peter

2008-07-28 17:09:45

by Peter Meier

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

Hi Ted,

thanks for putting my mind at ease on both accounts.

I'm about to restart installation again, and this time I'll try with ext4-4
and default mount options, i.e. delalloc turned on.

Installing Gentoo is compile-intensive, and on my dual-core box I'd
normally go with -j3. Since you say that -j4 kernel compiles work out
for you, I'm going to try with -j3 this time as well.

Fingers crossed. I'll get this install on ext4, and if it takes forever ;-).


Greetings,
Peter

2008-07-28 17:29:40

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

On Mon, Jul 28, 2008 at 03:11:07PM +0000, Peter Meier wrote:
> Well, my installation went fine using "nodelalloc" with ext4-3,
> but then of course I killed the file system when I forgot to add
> rootflags=nodelalloc to the kernel's boot command line in the
> grub config, so it got mounted with delalloc and proceeded
> to eat itself.
>
> I'm stubborn, so I'm going to try again. I'm unsure whether to
> try with 2.6.26-ext4-4 and delalloc turned on, however, since
> ming's patch was supposed to fix crashes, and I'm not really
> looking forward to crashes during installation. So my safest
> bet still seems to be to go with nodelalloc.

I'm using 2.6.26-ext4-4 with dealloc, and it works fine for me, even
with make -j4 kernel compiles. Mingming's patches will fix crashes
that show up under pretty extreme benchmark loads; I've yet to have it
happen during day-to-day usage on my laptop.

> However, I still don't know whether I can turn delalloc back on
> later on the existing FS without missing out on anything from
> not having it turned on from the beginning, as per the ques-
> tion in my last mail. If anyone could answer that, that'd be
> much appreciated.

Well, delayed allocation can allow the filesystem to allocate blocks
in a more optimal pattern. So yes, you can in theory miss out on some
improvements by not having it enabled from the very beginning. Given
that most installers are not multithreaded, but install packages one
at a time, I doubt the difference will be significant. However,
post-installation, if you run without delayed allocation and you have
multiple threads or processes writing into the same directory at the
same time, delayed allocation could make a much bigger difference.
For things like build directories, that's probably much less
important, though, since you can always just do a "make clean" and
then rebuild the object files.

So the bottom line is yes, it will make a difference, but it's
probably very minor. I will say though that I am currently typing
this message using 2.6.26-ext4-4, and it's working just fine for me,
without any problems. I've never seen any of the crashes which some
of our testers who have been using the system much more agressively
under benchmarking have reported.

Hopefully though we will have a fix for the journal credits patch
fairly soon, though. Probably just another few days...

Regards,

- Ted

2008-07-28 17:30:52

by Peter Meier

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

Hi Gary,

thanks for taking the time to write a comprehensive and informative reply!

I'm about to give 2.6.26-ext4-4 a try, which removes the journal credits
patch, and am going to leave delalloc enabled given Theo's confidence
in its stability under non-synthetic usage conditions. If I'm unlucky with that
as well, I'll finally try with nodelalloc once more, which went fine
until I for-
got to set rootflags and the credits patch made it eat itself.


Greetings,
Peter

2008-08-01 09:03:54

by Peter Meier

[permalink] [raw]
Subject: Re: Trying out 2.6.26-ext4-3

Hello,

I'm happy to report that I'm writing this from a system installed
on ext4 w/ 2.6.26-ext4-4 and default mount options: Everything
went well once the journal credits patch was out of the mix. My
heaviest load so far were multiple parallel -j3 builds of C++ app-
lications, and I apparently didn't have any crashes, either.


Greetings,
Peter