2008-11-29 13:18:40

by Andres Freund

[permalink] [raw]
Subject: EXT4 ENOSPC Bug

Hi Ted, Hi Andreas, hi all,

On a testsystem (spare laptop) with ext4 as root filesystem sometimes the
system starts to return ENOSPC to all write/create syscalls.
Sometimes the system runs without problems, at other times it starts having
problems soon after boot.
A reboot resolves the problem temporarily.

I don't see a specific usage triggering the problem.

Deleting some files sometimes allows the creation (just touch $unused_filename)
of some files, but not many.

Anything I can do to help you to debug the problem?

Andres

Kernel is: 2.6.28-rc6-andres-00007-ged31348

# df /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/root_crypt
302855628 110032848 192822780 37% /

# df -i /
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/root_crypt
19234816 786691 18448125 5% /


# tune2fs -l /dev/mapper/root_crypt
tune2fs 1.41.3 (12-Oct-2008)
Filesystem volume name: root
Last mounted on: <not available>
Filesystem UUID: b81f5bab-9544-45a9-8f15-aa7012ae2522
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype
needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg
dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 19234816
Block count: 76921099
Reserved block count: 0
Free blocks: 48205695
Free inodes: 18448125
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1005
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Tue Sep 30 11:57:33 2008
Last mount time: Thu Nov 27 14:30:58 2008
Last write time: Thu Nov 27 14:30:58 2008
Mount count: 116
Maximum mount count: 200
Last checked: Mon Oct 6 13:54:09 2008
Check interval: 31104000 (12 months)
Next check after: Thu Oct 1 13:54:09 2009
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
First orphan inode: 288694
Default directory hash: tea
Directory Hash Seed: d0501622-2c99-4b0d-9a0c-d6b6705ff4f2
Journal backup: inode blocks


Attachments:
(No filename) (2.65 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments
Subject: Re: EXT4 ENOSPC Bug

Andres Freund wrote:
> Hi Ted, Hi Andreas, hi all,
>
> On a testsystem (spare laptop) with ext4 as root filesystem sometimes the
> system starts to return ENOSPC to all write/create syscalls.
> Sometimes the system runs without problems, at other times it starts
having
> problems soon after boot.
> A reboot resolves the problem temporarily.
I have had the same problem.
>
> I don't see a specific usage triggering the problem.
Using Vuze(formerly azureus) seemed to trigger the problem. My system
can run fine for days before it happens.
>
> Deleting some files sometimes allows the creation (just touch
$unused_filename)
> of some files, but not many.
>
> Anything I can do to help you to debug the problem?
>
> Andres
I do not have much more info to give you, this is mostly a "me too".

Thanks,
Jonathan

2008-11-29 21:15:46

by Theodore Ts'o

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Sat, Nov 29, 2008 at 02:18:19PM +0100, Andres Freund wrote:
> Hi Ted, Hi Andreas, hi all,
>
> On a testsystem (spare laptop) with ext4 as root filesystem sometimes the
> system starts to return ENOSPC to all write/create syscalls.
> Sometimes the system runs without problems, at other times it starts having
> problems soon after boot.

Hmm.... Is there anything interesting log messages relating to ext4 or
jbd2 showing up in /var/log/messages or in dmesg?

- Ted

2008-11-29 21:31:47

by Andres Freund

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Saturday 29 November 2008 22:15:33 Theodore Tso wrote:
> On Sat, Nov 29, 2008 at 02:18:19PM +0100, Andres Freund wrote:
> > On a testsystem (spare laptop) with ext4 as root filesystem sometimes the
> > system starts to return ENOSPC to all write/create syscalls.
> > Sometimes the system runs without problems, at other times it starts
> > having problems soon after boot.
> Hmm.... Is there anything interesting log messages relating to ext4 or
> jbd2 showing up in /var/log/messages or in dmesg?
Nothing else than telling that barriers are not used (I am using dm-crypt...),
but that shouldnt be related.

Andres


Attachments:
(No filename) (619.00 B)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-12-01 12:35:43

by Andres Freund

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

Hi Ted, hi all,

Are there any additional informations you could use?
The filesystem is a bit big for you to download unfortunately (and the testdata
it contains a bit to sensitive).

Andres


Attachments:
(No filename) (192.00 B)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-12-01 19:42:42

by Andreas Dilger

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Dec 01, 2008 13:34 +0100, Andres Freund wrote:
> Are there any additional informations you could use?
> The filesystem is a bit big for you to download unfortunately (and the
> testdata it contains a bit to sensitive).

Som information that would be useful:
- did you run "e2fsck -f" to see if there were any errors in the filesystem?
- do you run any specific applications that seem to trigger the problem
(e.g. Vuze (formerly azureus) as was reported by another user)
- do the applications writing to this file have any unusual IO pattern
(e.g. mmap IO, lots of write+truncate+write on the same file, etc)

We discussed the creation of a debugging patch to help diagnose this
problem. It looks like you have already compiled your own kernel, so
I assume it would be possible for you to run with an additional patch?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-12-01 20:16:27

by Andres Freund

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

Hi Andreas, Hi all,

On Monday 01 December 2008 20:42:04 Andreas Dilger wrote:
> On Dec 01, 2008 13:34 +0100, Andres Freund wrote:
> > Are there any additional informations you could use?
> > The filesystem is a bit big for you to download unfortunately (and the
> > testdata it contains a bit to sensitive).
> - did you run "e2fsck -f" to see if there were any errors in the
> filesystem?
Yes, no errors.

> - do you run any specific applications that seem to trigger the
> problem (e.g. Vuze (formerly azureus) as was reported by another user)
No. The Problem occured sometimes shortly after boot (Once even before X was
up) sometimes only after days of full usage.
Boot starts postgres, thats the only thing that propably always ran. But I
have seen it while postgres was idle - i.e. just some polls and stats.

> - do the applications writing to this file have any unusual IO pattern
> (e.g. mmap IO, lots of write+truncate+write on the same file, etc)
Its not a single file, but the whole filesystem, having problems.

> We discussed the creation of a debugging patch to help diagnose this
> problem. It looks like you have already compiled your own kernel, so
> I assume it would be possible for you to run with an additional patch?
Sure, no problem.
At least if the patch is not expensive during run time. The system produces
test-data for testing our development software and I cant take it down for too
long.
Thats the reason why I can use something like ext4 on such a system at all ;-)

It sometimes takes quite a while to reproduce the problem though. The system
ran stable for 4 days so far.
Same kernel as the last time the error occured though.

Any specific debug options I should enable?

Andres


Attachments:
(No filename) (1.68 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-12-02 07:57:31

by Theodore Ts'o

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

Hi Andres,

What we suspect may be happening is that somehow, there is something
in your workload which is causing a leak or a double-increment of the
number of blocks reserved so that delayed allocation will succeed.
Basically, when the system writes to a block that hasn't been
previously allocated by filesystme, we reserve a data block so that
when we finally do the late delayed allocation, we know that a free
block will be available. When we do finally determine where the block
will be located on the filesystem, we decrement the reserved block
counter. If somehow the reserved block counter is growing and not
shrinking when it should, that could lead to the problem which you
describe.

So.... could you apply this patch, attached below. You can trigger it
using the attached program, debug-ioctl. If the filesystem is
quiscent, and you've typed sync once or twice, you should get the
following in your printk logs:

[ 2742.603886] ext4 debug delalloc of dm-4
[ 2742.603948] ext4: dirty blocks 0 free blocks 7324359
[ 2742.603960] ext4 debug delalloc done

If you do have some dirty blocks that haen't been flushed out to disk,
it might look liket this:

[ 2758.653682] ext4 debug delalloc of dm-4
[ 2758.653697] ext4: dirty blocks 172 free blocks 7324439
[ 2758.653703] s_dirty list:
[ 2758.653708] ino 401167: 79 2
[ 2758.653713] ino 401200: 2 2
[ 2758.653718] ino 401197: 3 2
....
[ 2758.653828] ext4 debug delalloc done

If our theory is correct, I suspect you will start to see the number
of dirty blocks grown over time, even before you start seeing ENOSPC
errors (which will happen when the number of dirty blocks exceeds the
number of free blocks).

In that case, the list of inodes that have data and metadata blocks
reserved will hopefully tell us soemthing about what might be going
on. Just run the debug-ioctl command giving a filename or directory
within the filesystem where you want to deump out the debugging
information; it will be dumpd out in the dmesg buffer.

- Ted


Attachments:
(No filename) (1.98 kB)
debug-delalloc-patch (2.42 kB)
debug-ioctl.c (863.00 B)
Download all attachments

2008-12-02 14:59:21

by Andres Freund

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

Hi Ted, Hi Andreas,

On Tuesday 02 December 2008 08:57:12 Theodore Tso wrote:
> So.... could you apply this patch, attached below. You can trigger it
> using the attached program, debug-ioctl. If the filesystem is
> quiscent, and you've typed sync once or twice, you should get the
> following in your printk logs:
Ok. The system now runs (without problems) with the patch enabled and I can
get the debug output.
30 Minutes after boot the system still returns to zero dirty blocks and the
free blocks seem to stay in a sensible range.
I will let it run for a while and report back if either something interesting
happens or the problem reappears and I am seeing no significant amount of dirty
blocks.

Andres



Attachments:
(No filename) (717.00 B)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments
Subject: Re: EXT4 ENOSPC Bug

On Mon, 01 Dec 2008, Andreas Dilger wrote:
> - do the applications writing to this file have any unusual IO pattern
> (e.g. mmap IO, lots of write+truncate+write on the same file, etc)

Vuze does a massive ammount of sparse writes. It can also be configured to
pre-allocate space (either by creating an empty non-sparse file, or by
telling the kernel that the file is expected to grow to size foo and to
please pre-allocate that in the filesystem. I don't recall how this is
done).

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2008-12-02 16:47:46

by Theodore Ts'o

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Tue, Dec 02, 2008 at 03:58:53PM +0100, Andres Freund wrote:
> Ok. The system now runs (without problems) with the patch enabled and I can
> get the debug output.
> 30 Minutes after boot the system still returns to zero dirty blocks and the
> free blocks seem to stay in a sensible range.
> I will let it run for a while and report back if either something interesting
> happens or the problem reappears and I am seeing no significant amount of dirty
> blocks.

You say you are using Postgres, right? Something you might try to see
if it triggers the problem it is creating a new database and then
restoring some database dump/backup into that new database. Some
databases expand into a new table space (or whatever terminology
Postgres uses) by random writes into a sparse portion of the file.
This could be triggering the problem, or at least trigger the problem
more quickly.

The other thing I wanted to ask is whether "df" was showing the 37%
in-use statistic at the time, or was that after you rebooted. And
although I hate to ask it, you're sure this isn't the standard "delete
an in-use file but not get the space back" Unix trap, right?

Thanks, regards,

- Ted

2008-12-02 17:47:51

by Andres Freund

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

Hi,

On Tuesday 02 December 2008 17:47:09 Theodore Tso wrote:
> You say you are using Postgres, right? Something you might try to see
> if it triggers the problem it is creating a new database and then
> restoring some database dump/backup into that new database. Some
> databases expand into a new table space (or whatever terminology
> Postgres uses) by random writes into a sparse portion of the file.
> This could be triggering the problem, or at least trigger the problem
> more quickly.
I tried that - I have seen no problems so far. But it is not the first time I
did not see the problem for some time.

Btw, postgres just creates the database by copying over a default database.

For an easy test with sparse files, I created a big one, set it up as a loop
device, created a filesystem and ran some stuff in it.
No Problem so far.

> The other thing I wanted to ask is whether "df" was showing the 37%
> in-use statistic at the time, or was that after you rebooted.
It definitely was before a reboot. And there were plenty of both, inodes and
blocks.

I think that I have seen the problem on metadata only changes (find /tmp -type
f|xargs touch) as well, but sometimes metadata changes were possible while file
creation was not.

Another Datapoint: File deletion sometimes made it possible to create more
files, but by far not as much as the space freed.

> And although I hate to ask it, you're sure this isn't the standard "delete
> an in-use file but not get the space back" Unix trap, right?
There were over 200GB free, so I doubt that. I don't know what could have
caused an allocation of so much space unnoticed in an idle system multiple
times.
But I do understand the reason for the question ;-)


Andres


Attachments:
(No filename) (1.69 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-12-02 20:33:31

by Theodore Ts'o

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Tue, Dec 02, 2008 at 06:47:24PM +0100, Andres Freund wrote:
>
> I think that I have seen the problem on metadata only changes (find
> /tmp -type f|xargs touch) as well, but sometimes metadata changes
> were possible while file creation was not.

Hmm... really? Was the error message ENOSPC, or something else?

The only way I can see that it might be ENOSPC would be if we got an
error in the jbd2 layer, in start_this_handle(), but that would have
resulted in a kernel printk:

if (nblocks > journal->j_max_transaction_buffers) {
printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
current->comm, nblocks,
journal->j_max_transaction_buffers);
ret = -ENOSPC;
goto out;
}


If that wasn't too long ago, you might want to search your old system
log files in /var/log for any "JBD" or "ext4" messages. That might be
an important clue, although that seems rather unlikely to me. Or in
the case where metadata changes were failing, perhaps they were doing
so with some other error that ENOSPC?

- Ted

Subject: Re: EXT4 ENOSPC Bug

Theodore Tso wrote:

> Hmm... really? Was the error message ENOSPC, or something else?
>
I can confirm I get a ENOSPC on open/creat. Here is the relevant output
from strace touch test:

open("test", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 ENOSPC (No
space left on device)

And the df -h output (first filesystem in list):

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg0-root 88G 60G 25G 71% /
tmpfs 1.6G 12K 1.6G 1% /lib/init/rw
udev 10M 112K 9.9M 2% /dev
tmpfs 1.6G 0 1.6G 0% /dev/shm
/dev/sda1 92M 19M 69M 22% /boot
none 1.6G 3.3M 1.6G 1% /tmp

Cheers,
Jonathan

Subject: Re: EXT4 ENOSPC Bug

Jonathan Bastien-Filiatrault wrote:
> Theodore Tso wrote:
>
>> Hmm... really? Was the error message ENOSPC, or something else?
>>
> I can confirm I get a ENOSPC on open/creat. Here is the relevant output
> from strace touch test:
>
(Sorry for replying to myself)

This happened while compiling a kernel with ccache enabled (with a
relatively low hit rate for this specific build). Vuze was also opened,
but with no active download.

Cheers,
Jonathan

2008-12-03 04:37:32

by Theodore Ts'o

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Tue, Dec 02, 2008 at 07:40:43PM -0500, Jonathan Bastien-Filiatrault wrote:
> Jonathan Bastien-Filiatrault wrote:
> > Theodore Tso wrote:
> >
> >> Hmm... really? Was the error message ENOSPC, or something else?
> >>
> > I can confirm I get a ENOSPC on open/creat. Here is the relevant output
> > from strace touch test:
> >
> (Sorry for replying to myself)
>
> This happened while compiling a kernel with ccache enabled (with a
> relatively low hit rate for this specific build). Vuze was also opened,
> but with no active download.

Hmm... I build with ccache all the time. I normally don't run a
bittorrent client, though. What kernel version are you using?

Any chance you can apply the patch which I sent earlier in this
thread, trigger the debugging ioctl when it happens again?

Thanks,

- Ted

2008-12-03 15:36:26

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Tue, Dec 02, 2008 at 07:37:20PM -0500, Jonathan Bastien-Filiatrault wrote:
> Theodore Tso wrote:
>
> > Hmm... really? Was the error message ENOSPC, or something else?
> >
> I can confirm I get a ENOSPC on open/creat. Here is the relevant output
> from strace touch test:
>
> open("test", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 ENOSPC (No
> space left on device)
>

Can you make sure you have the below patch in the kernel.

c001077f4003fa75793bb62979baa6241dd8eb19

commit c001077f4003fa75793bb62979baa6241dd8eb19
Author: Eric Sandeen <[email protected]>
Date: Tue Aug 19 22:19:50 2008 -0400

ext4: Fix bug where we return ENOSPC even though we have plenty of inodes

-aneesh

2008-12-03 17:23:36

by Theodore Ts'o

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Wed, Dec 03, 2008 at 09:04:41PM +0530, Aneesh Kumar K.V wrote:
>
> Can you make sure you have the below patch in the kernel.
>
> c001077f4003fa75793bb62979baa6241dd8eb19
>
> commit c001077f4003fa75793bb62979baa6241dd8eb19
> Author: Eric Sandeen <[email protected]>
> Date: Tue Aug 19 22:19:50 2008 -0400
>
> ext4: Fix bug where we return ENOSPC even though we have plenty of inodes
>

Mmmm, good point, thanks. I had been assuming this was caused by some
failure in the delayed allocation code with block accounting, but we
also had a bug fix that was causing a problem with inode allocation.

That doesn't explain a report of an ENOSPC error with metadata only
changes were failing (i.e., touching a file that already exists),
although I don't think we've gotten a lot of information about that
scenario and it feels a little unconfirmed to me still...

So, yes, this patch may solve the issue for you.

- Ted


Attachments:
(No filename) (930.00 B)
0001-ext4-Fix-bug-where-we-return-ENOSPC-even-though-we.patch (1.56 kB)
Download all attachments

2008-12-03 18:18:52

by Andres Freund

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

On Wednesday 03 December 2008 18:23:09 Theodore Tso wrote:
> On Wed, Dec 03, 2008 at 09:04:41PM +0530, Aneesh Kumar K.V wrote:
> > Can you make sure you have the below patch in the kernel.
> > c001077f4003fa75793bb62979baa6241dd8eb19
Yes, I do have that patch in kernel (verified through git and looking at the
source). Its a part of linus' tree.

> Mmmm, good point, thanks. I had been assuming this was caused by some
> failure in the delayed allocation code with block accounting, but we
> also had a bug fix that was causing a problem with inode allocation.
>
> That doesn't explain a report of an ENOSPC error with metadata only
> changes were failing (i.e., touching a file that already exists),
> although I don't think we've gotten a lot of information about that
> scenario and it feels a little unconfirmed to me still...
Unfortunately I am not absolutely sure (I reported this) as well, this is the
reason I did not report it at first but only after no culprit could be found.

Still no luck reproducing it once more.


Andres


Attachments:
(No filename) (1.02 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-12-10 00:08:23

by Andres Freund

[permalink] [raw]
Subject: Re: EXT4 ENOSPC Bug

Hi Ted,

Very Strange:
I booted (after an crash which I did not see) a somewhat older kernel and I
once more caught the issue. Unfortunately I do not have your patch in this
older kernel.


older kernel: 2.6.28-rc5-andres-00070-g4e14e83
kernel I rean whole last week: 2.6.28-rc6-andres-00007-ged31348
There seem to have been no significant ext4 related changes to the kernel
between those.

for i in `seq 1 1000`;do touch tmp$i;done
touch: cannot touch `tmp139': No space left on device
...

rm tmp*;for i in `seq 1 200`;do dd if=/dev/zero of=tmp$i bs=1024k count=1;done
dd: opening `tmp139': No space left on device
139
..

Various different blocksizes and filesizes all return the same result.

This all hints to the patch Anesh linked. But it is included in 2.6.28-rc5.
Any idea how that could happen?

The kernel install is just a "make;make install; update-initramfs -c -k `make
kernelrelease`; update-grub" run...

A short idea how I could verify on binary level what kernelrelease it really
is?


If a somehow corrupted compile (murked ccache?) is the culprit of this (in
addition to my inaccuracies) I am very sorry for all your work.

I let this kernel run for the next hours if you want to run some tests...

Thanks,

Andres



Attachments:
(No filename) (1.22 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments