2009-09-29 16:47:17

by Eric Sandeen

[permalink] [raw]
Subject: Where all does preallocated/extra space hide?

I was running some of the xfstests enospc tests on ext4, and they were
failing; in one case, manymanymany small files are made to fill up a
100M filesystem. ext4 stops quite early with -ENOSPC, but after a bit,
(or after a "sync") we get 40MB free again. So 40% of the fs space is
hidden somewhere in preallocation...

I tried calling out to discard group prealloc but that's only a few
blocks. I'll go trace through the sync paths to see what all gets
released, but if anyone knows offhand where the rest of that space is
hiding, please give me a shout. :)

Thanks,
-Eric


2009-10-01 16:55:47

by Eric Sandeen

[permalink] [raw]
Subject: Re: Where all does preallocated/extra space hide?

Eric Sandeen wrote:
> I was running some of the xfstests enospc tests on ext4, and they were
> failing; in one case, manymanymany small files are made to fill up a
> 100M filesystem. ext4 stops quite early with -ENOSPC, but after a bit,
> (or after a "sync") we get 40MB free again. So 40% of the fs space is
> hidden somewhere in preallocation...
>
> I tried calling out to discard group prealloc but that's only a few
> blocks. I'll go trace through the sync paths to see what all gets
> released, but if anyone knows offhand where the rest of that space is
> hiding, please give me a shout. :)
>
> Thanks,
> -Eric

Possibly related; on a 1G filesystem, doing this:

#!/bin/bash

xfs_io -F -f -d -c 'pwrite -b 64k 0 512m' /mnt/test/io_test
rm /mnt/test/io_test

in a "while true" loop spews ENOSPC. (it's a direct IO 512m write in
64k chunks). Buffered IO seems fine ...

-Eric

2009-10-08 11:49:34

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: Where all does preallocated/extra space hide?

On Tue, Sep 29, 2009 at 11:47:19AM -0500, Eric Sandeen wrote:
> I was running some of the xfstests enospc tests on ext4, and they were
> failing; in one case, manymanymany small files are made to fill up a
> 100M filesystem. ext4 stops quite early with -ENOSPC, but after a bit,
> (or after a "sync") we get 40MB free again. So 40% of the fs space is
> hidden somewhere in preallocation...
>
> I tried calling out to discard group prealloc but that's only a few
> blocks. I'll go trace through the sync paths to see what all gets
> released, but if anyone knows offhand where the rest of that space is
> hiding, please give me a shout. :)
>


preallocation space is discarded by default if we fail a block allocation
ext4_mb_discard_preallocations does that. What might be happening is the
extra meta data blocks that we reserve for making sure we will be able
to properly insert the new extent on block allocation. I guess we should
force a data allocation when we fail with ENOSPC in ext4_da_writepages
We currently force a journal commit so that the we claim back the blocks
from deleted files. But we can also force block allocation for delayed
allocated inodes so that we free some of the extra meta data we reserved

-aneesh

2009-10-08 15:57:02

by Eric Sandeen

[permalink] [raw]
Subject: Re: Where all does preallocated/extra space hide?

Aneesh Kumar K.V wrote:
> On Tue, Sep 29, 2009 at 11:47:19AM -0500, Eric Sandeen wrote:
>> I was running some of the xfstests enospc tests on ext4, and they were
>> failing; in one case, manymanymany small files are made to fill up a
>> 100M filesystem. ext4 stops quite early with -ENOSPC, but after a bit,
>> (or after a "sync") we get 40MB free again. So 40% of the fs space is
>> hidden somewhere in preallocation...
>>
>> I tried calling out to discard group prealloc but that's only a few
>> blocks. I'll go trace through the sync paths to see what all gets
>> released, but if anyone knows offhand where the rest of that space is
>> hiding, please give me a shout. :)
>>
>
>
> preallocation space is discarded by default if we fail a block allocation
> ext4_mb_discard_preallocations does that. What might be happening is the
> extra meta data blocks that we reserve for making sure we will be able
> to properly insert the new extent on block allocation. I guess we should
> force a data allocation when we fail with ENOSPC in ext4_da_writepages
> We currently force a journal commit so that the we claim back the blocks
> from deleted files. But we can also force block allocation for delayed
> allocated inodes so that we free some of the extra meta data we reserved
>
> -aneesh

Yep, I should have followed up, I narrowed it down to just that - the
worst-case metadata blocks - 2 metadata blocks for a 20-byte write into
an empty file. :)

I'm working on an inode walker to push out delalloc files on enospc.

Thanks,

-Eric

2009-10-09 05:27:41

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: Where all does preallocated/extra space hide?

On Thu, Oct 08, 2009 at 10:56:25AM -0500, Eric Sandeen wrote:
> Aneesh Kumar K.V wrote:
> > On Tue, Sep 29, 2009 at 11:47:19AM -0500, Eric Sandeen wrote:
> >> I was running some of the xfstests enospc tests on ext4, and they were
> >> failing; in one case, manymanymany small files are made to fill up a
> >> 100M filesystem. ext4 stops quite early with -ENOSPC, but after a bit,
> >> (or after a "sync") we get 40MB free again. So 40% of the fs space is
> >> hidden somewhere in preallocation...
> >>
> >> I tried calling out to discard group prealloc but that's only a few
> >> blocks. I'll go trace through the sync paths to see what all gets
> >> released, but if anyone knows offhand where the rest of that space is
> >> hiding, please give me a shout. :)
> >>
> >
> >
> > preallocation space is discarded by default if we fail a block allocation
> > ext4_mb_discard_preallocations does that. What might be happening is the
> > extra meta data blocks that we reserve for making sure we will be able
> > to properly insert the new extent on block allocation. I guess we should
> > force a data allocation when we fail with ENOSPC in ext4_da_writepages
> > We currently force a journal commit so that the we claim back the blocks
> > from deleted files. But we can also force block allocation for delayed
> > allocated inodes so that we free some of the extra meta data we reserved
> >
> > -aneesh
>
> Yep, I should have followed up, I narrowed it down to just that - the
> worst-case metadata blocks - 2 metadata blocks for a 20-byte write into
> an empty file. :)
>
> I'm working on an inode walker to push out delalloc files on enospc.


Should we do an inode walker ? I guess we should be doing something
similar to balance_dirty_pages. That will kick in the flusher threads
which inturn will force the block allocation of dirty inodes.

-aneesh

2009-10-09 14:52:09

by Eric Sandeen

[permalink] [raw]
Subject: Re: Where all does preallocated/extra space hide?

Aneesh Kumar K.V wrote:
> On Thu, Oct 08, 2009 at 10:56:25AM -0500, Eric Sandeen wrote:
>> Aneesh Kumar K.V wrote:
>>> On Tue, Sep 29, 2009 at 11:47:19AM -0500, Eric Sandeen wrote:
>>>> I was running some of the xfstests enospc tests on ext4, and they were
>>>> failing; in one case, manymanymany small files are made to fill up a
>>>> 100M filesystem. ext4 stops quite early with -ENOSPC, but after a bit,
>>>> (or after a "sync") we get 40MB free again. So 40% of the fs space is
>>>> hidden somewhere in preallocation...
>>>>
>>>> I tried calling out to discard group prealloc but that's only a few
>>>> blocks. I'll go trace through the sync paths to see what all gets
>>>> released, but if anyone knows offhand where the rest of that space is
>>>> hiding, please give me a shout. :)
>>>>
>>>
>>> preallocation space is discarded by default if we fail a block allocation
>>> ext4_mb_discard_preallocations does that. What might be happening is the
>>> extra meta data blocks that we reserve for making sure we will be able
>>> to properly insert the new extent on block allocation. I guess we should
>>> force a data allocation when we fail with ENOSPC in ext4_da_writepages
>>> We currently force a journal commit so that the we claim back the blocks
>>> from deleted files. But we can also force block allocation for delayed
>>> allocated inodes so that we free some of the extra meta data we reserved
>>>
>>> -aneesh
>> Yep, I should have followed up, I narrowed it down to just that - the
>> worst-case metadata blocks - 2 metadata blocks for a 20-byte write into
>> an empty file. :)
>>
>> I'm working on an inode walker to push out delalloc files on enospc.
>
>
> Should we do an inode walker ? I guess we should be doing something
> similar to balance_dirty_pages. That will kick in the flusher threads
> which inturn will force the block allocation of dirty inodes.
>
> -aneesh

it'll need to be synchronous to avoid a spurious enospc I think ...
plus, we only want to flush delalloc inodes; flushing everythign would
be needlessly expensive I think... need to think about the right way to
do this.

-Eric