2010-12-27 14:47:53

by Filipe David Manana

[permalink] [raw]
Subject: question about file space preallocation with fallocate

Hi,

I have been playing around with fallocate to preallocate space for a
file with the mode FALLOC_FL_KEEP_SIZE.
I'm running with Linux kernel 2.6.35-24 and ext4 as the fs.

I'm allocating 1Gb for a newly created file and then in a loop I write
1Gb of data into that file in chunks of 1Kb.
fallocate is returning me 0, therefore it was successful.
However I don't see any performance gains compared to a version of
that same code that doesn't call fallocate.

The test code which does this is: http://friendpaste.com/2UR0n2U851u4IXmubeLZh0

Am I doing something wrong?

best regards,

--
Filipe David Manana,
[email protected], [email protected]

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."


2010-12-27 17:20:00

by Sunil Mushran

[permalink] [raw]
Subject: Re: question about file space preallocation with fallocate

On 12/27/2010 06:47 AM, Filipe David Manana wrote:
> Hi,
>
> I have been playing around with fallocate to preallocate space for a
> file with the mode FALLOC_FL_KEEP_SIZE.
> I'm running with Linux kernel 2.6.35-24 and ext4 as the fs.
>
> I'm allocating 1Gb for a newly created file and then in a loop I write
> 1Gb of data into that file in chunks of 1Kb.
> fallocate is returning me 0, therefore it was successful.
> However I don't see any performance gains compared to a version of
> that same code that doesn't call fallocate.
>
> The test code which does this is: http://friendpaste.com/2UR0n2U851u4IXmubeLZh0
>
> Am I doing something wrong?

fallocate() gives users the ability to allocate space instantly. One way
to compare would be to time just fallocate() with another program
writing zeros for that length.

But that's not the aim of the syscall. The aim is to allow the fs to
allocate
the space in as large chunks as possible to allow for better read
performance.

If you don't do fallocate() and allow writes to allocate in small chunks,
as you are doing, the allocations on disks could be interleaved in face of
multiple processes doing the same. Fragmented allocations can only hurt
read performance.

2010-12-27 17:57:23

by Filipe David Manana

[permalink] [raw]
Subject: Re: question about file space preallocation with fallocate

On Mon, Dec 27, 2010 at 5:17 PM, Sunil Mushran <[email protected]> wrote:
>
> fallocate() gives users the ability to allocate space instantly. One way
> to compare would be to time just fallocate() with another program
> writing zeros for that length.
>
> But that's not the aim of the syscall. The aim is to allow the fs to
> allocate
> the space in as large chunks as possible to allow for better read
> performance.
>
> If you don't do fallocate() and allow writes to allocate in small chunks,
> as you are doing, the allocations on disks could be interleaved in face of
> multiple processes doing the same. Fragmented allocations can only hurt
> read performance.
>

Thanks for the clarification Sunil. But preallocation of blocks
shouldn't also improve write operations? Since each write operation
will no longer cause the OS/filesystem to allocate blocks for the
file, therefore should be faster.

Also, any particular advice for improving write performance when all
the writes are done in append-only fashion?

--
Filipe David Manana,
[email protected], [email protected]

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

2010-12-27 19:11:52

by Sunil Mushran

[permalink] [raw]
Subject: Re: question about file space preallocation with fallocate

On 12/27/2010 09:57 AM, Filipe David Manana wrote:
> On Mon, Dec 27, 2010 at 5:17 PM, Sunil Mushran<[email protected]> wrote:
>> fallocate() gives users the ability to allocate space instantly. One way
>> to compare would be to time just fallocate() with another program
>> writing zeros for that length.
>>
>> But that's not the aim of the syscall. The aim is to allow the fs to
>> allocate
>> the space in as large chunks as possible to allow for better read
>> performance.
>>
>> If you don't do fallocate() and allow writes to allocate in small chunks,
>> as you are doing, the allocations on disks could be interleaved in face of
>> multiple processes doing the same. Fragmented allocations can only hurt
>> read performance.
>>
> Thanks for the clarification Sunil. But preallocation of blocks
> shouldn't also improve write operations? Since each write operation
> will no longer cause the OS/filesystem to allocate blocks for the
> file, therefore should be faster.
>
> Also, any particular advice for improving write performance when all
> the writes are done in append-only fashion?

Even with meta-data journaling, the allocation overhead is tiny
compared to the 1G data write overhead.

Considering you are using ext4, you should benefit from delayed
allocation. But for that you'll need to have enough memory and
be running a 64-bit kernel. That way you wont be limited by the
speed of the disk.

Other option is submitting writes in larger chunks. Say 1MB rather
than 1KB.

2011-01-03 17:07:32

by Eric Sandeen

[permalink] [raw]
Subject: Re: question about file space preallocation with fallocate

On 12/27/2010 11:17 AM, Sunil Mushran wrote:
> On 12/27/2010 06:47 AM, Filipe David Manana wrote:
>> Hi,
>>
>> I have been playing around with fallocate to preallocate space for a
>> file with the mode FALLOC_FL_KEEP_SIZE.
>> I'm running with Linux kernel 2.6.35-24 and ext4 as the fs.
>>
>> I'm allocating 1Gb for a newly created file and then in a loop I write
>> 1Gb of data into that file in chunks of 1Kb.
>> fallocate is returning me 0, therefore it was successful.
>> However I don't see any performance gains compared to a version of
>> that same code that doesn't call fallocate.
>>
>> The test code which does this is: http://friendpaste.com/2UR0n2U851u4IXmubeLZh0
>>
>> Am I doing something wrong?
>
> fallocate() gives users the ability to allocate space instantly. One way
> to compare would be to time just fallocate() with another program
> writing zeros for that length.
>
> But that's not the aim of the syscall. The aim is to allow the fs to
> allocate
> the space in as large chunks as possible to allow for better read
> performance.

Well, all fallocate is really -supposed- to do is guarantee that the
space will be available for a future write.

"After a successful call to posix_fallocate(), subsequent writes to
bytes in the specified range are guaranteed not to fail because of lack
of disk space."

A practical side effect is that it is often more contiguous, but that
is not guaranteed. It -could- return your allocated space in very
fragmented extents.

> If you don't do fallocate() and allow writes to allocate in small chunks,
> as you are doing, the allocations on disks could be interleaved in face of
> multiple processes doing the same. Fragmented allocations can only hurt
> read performance.

As you followed up in later emails, the original test case isn't going
to show much if any difference; a 1G write is so small that it may well
all turn into a single delalloc write anyway. Since ext4 maxes out at
128MB extents that's still several extents to allocate but it's not that
much overhead.

A more interesting test might be to do random writes into a large
file, and compare preallocated vs. not-preallocated. Ext4 leaves
physical gaps for logical gaps though, so even that may not show a huge
difference in performance, esp. when you consider that the random writes
will cause "fragmentation" anyway in terms of written- and
unwritten-extents which must be converted ...

-Eric