2010-11-01 15:58:58

by Felipe Contreras

[permalink] [raw]
Subject: Re: Horrible btrfs performance due to fragmentation

On Mon, Nov 1, 2010 at 12:47 AM, Hugo Mills <[email protected]> wrote:
> On Mon, Nov 01, 2010 at 12:36:58AM +0200, Felipe Contreras wrote:
>> On Mon, Nov 1, 2010 at 12:25 AM, cwillu <[email protected]> wrote:
>> > btrfs fi defrag isn't recursive.  "btrfs filesystem defrag /home" will
>> > defragment the space used to store the folder, without touching the
>> > space used to store files in that folder.
>>
>> Yes, that came up on the IRC, but:
>>
>> 1) It doesn't make sense: "btrfs filesystem" doesn't allow a fileystem
>> as argument? Why would anyone want it to be _non_ recursive?
>
>   You missed the subsequent discussion on IRC about the interaction
> of COW with defrag. Essentially, if you've got two files that are COW
> copies of each other, and one has had something written to it since,
> it's *impossible* for both files to be defragmented, without making a
> full copy of both:
>
> Start with a file (A, etc are data blocks on the disk):
>
> file1 = ABCDEF
>
> Cow copy it:
>
> file1 = ABCDEF
> file2 = ABCDEF
>
> Now write to one of them:
>
> file1 = ABCDEF
> file2 = ABCDxF
>
>   So, either file1 is contiguous, and file2 is fragmented (with the
> block x somewhere else on disk), or file2 is contiguous, and file1 is
> fragmented (with E somewhere else on disk). In fact, we've determined
> by experiment that when you defrag a file that's sharing blocks with
> another one, the file gets copied in its entirety, thus separating the
> blocks of the file and its COW duplicate.

Ok, but the fragmentation would not be an issue in this case.

>> 2) The filesystem should not degrade performance so horribly no matter
>> how long the it has been used. Even git has automatic garbage
>> collection.
>
>   Since, I believe, btrfs uses COW very heavily internally for
> ensuring consistency, you can end up with fragmenting files and
> directories very easily. You probably need some kind of scrubber that
> goes looking for non-COW files that are fragmented, and defrags them
> in the background.

Or when going through all the fragments of a file, have a counter, and
if it exceeds certain limit mark it somehow, so that it gets
defragmented at least to a certain extent.

--
Felipe Contreras


2010-11-01 16:09:47

by Gregory Maxwell

[permalink] [raw]
Subject: Re: Horrible btrfs performance due to fragmentation

On Mon, Nov 1, 2010 at 11:58 AM, Felipe Contreras
<[email protected]> wrote:
> Or when going through all the fragments of a file, have a counter, and
> if it exceeds certain limit mark it somehow, so that it gets
> defragmented at least to a certain extent.

Thats elegant, — then resources are only spent on defragmenting files
which are actually in use and which actually need it... but I don't
see how it deals with the partial mutual exclusivity of defragmenting
and COWed files.