2008-07-10 17:29:24

by Ric Wheeler

[permalink] [raw]
Subject: suspiciously good fsck times?

(Repost to the list - this was mistakingly sent to linux-ext4-owner)

Just to be mean, I have been trying to test the fsck speed of ext4 with
lots of small files. The test I ran uses fs_mark to fill a 1TB Seagate
drive with 45.6 million 20k files (distributed between 256 subdirectories).

Running on ext3, "fsck -f" takes about one hour.

Running on ext4, with uninit_bg, the same fsck is finished in a bit over
5 minutes - more than 10x faster. (Without uninit_bg, the fsck takes
about 10 minutes).

Is this too good to be true? Below is the fsck run itself, the tree is
Ted's latest git tree and his 1.41 WIP tools,

ric


[root@localhost Perf]# time /sbin/fsck.ext4 -t -t -f /dev/sdb1
e4fsck 1.41-WIP (07-Jul-2008)
Pass 1: Checking inodes, blocks, and sizes
Pass 1: Memory used: 40632k/69424k (36424k/4209k), time: 204.95/78.22/25.58
Pass 1: I/O read: 11140MB, write: 0MB, rate: 54.35MB/s
Pass 2: Checking directory structure
Pass 2: Memory used: 70184k/61968k (51803k/18382k), time: 76.47/50.27/ 8.77
Pass 2: I/O read: 3023MB, write: 0MB, rate: 39.53MB/s
Pass 3: Checking directory connectivity
Peak memory: Memory used: 70184k/61968k (59256k/10929k), time:
281.72/128.59/34.35
Pass 3A: Memory used: 70184k/61968k (59256k/10929k), time: 0.00/ 0.00/
0.00
Pass 3A: I/O read: 0MB, write: 0MB, rate: 0.00MB/s
Pass 3: Memory used: 70184k/61968k (51803k/18382k), time: 0.03/ 0.00/ 0.00
Pass 3: I/O read: 1MB, write: 0MB, rate: 37.86MB/s
Pass 4: Checking reference counts
Pass 4: Memory used: 70184k/44968k (27354k/42831k), time: 2.37/ 2.36/ 0.00
Pass 4: I/O read: 0MB, write: 0MB, rate: 0.00MB/s
Pass 5: Checking group summary information
Pass 5: Memory used: 70184k/240k (64619k/5566k), time: 19.40/ 5.52/ 0.29
Pass 5: I/O read: 34MB, write: 0MB, rate: 1.75MB/s
/dev/sdb1: 45600268/61054976 files (0.0% non-contiguous),
232657574/244190000 blocks
Memory used: 70184k/240k (64889k/5296k), time: 303.54/136.48/34.65
I/O read: 14198MB, write: 1MB, rate: 46.77MB/s

real 5m3.993s
user 2m16.477s
sys 0m35.041s


2008-07-10 17:30:36

by Ric Wheeler

[permalink] [raw]
Subject: Re: suspiciously good fsck times?

Theodore Tso wrote:
> On Thu, Jul 10, 2008 at 08:36:42AM -0400, Ric Wheeler wrote:
>
>> Just to be mean, I have been trying to test the fsck speed of ext4 with
>> lots of small files. The test I ran uses fs_mark to fill a 1TB Seagate
>> drive with 45.6 million 20k files (distributed between 256
>> subdirectories).
>>
>> Running on ext3, "fsck -f" takes about one hour.
>>
>> Running on ext4, with uninit_bg, the same fsck is finished in a bit over
>> 5 minutes - more than 10x faster. (Without uninit_bg, the fsck takes
>> about 10 minutes).
>>
>> Is this too good to be true? Below is the fsck run itself, the tree is
>> Ted's latest git tree and his 1.41 WIP tools,
>>
>
> Wow. My guess is that flex_bg is making the difference. What we
> would want to compare is the I/O read statistics line:
>
>
>> I/O read: 14198MB, write: 1MB, rate: 46.77MB/s
>>
>
> That's pretty good, and indicates we've avoided a *lot* of seeking.
> The e2fsck -t -t output for ext3 should show roughly the same mount of
> I/O read (with 20k files, there would be no advantage towards using
> extents), but the I/O rate is probably *much* lower, indicating a lot
> more seeking is going on.
>
We did run fsck through seekwatcher & saw a significant reduction in
seeks/sec for ext4. Eric has the pretty pictures that he can share.

> Can you send the full e2fsck -t -t output of the ext3 run? And what
> is the hdparm -t -t results of the disk?
>

I didn't run the ext3 test with -t -t (but can refill and rerun, takes
about 12 hours).

This disk is a relatively new Seagate 1TB drive, specs at:

http://www.seagate.com/ww/v/index.jsp?vgnextoid=0732f141e7f43110VgnVCM100000f5ee0a0aRCRD

hdparm test:

[root@localhost rwheeler]# /sbin/hdparm -t -t /dev/sdb

/dev/sdb:
Timing buffered disk reads: 186 MB in 3.03 seconds = 61.33 MB/sec



> If I'm right, if you create the filesystem with mke2fs -t ext4dev -O
> ^flex_bg,^uninit_bg, you should see performance back to the old ext3
> levels.
>

With uninit_bg off, it ran about 10 minutes, but it would be interesting
to run without either.
> - Ted
>
> P.S. We probably do want to examine the block allocation layout with
> flex_bg to make sure that the filesystem ages well in the long term.
>
Testing aged file systems is always the holy grail - this workload is a
fairly artificial one and was laid down with 4 threads currently writing
to a shared subdirectory.

ric