2024-01-03 13:28:47

by Zhang Yi

[permalink] [raw]
Subject: Re: [linus:master] [jbd2] 6a3afb6ac6: fileio.latency_95th_ms 92.5% regression

On 2024/1/3 17:49, Jan Kara wrote:
> Hello!
>
> On Wed 03-01-24 11:31:39, Zhang Yi wrote:
>> On 2024/1/2 15:31, kernel test robot wrote:
>>>
>>>
>>> Hello,
>>>
>>> kernel test robot noticed a 92.5% regression of fileio.latency_95th_ms on:
>>
>> This seems a little weird, the tests doesn't use blk-cgroup, and the patch
>> increase IO priority in WBT, so there shouldn't be any negative influence in
>> theory.
>
> I don't have a great explanation either but there could be some impact e.g.
> due to a different request merging of IO done by JBD2 vs the flush worker or
> something like that. Note that the throughput reduction is only 5.7% so it
> is not huge.

Yeah, make sense, this should be one explanation that can be thought of at
the moment.

>
>> I've tested sysbench on my machine with Intel Xeon Gold 6240 CPU,
>> 400GB memory with HDD disk, and couldn't reproduce this regression.
>>
>> ==
>> Without 6a3afb6ac6 ("jbd2: increase the journal IO's priority")
>> ==
>>
>> $ sysbench fileio --events=0 --threads=128 --time=600 --file-test-mode=seqwr --file-total-size=68719476736 --file-io-mode=sync --file-num=1024 run
>>
>> sysbench 1.1.0-df89d34 (using bundled LuaJIT 2.1.0-beta3)
>>
>> Running the test with following options:
>> Number of threads: 128
>> Initializing random number generator from current time
>>
>>
>> Extra file open flags: (none)
>> 1024 files, 64MiB each
>> 64GiB total file size
>> Block size 16KiB
>> Periodic FSYNC enabled, calling fsync() each 100 requests.
>> Calling fsync() at the end of test, Enabled.
>> Using synchronous I/O mode
>> Doing sequential write (creation) test
>> Initializing worker threads...
>>
>> Threads started!
>>
>>
>> Throughput:
>> read: IOPS=0.00 0.00 MiB/s (0.00 MB/s)
>> write: IOPS=31961.19 499.39 MiB/s (523.65 MB/s)
>> fsync: IOPS=327500.24
>
> Well, your setup seems to be very different from what LKP was using. You
> are achieving ~500 MB/s (likely because all the files fit into the cache
> and more or less even within the dirty limit of the page cache) while LKP
> run achieves only ~54 MB/s (i.e., we are pretty much bound by the rather
> slow disk). I'd try running with something like 32GB of RAM to really see
> the disk speed impact...
>

I'm afraid I missed the vmstat.io.bo changes, I will limit the dirty ratio
and test it again tomorrow.

Thanks,
Yi.