Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753893AbdCFNwY (ORCPT ); Mon, 6 Mar 2017 08:52:24 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:3825 "EHLO dggrg01-dlp.huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1752673AbdCFNwP (ORCPT ); Mon, 6 Mar 2017 08:52:15 -0500 From: Hou Tao Subject: cfq-iosched: two questions about the hrtimer version of CFQ To: Jan Kara , CC: , , Vivek Goyal Message-ID: <775f3ecb-45d1-4264-885a-f14e0458d36b@huawei.com> Date: Mon, 6 Mar 2017 21:50:09 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.31.14] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090206.58BD6923.00A7,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 3145d32a791648159acc2a8846178380 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2079 Lines: 86 Hi Jan and list, When testing the hrtimer version of CFQ, we found a performance degradation problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at least 1 jiffie instead of 1 ns"). The following is the test process: * filesystem and block device * XFS + /dev/sda mounted on /tmp/sda * CFQ configuration * default configurations * fio job configuration [global] bs=4k ioengine=psync iodepth=1 direct=1 rw=randwrite time_based runtime=15 cgroup_nodelete=1 group_reporting=1 [cfq_a] filename=/tmp/sda/cfq_a.dat size=2G cgroup_weight=500 cgroup=cfq_a thread=1 numjobs=2 [cfq_b] new_group filename=/tmp/sda/cfq_b.dat size=2G rate=4m cgroup_weight=500 cgroup=cfq_b thread=1 numjobs=2 The following is the test result: * with 0b31c10: * fio report cfq_a: bw=5312.6KB/s, iops=1328 cfq_b: bw=8192.6KB/s, iops=2048 * blkcg debug files ./cfq_a/blkio.group_wait_time:8:0 12062571233 ./cfq_b/blkio.group_wait_time:8:0 155841600 ./cfq_a/blkio.io_serviced:Total 19922 ./cfq_b/blkio.io_serviced:Total 30722 ./cfq_a/blkio.time:8:0 19406083246 ./cfq_b/blkio.time:8:0 19417146869 * without 0b31c10: * fio report cfq_a: bw=21670KB/s, iops=5417 cfq_b: bw=8191.2KB/s, iops=2047 * blkcg debug files ./cfq_a/blkio.group_wait_time:8:0 5798452504 ./cfq_b/blkio.group_wait_time:8:0 5131844007 ./cfq_a/blkio.io_serviced:8:0 Write 81261 ./cfq_b/blkio.io_serviced:8:0 Write 30722 ./cfq_a/blkio.time:8:0 5642608173 ./cfq_b/blkio.time:8:0 5849949812 We want to known the reason why you revert the minimal used slice to 1 jiffy when the slice has not been allocated. Does it lead to some performance regressions or something similar ? If not, I think we could revert the minimal slice to 1 ns again. Another problem is about the time comparison in CFQ code. In no-hrtimer version of CFQ, it uses time_after or time_before when possible, Why the hrtimer version doesn't use the equivalent time_after64/time_before64 ? Can ktime_get_ns() ensure there will be no wrapping problem ? Thanks very much. Regards, Tao