Subject: Re: [PATCH 6/6] writeback: throttle buffered writeback
To: Shaohua Li <shli@kernel.org>
References: <1458669320-6819-1-git-send-email-axboe@fb.com>
 <1458669320-6819-7-git-send-email-axboe@fb.com>
 <x49twjyb7cb.fsf@segfault.boston.devel.redhat.com> <56F1A8D0.4060403@fb.com>
 <20160322213000.GA45596@kernel.org>
CC: Jeff Moyer <jmoyer@redhat.com>, <linux-kernel@vger.kernel.org>,
        <linux-fsdevel@vger.kernel.org>, <linux-block@vger.kernel.org>
From: Jens Axboe <axboe@fb.com>
Message-ID: <56F1BA9D.7020209@fb.com>
Date: Tue, 22 Mar 2016 15:35:25 -0600
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <20160322213000.GA45596@kernel.org>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1822
Lines: 42

On 03/22/2016 03:30 PM, Shaohua Li wrote:
> On Tue, Mar 22, 2016 at 02:19:28PM -0600, Jens Axboe wrote:
>> On 03/22/2016 02:12 PM, Jeff Moyer wrote:
>>> Hi, Jens,
>>>
>>> Jens Axboe <axboe@fb.com> writes:
>>>
>>>> If the device has write back caching, 'wb_cache_delay' delays by
>>>> this amount of usecs when a write completes before allowing more.
>>>
>>> What's the reason behind that?
>>
>> For classic write back caching, the cache can absorb a bunch of writes
>> shortly, which means that the completion cost only shows a small part of the
>> overall cost. This means that if we just throttle on completion, then when
>> the device starts committing to media, then we'll end up starving other IO
>> anyway. This knob is a way to attempt to tame that.
>
> Does request size matter? I think it's yes. If request size will be accounted,
> there will be issue how to evaluate IO cost of each request, which is hard.

The code currently deliberately ignores it, since we do the throttling 
checks post merging. We can experiment with doing it on a per-request 
basis. I didn't want to complicate it too much, in my testing, for this 
sort of application, the size of the request doesn't matter too much. 
That's mainly because we, by default, bound the size. If it was 
unbounded, then that would be different.

> Looks the throttling is done regardless if there is other IO running, which
> could hurt writeback.

I wanted to make the first cut very tough on the writes. We always want 
to throttle, but perhaps not as much as we do now. But you'd be 
surprised how close this basic low depth gets to ideal performance, on 
most devices!

Background writeback does not have to be at 100% or 99% of the device 
capability. If we sync or wait on it, then yes, we want it to go really 
fast. And it should.

-- 
Jens Axboe