Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752265AbcKHNnU (ORCPT ); Tue, 8 Nov 2016 08:43:20 -0500 Received: from mx2.suse.de ([195.135.220.15]:37707 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751243AbcKHNnO (ORCPT ); Tue, 8 Nov 2016 08:43:14 -0500 Date: Tue, 8 Nov 2016 14:42:33 +0100 From: Jan Kara To: Jens Axboe Cc: axboe@kernel.dk, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, jack@suse.cz, hch@lst.de Subject: Re: [PATCH 8/8] block: hook up writeback throttling Message-ID: <20161108134233.GR32353@quack2.suse.cz> References: <1478034531-28559-1-git-send-email-axboe@fb.com> <1478034531-28559-9-git-send-email-axboe@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1478034531-28559-9-git-send-email-axboe@fb.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2491 Lines: 49 On Tue 01-11-16 15:08:51, Jens Axboe wrote: > Enable throttling of buffered writeback to make it a lot > more smooth, and has way less impact on other system activity. > Background writeback should be, by definition, background > activity. The fact that we flush huge bundles of it at the time > means that it potentially has heavy impacts on foreground workloads, > which isn't ideal. We can't easily limit the sizes of writes that > we do, since that would impact file system layout in the presence > of delayed allocation. So just throttle back buffered writeback, > unless someone is waiting for it. > > The algorithm for when to throttle takes its inspiration in the > CoDel networking scheduling algorithm. Like CoDel, blk-wb monitors > the minimum latencies of requests over a window of time. In that > window of time, if the minimum latency of any request exceeds a > given target, then a scale count is incremented and the queue depth > is shrunk. The next monitoring window is shrunk accordingly. Unlike > CoDel, if we hit a window that exhibits good behavior, then we > simply increment the scale count and re-calculate the limits for that > scale value. This prevents us from oscillating between a > close-to-ideal value and max all the time, instead remaining in the > windows where we get good behavior. > > Unlike CoDel, blk-wb allows the scale count to to negative. This > happens if we primarily have writes going on. Unlike positive > scale counts, this doesn't change the size of the monitoring window. > When the heavy writers finish, blk-bw quickly snaps back to it's > stable state of a zero scale count. > > The patch registers two sysfs entries. The first one, 'wb_window_usec', > defines the window of monitoring. The second one, 'wb_lat_usec', > sets the latency target for the window. It defaults to 2 msec for > non-rotational storage, and 75 msec for rotational storage. Setting > this value to '0' disables blk-wb. Generally, a user would not have > to touch these settings. > > We don't enable WBT on devices that are managed with CFQ, and have > a non-root block cgroup attached. If we have a proportional share setup > on this particular disk, then the wbt throttling will interfere with > that. We don't have a strong need for wbt for that case, since we will > rely on CFQ doing that for us. Just one nit: Don't you miss wbt_exit() call for legacy block layer? I don't see where that happens. Honza -- Jan Kara SUSE Labs, CR