Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753322AbdI1NTx (ORCPT ); Thu, 28 Sep 2017 09:19:53 -0400 Received: from mail.stoffel.org ([104.236.43.127]:45513 "EHLO mail.stoffel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753167AbdI1NTu (ORCPT ); Thu, 28 Sep 2017 09:19:50 -0400 Date: Thu, 28 Sep 2017 09:19:49 -0400 From: John Stoffel To: Jens Axboe Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, hannes@cmpxchg.org, jack@suse.cz, torvalds@linux-foundation.org Subject: Re: [PATCH 0/12 v3] Writeback improvements Message-ID: <20170928131949.GA4384@quad.stoffel.home> References: <1506543239-31470-1-git-send-email-axboe@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1506543239-31470-1-git-send-email-axboe@kernel.dk> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1253 Lines: 25 On Wed, Sep 27, 2017 at 02:13:47PM -0600, Jens Axboe wrote: > We've had some issues with writeback in presence of memory reclaim > at Facebook, and this patch set attempts to fix it up. The real > functional change for that issue is patch 10. The rest are cleanups, > as well as the removal of doing non-range cyclic writeback. The users > of that was sync_inodes_sb() and wakeup_flusher_threads(), both of > which writeback all of the dirty pages. So does this patch set make things faster? Less bursty? Does it make writeout take longer, but with less spikes? What is the performance impact of this change? I hate to be a pain, but this just smacks of arm waving and I'm sure FB doesn't make changes without data... :-) > The basic idea is that we have callers that call > wakeup_flusher_threads() with nr_pages == 0. This means 'writeback > everything'. For memory reclaim situations, we can end up queuing > a TON of these kinds of writeback units. This can cause softlockups > and further memory issues, since we allocate huge amounts of > struct wb_writeback_work to handle this writeback. Handle this > situation more gracefully. Do you push back on the callers or slow them down? Why do we even allow callers to flush everything? John