Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751813AbdITXLy (ORCPT ); Wed, 20 Sep 2017 19:11:54 -0400 Received: from gum.cmpxchg.org ([85.214.110.215]:50356 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751547AbdITXLw (ORCPT ); Wed, 20 Sep 2017 19:11:52 -0400 Date: Wed, 20 Sep 2017 19:11:46 -0400 From: Johannes Weiner To: Jens Axboe Cc: John Stoffel , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, clm@fb.com, jack@suse.cz Subject: Re: [PATCH 0/6] More graceful flusher thread memory reclaim wakeup Message-ID: <20170920230910.GA18540@cmpxchg.org> References: <1505850787-18311-1-git-send-email-axboe@kernel.dk> <20170920192909.GA27517@quad.stoffel.home> <8a91a54e-e224-ad79-faac-3f8fe654246a@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8a91a54e-e224-ad79-faac-3f8fe654246a@kernel.dk> User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1981 Lines: 38 [ Fixed up CC list. John, you're sending email with From: John Stoffel ] On Wed, Sep 20, 2017 at 01:32:25PM -0600, Jens Axboe wrote: > On 09/20/2017 01:29 PM, John Stoffel wrote: > > On Tue, Sep 19, 2017 at 01:53:01PM -0600, Jens Axboe wrote: > >> We've had some issues with writeback in presence of memory reclaim > >> at Facebook, and this patch set attempts to fix it up. The real > >> functional change is the last patch in the series, the first 5 are > >> prep and cleanup patches. > >> > >> The basic idea is that we have callers that call > >> wakeup_flusher_threads() with nr_pages == 0. This means 'writeback > >> everything'. For memory reclaim situations, we can end up queuing > >> a TON of these kinds of writeback units. This can cause softlockups > >> and further memory issues, since we allocate huge amounts of > >> struct wb_writeback_work to handle this writeback. Handle this > >> situation more gracefully. > > > > This looks nice, but do you have any numbers to show how this improves > > things? I read the patches, but I'm not strong enough to comment on > > them at all. But I am interested in how this improves writeback under > > pressure, if at all. > > Writeback should be about the same, it's mostly about preventing > softlockups and excessive memory usage, under conditions where we are > actively trying to reclaim/clean memory. It was bad enough to cause > softlockups for writeback work processing, while the pending writeback > work units grew to insane lengths. In numbers, we have seen situations where we had 600 million writeback work items queued up from reclaim under pressure. That's 35G worth of work descriptors, and the machine was struggling to remain responsive due to a lack of memory. Once writeback against all outstanding dirty pages has been requested, there really isn't a need to queue even a second work item; the job is already being performed. We can queue the next one when it completes.