Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753650AbbGBO0E (ORCPT ); Thu, 2 Jul 2015 10:26:04 -0400 Received: from imap.thunk.org ([74.207.234.97]:37073 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753460AbbGBO0B (ORCPT ); Thu, 2 Jul 2015 10:26:01 -0400 Date: Thu, 2 Jul 2015 10:25:51 -0400 From: "Theodore Ts'o" To: Michal Hocko Cc: Nikolay Borisov , Johannes Weiner , Andrew Morton , Dave Chinner , Marian Marinov , Hugh Dickins , linux-mm@kvack.org, LKML , linux-ext4@vger.kernel.org Subject: Re: [PATCH] mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations Message-ID: <20150702142551.GB9456@thunk.org> Mail-Followup-To: Theodore Ts'o , Michal Hocko , Nikolay Borisov , Johannes Weiner , Andrew Morton , Dave Chinner , Marian Marinov , Hugh Dickins , linux-mm@kvack.org, LKML , linux-ext4@vger.kernel.org References: <1435677437-16717-1-git-send-email-mhocko@suse.cz> <20150701061731.GB6286@dhcp22.suse.cz> <20150701133715.GA6287@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150701133715.GA6287@dhcp22.suse.cz> User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2016 Lines: 62 On Wed, Jul 01, 2015 at 03:37:15PM +0200, Michal Hocko wrote: > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 37e90db1520b..6c44d424968e 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -995,7 +995,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, > goto keep_locked; > > /* Case 3 above */ > - } else { > + } else if (sc->gfp_mask & __GFP_FS) { > wait_on_page_writeback(page); > } > } Um, I've just taken a closer look at this code now that I'm back from vacation, and I'm not sure this is right. This Case 3 code occurs inside an if (PageWriteback(page)) { ... } conditional, and if I'm not mistaken, if the flow of control exits this conditional, it is assumed that the page is *not* under writeback. This patch will assume the page has been cleaned if __GFP_FS is set, which could lead to a dirty page getting dropped, so I believe this is a bug. No? It would seem to me that a better fix would be to change the Case 2 handling: /* Case 2 above */ } else if (global_reclaim(sc) || - !PageReclaim(page) || !(sc->gfp_mask & __GFP_IO)) { + !PageReclaim(page) || !(sc->gfp_mask & __GFP_FS)) { /* * This is slightly racy - end_page_writeback() * might have just cleared PageReclaim, then * setting PageReclaim here end up interpreted * as PageReadahead - but that does not matter * enough to care. What we do want is for this * page to have PageReclaim set next time memcg * reclaim reaches the tests above, so it will * then wait_on_page_writeback() to avoid OOM; * and it's also appropriate in global reclaim. */ SetPageReclaim(page); nr_writeback++; goto keep_locked; Am I missing something? - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/