Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751100AbdCQMng (ORCPT ); Fri, 17 Mar 2017 08:43:36 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37446 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751023AbdCQMnf (ORCPT ); Fri, 17 Mar 2017 08:43:35 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 40721342C72 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=aquini@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 40721342C72 Date: Fri, 17 Mar 2017 08:42:45 -0400 From: Rafael Aquini To: "Huang, Ying" Cc: Andrew Morton , Andi Kleen , Dave Hansen , Shaohua Li , Rik van Riel , Tim Chen , Michal Hocko , Mel Gorman , Aaron Lu , "Kirill A. Shutemov" , Gerald Schaefer , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/5] mm, swap: Fix comment in __read_swap_cache_async Message-ID: <20170317124244.GF956@xps> References: <20170317064635.12792-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170317064635.12792-1-ying.huang@intel.com> User-Agent: Mutt/1.7.1 (2016-10-04) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 17 Mar 2017 12:42:52 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2829 Lines: 67 On Fri, Mar 17, 2017 at 02:46:19PM +0800, Huang, Ying wrote: > From: Huang Ying > > The commit cbab0e4eec29 ("swap: avoid read_swap_cache_async() race to > deadlock while waiting on discard I/O completion") fixed a deadlock in > read_swap_cache_async(). Because at that time, in swap allocation > path, a swap entry may be set as SWAP_HAS_CACHE, then wait for > discarding to complete before the page for the swap entry is added to > the swap cache. But in the commit 815c2c543d3a ("swap: make swap > discard async"), the discarding for swap become asynchronous, waiting > for discarding to complete will be done before the swap entry is set > as SWAP_HAS_CACHE. So the comments in code is incorrect now. This > patch fixes the comments. > > The cond_resched() added in the commit cbab0e4eec29 is not necessary > now too. But if we added some sleep in swap allocation path in the > future, there may be some hard to debug/reproduce deadlock bug. So it > is kept. > ^ this is a rather disconcerting way to describe why you left that part behind, and I recollect telling you about it in a private discussion. The fact is that __read_swap_cache_async() still races against get_swap_page() with a way narrower window due to the async fashioned SSD wear leveling done for swap nowadays and other changes made within __read_swap_cache_async()'s while loop thus making that old deadlock scenario very improbable to strike again. All seems legit, apart from that last paragraph in the commit log message Acked-by: Rafael Aquini > Cc: Shaohua Li > Cc: Rafael Aquini > Signed-off-by: "Huang, Ying" > --- > mm/swap_state.c | 12 +----------- > 1 file changed, 1 insertion(+), 11 deletions(-) > > diff --git a/mm/swap_state.c b/mm/swap_state.c > index 473b71e052a8..7bfb9bd1ca21 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -360,17 +360,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > /* > * We might race against get_swap_page() and stumble > * across a SWAP_HAS_CACHE swap_map entry whose page > - * has not been brought into the swapcache yet, while > - * the other end is scheduled away waiting on discard > - * I/O completion at scan_swap_map(). > - * > - * In order to avoid turning this transitory state > - * into a permanent loop around this -EEXIST case > - * if !CONFIG_PREEMPT and the I/O completion happens > - * to be waiting on the CPU waitqueue where we are now > - * busy looping, we just conditionally invoke the > - * scheduler here, if there are some more important > - * tasks to run. > + * has not been brought into the swapcache yet. > */ > cond_resched(); > continue; > -- > 2.11.0 >