Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp872166pxb; Fri, 28 Jan 2022 12:00:11 -0800 (PST) X-Google-Smtp-Source: ABdhPJxw36wjvIB3jd14kjlVsuwMW8d1v+r4+G9RfdD+sbLiI5pL/u23EiEGThav1WPtlvJ/5kGM X-Received: by 2002:a05:6402:2789:: with SMTP id b9mr9551924ede.308.1643400010825; Fri, 28 Jan 2022 12:00:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643400010; cv=none; d=google.com; s=arc-20160816; b=zvCgVnifG+/x0L3Hsl59GggWLlQqC6uCw0/FCZ/B286R7vZVILavy2pjT5xVbmHFap p8NpdFL159EPwZu3xKddvcrYpelqjlQUjZkN/0J9xZQPQM5SP/w9T2mPS88/+eTe/lzc LDt3SP6WX/Kh+MlypQYvraBEm9rvwGgfFq4rtGsqSBoyAvmr7MQRxzq3KGL/+GTaDfHa asrPOt9lkVGPYf0hG/sJ75apLhPj2WxsOnXqZLxyPQMxuh3eHWjAwEYBoUaQ/JBg6/zA 2fnV+Iv1Qzn9A2aI7wkX0ShhAd0EqNG6RLZA0CMWs07ycWpoRb5XeNVuiMVLHNA/RDgH v0/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=vsZxj2n/FQH3vhW+ujW3pW+S6DvzEuQ7kVmpC7bpK4k=; b=DEcPN14tSS9GwxUps6AKj3LMnqovY05aBToStE6TYSLo4YeXda+8vGQV/jHeMDNy6c favq6Xo/aeVTMreOAoL85VHZLOUQWprUdZCM3sWItNG+ZXCSRbkALazEJCWgZkeNAba+ NrTDxcMkbZW0GRlvvfNIUaon7FaBDQiff0JD6cufJYxR6B6fDIl7q9uiTEF9mrUjgFMC vSpY7Vfkyi45iPBYb7vBXXksV1BAbeStNC3w7UWsFs4n4nblHYIiiA4/MQeKO4wh6Q2N XMctaOkmyMPRyrpROD3z+H2Humb+52uTgoqrgLZEvYoWICeK1bSz84fLqTp4GDZZdzvU YLzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Cl0nOWq7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z26si3174032ejl.371.2022.01.28.11.59.33; Fri, 28 Jan 2022 12:00:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Cl0nOWq7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245393AbiA0VX4 (ORCPT + 99 others); Thu, 27 Jan 2022 16:23:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240581AbiA0VXz (ORCPT ); Thu, 27 Jan 2022 16:23:55 -0500 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA43BC061714 for ; Thu, 27 Jan 2022 13:23:54 -0800 (PST) Received: by mail-ed1-x52c.google.com with SMTP id j2so5500876edj.8 for ; Thu, 27 Jan 2022 13:23:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vsZxj2n/FQH3vhW+ujW3pW+S6DvzEuQ7kVmpC7bpK4k=; b=Cl0nOWq7usXUtt3ClyuNJoGcfX4dh8CnVflfZzrcDXHlPXHL1bT6/PKwOHZhp24ISE T1nf5dlnF6DIfFFEhHEPhOL6qQdEe7QRCLsQWwfth+zmfqdRhl8xnCFXK8ipUX6EVYOA 6qJGWWuMFlfBFC3ToIQ8aHCFia2FZWzFjKlD9o3IXirOkv9DBp4YAwhr+N2CJf0ejrgw Tp/bVDl9CItryv66zQnjsJ4UjZp0xLCC8FFUSk1QByIKDkfJvdhkDLTlgobuOc820jFV /tYkNw1Ms1a5c10atzqsBH1pQlTsvRPA5xrJo2KA4bijTpvW2R0oI9GX5hzlKbETcS6P ePVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vsZxj2n/FQH3vhW+ujW3pW+S6DvzEuQ7kVmpC7bpK4k=; b=dHPbwrUUf9rMDE4ySxJ63beFE3B17GGN2Jz+HE0SjQzRh9EawO/SEnO/5CmzZjzVt/ cnA0zBVL5GTQvT6pZSF8hpTcPe0qtQEIHQ6nEzCT8NrkJrdPXx6Gsyn4z6uMYZh0WlaK KugDFDDrfpxLydx1+c3YmemKVSm+d/WBAx6TwX2PZI+xryHWAwmd1wGAJlKc6STQ3MPt UaOostYeSLeIFTsw2vr4vuJoeodCtq0TEiwUCpPB6bxnXtzdLP2PL5sIzvdzGDq0DIqP GOV8catDpuXHqeepmMtDIZhoZ08yHFYOhyqQ9V5GWifo7XXQOWRaFKVfhY7b2rVeY4cL iaAQ== X-Gm-Message-State: AOAM532bYEszodQWO54AOwVB10ml/uTeiOJ5UOZy6mgXqtgncgbcTdQ9 p5r7UdwxMQrVAy8tmV1fPcczedor4+bxRn0mMwI132eC35Q= X-Received: by 2002:a05:6402:270f:: with SMTP id y15mr5235743edd.409.1643318633482; Thu, 27 Jan 2022 13:23:53 -0800 (PST) MIME-Version: 1.0 References: <20220126095557.32392-1-david@redhat.com> <20220126095557.32392-7-david@redhat.com> In-Reply-To: <20220126095557.32392-7-david@redhat.com> From: Yang Shi Date: Thu, 27 Jan 2022 13:23:41 -0800 Message-ID: Subject: Re: [PATCH RFC v2 6/9] mm/khugepaged: remove reuse_swap_page() usage To: David Hildenbrand Cc: Linux Kernel Mailing List , Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Linux MM Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 26, 2022 at 2:00 AM David Hildenbrand wrote: > > reuse_swap_page() currently indicates if we can write to an anon page > without COW. A COW is required if the page is shared by multiple > processes (either already mapped or via swap entries) or if there is > concurrent writeback that cannot tolerate concurrent page modifications. > > reuse_swap_page() doesn't check for pending references from other > processes that already unmapped the page, however, > is_refcount_suitable() essentially does the same thing in the context of > khugepaged. khugepaged is the last remaining user of reuse_swap_page() and > we want to remove that function. > > In the context of khugepaged, we are not actually going to write to the > page and we don't really care about other processes mapping the page: > for example, without swap, we don't care about shared pages at all. > > The current logic seems to be: > * Writable: -> Not shared, but might be in the swapcache. Nobody can > fault it in from the swapcache as there are no other swap entries. > * Readable and not in the swapcache: Might be shared (but nobody can > fault it in from the swapcache). > * Readable and in the swapcache: Might be shared and someone might be > able to fault it in from the swapcache. Make sure we're the exclusive > owner via reuse_swap_page(). > > Having to guess due to lack of comments and documentation, the current > logic really only wants to make sure that a page that might be shared > cannot be faulted in from the swapcache while khugepaged is active. > It's hard to guess why that is that case and if it's really still required, > but let's try keeping that logic unmodified. I don't think it could be faulted in while khugepaged is active since khugepaged does hold mmap_lock in write mode IIUC. So page fault is serialized against khugepaged. My wild guess is that collapsing shared pages was not supported before v5.8, so we need reuse_swap_page() to tell us if the page in swap cache is shared or not. But it is not true anymore. And khugepaged just allocates a THP then copy the data from base pages to huge page then replace PTEs to PMD, it doesn't change the content of the page, so I failed to see a problem by collapsing a shared page in swap cache. But I'm really not entirely sure, I may miss something... > > Instead of relying on reuse_swap_page(), let's unconditionally > try_to_free_swap(), special casing PageKsm(). try_to_free_swap() will fail > if there are still swap entries targeting the page or if the page is under > writeback. > > After a successful try_to_free_swap() that page cannot be readded to the > swapcache because we're keeping the page locked and removed from the LRU > until we actually perform the copy. So once we succeeded removing a page > from the swapcache, it cannot be re-added until we're done copying. Add a > comment stating that. > > Signed-off-by: David Hildenbrand > --- > mm/khugepaged.c | 16 +++++++++++++--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 35f14d0a00a6..bc0ff598e98f 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -683,10 +683,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > goto out; > } > if (!pte_write(pteval) && PageSwapCache(page) && > - !reuse_swap_page(page)) { > + (PageKsm(page) || !try_to_free_swap(page))) { > /* > - * Page is in the swap cache and cannot be re-used. > - * It cannot be collapsed into a THP. > + * Possibly shared page cannot be removed from the > + * swapache. It cannot be collapsed into a THP. > */ > unlock_page(page); > result = SCAN_SWAP_CACHE_PAGE; > @@ -702,6 +702,16 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > result = SCAN_DEL_PAGE_LRU; > goto out; > } > + > + /* > + * We're holding the page lock and removed the page from the > + * LRU. Once done copying, we'll unlock and readd to the > + * LRU via release_pte_page(). If the page is still in the > + * swapcache, we're the exclusive owner. Due to the page lock > + * the page cannot be added to the swapcache until we're done > + * and consequently it cannot be faulted in from the swapcache > + * into another process. > + */ > mod_node_page_state(page_pgdat(page), > NR_ISOLATED_ANON + page_is_file_lru(page), > compound_nr(page)); > -- > 2.34.1 >