Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp2725325pxb; Mon, 31 Jan 2022 03:00:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJzwj1Jbx6XkP2ckZUH5scHrOMDrqKOgropOKqt69ovv+WjkTc3IeaAcX8IPRLtfsucDDIO6 X-Received: by 2002:a05:6402:2071:: with SMTP id bd17mr20396682edb.326.1643626848088; Mon, 31 Jan 2022 03:00:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643626848; cv=none; d=google.com; s=arc-20160816; b=KJti74oolLkCm3MPxN7VNcXh7XPOI36pnF5Nlzv1r0WFWbx3DCAxcXJ4u4NHoZT2bq neyIFRnDVzaf2gq4ILh9l4H6sCn5PlxZt5YoEak/1d/zom5dtO+VqgTHu4lztKsVzC+R rtfDqkcM8NICISP8BsBzGCTRxxlYiohKbJ5BXtOPzx+V0wYmy3F1cg5N7zMLQ28ZIJ8N kHvdJ95K/ejD3XnOyfiMhb+c9EfAi+AOll7VhCexKsVVYrfav22vpR1aXHL6MEvwLQmu 1H1DUn5qM3YtRkoP5m1aAI2EqwEQfBbGKJoxxSJzdsuQYVEYfaGAxGsWcN3yY4Q6jHl9 ncNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature:dkim-signature; bh=My8yUIslN3g83ByWIBUrZ3wkCDsy1wmwr/8wCba+DW0=; b=Qx0mfdxkyrYRc4CAzfY+gXEcxlHjiv+GzH6zZhqf3/uMsI191r3fQsTnZNUm8qgR9c 5S0G5fM4DhlCpJCDmk3cVOtQSO396p8Ui8tTDM9VEHjz4hgT3TvqzN2Byx6jmdSCTCYA xOM3l2cpSlilEp7foUETRvrDzFOictGwLqcnS7rY3w2/0sy4wipzXhtHD9Qi5qwJTR5k k8aUWBuADhFgHoxsNuH+0d3DeORfHV4YgZtfVNp4A/FZziRRa3Zq6FkwSzOozpH/DiCy wTzXYTQ9U392KPk2/3/EgebZ3fJwpMEOBb2SdFRL7m+ElAk+CQbgVOvq0BxMsKnCFDAd k5qA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=2uvxxewp; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c7si7620755ejc.853.2022.01.31.03.00.22; Mon, 31 Jan 2022 03:00:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=2uvxxewp; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244855AbiA1MxS (ORCPT + 99 others); Fri, 28 Jan 2022 07:53:18 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:44958 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231817AbiA1MxQ (ORCPT ); Fri, 28 Jan 2022 07:53:16 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id F0B451F385; Fri, 28 Jan 2022 12:53:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1643374396; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=My8yUIslN3g83ByWIBUrZ3wkCDsy1wmwr/8wCba+DW0=; b=2uvxxewpgIYlCM2Y50+yYIfQscccjxuGkFd+mWNMCI/t2TOPdDBbkfVFORgh9ltV2Edjpy HJfLZOBk8yvhvfm9trHm5FtVVvXvIdl/jAppk6jT5bvQp9S/GenH5w42whEr1PKA8zVx/p MXydr6Kla31FEFhTQDXBUOhUx/RWdPg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1643374396; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=My8yUIslN3g83ByWIBUrZ3wkCDsy1wmwr/8wCba+DW0=; b=wBXKexbbo4poemZ1IAY7YMfO7uaeClm6X7OmEJUF9sdwkcth0nsw1rRlN72CyWtWE0Yg9Y DpK2+DcgfZNBeHCg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 85F1813ABF; Fri, 28 Jan 2022 12:53:15 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id zLP2Hzvn82H6awAAMHmgww (envelope-from ); Fri, 28 Jan 2022 12:53:15 +0000 Message-ID: <595b8e80-96c0-dab6-5d13-652f0a0e40ec@suse.cz> Date: Fri, 28 Jan 2022 13:53:15 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US To: David Hildenbrand , linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , linux-mm@kvack.org, Nadav Amit References: <20220126095557.32392-1-david@redhat.com> <20220126095557.32392-2-david@redhat.com> From: Vlastimil Babka Subject: Re: [PATCH RFC v2 1/9] mm: optimize do_wp_page() for exclusive pages in the swapcache In-Reply-To: <20220126095557.32392-2-david@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/26/22 10:55, David Hildenbrand wrote: > Liang Zhang reported [1] that the current COW logic in do_wp_page() is > sub-optimal when it comes to swap+read fault+write fault of anonymous > pages that have a single user, visible via a performance degradation in > the redis benchmark. Something similar was previously reported [2] by > Nadav with a simple reproducer. Can we make the description more self-contained? I.e. describe that sub-optimal COW means we copy when it's not necessary, and this can happen if swap-out is followed by a swap-in for read and a then a write fault (IIUC), because the swap cache reference increases page_count()... > Let's optimize for pages that have been added to the swapcache but only > have an exclusive owner. Try removing the swapcache reference if there is > hope that we're the exclusive user. Can we expect any downside for reclaim efficiency due to the more aggressive removal from swapcache? Probably not, as we are doing the removal when the page is about to get dirty, so we wouldn't be able to reuse any previously swapped out content anyway. Maybe it's even beneficial? > We will fail removing the swapcache reference in two scenarios: > (1) There are additional swap entries referencing the page: copying > instead of reusing is the right thing to do. > (2) The page is under writeback: theoretically we might be able to reuse > in some cases, however, we cannot remove the additional reference > and will have to copy. > > Further, we might have additional references from the LRU pagevecs, > which will force us to copy instead of being able to reuse. We'll try > handling such references for some scenarios next. Concurrent writeback > cannot be handled easily and we'll always have to copy. > > While at it, remove the superfluous page_mapcount() check: it's > implicitly covered by the page_count() for ordinary anon pages. > > [1] https://lkml.kernel.org/r/20220113140318.11117-1-zhangliang5@huawei.com > [2] https://lkml.kernel.org/r/0480D692-D9B2-429A-9A88-9BBA1331AC3A@gmail.com > > Reported-by: Liang Zhang > Reported-by: Nadav Amit > Signed-off-by: David Hildenbrand Acked-by: Vlastimil Babka > --- > mm/memory.c | 20 ++++++++++++++------ > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index c125c4969913..bcd3b7c50891 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3291,19 +3291,27 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) > if (PageAnon(vmf->page)) { > struct page *page = vmf->page; > > - /* PageKsm() doesn't necessarily raise the page refcount */ > - if (PageKsm(page) || page_count(page) != 1) > + /* > + * We have to verify under page lock: these early checks are > + * just an optimization to avoid locking the page and freeing > + * the swapcache if there is little hope that we can reuse. > + * > + * PageKsm() doesn't necessarily raise the page refcount. > + */ > + if (PageKsm(page) || page_count(page) > 1 + PageSwapCache(page)) > goto copy; > if (!trylock_page(page)) > goto copy; > - if (PageKsm(page) || page_mapcount(page) != 1 || page_count(page) != 1) { > + if (PageSwapCache(page)) > + try_to_free_swap(page); > + if (PageKsm(page) || page_count(page) != 1) { > unlock_page(page); > goto copy; > } > /* > - * Ok, we've got the only map reference, and the only > - * page count reference, and the page is locked, > - * it's dark out, and we're wearing sunglasses. Hit it. > + * Ok, we've got the only page reference from our mapping > + * and the page is locked, it's dark out, and we're wearing > + * sunglasses. Hit it. > */ > unlock_page(page); > wp_page_reuse(vmf);