Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp2647101imj; Mon, 11 Feb 2019 06:25:41 -0800 (PST) X-Google-Smtp-Source: AHgI3IaVpaZgejjriIf1IlZgLD3Hm525IoSktpx5NsLOn2xKmNwYbgp9K50xzTmqnsRJ90m+Ekx3 X-Received: by 2002:a17:902:7892:: with SMTP id q18mr37718534pll.217.1549895141717; Mon, 11 Feb 2019 06:25:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549895141; cv=none; d=google.com; s=arc-20160816; b=EoGEDCxkMbhFmL+UI3/kMBFl9D4s1OHYjEiQW+CKjQcAvLdnZlzlAelwNw97CINVHJ liNSz09xCdE1BtWzltpA4n+uAYutLSwvqUS8Ju7yucbq6uGSqxGXsXKjI4arYZOw89Ki tJPATiQlpoagldradg3LDSLnZVUf4tuUEuV+ROI4ciAhBDyz/pNbbtM9+KPIhNcQfBzK TrxnnyfmYKqp6cWP8YWH2dSmJ7uef6KeN640cu6SpkCWX45BCAwsUKUvPG6nTlAzYw0L vobPNCddWFsimkzskeK5C0rGhJU+bQLM5lDy+f7s6eSSsbbMzWCfbSWCd1iV1TV2WeUT Hhxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=17mLUiLoUITEf9onrIviV0lzmU8oz0sx09jONf+UW20=; b=EaDWlWNtR4Re+uGDx3+kHuaPKh9GOF/C/WtKgDCcOgxz8HAIr/bLIwj6sE2LiM4Vbn cTh2/rBDietOqjJE52x8bP/hjPtMeOPj8Fgd+dg3z5eKcaqpzigzyqVmuDsHTZxlXshR ahaRisHeMtH8FphXyEgboSnHZVSXbG2xMXqx+u3mm5SOZwy3btVrraQdwUsO2v1q5SaU RcISBhpvDega2QArrLy2XxhleGxtHaIfnhngFRcDm9NfV/fTenBXd08e1NYX6ew+akAu NIVK0GA5pEkYXvA2Stp6UHGx5Wu7nRJcPlDduBMz7aKRxhf69hm8xI4pa6RhgJgwDamk /f2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m64si10816pfb.224.2019.02.11.06.25.25; Mon, 11 Feb 2019 06:25:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727080AbfBKORX (ORCPT + 99 others); Mon, 11 Feb 2019 09:17:23 -0500 Received: from mail-qk1-f195.google.com ([209.85.222.195]:46544 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726655AbfBKORW (ORCPT ); Mon, 11 Feb 2019 09:17:22 -0500 Received: by mail-qk1-f195.google.com with SMTP id i5so1768880qkd.13 for ; Mon, 11 Feb 2019 06:17:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=17mLUiLoUITEf9onrIviV0lzmU8oz0sx09jONf+UW20=; b=dXAH+JBqq0T9Nn0m9XokkFE4dtumDezby/CT+yus+35+5fjg70OUATFEoHJK6HFatP FzYVRDIusag42dBDonusuul+eJ2H1K5wq61FXe0u2ECiLoM7TLCliW9+0csZL6oRlwdO D4ZsnBZDQRAiOK61sxzlbNtWxLI2LYSITwWN42x+xMNk69Ugk9x8d7UhLw7nmVnuHsOS NIs0ymfWw9afIMPMrRFB/2rt/Z60JJZmhQWoBfYCNFlL4jwdOOWedZvRVfKNo2UrSoTN Xz9TUCjoYbN4HsKSQlNZ2U0D5FRnebC0Y03Dbyc+4sFoOIFAzt414BVvNUBi/7fPz+H+ N/5Q== X-Gm-Message-State: AHQUAuYznrmlWBRp1ZigDzLXeB2GT/MqzKMy18UTzvQ5t9tPNI6U4Grt M9lh2QXqOdt2nAshHOS/h9xDGg== X-Received: by 2002:a37:c403:: with SMTP id d3mr4543650qki.54.1549894641430; Mon, 11 Feb 2019 06:17:21 -0800 (PST) Received: from redhat.com (pool-173-76-246-42.bstnma.fios.verizon.net. [173.76.246.42]) by smtp.gmail.com with ESMTPSA id r20sm5834424qtp.68.2019.02.11.06.17.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 11 Feb 2019 06:17:20 -0800 (PST) Date: Mon, 11 Feb 2019 09:17:18 -0500 From: "Michael S. Tsirkin" To: Nitesh Narayan Lal Cc: Alexander Duyck , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, rkrcmar@redhat.com, alexander.h.duyck@linux.intel.com, x86@kernel.org, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, pbonzini@redhat.com, tglx@linutronix.de, akpm@linux-foundation.org Subject: Re: [RFC PATCH 4/4] mm: Add merge page notifier Message-ID: <20190211091623-mutt-send-email-mst@kernel.org> References: <20190204181118.12095.38300.stgit@localhost.localdomain> <20190204181558.12095.83484.stgit@localhost.localdomain> <20190209195325-mutt-send-email-mst@kernel.org> <7fcb61d6-64f0-f3ae-5e32-0e9f587fdd49@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7fcb61d6-64f0-f3ae-5e32-0e9f587fdd49@redhat.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 11, 2019 at 08:30:03AM -0500, Nitesh Narayan Lal wrote: > > On 2/9/19 7:57 PM, Michael S. Tsirkin wrote: > > On Mon, Feb 04, 2019 at 10:15:58AM -0800, Alexander Duyck wrote: > >> From: Alexander Duyck > >> > >> Because the implementation was limiting itself to only providing hints on > >> pages huge TLB order sized or larger we introduced the possibility for free > >> pages to slip past us because they are freed as something less then > >> huge TLB in size and aggregated with buddies later. > >> > >> To address that I am adding a new call arch_merge_page which is called > >> after __free_one_page has merged a pair of pages to create a higher order > >> page. By doing this I am able to fill the gap and provide full coverage for > >> all of the pages huge TLB order or larger. > >> > >> Signed-off-by: Alexander Duyck > > Looks like this will be helpful whenever active free page > > hints are added. So I think it's a good idea to > > add a hook. > > > > However, could you split adding the hook to a separate > > patch from the KVM hypercall based implementation? > > > > Then e.g. Nilal's patches could reuse it too. > With the current design of my patch-set, if I use this hook to report > free pages. I will end up making redundant hints for the same pfns. > > This is because the pages once freed by the host, are returned back to > the buddy. Suggestions on how you'd like to fix this? You do need this if you introduce a size cut-off right? > > > > > >> --- > >> arch/x86/include/asm/page.h | 12 ++++++++++++ > >> arch/x86/kernel/kvm.c | 28 ++++++++++++++++++++++++++++ > >> include/linux/gfp.h | 4 ++++ > >> mm/page_alloc.c | 2 ++ > >> 4 files changed, 46 insertions(+) > >> > >> diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h > >> index 4487ad7a3385..9540a97c9997 100644 > >> --- a/arch/x86/include/asm/page.h > >> +++ b/arch/x86/include/asm/page.h > >> @@ -29,6 +29,18 @@ static inline void arch_free_page(struct page *page, unsigned int order) > >> if (static_branch_unlikely(&pv_free_page_hint_enabled)) > >> __arch_free_page(page, order); > >> } > >> + > >> +struct zone; > >> + > >> +#define HAVE_ARCH_MERGE_PAGE > >> +void __arch_merge_page(struct zone *zone, struct page *page, > >> + unsigned int order); > >> +static inline void arch_merge_page(struct zone *zone, struct page *page, > >> + unsigned int order) > >> +{ > >> + if (static_branch_unlikely(&pv_free_page_hint_enabled)) > >> + __arch_merge_page(zone, page, order); > >> +} > >> #endif > >> > >> #include > >> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > >> index 09c91641c36c..957bb4f427bb 100644 > >> --- a/arch/x86/kernel/kvm.c > >> +++ b/arch/x86/kernel/kvm.c > >> @@ -785,6 +785,34 @@ void __arch_free_page(struct page *page, unsigned int order) > >> PAGE_SIZE << order); > >> } > >> > >> +void __arch_merge_page(struct zone *zone, struct page *page, > >> + unsigned int order) > >> +{ > >> + /* > >> + * The merging logic has merged a set of buddies up to the > >> + * KVM_PV_UNUSED_PAGE_HINT_MIN_ORDER. Since that is the case, take > >> + * advantage of this moment to notify the hypervisor of the free > >> + * memory. > >> + */ > >> + if (order != KVM_PV_UNUSED_PAGE_HINT_MIN_ORDER) > >> + return; > >> + > >> + /* > >> + * Drop zone lock while processing the hypercall. This > >> + * should be safe as the page has not yet been added > >> + * to the buddy list as of yet and all the pages that > >> + * were merged have had their buddy/guard flags cleared > >> + * and their order reset to 0. > >> + */ > >> + spin_unlock(&zone->lock); > >> + > >> + kvm_hypercall2(KVM_HC_UNUSED_PAGE_HINT, page_to_phys(page), > >> + PAGE_SIZE << order); > >> + > >> + /* reacquire lock and resume freeing memory */ > >> + spin_lock(&zone->lock); > >> +} > >> + > >> #ifdef CONFIG_PARAVIRT_SPINLOCKS > >> > >> /* Kick a cpu by its apicid. Used to wake up a halted vcpu */ > >> diff --git a/include/linux/gfp.h b/include/linux/gfp.h > >> index fdab7de7490d..4746d5560193 100644 > >> --- a/include/linux/gfp.h > >> +++ b/include/linux/gfp.h > >> @@ -459,6 +459,10 @@ static inline struct zonelist *node_zonelist(int nid, gfp_t flags) > >> #ifndef HAVE_ARCH_FREE_PAGE > >> static inline void arch_free_page(struct page *page, int order) { } > >> #endif > >> +#ifndef HAVE_ARCH_MERGE_PAGE > >> +static inline void > >> +arch_merge_page(struct zone *zone, struct page *page, int order) { } > >> +#endif > >> #ifndef HAVE_ARCH_ALLOC_PAGE > >> static inline void arch_alloc_page(struct page *page, int order) { } > >> #endif > >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >> index c954f8c1fbc4..7a1309b0b7c5 100644 > >> --- a/mm/page_alloc.c > >> +++ b/mm/page_alloc.c > >> @@ -913,6 +913,8 @@ static inline void __free_one_page(struct page *page, > >> page = page + (combined_pfn - pfn); > >> pfn = combined_pfn; > >> order++; > >> + > >> + arch_merge_page(zone, page, order); > >> } > >> if (max_order < MAX_ORDER) { > >> /* If we are here, it means order is >= pageblock_order. > -- > Regards > Nitesh >