Received: by 2002:a25:5b86:0:0:0:0:0 with SMTP id p128csp1050988ybb; Thu, 28 Mar 2019 18:29:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqywezJikct7TkX1fVADkQ2n+CJ4S0lj0zd4RjAEZabOK8aqdDibTRJMh2iH7xmykS/cuWhU X-Received: by 2002:aa7:8a87:: with SMTP id a7mr45735693pfc.252.1553822986390; Thu, 28 Mar 2019 18:29:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553822986; cv=none; d=google.com; s=arc-20160816; b=XPaEisa0AlW3OGtzvNdlapJrL6ETR8lkEYYCqQaBPfIYw7o4PzmCwcOfX0LI+KiYIo mIHYTAqCMBDDYNBLcCWrUT2tF8fMvaUD8JSLSyBKEKrXlUSpzkT3Z4QxfcRtUVU4/2Oc s0fn7Dtf+BV6rzWAj3onOLHnYRax4mqSArWLDO2LlO5FkIcJS4tYaZDunelyiQsIKhxK 28YuehmZpMt0BQJ32dW0/aoHLQHZToWggtTk5ZO1IV11lvDAfNRwV32UJqJgEXSFQ5Aw OVbyhrvVls0FzQTO1OgdADdj5i53h2xFd4rDwgnA8+R5vLByoQe2rjM3AWbCdPbjIaE1 ADEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:mime-version :message-id:date:dkim-signature; bh=4T5MUEH2ZwoDYdu9ANRKZqoThspT7kSM7OLP02B0X0U=; b=nCIb9UF/jdBt4uFFl221plhsfNWEUtZWHcxR6VxhLZgMpVr3pKYflBVNJA/BuEH6wH sg9A8jSUVZPRrrJSQDWmlRGLjV2SrcPEFYMfOi9pYDJgwKikiSsEQXApbUQuYvVzllkT mJ/54yS670TJEwWibzbP6McfMxCmDyugVwpKOabmrxh1ctUB7fui1ujxJbIqiLv5+M+T ATS6r7oopXZ+UKSMe1Pl2YQR6W23gnlsnpSxcUx+TYU88AxsVIpiOzzYYO5voV/NWpnB 2eK6z6iMdlN88ByYgW8KoO4GtdbYUfHPG71R3jIjprm9r78rdNJbrsxFCOZS8ss7WAXk SssQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=MCNu8JKS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y18si544227pfe.72.2019.03.28.18.29.28; Thu, 28 Mar 2019 18:29:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=MCNu8JKS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728359AbfC2B2v (ORCPT + 99 others); Thu, 28 Mar 2019 21:28:51 -0400 Received: from mail-io1-f73.google.com ([209.85.166.73]:40220 "EHLO mail-io1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727675AbfC2B2v (ORCPT ); Thu, 28 Mar 2019 21:28:51 -0400 Received: by mail-io1-f73.google.com with SMTP id e72so550194iof.7 for ; Thu, 28 Mar 2019 18:28:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=4T5MUEH2ZwoDYdu9ANRKZqoThspT7kSM7OLP02B0X0U=; b=MCNu8JKS44wPWIpPESaw89q4D2UcNs/MhIuoQmsdXKySRXIQldKv70crIqexGYEGdG DqvNejtNzC3Euo8Y/Y3E6xkLNC1XPJvK2eC/eOgolWclv2eFd5YN0OV4AufeqO4b6o76 GMzb4pZSO6M2U/RH7vNxuuexqDgNPUNXjGlNxLbUsrN0uP02xbs51KEtn7HXtw9CFi2T XAStb8JepS5oASqG4EWul/lHaJmmrlrvgMzj1s162xTBqmnTlLdP1hNXTllD7umlIDSi qAdGJxIdbbPKbsAa4BSXrzRMnQZH+RYoXxDLRLCbWrzCHn1WwE9o26sQ/wD67LrzQp7j Jonw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=4T5MUEH2ZwoDYdu9ANRKZqoThspT7kSM7OLP02B0X0U=; b=ggV1ItNFoJ+hcBVyti5J8KmKT43/1gupYFuLjavr+XVa3A5GFpFzUNBMdYnWdX9gTa 6Knc3myjV+Ee3LuQu/z7P1UY/Y0EGkQrGbSU6zbG4KCqxxUfj0yd9F7hu4z/1vVsibY4 Ldj+Sv6ekocN9dMElaBTo1gUs1eqpvt3/yV3PnijjCaQMy35W3Lcua+3PP8Cm+XKlpb7 i4RzIabALQOiaiFr0EOH75vbaes4gJunXEmqaIk4Qhl1jvtkQhKs7nme68MUGBxHDhoA xqR54qremYR4V3phnIaIZfYi2oLAq+cp76uRX55CCM+PGB7uexnKYoS1mLxMWDT7HcuQ P0TA== X-Gm-Message-State: APjAAAX3YqPfwj+r+LJIM/sF1IcIg0XEt1TgKtj9eoKbJkqNYqI5kebS uGGumKNN7IPfZXB3GNPppJDLLc5N8xcOOg== X-Received: by 2002:a24:9884:: with SMTP id n126mr919316itd.30.1553822930204; Thu, 28 Mar 2019 18:28:50 -0700 (PDT) Date: Thu, 28 Mar 2019 18:28:36 -0700 Message-Id: <20190329012836.47013-1-shakeelb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.21.0.392.gf8f6787159e-goog Subject: [RFC PATCH] mm, kvm: account kvm_vcpu_mmap to kmemcg From: Shakeel Butt To: Johannes Weiner , Vladimir Davydov , Michal Hocko , Andrew Morton , Matthew Wilcox , Paolo Bonzini , Ben Gardon , "=?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?=" Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Shakeel Butt Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org A VCPU of a VM can allocate upto three pages which can be mmap'ed by the user space application. At the moment this memory is not charged. On a large machine running large number of VMs (or small number of VMs having large number of VCPUs), this unaccounted memory can be very significant. So, this memory should be charged to a kmemcg. However that is not possible as these pages are mmapped to the userspace and PageKmemcg() was designed with the assumption that such pages will never be mmapped to the userspace. One way to solve this problem is by introducing an additional memcg charging API similar to mem_cgroup_[un]charge_skmem(). However skmem charging API usage is contained and shared and no new users are expected but the pages which can be mmapped and should be charged to kmemcg can and will increase. So, requiring the usage for such API will increase the maintenance burden. The simplest solution is to remove the assumption of no mmapping PageKmemcg() pages to user space. Signed-off-by: Shakeel Butt --- arch/s390/kvm/kvm-s390.c | 2 +- arch/x86/kvm/x86.c | 2 +- include/linux/page-flags.h | 26 ++++++++++++++++++-------- include/trace/events/mmflags.h | 9 ++++++++- virt/kvm/coalesced_mmio.c | 2 +- virt/kvm/kvm_main.c | 2 +- 6 files changed, 30 insertions(+), 13 deletions(-) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 4638303ba6a8..1a9e337ed5da 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2953,7 +2953,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, goto out; BUILD_BUG_ON(sizeof(struct sie_page) != 4096); - sie_page = (struct sie_page *) get_zeroed_page(GFP_KERNEL); + sie_page = (struct sie_page *) get_zeroed_page(GFP_KERNEL_ACCOUNT); if (!sie_page) goto out_free_cpu; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 65e4559eef2f..05c0c7eaa5c6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9019,7 +9019,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) else vcpu->arch.mp_state = KVM_MP_STATE_UNINITIALIZED; - page = alloc_page(GFP_KERNEL | __GFP_ZERO); + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); if (!page) { r = -ENOMEM; goto fail; diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 9f8712a4b1a5..b47a6a327d6a 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -78,6 +78,10 @@ * PG_hwpoison indicates that a page got corrupted in hardware and contains * data with incorrect ECC bits that triggered a machine check. Accessing is * not safe since it may cause another machine check. Don't touch! + * + * PG_kmemcg indicates that a kmem page is charged to a memcg. If kmemcg is + * enabled, the page allocator will set PageKmemcg() on pages allocated with + * __GFP_ACCOUNT. It gets cleared on page free. */ /* @@ -130,6 +134,9 @@ enum pageflags { #if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT) PG_young, PG_idle, +#endif +#ifdef CONFIG_MEMCG_KMEM + PG_kmemcg, #endif __NR_PAGEFLAGS, @@ -289,6 +296,9 @@ static inline int Page##uname(const struct page *page) { return 0; } #define SETPAGEFLAG_NOOP(uname) \ static inline void SetPage##uname(struct page *page) { } +#define __SETPAGEFLAG_NOOP(uname) \ +static inline void __SetPage##uname(struct page *page) { } + #define CLEARPAGEFLAG_NOOP(uname) \ static inline void ClearPage##uname(struct page *page) { } @@ -427,6 +437,13 @@ TESTCLEARFLAG(Young, young, PF_ANY) PAGEFLAG(Idle, idle, PF_ANY) #endif +#ifdef CONFIG_MEMCG_KMEM +__PAGEFLAG(Kmemcg, kmemcg, PF_NO_TAIL) +#else +TESTPAGEFLAG_FALSE(kmemcg) +__SETPAGEFLAG_NOOP(kmemcg) +__CLEARPAGEFLAG_NOOP(kmemcg) +#endif /* * On an anonymous page mapped into a user virtual memory area, * page->mapping points to its anon_vma, not to a struct address_space; @@ -701,8 +718,7 @@ PAGEFLAG_FALSE(DoubleMap) #define PAGE_MAPCOUNT_RESERVE -128 #define PG_buddy 0x00000080 #define PG_offline 0x00000100 -#define PG_kmemcg 0x00000200 -#define PG_table 0x00000400 +#define PG_table 0x00000200 #define PageType(page, flag) \ ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE) @@ -743,12 +759,6 @@ PAGE_TYPE_OPS(Buddy, buddy) */ PAGE_TYPE_OPS(Offline, offline) -/* - * If kmemcg is enabled, the buddy allocator will set PageKmemcg() on - * pages allocated with __GFP_ACCOUNT. It gets cleared on page free. - */ -PAGE_TYPE_OPS(Kmemcg, kmemcg) - /* * Marks pages in use as page tables. */ diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index a1675d43777e..d93b78eac5b9 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -79,6 +79,12 @@ #define IF_HAVE_PG_IDLE(flag,string) #endif +#ifdef CONFIG_MEMCG_KMEM +#define IF_HAVE_PG_KMEMCG(flag,string) ,{1UL << flag, string} +#else +#define IF_HAVE_PG_KMEMCG(flag,string) +#endif + #define __def_pageflag_names \ {1UL << PG_locked, "locked" }, \ {1UL << PG_waiters, "waiters" }, \ @@ -105,7 +111,8 @@ IF_HAVE_PG_MLOCK(PG_mlocked, "mlocked" ) \ IF_HAVE_PG_UNCACHED(PG_uncached, "uncached" ) \ IF_HAVE_PG_HWPOISON(PG_hwpoison, "hwpoison" ) \ IF_HAVE_PG_IDLE(PG_young, "young" ) \ -IF_HAVE_PG_IDLE(PG_idle, "idle" ) +IF_HAVE_PG_IDLE(PG_idle, "idle" ) \ +IF_HAVE_PG_KMEMCG(PG_kmemcg, "kmemcg" ) #define show_page_flags(flags) \ (flags) ? __print_flags(flags, "|", \ diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c index 5294abb3f178..ebf1601de2a5 100644 --- a/virt/kvm/coalesced_mmio.c +++ b/virt/kvm/coalesced_mmio.c @@ -110,7 +110,7 @@ int kvm_coalesced_mmio_init(struct kvm *kvm) int ret; ret = -ENOMEM; - page = alloc_page(GFP_KERNEL | __GFP_ZERO); + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); if (!page) goto out_err; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f25aa98a94df..de6328dff251 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -306,7 +306,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) vcpu->pre_pcpu = -1; INIT_LIST_HEAD(&vcpu->blocked_vcpu_list); - page = alloc_page(GFP_KERNEL | __GFP_ZERO); + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); if (!page) { r = -ENOMEM; goto fail; -- 2.21.0.392.gf8f6787159e-goog