Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp2514963pxk; Mon, 14 Sep 2020 15:58:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxErNNExaHvi6TFJfwW9d8HZSDcjiusHtXCkVRE+Rly4Dkrgq4tfbfV4ypGi3eCeRtTxtEU X-Received: by 2002:a17:906:6d92:: with SMTP id h18mr16717424ejt.405.1600124333041; Mon, 14 Sep 2020 15:58:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600124333; cv=none; d=google.com; s=arc-20160816; b=rg6yKPgJ3ICRlaFtryUTjQ6HT5O3qPS7FZucJ4hDP8dSUj6f/zFfcFR53deuZNHlYg DO5DNS/NngRX5k5i5S3QeSVoW41p4ReY9rzIKQ/IpEyNAMI0Bm4MvxTJZWhRirwFPycC Po31aGMBbHY863dfuN4vA7+DPs9zbvTp+Ojj04lgM4i/kNAsanUJfTEICpuY5szI5MOb 2o1qyqYEwfvl+KnUM06/qbKmT2N0Vi/Y85+MQixIpYUM5RvJYJFEFnouKMAniZ3otoiw QIWvvH5nOF5WWLxkx5MPQ8caOWLSFsfmxiDmj1sc/7e4WaVeHKEMZQJ/niNI3kdqrYV/ 9/NA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=WViTl0b8x+giGHFZh+rVEKrUPQoCKo+J/50WQCvSoeU=; b=GUu+hCg7WgCJK+53cV1CaU4J4MNXSjuvjNep3wFIUgH6Vu5NbjsofW3mVPNf3c0r8x 7gxkhCTUlokNaQwKWz+YZSoodI3w5+Y3aa5nXzDqdDaAhYyTeLvShEUK8xWF4Eb/cOOk 1MykZbKEm1BjNqGX5b+whKf4F9kfvwioYJFzzAEvvqyWOHKEg3r2HNsRVNfQxieJWMkR saLKHuIsvYxji82tGaTyS1NjJkf3H23wBzPEehMqVCvQvbE3CezYkrRgnEU5rfMw4h7I 6VAyo/GxqPh4xUbTq7hV3NQDeM6pefYfc2Kp+Y06OWoPr0CdEKizhgu+AUPrat97wJek uhCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="cRqX/Voi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id me14si8238275ejb.397.2020.09.14.15.58.30; Mon, 14 Sep 2020 15:58:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="cRqX/Voi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725985AbgINW5i (ORCPT + 99 others); Mon, 14 Sep 2020 18:57:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725926AbgINW5f (ORCPT ); Mon, 14 Sep 2020 18:57:35 -0400 Received: from mail-lf1-x141.google.com (mail-lf1-x141.google.com [IPv6:2a00:1450:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BEAE5C06174A for ; Mon, 14 Sep 2020 15:57:33 -0700 (PDT) Received: by mail-lf1-x141.google.com with SMTP id d15so997714lfq.11 for ; Mon, 14 Sep 2020 15:57:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WViTl0b8x+giGHFZh+rVEKrUPQoCKo+J/50WQCvSoeU=; b=cRqX/VoiJlNpAl8z1euxYLDbFZouqQx0Mt33brymprI2vtjrByuRMCH42vuV/btT6h OV05p0Cytn9A45Ob2T8JxSWLWKRCcbNClVcFRUdmCfOa5zrsHmKyYMsHNY/6dZ9gV+g0 Z8IZWuFtKTlqacd2ikYCTZeFd+UdvXSiw1lsPVPZAE9nGhBRYsYW47ufhafLdKP5LGOU ++fzP4+9m0MsmjaQjTV9Ks/RJm7Nz6m0NC2meTZOjUOeMz4gwIKUxHViVTr8n5pMZ71D mf4ZaigkjBXHPrm1KGiAyFnK+ZPMozHVtg89PLTj6Al/4lCyQjxTgfjPoc3+LROM/sk1 5Rvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WViTl0b8x+giGHFZh+rVEKrUPQoCKo+J/50WQCvSoeU=; b=OHcOKFyqSNJ3I8sx7xFFDLtyDUmTwYJCx+/RSAhrV4HU1FbmPxwn/+w8eTG6W35vIQ HFv+eV0PksbTSzXwrtajBpuKzUcuBEVkHrNgy1hH2U678VeMuDdnIn+mz8tW4W5Pk897 0ehSZPjAF6D2LGOZ+o6K+Mv7ThzFSGkQbVmAKP7aiT+voX9aT5qlbpGPaAqgXg0sKD1+ uuks+vBNQwCyrIVosrtyPOuXNSMRzBd+hXB+iUcGA31U6Rc23Wtu8JWu3nHDOg9lYJfw pv/tw9MSv72o8CLTzkBDWlPiLL7P/KNChGHE1h6AwYnXWD+0/onB7dFRKeQZf9El8bkk OVMg== X-Gm-Message-State: AOAM530TVHXdZqS+nxKGqJM7TZEDMFYKR6AstiY+3LwpgCS8YT/8QIcS CHtVzPA8wnvjdJy7eR2r2dm9cu/z9A1m0TLvhh34jQ== X-Received: by 2002:a19:4084:: with SMTP id n126mr2701402lfa.54.1600124251790; Mon, 14 Sep 2020 15:57:31 -0700 (PDT) MIME-Version: 1.0 References: <20200913070010.44053-1-songmuchun@bytedance.com> In-Reply-To: From: Shakeel Butt Date: Mon, 14 Sep 2020 15:57:20 -0700 Message-ID: Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2 To: Muchun Song Cc: Tejun Heo , Li Zefan , Johannes Weiner , Jonathan Corbet , Michal Hocko , Vladimir Davydov , Andrew Morton , Roman Gushchin , Cgroups , linux-doc@vger.kernel.org, LKML , Linux MM , kernel test robot Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 14, 2020 at 9:55 AM Muchun Song wrote: > > On Tue, Sep 15, 2020 at 12:07 AM Shakeel Butt wrote: > > > > On Sun, Sep 13, 2020 at 12:01 AM Muchun Song wrote: > > > > > > In the cgroup v1, we have a numa_stat interface. This is useful for > > > providing visibility into the numa locality information within an > > > memcg since the pages are allowed to be allocated from any physical > > > node. One of the use cases is evaluating application performance by > > > combining this information with the application's CPU allocation. > > > But the cgroup v2 does not. So this patch adds the missing information. > > > > > > Signed-off-by: Muchun Song > > > Suggested-by: Shakeel Butt > > > Reported-by: kernel test robot > > > --- > > [snip] > > > + > > > +static struct numa_stat numa_stats[] = { > > > + { "anon", PAGE_SIZE, NR_ANON_MAPPED }, > > > + { "file", PAGE_SIZE, NR_FILE_PAGES }, > > > + { "kernel_stack", 1024, NR_KERNEL_STACK_KB }, > > > + { "shmem", PAGE_SIZE, NR_SHMEM }, > > > + { "file_mapped", PAGE_SIZE, NR_FILE_MAPPED }, > > > + { "file_dirty", PAGE_SIZE, NR_FILE_DIRTY }, > > > + { "file_writeback", PAGE_SIZE, NR_WRITEBACK }, > > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > > + /* > > > + * The ratio will be initialized in numa_stats_init(). Because > > > + * on some architectures, the macro of HPAGE_PMD_SIZE is not > > > + * constant(e.g. powerpc). > > > + */ > > > + { "anon_thp", 0, NR_ANON_THPS }, > > > +#endif > > > + { "inactive_anon", PAGE_SIZE, NR_INACTIVE_ANON }, > > > + { "active_anon", PAGE_SIZE, NR_ACTIVE_ANON }, > > > + { "inactive_file", PAGE_SIZE, NR_INACTIVE_FILE }, > > > + { "active_file", PAGE_SIZE, NR_ACTIVE_FILE }, > > > + { "unevictable", PAGE_SIZE, NR_UNEVICTABLE }, > > > + { "slab_reclaimable", 1, NR_SLAB_RECLAIMABLE_B }, > > > + { "slab_unreclaimable", 1, NR_SLAB_UNRECLAIMABLE_B }, > > > +}; > > > + > > > +static int __init numa_stats_init(void) > > > +{ > > > + int i; > > > + > > > + for (i = 0; i < ARRAY_SIZE(numa_stats); i++) { > > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > > + if (numa_stats[i].idx == NR_ANON_THPS) > > > + numa_stats[i].ratio = HPAGE_PMD_SIZE; > > > +#endif > > > + } > > > > The for loop seems excessive but I don't really have a good alternative. > > Yeah, I also have no good alternative. The numa_stats is only initialized > once. So there may be no problem :). > > > > > > + > > > + return 0; > > > +} > > > +pure_initcall(numa_stats_init); > > > + > > > +static unsigned long memcg_node_page_state(struct mem_cgroup *memcg, > > > + unsigned int nid, > > > + enum node_stat_item idx) > > > +{ > > > + VM_BUG_ON(nid >= nr_node_ids); > > > + return lruvec_page_state(mem_cgroup_lruvec(memcg, NODE_DATA(nid)), idx); > > > +} > > > + > > > +static const char *memory_numa_stat_format(struct mem_cgroup *memcg) > > > +{ > > > + int i; > > > + struct seq_buf s; > > > + > > > + /* Reserve a byte for the trailing null */ > > > + seq_buf_init(&s, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE - 1); > > > + if (!s.buffer) > > > + return NULL; > > > + > > > + for (i = 0; i < ARRAY_SIZE(numa_stats); i++) { > > > + int nid; > > > + > > > + seq_buf_printf(&s, "%s", numa_stats[i].name); > > > + for_each_node_state(nid, N_MEMORY) { > > > + u64 size; > > > + > > > + size = memcg_node_page_state(memcg, nid, > > > + numa_stats[i].idx); > > > + size *= numa_stats[i].ratio; > > > + seq_buf_printf(&s, " N%d=%llu", nid, size); > > > + } > > > + seq_buf_putc(&s, '\n'); > > > + } > > > + > > > + /* The above should easily fit into one page */ > > > + if (WARN_ON_ONCE(seq_buf_putc(&s, '\0'))) > > > + s.buffer[PAGE_SIZE - 1] = '\0'; > > > > I think you should follow Michal's recommendation at > > http://lkml.kernel.org/r/20200914115724.GO16999@dhcp22.suse.cz > > Here is different, because the seq_buf_putc(&s, '\n') will not add \0 unless > we use seq_buf_puts(&s, "\n"). > Why a separate memory_numa_stat_format()? For memory_stat_format(), it is called from two places. There is no need to have a separate memory_numa_stat_format(). Similarly why not just call seq_printf() instead of formatting into a seq_buf?