Received: by 2002:ab2:5c0e:0:b0:1ef:a325:1205 with SMTP id i14csp42038lqk; Wed, 13 Mar 2024 15:49:33 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUdReFJVQnKtnispdePXQNy0R0ZNRg+W0dXkY0C05CCmY6BxMjwuw0/e4o80wO5qXpudp1mbamUBGPpHpA93r2Z03Xli8FGYguDTPtLrA== X-Google-Smtp-Source: AGHT+IEnQYlrartqtPAiKd1a0rEnYdN8erXk5HB8kRuOblsUbE9YdkffHKO/GpX46/TouycW7HLO X-Received: by 2002:a17:902:a504:b0:1db:f7f6:a73a with SMTP id s4-20020a170902a50400b001dbf7f6a73amr134019plq.25.1710370173337; Wed, 13 Mar 2024 15:49:33 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710370173; cv=pass; d=google.com; s=arc-20160816; b=wcjWoQoOFEWhhBBbbkWhevZCbKi3M3G4oDG9pfV1lKrL5YbNgbIbxZyumjYLUKDMB1 X3U1D+1rUB6yzKeGbghq8gcmbr4CZa3CAOIdKcCAXLQ541P+WPO5CPDqZgrY16eDS6ht xdgG7baKhLAlFCsd5yl5QXVoqlGykJPrNAJVYerITFs+uIluD/WMq9FJozyy16l/5xK0 jYj+9d50DTcXCNlfEy5SwRWBBaKJNLOs5PFACTWRXVpTwpOfITIlo5ppjvGv9k/uUEck XLLlF7s7CH6mJiDtiu2/bBWQTiYIn9Fe+d/MK64fnHephYomngou5E2YQ5xrwRJZ2LR4 /FyA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=kQPlLLYuK9rGDu9iFj4vfuJRTUe5t6j7sL0t/2BVD8s=; fh=T1X0nYAJlJ7RIyg0MUvMTlS6h5IHik84E+cR/cSBEAU=; b=W+8Tno5i+cHOBm190SVV79S/rIA2DJWBh9TsrfbFMFykNOP0W5aMqgYDEB1MsaP6ta StlpDSFX5rdkJpSHQwjKhiRZKW8NpE/A8us5ubdGlMo0r6P4hBZTcdFohQEcdEHin06g KhjlBeMRrCFjH+bcIiE77rGERG/CaZIgJNn4CG77mUfwCdWiRsmcgscJp3u1mImfSqci dgK+zu9ny4q2dXip9L04QeLQxYpi0L3JDueDZBuQvY+hEjz61WOAF7hhm/3DtJzz/ebK c+vd9Ils8sNVCz46OI+/1AHgmuHOrTYFOHpWZhrt7j94pDnl1CszaRvsyj5dWudEDsiq ctVQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=YUSkiYsX; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-102615-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-102615-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id m18-20020a170902f65200b001dcdfbf26b4si233248plg.473.2024.03.13.15.49.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Mar 2024 15:49:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-102615-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=YUSkiYsX; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-102615-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-102615-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 066A8285CA0 for ; Wed, 13 Mar 2024 22:49:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 89DFC5C90B; Wed, 13 Mar 2024 22:49:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="YUSkiYsX" Received: from mail-io1-f47.google.com (mail-io1-f47.google.com [209.85.166.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0431C5B697 for ; Wed, 13 Mar 2024 22:49:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710370166; cv=none; b=bUhMHL7j+5rwxHnWPZL2PzUGlAjPI/XiubtV/Wvq+ZIBLnT1YbPJYNVTJH/hGMiGsT4nbAofRlvq/O1OXfaGZydDb0+mSVymTi1W074vxsVKkLYRDLtIdiTKNDrWuiFpb7IJwwyCAsqHzr40EtfA6PBUzqd/ahxpPLDd/4yPWLs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710370166; c=relaxed/simple; bh=kQPlLLYuK9rGDu9iFj4vfuJRTUe5t6j7sL0t/2BVD8s=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=S1D24Z3HhUV7DX0QAIzd8XUGTVV0E5pNKsMqcGOm8jWCRVZBIl8swMtSMNLWGg3GqE3dbQB65pA3moH+eYSOML/09S1m89GwM00hvBqSpecy9CTCTSOoZP0kXDmbf0qEWiQ+GFAYLkeF32O8ODN0bkGkHi+RqdH/2DenBrnffJ4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=YUSkiYsX; arc=none smtp.client-ip=209.85.166.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Received: by mail-io1-f47.google.com with SMTP id ca18e2360f4ac-7c8ceb9fb0fso9246839f.2 for ; Wed, 13 Mar 2024 15:49:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1710370164; x=1710974964; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kQPlLLYuK9rGDu9iFj4vfuJRTUe5t6j7sL0t/2BVD8s=; b=YUSkiYsX75yXbpkLyU0s1DSqpWR70wHwYofCL2Ef9DY9vHKnF7L2TyoE1eK4wyiVPr paqeTp6C8YXQ8RrVXL6X6SLafnuiO63o+9813Wdvccz8hiI0GCsdGRTkyU+UOQVXEfRe 4vdXmj7jaNA7DXZUrx31oszHLz9sp+dVnaTce4AYTEvzLrXu1QAVIaoFwDc+TpR3NQxb fQrCMcABTrEunJOFQbj7ZdZOxIxkOAMICMPRqSmbSxqlTHBjbyIlJECQqgaiPACBFag7 LfH9Q2NV8Mo0JnJBQxgaueL+BYEK9g3jsvags1HHVjbm+3UopXkDY6Q47UhcwXDPpujC ztPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710370164; x=1710974964; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kQPlLLYuK9rGDu9iFj4vfuJRTUe5t6j7sL0t/2BVD8s=; b=ZUMGu82PbrXRvgxtE2Fl9GOWYqYmgav8i2Mxgj9uw6EIl/3hJ1CmgPT+BveXyIgwqc 03VH8VdR/88/id1MnjXS4KmgRlmjorKM6eW/x4SGy96i7ol/dtUYoCXB1oS3ZbhEv3qe Um5vKQ794RPFHoxFx5t+k7263tdFMrQAX0lhO/FMQR+lIvaqAy9g9AL8OdDxm8eNdxmb k2DflqxmCfpqlIh4d8sJJrm0Mj7JHTkvnYzyVr6ZYkgK5/Ea9eGjkJz7btiRuq1XL/FR 78/Ki5n3dhB4sbJ0hchxkcLurfWmw2/KKj6R0x8PJIHNRNs5ygYZdSxBVyCypaLZ4YvN 9MxA== X-Forwarded-Encrypted: i=1; AJvYcCXvrBADjwJr+O8Lqlczov/zmA1A7jx6k9T+6DzbOQYRHS1UyTLVgYwd6drtKHEcKHa16mftbxIckt0b8HHkG2/QGIOtVyj26SkvVQq1 X-Gm-Message-State: AOJu0YysHFGWX/PxVwnLTWhyedZNZ8ZJzP2SQvTyDONEnrLNw0hgbk1/ XlaDXuBkD61jjjJCPyrqD47tlk8jRk1497VpCDOY25a2h6aUk4wX3hE3CEEAcXSUGQwKSOnJpGS d46GiloKB4xzCS92FyOPWO3GnyVRSZYnTSUVJm1E9Yqv6iD0sonA= X-Received: by 2002:a81:7105:0:b0:608:13ee:8f3f with SMTP id m5-20020a817105000000b0060813ee8f3fmr15835ywc.27.1710369639489; Wed, 13 Mar 2024 15:40:39 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240220214558.3377482-1-souravpanda@google.com> <20240220214558.3377482-2-souravpanda@google.com> In-Reply-To: <20240220214558.3377482-2-souravpanda@google.com> From: Pasha Tatashin Date: Wed, 13 Mar 2024 18:40:03 -0400 Message-ID: Subject: Re: [PATCH v9 1/1] mm: report per-page metadata information To: Sourav Panda Cc: corbet@lwn.net, gregkh@linuxfoundation.org, rafael@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, muchun.song@linux.dev, rppt@kernel.org, david@redhat.com, rdunlap@infradead.org, chenlinxuan@uniontech.com, yang.yang29@zte.com.cn, tomas.mudrunka@gmail.com, bhelgaas@google.com, ivan@cloudflare.com, yosryahmed@google.com, hannes@cmpxchg.org, shakeelb@google.com, kirill.shutemov@linux.intel.com, wangkefeng.wang@huawei.com, adobriyan@gmail.com, vbabka@suse.cz, Liam.Howlett@oracle.com, surenb@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, willy@infradead.org, weixugc@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Feb 20, 2024 at 4:46=E2=80=AFPM Sourav Panda wrote: > > Adds two new per-node fields, namely nr_memmap and nr_memmap_boot, > to /sys/devices/system/node/nodeN/vmstat and a global Memmap field > to /proc/meminfo. This information can be used by users to see how > much memory is being used by per-page metadata, which can vary > depending on build configuration, machine architecture, and system > use. > > Per-page metadata is the amount of memory that Linux needs in order to > manage memory at the page granularity. The majority of such memory is > used by "struct page" and "page_ext" data structures. In contrast to > most other memory consumption statistics, per-page metadata might not > be included in MemTotal. For example, MemTotal does not include memblock > allocations but includes buddy allocations. In this patch, exported > field nr_memmap in /sys/devices/system/node/nodeN/vmstat would > exclusively track buddy allocations while nr_memmap_boot would > exclusively track memblock allocations. Furthermore, Memmap in > /proc/meminfo would exclusively track buddy allocations allowing it to > be compared against MemTotal. > > This memory depends on build configurations, machine architectures, and > the way system is used: > > Build configuration may include extra fields into "struct page", > and enable / disable "page_ext" > Machine architecture defines base page sizes. For example 4K x86, > 8K SPARC, 64K ARM64 (optionally), etc. The per-page metadata > overhead is smaller on machines with larger page sizes. > System use can change per-page overhead by using vmemmap > optimizations with hugetlb pages, and emulated pmem devdax pages. > Also, boot parameters can determine whether page_ext is needed > to be allocated. This memory can be part of MemTotal or be outside > MemTotal depending on whether the memory was hot-plugged, booted with, > or hugetlb memory was returned back to the system. > > Utility for userspace: > > Application Optimization: Depending on the kernel version and command > line options, the kernel would relinquish a different number of pages > (that contain struct pages) when a hugetlb page is reserved (e.g., 0, 6 > or 7 for a 2MB hugepage). The userspace application would want to know > the exact savings achieved through page metadata deallocation without > dealing with the intricacies of the kernel. > > Observability: Struct page overhead can only be calculated on-paper at > boot time (e.g., 1.5% machine capacity). Beyond boot once hugepages are > reserved or memory is hotplugged, the computation becomes complex. > Per-page metrics will help explain part of the system memory overhead, > which shall help guide memory optimizations and memory cgroup sizing. > > Debugging: Tracking the changes or absolute value in struct pages can > help detect anomalies as they can be correlated with other metrics in > the machine (e.g., memtotal, number of huge pages, etc). > > page_ext overheads: Some kernel features such as page_owner > page_table_check that use page_ext can be optionally enabled via kernel > parameters. Having the total per-page metadata information helps users > precisely measure impact. > > Suggested-by: Pasha Tatashin > Signed-off-by: Sourav Panda Reviewed-by: Pasha Tatashin