Received: by 2002:ab2:620c:0:b0:1ef:ffd0:ce49 with SMTP id o12csp908257lqt; Tue, 19 Mar 2024 07:26:32 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWpCT55zezLoV/mjXHyuzI7y8KamWTJeshUA6l2cJimp52m+FV+BIjQqYe8w30JniqIlwmqofGutaYMZYEBwpfpfSWnPruNTBfKhDRp9Q== X-Google-Smtp-Source: AGHT+IHeqsvwVizn9T/ulxXViGKnD7+IBb3Ix/B2Bq2wlVn0qkD9LXS/uglb5RSeII5RN6lvxaTR X-Received: by 2002:a05:620a:5596:b0:789:f61c:ac7 with SMTP id vq22-20020a05620a559600b00789f61c0ac7mr2766564qkn.10.1710858392405; Tue, 19 Mar 2024 07:26:32 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710858392; cv=pass; d=google.com; s=arc-20160816; b=GzSEQCuMlxMUDvO4pg92MU/ZnMPYB7vqONYXC1OORJNmEZ/eB4WNL/brYIymhn8x81 ZHS/b6zkF0Xk0gjXvo09z8e1p5AT0SY3IObL0hwiTDJFOu4QerptwbXUk3qsPxQZAJho JYLRMUoO6Sto+MWlXJu1V+j4Jq0pMdar5GgMUPpjYvcImVF5mDVv+CTJbZOVwykxv9Cx FzZPlsqvz2kBwRc9FOq1tsYixmbPQGZ4CwZtjvmIZfzBflUiQP0gNplcOBtygKzqcKX8 5wRqt525X3IJa4SVEhWCKydjQk2sx+U2O7W0NKiwe8NVzm9kk+gLTUgSB0DCZSRfu/Bz JnoQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=/ai7rsRCGQJwbmMN9+0MqsXjffyEnlQWeJfS3vA+etg=; fh=6DC6v8ulsogQMi5zTnVuXV6eWfeSYC+FvzrXRiFF2JM=; b=fNonTjt1/RxwMBH4fFmxu/aqWd0AUQu+HcU/l/Sq9kcXUoKIS8+2ulFaWs2RBOCGNe WHw/ISOhvneiS+ZDtXVa5TOUoVsiwkp6t5i4VkJ+eq/7TMx8Noh/dlH2bWmgtSXib6DJ tRAuY80VDQcd4flRsyAhEDH4SyqIUTozZ3E2Vf4o/ELsuEQILS/+XKr0d8XS1zekIxAI B79idiYUIVyJDJCdL3VVa9OGZZ+ONGcnEVm9K5uHJWX0BIQuKoHgLzG/wXugoP8IVnKx sXTLRaG6IzMx1XyUosdNJUrr0itmNAkvuHAr/VIiChs/KgUxLn1B7he3UGBKHMVfdwQc Og6w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=ZsdNFS5C; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-107645-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107645-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id h4-20020a37c444000000b00788312271f2si9078662qkm.472.2024.03.19.07.26.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Mar 2024 07:26:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-107645-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=ZsdNFS5C; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-107645-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107645-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 090D51C22408 for ; Tue, 19 Mar 2024 14:26:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4162E81ADD; Tue, 19 Mar 2024 14:26:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="ZsdNFS5C" Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F43281AAE for ; Tue, 19 Mar 2024 14:26:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.169 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710858370; cv=none; b=rf5VfzWFQ8rQ9deKROnoilFxfxV+uUdmrIYVOj/GT78V7fN6Q93/cNBZoAxnVbVhpUDzvYryXFHFAZr3b0kaN86dFfzENCQGjP0JYTwJizmH+mbGQa0OgpShAAPq2Pu+XrXDyKkQpDh3QcapPW7oDLJ/x68xdAz0wSAzv9quHJM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710858370; c=relaxed/simple; bh=/ai7rsRCGQJwbmMN9+0MqsXjffyEnlQWeJfS3vA+etg=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=ByJRSQ2tnxFHy0vw7rhNLOL304KXnsU20PmL+fFR5Qs1mWZ1pgRMENFdACK6n4h3WuWoNQ/NU8GTZPFMWjJph1iKd5KCSx06E51uXe+5ESleSmYnDgUlM91zXM2To5yA6t9h9+CA8SQpm3zox3MKmsj5S31MdxAHMrzIecktEoI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=ZsdNFS5C; arc=none smtp.client-ip=209.85.160.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-42a9c21f9ecso27935481cf.0 for ; Tue, 19 Mar 2024 07:26:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1710858367; x=1711463167; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/ai7rsRCGQJwbmMN9+0MqsXjffyEnlQWeJfS3vA+etg=; b=ZsdNFS5CLyanYiKD/6naRKyWF4Gbq8HpwvdVEw9eYRwZda4V7tEQGBWn/s/rEhB/D2 aEQOUclwi9HR0MWvHnH8A/qVKgak0BidZvvaHTm1tSnBi+6RWQX9HuMZ9pHaVfgNXY29 /NKJrUKNWPMdE5ii1+3913bG1SNGZ51Ezo1j0d71fmxhEMLo0YUMn6ogQM7QGe5UAzJy qWrj9V8UapEswmU28CIJNmWdY8eModaKriQIx9MB6EEXgOGpbhp9bgtr77DQWa/xPhmU mwcOYKvdlXZ8eMa0U6nRdvln4jaDxsaVwEyMcuuZk0LlLIkRsvukTMiS/gPrVUO1ewcE Znfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710858367; x=1711463167; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/ai7rsRCGQJwbmMN9+0MqsXjffyEnlQWeJfS3vA+etg=; b=ba0Aed95p5S5yq4ETGR6KX14aprMmI93XxecF1sKDm0hMMLA5LZy5tkc1S8ZfxKZ60 mPIuO1zaxrh0EPNzsk/HUQb2jlTPhCfLHvy1N0l0yXRDP560vJp3Wjp+6wfRm2qFA1jx E4MoC5rozb+up6pQ3DiAtZYCVCQw5ex8ybV7NiJY0l3fjG+52bMPWuK7pDcmjNHNB+7A kJKjFVURcvFG3Bz/UAeAJmecM7Pte0VLkCNJRbkz7IQg83vYI4VVZrHiOaebvN7MGJWo ScONU0+n/L/cwZD6xF7q1NQZ1sVsNj8XIPwRVvYebZk7ieIUSMCiHV9n0h/oNqGaR2S5 ewuA== X-Forwarded-Encrypted: i=1; AJvYcCWGZGrRoUvc4NCz4mMXMPQr26QOWXDbtO/s30gWB2FdFNIEHWSVgZT20fPyosuB57qWzJNsMFbZlq9nVU0Cnl8io9zthqYKXHau8ciS X-Gm-Message-State: AOJu0YyKbwlhobCXv8FJJ/ZH5HxnRZKFSjlsNd1Ri14WykvfODS5HS0z UtZjQ4HzwHtPI7pelHcOkETn4YYNwDSE7anZaa9qazlUWE9giFaXHC9qJurAx7Ys4puyukQOlH0 fz+5VL6bcj3FmBRuGfxnuvAcYvrpjMa6OPEgwQQ== X-Received: by 2002:a05:622a:1991:b0:430:ef64:8637 with SMTP id u17-20020a05622a199100b00430ef648637mr1164760qtc.15.1710858367665; Tue, 19 Mar 2024 07:26:07 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240220214558.3377482-1-souravpanda@google.com> <20240220214558.3377482-2-souravpanda@google.com> In-Reply-To: From: Pasha Tatashin Date: Tue, 19 Mar 2024 10:25:30 -0400 Message-ID: Subject: Re: [PATCH v9 1/1] mm: report per-page metadata information To: akpm@linux-foundation.org Cc: Sourav Panda , corbet@lwn.net, gregkh@linuxfoundation.org, rafael@kernel.org, mike.kravetz@oracle.com, muchun.song@linux.dev, rppt@kernel.org, david@redhat.com, rdunlap@infradead.org, chenlinxuan@uniontech.com, yang.yang29@zte.com.cn, tomas.mudrunka@gmail.com, bhelgaas@google.com, ivan@cloudflare.com, yosryahmed@google.com, hannes@cmpxchg.org, shakeelb@google.com, kirill.shutemov@linux.intel.com, wangkefeng.wang@huawei.com, adobriyan@gmail.com, vbabka@suse.cz, Liam.Howlett@oracle.com, surenb@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, willy@infradead.org, weixugc@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Mar 13, 2024 at 6:40=E2=80=AFPM Pasha Tatashin wrote: > > On Tue, Feb 20, 2024 at 4:46=E2=80=AFPM Sourav Panda wrote: > > > > Adds two new per-node fields, namely nr_memmap and nr_memmap_boot, > > to /sys/devices/system/node/nodeN/vmstat and a global Memmap field > > to /proc/meminfo. This information can be used by users to see how > > much memory is being used by per-page metadata, which can vary > > depending on build configuration, machine architecture, and system > > use. > > > > Per-page metadata is the amount of memory that Linux needs in order to > > manage memory at the page granularity. The majority of such memory is > > used by "struct page" and "page_ext" data structures. In contrast to > > most other memory consumption statistics, per-page metadata might not > > be included in MemTotal. For example, MemTotal does not include membloc= k > > allocations but includes buddy allocations. In this patch, exported > > field nr_memmap in /sys/devices/system/node/nodeN/vmstat would > > exclusively track buddy allocations while nr_memmap_boot would > > exclusively track memblock allocations. Furthermore, Memmap in > > /proc/meminfo would exclusively track buddy allocations allowing it to > > be compared against MemTotal. > > > > This memory depends on build configurations, machine architectures, and > > the way system is used: > > > > Build configuration may include extra fields into "struct page", > > and enable / disable "page_ext" > > Machine architecture defines base page sizes. For example 4K x86, > > 8K SPARC, 64K ARM64 (optionally), etc. The per-page metadata > > overhead is smaller on machines with larger page sizes. > > System use can change per-page overhead by using vmemmap > > optimizations with hugetlb pages, and emulated pmem devdax pages. > > Also, boot parameters can determine whether page_ext is needed > > to be allocated. This memory can be part of MemTotal or be outside > > MemTotal depending on whether the memory was hot-plugged, booted with, > > or hugetlb memory was returned back to the system. > > > > Utility for userspace: > > > > Application Optimization: Depending on the kernel version and command > > line options, the kernel would relinquish a different number of pages > > (that contain struct pages) when a hugetlb page is reserved (e.g., 0, 6 > > or 7 for a 2MB hugepage). The userspace application would want to know > > the exact savings achieved through page metadata deallocation without > > dealing with the intricacies of the kernel. > > > > Observability: Struct page overhead can only be calculated on-paper at > > boot time (e.g., 1.5% machine capacity). Beyond boot once hugepages are > > reserved or memory is hotplugged, the computation becomes complex. > > Per-page metrics will help explain part of the system memory overhead, > > which shall help guide memory optimizations and memory cgroup sizing. > > > > Debugging: Tracking the changes or absolute value in struct pages can > > help detect anomalies as they can be correlated with other metrics in > > the machine (e.g., memtotal, number of huge pages, etc). > > > > page_ext overheads: Some kernel features such as page_owner > > page_table_check that use page_ext can be optionally enabled via kernel > > parameters. Having the total per-page metadata information helps users > > precisely measure impact. Hi Andrew, Can you please give this patch another look, does it require more reviews before you can take it in? Thank you, Pasha