Received: by 2002:ab2:69cc:0:b0:1f4:be93:e15a with SMTP id n12csp164909lqp; Fri, 12 Apr 2024 13:48:35 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXpP2QTpM/VvvjYo0PWMwyYt0ZYCVvT1XbjIlcq/vnFhu5lOUfhSjtpS5h90cwZABRuJCX3hXzaPEguRnOMTbqB4+jqzirxdrYQo+/ehw== X-Google-Smtp-Source: AGHT+IGwwQBefpZex1PmpUNG5MUmyNfwnfXkMVHZ2mBj9t9ddLf4pL1My5dgf+gx58bDcjFoNV0q X-Received: by 2002:a05:620a:564a:b0:78d:69c8:7a14 with SMTP id vw10-20020a05620a564a00b0078d69c87a14mr3940860qkn.42.1712954914879; Fri, 12 Apr 2024 13:48:34 -0700 (PDT) Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id m27-20020a05620a215b00b0078ebdce40a0si4417239qkm.1.2024.04.12.13.48.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Apr 2024 13:48:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-143325-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=ADpxsRzB; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-143325-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-143325-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 633961C22B65 for ; Fri, 12 Apr 2024 20:48:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DB80A40BE5; Fri, 12 Apr 2024 20:48:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ADpxsRzB" Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 787B1405F2; Fri, 12 Apr 2024 20:48:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.48 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712954910; cv=none; b=mMeNaXYiJPDa7F/83X0vCRfrwIJlE8E8zZStakuK1KDpDq2N+NH2X16lTMnqPc00wqnicxfTzaQ8s7TZ5bJhBod6ajnUi1W53GDLOtZ2dUBMIPioe+gEyw9JYXntDHgrwgmbLJOE0nv5lBEwUBJnwSTZxv4NIyhaFIqR+kq4PvY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712954910; c=relaxed/simple; bh=pbE3TaIzhf/PuBsPi7z2XyKs+/RUTs1HkbRcW3mWWCc=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=NLhRxA6vxEBNNGECaUk27G/VfUxh+fweT5orZcEbRWgrPSyl941xjU5ID9tl813u2+/cplPLtahRK8jtaItPS1nDEN88Naf3Wbu3CFLmxybSXOa12rThaSfHr5QMBfHEo0kVjz0oVrb+R18Uu7bVmOpgYYkOYhIs1BGAzG9XX4c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ADpxsRzB; arc=none smtp.client-ip=209.85.219.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-qv1-f48.google.com with SMTP id 6a1803df08f44-69b2f6d31c1so3823766d6.0; Fri, 12 Apr 2024 13:48:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712954907; x=1713559707; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UnmmxjRKpqxFVnghAeFega9zNQA9pSSXf3fcA1ZfMnw=; b=ADpxsRzB8n+B7MaYarpx2kx1o37tsfnnQe9qv0bZFt569cisd7hzUnGhhmIjpbpC+T utw7tp2AgVHJwZdyn64+LLNIMPpCLW78IELO50M76FdBnbzxPZTreNi3ozEP0RGbNQBf 9PNpBcQ+xTzm8vvgEUR4V8fiMgEebML+o3aHdK33OfFjqiWqroOiCHRPFbj/jdc0r7ff ozCv7kJ98ZFYwRCkH9YSMMYzawIiJMBNZLYPolQ8Am1hewW2amQFdY+PilN+baQ7tscI 8OQezFWGBnA7O8xXCD28KVdfEctGgdeXHHzFOQI8ZruAvNzXHBVKf9b5RClKoVZnx4sP kHAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712954907; x=1713559707; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UnmmxjRKpqxFVnghAeFega9zNQA9pSSXf3fcA1ZfMnw=; b=bTmK+9oCPi5C3L43jY6n5wfvnkD1lYJae2L2yqIJ47m63BwUVh1jcswgMR7bVvmKaG Mu5rINmMfFjV864kN2KLCyzRVlgzWfDtmBY6iBU5ML9oELkfuajv0Gtin4/x7gsXe6yZ wOAmAzl+o/ypJGsWcFO2S5pdjZE7FPA0UVNFLT5rbpzLyL3gm13ZY+4oikN3etKqSD7n RD80ohV1vbURUccIR4rAXMAzzDN0iNok1rWgkIwbkOtG9XplnhlMOyZFHEwsxvtKOnbQ L2zhxqzcBBNKONK83JXegAjRfhaFeCZy7fp5azE8RQBWSq9laiCs2tQl15ioMelD3gKv r8WA== X-Forwarded-Encrypted: i=1; AJvYcCVs+3x3mfdEPdaHU0THTuv3xuJ2IN8X2bQR6tehMnc2W+oGthyxCnRAotuJlHuj2hwRYMx0vGubw3Nh7wKQ+SWqacE8LG9NksEDldV2GaautVw53RdIBR9yFGO68bswCfLc/T4yAuRV X-Gm-Message-State: AOJu0YwVK9BtXkFcw/2H2SCfRwREUpfpupuWKbhtZjcB6iXeRe9CI0dl 1PNQSaOm6fLww5vBQ4yvlIErNFCO+F1RIRUC/Ow4TLzrIucgchyRvprCtyKplB4h1V8M3laPraG 0kkoB6UgEmDZYOMV51e4S1nbVBmleYOE3 X-Received: by 2002:a05:6214:11a2:b0:69b:246b:4bff with SMTP id u2-20020a05621411a200b0069b246b4bffmr3823325qvv.33.1712954907535; Fri, 12 Apr 2024 13:48:27 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240405164920.2844-1-mcassell411@gmail.com> In-Reply-To: From: Matthew Cassell Date: Fri, 12 Apr 2024 15:48:16 -0500 Message-ID: Subject: Re: [PATCH] Documentation/admin-guide/sysctl/vm.rst adding the importance of NUMA-node count to documentation To: Vratislav Bendel Cc: corbet@lwn.net, akpm@linux-foundation.org, rppt@kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for the feedback. Here is a quick outline I came up with on your advice: [...] (original content) Keep in mind enabling bits in zone_reclaim_mode makes the most sense for topologies consisting of multiple NUMA nodes. In addition to vanilla zone_reclaim (clean and unmapped pages), there exist additional bits that expand which pages are eligible to be reclaimed and dictate scan_control policy during the reclaim process. The page allocator will attempt to recla= im memory locally in accordance with these bits before attempting to allocate on remote nodes. Allow dirty pages to become candidates for memory reclaim:: echo 2 > /proc/sys/vm/zone_reclaim_mode [...] (original content) Allow mapped pages to become candidates for memory reclaim:: echo 4 > /proc/sys/vm/zone_reclaim_mode [...] (original content) I'm trying to balance between keeping the original content, being descripti= ve, and not going into encyclopedia-mode. My motivation was to stress the impor= tance of NUMA-node count and describe the additional bits more per your advice. I added the echo snippets to better segue the aggressive options. Any thoug= hts on the above? On Thu, Apr 11, 2024 at 2:54=E2=80=AFAM Vratislav Bendel wrote: > > On Fri, Apr 5, 2024 at 6:49=E2=80=AFPM Matthew Cassell wrote: > > > > If any bits are set in node_reclaim_mode (tunable via > > /proc/sys/vm/zone_reclaim_mode) within get_pages_from_freelist(), then > > page allocations start getting early access to reclaim via the > > node_reclaim() code path when memory pressure increases. This behavior > > provides the most optimization for multiple NUMA node machines. The abo= ve > > is mentioned in: > > > > Commit 9eeff2395e3cfd05c9b2e6 ("[PATCH] Zone reclaim: Reclaim logic") > > states "Zone reclaim is of particular importance for NUMA machines. It > > can be more beneficial to reclaim a page than taking the performance > > penalties that come with allocating a page on a REMOTE zone." > > > > While the pros/cons of staying on node versus allocating remotely are > > mentioned in commit histories and mailing lists. It isn't specifically > > mentioned in Documentation/ and isn't possible with a lone node. Imagin= e a > > situation where CONFIG_NUMA=3Dy (the default on most major distribution= s) > > and only a single NUMA node exists. The latter is an oxymoron > > (single-node =3D=3D uniform memory access). Informing the user via vm.r= st that > > the most bang for their buck is when multiple nodes exist seems helpful= . > > > > I agree that the documentation could be improved to better express the > implications > and relevance of setting zone_reclaim_mode bits. > > Though I would suggest to go a step further and also elaborate on > those "additional actions", > for example something like: > "The page allocator will attempt to reclaim memory within the zone, > depending on the bits set, > before looking for free pages in other zones, namely on remote memory nod= es." > > > Signed-off-by: Matthew Cassell > > --- > > Documentation/admin-guide/sysctl/vm.rst | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/ad= min-guide/sysctl/vm.rst > > index c59889de122b..10270548af2a 100644 > > --- a/Documentation/admin-guide/sysctl/vm.rst > > +++ b/Documentation/admin-guide/sysctl/vm.rst > > @@ -1031,7 +1031,8 @@ Consider enabling one or more zone_reclaim mode b= its if it's known that the > > workload is partitioned such that each partition fits within a NUMA no= de > > and that accessing remote memory would cause a measurable performance > > reduction. The page allocator will take additional actions before > > -allocating off node pages. > > +allocating off node pages. Keep in mind enabling bits in zone_reclaim_= mode > > +makes the most sense for topologies consisting of multiple NUMA nodes. > > > > Allowing zone reclaim to write out pages stops processes that are > > writing large amounts of data from dirtying pages on other nodes. Zone > > -- > > 2.34.1 > > >