Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp578827rdb; Thu, 21 Dec 2023 20:38:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IFCyk5p0F1wHaxdaWzabXAlIntBe5UxjoXdbxMxz4JlPSr0pXPmcn5Wdng1afCKEPa3T4Br X-Received: by 2002:a05:6512:61c:b0:50e:520e:c50f with SMTP id b28-20020a056512061c00b0050e520ec50fmr311623lfe.113.1703219930089; Thu, 21 Dec 2023 20:38:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703219930; cv=none; d=google.com; s=arc-20160816; b=VIGuKHKLcZEtgedYgA8z4OQaOD6XmPe2gpaOZ18lAJsvRv02bcw7gyeuWwRRB6tHtx pVUuLmvg47u6Fvu6nbWb4QA7OApXCFXKheTPuxohn12rrGgGy2cKQGuyYmrTVie/+Y7N 28M+t19pBm6jkuPX3gAvyXLPaic1zYnbrHeh1ZNRxGUN6U6pVy+ToiGm+6U7pYdK/lea LEAEo6Q8nnGYvAgWe5w97VYrkl3n+YkOXuyvmQGQUInhVtXAmjAIK3wdqgCJXyKvGCaZ 8vmE3pVKdwbJ4Y6JJ8wPW8OF17PzNh+Rs4BWTEzwCXNafTpbnH/JzXAEsXYd5XCH8CLJ J89A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:in-reply-to:subject:cc:to:from:date :dkim-signature; bh=GyedNlCjbE+Q4bYx68krhasQrfrx8EhiaTfO6LvPL04=; fh=xwUHAcblIzhNTankEvej/3kLZeeLU5LfFrnrmtZZ3+o=; b=yHxSVVyLVM1rgDmT2822i7hxEtg0QpV082Ph/rtvwOYOb6fbSbFJF7RAkY5sa6F40r JKFZrn9/rGISfXwf4YTqsv2SkDHSfWLC3cZMd4HwZG3lX6HDy/XFTEAeSdALis3ybaG6 FDr5dNv77S8wjqjsfjSfmxO+1PR/ceDIuvBToQnxDoaDxrs1rjjUO0OlXCP5vvwUzUNT 71Kj7iUPW37auiIiqKMwxDjG1g8VH/FQjbFGPOhbx+YRLR5gOTRldG9WZ6/+FzWOoSvW qs3fKb6x6StvCmdibWhyjXq7DzMVe0lsg6W7rUskHx4CA2+HYlpM9gviy99tHKndKQy3 +RDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Y5U4iH1T; spf=pass (google.com: domain of linux-kernel+bounces-9324-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9324-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id o6-20020a17090608c600b00a2338cdecb5si1430439eje.496.2023.12.21.20.38.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Dec 2023 20:38:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-9324-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Y5U4iH1T; spf=pass (google.com: domain of linux-kernel+bounces-9324-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9324-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id D23901F2175D for ; Fri, 22 Dec 2023 04:38:49 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B60DD539A; Fri, 22 Dec 2023 04:38:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Y5U4iH1T" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A549079D4 for ; Fri, 22 Dec 2023 04:38:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1d3fde109f2so65095ad.1 for ; Thu, 21 Dec 2023 20:38:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1703219922; x=1703824722; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=GyedNlCjbE+Q4bYx68krhasQrfrx8EhiaTfO6LvPL04=; b=Y5U4iH1THZn/kJmXH69IG8NL7xbHR+tOnRv5VqNnIA3pw1RYtpGq7pQzj8+bn0W7G6 AyZDcRfWC2Xg1KYFnb+MpgtEAwaGOvi6L1Db0CMc84n7gTRJXu6VYu3R494hAJKgV3qC 7F9aYM0nrvdSO8aMWIc3rTGq1XjDYroi5/DGpozG7Gu91CIWTksLE6XF38S3I5n1zFbm 7KtvLhAayfQMrV/KsRIEGDovY3RUoDPfa/eMYBSSghxom6rPROmccyO7DJkXBzRCswoX NlcIK3ObGdlcbC2j7p4ki8oMTzE3y17PZSyk0mNl2oN/MCc5XaqvB1HNQxT2ZGsdkarP ZEjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703219922; x=1703824722; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GyedNlCjbE+Q4bYx68krhasQrfrx8EhiaTfO6LvPL04=; b=Hu7gcM6HHuqRw9y3g84xMU7obbOp0KV/E2sPLdoGBB3/19t5hS/YC4qNSC5IHoSl/B YZcb6lLmMQv7Y44VpkgYjyRLijt+LK/Wpv6BFWhCLlUZYabR+2TVbXyRAeHhlp2ajCWK FMGP/pIvn24b2npSVXllEAaTXXsjQ4z8avUmzzVpcKfEojafGGNF6O1xwS51Ai8XaQ7F Ic1voBozFb2UpysvXuO8Gfgsq7AMJ9+YN/Y3hSkpfAhJsb+nQlIZDJOK+PtQKT4xc8vW YMt4N53zkS3Ksxlfv/aruNG0b9Qta6k08kyQ3+RnjLrIxlPKffE/cMEpYAEqyD/naAWu UQOQ== X-Gm-Message-State: AOJu0Yy8WesPHLY6pf6xc/M/YnvVh5JoW+jfQGtzzbP08OcBoNtzRarv Tu7LJ7jXdf9DasjYnkArpSByTHADTurpmj6mCf8Nzu+Iofv9S+Y= X-Received: by 2002:a17:903:18c:b0:1d4:1430:139f with SMTP id z12-20020a170903018c00b001d41430139fmr58061plg.25.1703219921743; Thu, 21 Dec 2023 20:38:41 -0800 (PST) Received: from [2620:0:1008:15:184:1476:510:6ea1] ([2620:0:1008:15:184:1476:510:6ea1]) by smtp.gmail.com with ESMTPSA id ei8-20020a17090ae54800b0028bd9f88576sm2626083pjb.26.2023.12.21.20.38.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Dec 2023 20:38:41 -0800 (PST) Date: Thu, 21 Dec 2023 20:38:40 -0800 (PST) From: David Rientjes To: Dan Schatzberg cc: Johannes Weiner , Roman Gushchin , Yosry Ahmed , Huan Yang , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Tejun Heo , Zefan Li , Jonathan Corbet , Michal Hocko , Shakeel Butt , Muchun Song , Andrew Morton , Kefeng Wang , SeongJae Park , "Vishal Moola (Oracle)" , Nhat Pham , Yue Zhao Subject: Re: [PATCH v5 2/2] mm: add swapiness= arg to memory.reclaim In-Reply-To: <20231220152653.3273778-3-schatzberg.dan@gmail.com> Message-ID: References: <20231220152653.3273778-1-schatzberg.dan@gmail.com> <20231220152653.3273778-3-schatzberg.dan@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII On Wed, 20 Dec 2023, Dan Schatzberg wrote: > Allow proactive reclaimers to submit an additional swappiness= > argument to memory.reclaim. This overrides the global or per-memcg > swappiness setting for that reclaim attempt. > > For example: > > echo "2M swappiness=0" > /sys/fs/cgroup/memory.reclaim > > will perform reclaim on the rootcg with a swappiness setting of 0 (no > swap) regardless of the vm.swappiness sysctl setting. > > Userspace proactive reclaimers use the memory.reclaim interface to > trigger reclaim. The memory.reclaim interface does not allow for any way > to effect the balance of file vs anon during proactive reclaim. The only > approach is to adjust the vm.swappiness setting. However, there are a > few reasons we look to control the balance of file vs anon during > proactive reclaim, separately from reactive reclaim: > > * Swapout should be limited to manage SSD write endurance. In near-OOM > situations we are fine with lots of swap-out to avoid OOMs. As these are > typically rare events, they have relatively little impact on write > endurance. However, proactive reclaim runs continuously and so its > impact on SSD write endurance is more significant. Therefore it is > desireable to control swap-out for proactive reclaim separately from > reactive reclaim > > * Some userspace OOM killers like systemd-oomd[1] support OOM killing on > swap exhaustion. This makes sense if the swap exhaustion is triggered > due to reactive reclaim but less so if it is triggered due to proactive > reclaim (e.g. one could see OOMs when free memory is ample but anon is > just particularly cold). Therefore, it's desireable to have proactive > reclaim reduce or stop swap-out before the threshold at which OOM > killing occurs. > > In the case of Meta's Senpai proactive reclaimer, we adjust > vm.swappiness before writes to memory.reclaim[2]. This has been in > production for nearly two years and has addressed our needs to control > proactive vs reactive reclaim behavior but is still not ideal for a > number of reasons: > > * vm.swappiness is a global setting, adjusting it can race/interfere > with other system administration that wishes to control vm.swappiness. > In our case, we need to disable Senpai before adjusting vm.swappiness. > > * vm.swappiness is stateful - so a crash or restart of Senpai can leave > a misconfigured setting. This requires some additional management to > record the "desired" setting and ensure Senpai always adjusts to it. > > With this patch, we avoid these downsides of adjusting vm.swappiness > globally. > > [1]https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html > [2]https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai.cpp#L585-L598 > > Signed-off-by: Dan Schatzberg Acked-by: David Rientjes