Received: by 2002:a05:6358:111d:b0:dc:6189:e246 with SMTP id f29csp1058565rwi; Mon, 31 Oct 2022 10:39:32 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6TQ5TLzZpcWEh1XaIb3++xfFfnfHySqnb+gkiaEHMLwEb+U+4Z+4njt/I0liiMzURrzdFn X-Received: by 2002:a65:6bca:0:b0:46f:5de2:30d4 with SMTP id e10-20020a656bca000000b0046f5de230d4mr13708814pgw.323.1667237972454; Mon, 31 Oct 2022 10:39:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667237972; cv=none; d=google.com; s=arc-20160816; b=HEbRVIEsMQJ8HWIcpj7Cy8hcRpsIcJNJM8VXQokrHT4o5WyIfsEVz2LA8GiMjBVV9c vkUseRYOBYxu83oPG3JDEM3hZbbYxExnikIZzYV1E9uksKjwtp9G/Ioj5MuerW19Mova ozcD4tTqvuiKlX2HIUIQU40BaBSJScTZZb/ufTzDc5jrUrjdNYAjblXbJ7PBkOO+rB29 UJG413mgXa9Ye8rdN7Rd/Lk5qWR/MpS8ASv8yXALEjlFj507G9t2YM0dgeoXpBqprBeW oV4ToTJmzw4wxenSnHqUJDvpMEzRe6c+SiV649FUs4A/5b4VoJGMNIGAgR/6WnBGMpws 18sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=2Mz9q4FkmQXMt/CPksl2IBt8CctFgUcDRo2yJyy63HU=; b=M1LxtuWBF1dDsCfgKAJdjt6NwQmQyqR4IjBppGnWEp1Sk1z3xi4NDiZQC+3OfJcSHq UIkgit+1utWdbXzAis1gS4qL+MyDPtLGasreA+16aJ7gcAn74jQXGl+84zitKmkHAQ8n XSUdjzCg+ezBzOmmUZKeYDCeLp7SD8VJmPVljTnLBfRmvevTsY9R+qTfZ/eaA/y7VFzj aOMmq/8rT0AoEgcBFdSnMmjIr+gPXAKfhaomFlsRKxo3A57GrpjUv5j3Q3QMlNPWg+oS Ep4cZJrGH9LHnK4E5NIUkzm1bx6d1rA1Wxb+PCIxQBl6Y/MhZRddhAhVQThJUsUqZJHM p+VQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=muhS0Hni; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n18-20020a635912000000b0046ed85dc141si9773048pgb.472.2022.10.31.10.39.19; Mon, 31 Oct 2022 10:39:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=muhS0Hni; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231650AbiJaQqk (ORCPT + 98 others); Mon, 31 Oct 2022 12:46:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231315AbiJaQqi (ORCPT ); Mon, 31 Oct 2022 12:46:38 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AFB26DFFD for ; Mon, 31 Oct 2022 09:46:37 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id 63so10197557iov.8 for ; Mon, 31 Oct 2022 09:46:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2Mz9q4FkmQXMt/CPksl2IBt8CctFgUcDRo2yJyy63HU=; b=muhS0Hni6Nrjsd6f/nV048meu4YXT7mjCv8vBi0EE0dUAcnBPhA5pTUBcQGs6v/g5n mcw8YlBn5NyKnJdiZ0cbkV00uCd2Wkuq1FCnCSdZcfR/FYY2GFf67lerfHlNxqlky/cw KGdIo+XguTc1LF97eQPEjuMot29aFZoRxWQG6eWAfckE7VgdVpgDKIHCpsUyjyHgd+yX r32wf6O4+wOp8chKq4oUmumtxJEcFCk+cTGzA32HaWmJjj5+MHKbbMK0jejipfz+Frrm DPPzu4m26iMyK0YeqBfHR2bGl417BMP4etHRIvGeyj6YVf2Nl7LQgxc4NbKRqZwNtI2O rEJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2Mz9q4FkmQXMt/CPksl2IBt8CctFgUcDRo2yJyy63HU=; b=IIUFHbk37SID89pz7SITMH0VUGGinA+JUc+mWKqV95uwRaLUwwtskgn/9QPDzh8Caw v7ut1MlznjluyIcpdGt6fRTYoBhyjYNN8ulHWoN0HjDfqnVL75gv52Mj3IhF2mKObtUH 6A6K+HWpCWvDxQLkYhGoX4utexyPzUdQsXCF1+HMx2gnp5a0icpKCLY8yOlFXDry+ben YG48cm3DZtDpQ7vLB3KEHmacYvTCu9reaCD2sRrq2tcBi2YO99XhTSABdjBzI3qEfvM2 j2lEumkBM7EcnuAIp9ET7kC0JQtAo92YfhfD9YkyaknTfC3x7KQPWkUIju6rcwb0iVOz PjCQ== X-Gm-Message-State: ACrzQf00PfCOSVbQxqASyEdEcbjX084fyYhcVvOVoDFTyXEounVQynIX WZzf0ThzIRiJFJjY1o3MU3dqyAp9jFQjNjhSJgz/+A== X-Received: by 2002:a05:6638:1450:b0:363:7052:9c30 with SMTP id l16-20020a056638145000b0036370529c30mr7884018jad.53.1667234796821; Mon, 31 Oct 2022 09:46:36 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yosry Ahmed Date: Mon, 31 Oct 2022 09:46:00 -0700 Message-ID: Subject: Re: [PATCH] mm: vmscan: split khugepaged stats from direct reclaim stats To: Johannes Weiner Cc: Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Eric Bergen Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 31, 2022 at 9:00 AM Johannes Weiner wrote: > > On Fri, Oct 28, 2022 at 10:41:17AM -0700, Yosry Ahmed wrote: > > On Fri, Oct 28, 2022 at 7:39 AM Johannes Weiner wrote: > > > pgscan_user: User-requested reclaim. Could be confusing if we ever > > > have an in-kernel proactive reclaim driver - unless that would then go > > > to another counter (new or kswapd). > > > > > > pgscan_ext: Reclaim activity from extraordinary/external > > > requests. External as in: outside the allocation context. > > > > I imagine if the kernel is doing proactive reclaim on its own, we > > might want a separate counter for that anyway to monitor what the > > kernel is doing. So maybe pgscan_user sounds nice for now, but I also > > like that the latter explicitly says "this is external to the > > allocation context". But we can just go with pgscan_user and document > > it properly. > > Yes, I think you're right. pgscan_user sounds good to me. > > > How would khugepaged fit in this story? Seems like it would be part of > > pgscan_ext but not pgscan_user. I imagine we also don't want to > > pollute proactive reclaim counters with khugepaged reclaim (or other > > non-direct reclaim). > > > > Maybe pgscan_user and pgscan_kernel/pgscan_indirect for things like khugepaged? > > The problem with pgscan_kernel/indirect is that if we add a proactive > > reclaim kthread in the future it would technically fit there but we > > would want a separate counter for it. > > > > I am honestly not sure where to put khugepaged. The reasons I don't > > like a dedicated counter for khugepaged are: > > - What if other kthreads like khugepaged start doing the same, do we > > add one counter per-thread? > > It's unlikely there will be more. > > The reason khugepaged doesn't rely on kswapd is unique to THP > allocations: they can require an exorbitant amount of work to > assemble, but due to fragmentation those requests may fail > permanently. We don't want to burden a shared facility like kswapd > with large amounts of speculative work on behalf of what are (still*) > cornercase requests. > > This isn't true for other allocations. We do have __GFP_NORETRY sites > here and there that rather fall back early than put in the full amount > of work; but overall we expect allocations to succeed - and kswapd to > be able to balance for them!!** - because the alternative tends to be > OOMs, or drivers and workloads aborting on -ENOMEM. > > (* As we evolve the allocator and normalize huge page requests > (folios), kswapd may also eventually balance for THPs again. IOW, > it's more likely for this exception to disappear again than it is > that we'll see more of them.) > > (** This is also why it's no big deal if other kthreads that rely on > kswapd contribute to direct reclaim stats. First, it's highly > error prone to determine on a case by case basis whether userspace > could be waiting behind that direct reclaim - as Yang Shi's > writeback example demonstrates. Second, if kswapd is overwhelmed, > it's likely to impact userspace *anyway*! The benefit of this > classification work is questionable.) Thanks for the explanation :) > > > - What if we deprecate khugepaged (or such threads)? Seems more likely > > than deprecating kswapd. > > If that happens, we can remove the counter again. The bar isn't as > high for vmstat as it for other ABI, and we've updated it plenty of > times to reflect changes in the MM implementation. Good to know! I thought we'd be stuck with it forever. > > > Looks like we want a stat that would group all of this reclaim coming > > from non-direct kthreads, but would not include a future proactive > > reclaim kthread. > > I think the desire to generalize overcomplicates things here in a way > that isn't actually meaningful. > > Think of direct reclaim stats as a signal that either a) kswapd is > broken or b) memory pressure is high enough to cause latencies in the > class of requests that are of interest to userspace. This is true for > all cases but khugepaged. Agreed. I believe moving forward with pgscan_user and pgscan_khugepaged style stats makes sense. Thanks, Johannes!