Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp8040308rwb; Tue, 13 Dec 2022 01:03:19 -0800 (PST) X-Google-Smtp-Source: AA0mqf6xq+KQbmnvMnwL6hFFYfr3YP4Fq0X5U25vz0JWumNTL/N2GYbOnDHxnFgsU2UB8H6QcdUE X-Received: by 2002:a17:906:90c9:b0:7c0:f908:79f1 with SMTP id v9-20020a17090690c900b007c0f90879f1mr21839783ejw.60.1670922199293; Tue, 13 Dec 2022 01:03:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670922199; cv=none; d=google.com; s=arc-20160816; b=CcuHSBKCLg2hzT9skYU1kQ7qgqIkmvGNIQkx0RvZxSj2JIjaYlIAGkKG/bTi2QZlbm KNnHrJCq/bnhXXjgPr/a2u3H+Dt/fK9MBDtvcco3jRwA3Jz1YsQJVarFwpQ6Oz58kZID JjVGS/KSoDinJrb22bkHLYlWUiReOqIeeVS/5/xK+MQ2ybBJ2ieL9NONSVuz6AIm+AGt P2eiv7uudgciMoAxbNw8aofwwAedbxxe1Oy7nbjiWE9pkIAKZLKokzGp2YtgrJW4TLCz VLpW/RqMxGwSCIR3NaXQZQ77UVf1896lrw+MV0jlS/7U4F22zxVw5bxJaOHmt+k1Nh7/ /TlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=cJqVpp0rgt2DpLfJteOt1+knopZ57BAnBRXhD56Saew=; b=f75TmzLMUHzs99xcffUzyIwf3sEiW+qmf2AFmySvj57wuzmntjZ9xupuNoTWJ6Il6c FCF4cIJGKWN07v36IYbrNt4qaqGPueH0H9kiR4fYXsLI41NU8X5CDC7PHZ+tOjp7Yod3 V/obKlPrZmMNdJDDeMRqQyD2JaoP9o9u4JwW5paPesKv6DyD+1KkbPgSfsDU7Ouf70yX s1tGhLr0DISOP9LQmgXfLGCywFA0Qam9lGGBXkP1pGFlNwo5htlwftMQD/zpScA49u6H Hba4r0PxEe2C7Z1QQNmQu6Wz+jO2okfslQ8ze4kaz5KbrzYbAmC2UHVnfMM7UQ0nJn0o Og2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=MfMJRehQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gs10-20020a1709072d0a00b007c0ef00a04fsi9058337ejc.434.2022.12.13.01.02.48; Tue, 13 Dec 2022 01:03:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=MfMJRehQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234875AbiLMIvu (ORCPT + 74 others); Tue, 13 Dec 2022 03:51:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234913AbiLMIvc (ORCPT ); Tue, 13 Dec 2022 03:51:32 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DE47167C9; Tue, 13 Dec 2022 00:51:29 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 4591B22053; Tue, 13 Dec 2022 08:51:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1670921488; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=cJqVpp0rgt2DpLfJteOt1+knopZ57BAnBRXhD56Saew=; b=MfMJRehQtolYF5vDu/F2qedFOWN5WjmU9c57R+GI1JCJbJYbMgJO0bjjf4S1YVNy0qGtIo u672Fjtmqv/Ii8avOlmNsNaT+W4ZkjnojFEgAYgg+bY/BIy2tr2FefoWzPGm0l/yfl6CEJ bn5zrj+XgsAI5VrL7PTa5ET5Aefo6Ok= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 215B2138EE; Tue, 13 Dec 2022 08:51:28 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Qu2dBBA9mGN5LAAAMHmgww (envelope-from ); Tue, 13 Dec 2022 08:51:28 +0000 Date: Tue, 13 Dec 2022 09:51:27 +0100 From: Michal Hocko To: "Huang, Ying" Cc: Mina Almasry , Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Yang Shi , Yosry Ahmed , weixugc@google.com, fvdl@google.com, bagasdotme@gmail.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3] mm: Add nodes= arg to memory.reclaim Message-ID: References: <20221202223533.1785418-1-almasrymina@google.com> <87k02volwe.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87k02volwe.fsf@yhuang6-desk2.ccr.corp.intel.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 13-12-22 14:30:57, Huang, Ying wrote: > Mina Almasry writes: [...] > After these discussion, I think the solution maybe use different > interfaces for "proactive demote" and "proactive reclaim". That is, > reconsider "memory.demote". In this way, we will always uncharge the > cgroup for "memory.reclaim". This avoid the possible confusion there. > And, because demotion is considered aging, we don't need to disable > demotion for "memory.reclaim", just don't count it. As already pointed out in my previous email, we should really think more about future requirements. Do we add memory.promote interface when there is a request to implement numa balancing into the userspace? Maybe yes but maybe the node balancing should be more generic than bound to memory tiering and apply to a more fine grained nodemask control. Fundamentally we already have APIs to age (MADV_COLD, MADV_FREE), reclaim (MADV_PAGEOUT, MADV_DONTNEED) and MADV_WILLNEED to prioritize (swap in, or read ahead) which are per mm/file. Their primary usability issue is that they are process centric and that requires a very deep understanding of the process mm layout so it is not really usable for a larger scale orchestration. The important part of those interfaces is that they do not talk about demotion because that is an implementation detail. I think we want to follow that model at least. From a higher level POV I believe we really need an interface to age&reclaim and balance memory among nodes. Are there more higher level usecases? -- Michal Hocko SUSE Labs