Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp3400552rwb; Fri, 9 Dec 2022 14:06:32 -0800 (PST) X-Google-Smtp-Source: AA0mqf5W4r5JJV6WlvLyrkeGWwjWkQ5Uv+eFfUPhVFuirwuFvHHlu5XXbya7Jhg0toZPCa4grWI2 X-Received: by 2002:a05:6a20:9391:b0:9d:efd3:66e0 with SMTP id x17-20020a056a20939100b0009defd366e0mr9528365pzh.39.1670623591931; Fri, 09 Dec 2022 14:06:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670623591; cv=none; d=google.com; s=arc-20160816; b=Ppg3VVOR/nxLO/Vuld4dG6T7rp4oKnmjAPU7aQ524mEGTxCf0sLheSHoUtliKsjoEs x/yLi/vCXIpKniPyXxa5IFsj7gwmxVFU2q1YAI0K31JBQ/Y9fkSNhy8st1Jp+aNIsnAX X3txyJ3o1rm+R//VS/2xut6ZFEOcRHDzTo0jR+VrOSbjhiyU9z7u3qP4ot1sC238JW8l RYanqf6KNHVCNmEP5CrzXv3kaVvoCq0+7mee6ZOTEt3nIiaF8x+JLS1lX6ngD8jzT3Wb qKDnW9d8+AzmueMu96y9MpzssdvL8vDgnSQ6cvO6qqSQk0++T+2QY/Os3mH8Xra52a5M 5mqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=20fWTWqG3b1NNXuV0UqK1yxQ/Ktl7cxXS71QLmWNzcs=; b=CU/sXmx599Gk7j09pyGZ+kOdOf2TiUEIMbIZfcMA+mqbWk297BpD54EFsd5FpXWLEK uQFS+QapxU+hNDRXi8d3UdfVd/Xh+uyA0PP6z5b1ZuDyb6Uv5m8vSxyHSAhkZqqFLhL6 6axmpMVyk/tfq8ANBwXxDux4CHaW/pEndHYYL0I3bWkqk09TMzOYu5EXClqIRUOo2mmG +ECpuG3F0k19yi9shXgPCidUq/3/WFN4dBqWi+P9gOKSTydBdvkx2hgJ9M022ie41mJM lt1B6/BjJ5glyKL1PAzm4VtF56/+UDJDRReUl5S6DtMTjbeR8tD8rsKCDkp/HQF/Oz7z GgVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="BnhqE1j/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 205-20020a6301d6000000b0046f71a7292dsi2541198pgb.384.2022.12.09.14.06.22; Fri, 09 Dec 2022 14:06:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="BnhqE1j/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229661AbiLIVkC (ORCPT + 74 others); Fri, 9 Dec 2022 16:40:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229498AbiLIVkA (ORCPT ); Fri, 9 Dec 2022 16:40:00 -0500 Received: from mail-vs1-xe34.google.com (mail-vs1-xe34.google.com [IPv6:2607:f8b0:4864:20::e34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1DF192FFA for ; Fri, 9 Dec 2022 13:39:58 -0800 (PST) Received: by mail-vs1-xe34.google.com with SMTP id t5so5742685vsh.8 for ; Fri, 09 Dec 2022 13:39:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=20fWTWqG3b1NNXuV0UqK1yxQ/Ktl7cxXS71QLmWNzcs=; b=BnhqE1j/P331hqbRIW+1zWjX2X6wDnR895yP+axeKfFphfSMfyTNbZYwuTh1KqoWrX g8+5allE98PfMOuEtX1gTvtpIKGLqQNbtIjLIzih3NOOeXIHLfFuTzD1J+84oLDIeUc/ ytiPO2eNVUlKMT3YRiVYPaL3TC/YqPGZIoVFLL5D/ZhXpa0nNVNtagtGFd/92/BsMhoC kHW6h5lEI7nT3JHSq3pks8s8ZY6qZavgBP+0ESTNOJVh6EM5Yi+ZMB82eADgJ55dgG7c eL1e/A1iWoeg+HKE400v2Eht3GiPWflB+K0ov7T6fDO7W304XB4TPbg5whESSa/VHuvn Lx1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=20fWTWqG3b1NNXuV0UqK1yxQ/Ktl7cxXS71QLmWNzcs=; b=J0z1AOGLOHDm9U25hzR30jabr8dtz1u/svPo4J1FzUVpcLLxxlV77P2NpTOSiMtTRU cpLnWzZfrk8W4Bcyl+s5TH3rwC3NSDTK3TyHgCM+J5GybQAP5lF2NOzutGszimvw+ILk iNPW2+T1tJuqP1+rlfTyyX2aUJGXbAr9tn18qGBmB09p6+9otD1+lIulWz0S9aaPHgRh 7UYK6JnHIQzwp7SouYy6B/mOUkYtZoEeYcj4/P/8XAQKjjEmT8daJHtBkSoEf1e57Jhn uxYtgzWDSG6EmuyRCRfjMgGCM7AmyCbhbIp8n/ZJ5b5UJSOiLsGBAFJ3D2MOJy4dOAZy xRug== X-Gm-Message-State: ANoB5pncrRzxNIm2w2WcjRn9G1PqacidHF5Gq1MQJEjmaiVb3rueiQEQ 4brDPRLck8liG0rRuG3vSEFSSq0a8RN/6LDtp1Ed8A== X-Received: by 2002:a67:fe53:0:b0:3b1:3d9a:6932 with SMTP id m19-20020a67fe53000000b003b13d9a6932mr9775580vsr.59.1670621997768; Fri, 09 Dec 2022 13:39:57 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Mina Almasry Date: Fri, 9 Dec 2022 13:39:44 -0800 Message-ID: Subject: Re: [PATCH v3] [mm-unstable] mm: Fix memcg reclaim on memory tiered systems To: Michal Hocko Cc: Wei Xu , Andrew Morton , Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Huang Ying , Yang Shi , Yosry Ahmed , fvdl@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 9, 2022 at 1:16 PM Michal Hocko wrote: > > On Fri 09-12-22 08:41:47, Wei Xu wrote: > > On Fri, Dec 9, 2022 at 12:08 AM Michal Hocko wrote: > > > > > > On Thu 08-12-22 16:59:36, Wei Xu wrote: > > > [...] > > > > > What I really mean is to add demotion nodes to the nodemask along with > > > > > the set of nodes you want to reclaim from. To me that sounds like a > > > > > more natural interface allowing for all sorts of usecases: > > > > > - free up demotion targets (only specify demotion nodes in the mask) > > > > > - control where to demote (e.g. select specific demotion target(s)) > > > > > - do not demote at all (skip demotion nodes from the node mask) > > > > > > > > For clarification, do you mean to add another argument (e.g. > > > > demotion_nodes) in addition to the "nodes" argument? > > > > > > No, nodes=mask argument should control the domain where the memory > > > reclaim should happen. That includes both aging and the reclaim. If the > > > mask doesn't contain any lower tier node then no demotion will happen. > > > If only a subset of lower tiers are specified then only those could be > > > used for the demotion process. Or put it otherwise, the nodemask is not > > > only used to filter out zonelists during reclaim it also restricts > > > migration targets. > > > > > > Is this more clear now? > > I think putting the demotion sources and demotion targets in the same nodemask is a bit confusing, and prone to error. IIUC the user puts both the demotion source and the demotion target in the nodemaks, and the kernel infers which is which depending on whether the node is a top-tier node, or a bottom tier node. I think in the future this will become ambiguous. What happens in the future when the user when the machine has N memory tiers and the user specifies a node in a middle tier in the nodemask? Does that mean the user wants demotion from or to this node? Middle memory tiers can act as both... I think if your goal is to constrain demotion targets then a much more clear and future proof way is to simply add a second arg to memory.reclaim "allowed_demotion_targets=".\ > > In that case, how can we request demotion only from toptier nodes > > (without counting any reclaimed bytes from other nodes), which is our > > memory tiering use case? > > I am not sure I follow. Could you be more specific please? > > > Besides, when both toptier and demotion nodes are specified, the > > demoted pages should only be counted as aging and not be counted > > towards the requested bytes of try_to_free_mem_cgroup_pages(), which > > is what this patch tries to address. > > This should be addressed by > http://lkml.kernel.org/r/Y5B1K5zAE0PkjFZx@dhcp22.suse.cz, no? I think I provided a test case in [1] showing very clearly that this breaks one of our use cases, i.e. the use case where the user is asking to demote X bytes from the top tier nodes to the lower tier nodes. I would not like to proceed with a fix that breaks one of our use cases. I believe I provided in this patch a fix that caters to all existing users, and we should take the fix in this patch over a fix that breaks use cases. [1] https://lore.kernel.org/all/CAHS8izMKK107wVFSJvg36nQ=WzXd8_cjYBtR0p47L+XLYUSsqA@mail.gmail.com/ > -- > Michal Hocko > SUSE Labs