Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp963264rwb; Tue, 27 Sep 2022 06:51:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM54NTz6ryQsPjWJRDpgItvCvpQxIz8cyoJhxIAO+RLXSFH4oP/e5mpMqClB0JIAnEbkhOwK X-Received: by 2002:a17:902:ecd2:b0:178:3b53:ebf7 with SMTP id a18-20020a170902ecd200b001783b53ebf7mr25998033plh.28.1664286704982; Tue, 27 Sep 2022 06:51:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664286704; cv=none; d=google.com; s=arc-20160816; b=EfURX+GUaRcuI97HFqSwodne8x5tC1aQnWt2k4LNNrGmsLORrPnzE3sgQtWAsGQtE4 QbKCWbTgYr68chn/QWfvlJu95u71zYbud4PAaIpsdxO/k0d2mFagK7rSG0YZfkIE8RiU Wuy6r45bJq8DAIDIQ+P6XM0QO7m80x6aQ91DTH45BV2fscBIsjg/F1EvBvT4WLhWQYZx T4px9gqqFNnVYd4Vx5xnXMQCJajRU+/J1f5J5QFFFxtiV6Zd5DyYL2uu++YJb7XNWc5K 2xI+90ni8P+XsPX0z1pd4OTFcXUhN4UiUxQllmC2MVawE54acS8gtZtOD6PMP77odcJS jXtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=wLWYmfFySAd0GACOj4/rphI/57VoRcvb+5EekgbbspY=; b=QMOjp2OCfq504Z7AFPzBL1c4Po5bOEXIkHPHYQiBQoHlVlyaKSwIiN94wcVM9qqODA uASrJ1Bo15LmowOXXVeJRJrwd672zCjfUFAj/2D+JsIsI7t529x2BzqfgimKYLtFXYQR R2lEqIjNCD5/JadPxfbYh5Em5jN99F+JOrdkLGT7ckhMnHWh2oNKoxYSOQ8JerB95bBH 8yyRe3VMG5NafySdbqYVwXei4YEPY5RYYscAO5NboEC9ibLXRRfZcUyWhj8AjwhSCG8A CMtNZLnsMTL4kFodqUC48NktV1OF3CbnN+ZJUKIzOBhird2gzFCC+mF+62tw7C1ZO4si fCLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=vpZuuu+c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c2-20020a63ea02000000b004399ac70a82si2036110pgi.328.2022.09.27.06.51.33; Tue, 27 Sep 2022 06:51:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=vpZuuu+c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232086AbiI0NHQ (ORCPT + 99 others); Tue, 27 Sep 2022 09:07:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232369AbiI0NHK (ORCPT ); Tue, 27 Sep 2022 09:07:10 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5F6F17D415 for ; Tue, 27 Sep 2022 06:07:08 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id iw17so9067883plb.0 for ; Tue, 27 Sep 2022 06:07:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date; bh=wLWYmfFySAd0GACOj4/rphI/57VoRcvb+5EekgbbspY=; b=vpZuuu+cF8tKQW52wVI/FmSsjrI369ARx9x3l38QFDVBOY19iVxTCIFY2z5Z/RDz3c /KqBkQMe4HZvIYPfm/Q4Fj2R0q1L+Y6n3yo9mpWc/7QINyfciOuJ+CEbUu90eoA8MsvO wLL1YBsaHOMLuSkbDqCbZ9yRM0epZd8t+yVGMB3d8hKZuPMjQJ+ZfC32vFaybi9/L+Qe Dkpf6LxbuzCH8EAmobz/VVimGkbsUIwtpxwKTeXhIW6f5qb14YamDSOJ9PS+hG7NAuW2 KNnSDh8AQ2ixJUAHn2Q/qJKMWDQkFBR147zZRAozzgYpU11id1HWzvLMCBtWMRFKVnD3 bwXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date; bh=wLWYmfFySAd0GACOj4/rphI/57VoRcvb+5EekgbbspY=; b=FuW5gpoUM/Ow7sB0SIgeNfmWC+OXeRq2sofKDavFh8oMtyZZuQc8IcvImgln2ksRyz ErxlTR9ZiOX/LpDhKbHxceTvaFg9fGlriD47D/HfkinSJlsp1qeLXIwMvlkMNO8mjjih 7SYMsb247zsRPz/g9stNyLbGNBq4q4IIpguxLJcDLWYE2S2r6EH9QVOtrZ7YdVZq9Ol9 US1BmA1A2h6z9Jg87Dp5xl6QxtmiuLpcEY34OQ3dQZkqDqKX8cgERXlYFpMtY1jwehtC 5y+mz2J20r8OwCzjBVtLXeb3wpkKQYpTDVswZwnxkJ5NW/sKBfXmo2XSMkfmGGA2feOw JvSg== X-Gm-Message-State: ACrzQf3F12XT+zGVRlJjsKRyEbGIR69dQYpYRCp8IzcC27SrEOuvWajf NLPpfbd9/tZcojGrn06UKfX2FEM6p3OXRQ== X-Received: by 2002:a17:902:e841:b0:177:82b6:e6f7 with SMTP id t1-20020a170902e84100b0017782b6e6f7mr27664517plg.66.1664284027853; Tue, 27 Sep 2022 06:07:07 -0700 (PDT) Received: from [10.255.19.83] ([139.177.225.224]) by smtp.gmail.com with ESMTPSA id f22-20020a63f116000000b0042a713dd68csm1515083pgi.53.2022.09.27.06.07.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Sep 2022 06:07:07 -0700 (PDT) Message-ID: <9a0130ce-6528-6652-5a8e-3612c5de2d96@bytedance.com> Date: Tue, 27 Sep 2022 21:07:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [External] Re: [RFC] proc: Add a new isolated /proc/pid/mempolicy type. Content-Language: en-US To: Michal Hocko Cc: Zhongkun He , corbet@lwn.net, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org References: <20220926091033.340-1-hezhongkun.hzk@bytedance.com> <24b20953-eca9-eef7-8e60-301080a17d2d@bytedance.com> <7ac9abce-4458-982b-6c04-f9569a78c0da@bytedance.com> From: Abel Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/27/22 6:49 PM, Michal Hocko wrote: > On Tue 27-09-22 11:20:54, Abel Wu wrote: > [...] >>>> Btw.in order to add per-thread-group mempolicy, is it possible to add >>>> mempolicy in mm_struct? >>> >>> I dunno. This would make the mempolicy interface even more confusing. >>> Per mm behavior makes a lot of sense but we already do have per-thread >>> semantic so I would stick to it rather than introducing a new semantic. >>> >>> Why is this really important? >> >> We want soft control on memory footprint of background jobs by applying >> NUMA preferences when necessary, so the impact on different NUMA nodes >> can be managed to some extent. These NUMA preferences are given by the >> control panel, and it might not be suitable to overwrite the tasks with >> specific memory policies already (or vice versa). > > Maybe the answer is somehow implicit but I do not really see any > argument for the per thread-group semantic here. In other words why a > new interface has to cover more than the local [sg]et_mempolicy? > I can see convenience as one potential argument. Also if there is a > requirement to change the policy in atomic way then this would require a > single syscall. Convenience is not our major concern. A well-tuned workload can have specific memory policies for different tasks/vmas in one process, and this can be achieved by set_mempolicy()/mbind() respectively. While other workloads are not, they don't care where the memory residents, so the impact they brought on the co-located workloads might vary in different NUMA nodes. The control panel, which has a full knowledge of workload profiling, may want to interfere the behavior of the non-mempolicied processes by giving them NUMA preferences, to better serve the co-located jobs. So in this scenario, a process's memory policy can be assigned by two objects dynamically: a) the process itself, through set_mempolicy()/mbind() b) the control panel, but API is not available right now Considering the two policies should not fight each other, it sounds reasonable to introduce a new syscall to assign memory policy to a process through struct mm_struct.