Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp4939426iob; Mon, 9 May 2022 05:19:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxGUII0dUIdmkGvPjd6YoW2F3K9etXLzwRUenNy5zZx0M6ENNvRmC5JSqskCVCxiAmOT5wo X-Received: by 2002:a17:902:db05:b0:15e:d003:54f2 with SMTP id m5-20020a170902db0500b0015ed00354f2mr15762787plx.103.1652098748095; Mon, 09 May 2022 05:19:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652098748; cv=none; d=google.com; s=arc-20160816; b=xWyDzSYDXBo9z+2mD1fubXRuubRO350rd7t05l01l9G5CPNwkTMr3dg3No6mOlVQZ4 DFM93qPPWKmWZIiv9KcKB6GjL1r0PbXVT8sSk4kc+NTbVqCT6SSS6MsyEw2xTCkVvwXZ lOTmZ2NOK4AiwcWHb7JJZct1LLehzN+JrzdhWSAuroFB/enpLO+vvgUIB2xbOY6eFjCo EkRy/p8cnbSQujs4nVPbihqJOY8McNpfLX0kWN//B6eh+sccWzeMvq9bhsW/Am4DiK0e 8Jl124OsahoX9tA7Swb/fYeFYSEtyh4iCeKTAobJ1Www5hrSG4DQriztrFGdLNIFnVS7 YwAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=yNVsaYU+i3VVfScXq+gO3IUr2FcfBpLpJo9m0vSuIps=; b=gTL76QWIROjblIfRvO5yW1D3sCuNILPKgTNpT8lR9pvpmCajDlTfd4KU/YuMAiWxfW OnVyMs0Buq/WPT2tVCc5XQcLjXcgQn/7SCEumqeZ2TSyv39YO3ECFgLJY/aMVTXT4VSJ 6b3xgY8sbC93xObdogTL3BRhtOMEfB8/eaObUIXNybM1oOotkcy1qpuv52QEMKZMTcj5 bP3KMNek232XVnPn6gfsyuIe4FKgP4l8LWrHK4f77h4+3xUtJXDHZSHxU3nAzixqD2mB qJZHRWD0e9rHPUN5GTOowO2HwmecycShUz+zUn82tx3uYM5D4DJXXVQJkkhKN8RAeK1N ZC7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=PgBcYyrY; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id d19-20020a056a00199300b005061dab8a95si14737448pfl.121.2022.05.09.05.19.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 May 2022 05:19:08 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=PgBcYyrY; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E89FA2272C6; Mon, 9 May 2022 04:48:59 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233279AbiEILwi (ORCPT + 99 others); Mon, 9 May 2022 07:52:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233093AbiEILwg (ORCPT ); Mon, 9 May 2022 07:52:36 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AD9821F1E4; Mon, 9 May 2022 04:48:42 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 3621A21C47; Mon, 9 May 2022 11:48:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1652096921; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yNVsaYU+i3VVfScXq+gO3IUr2FcfBpLpJo9m0vSuIps=; b=PgBcYyrYRNhutGWvlEmu9msR4DIGgtnqSCFJbV9Dj4/vMCOPej08/emexv5dmzb6WBUV3+ 1F7Zfz658IhUEWUFyPZcXrfrptUvFRglvXl9w7IobT3GXmtNPva6WowvipqksnFDMBtzUy op94Cd4xm32IIWOeqkAg5s29/XX6ztM= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 32D0F2C141; Mon, 9 May 2022 11:48:40 +0000 (UTC) Date: Mon, 9 May 2022 13:48:39 +0200 From: Michal Hocko To: CGEL Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, willy@infradead.org, shy828301@gmail.com, roman.gushchin@linux.dev, shakeelb@google.com, linmiaohe@huawei.com, william.kucharski@oracle.com, peterx@redhat.com, hughd@google.com, vbabka@suse.cz, songmuchun@bytedance.com, surenb@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Yang Yang Subject: Re: [PATCH] mm/memcg: support control THP behaviour in cgroup Message-ID: References: <20220505033814.103256-1-xu.xin16@zte.com.cn> <6275d3e7.1c69fb81.1d62.4504@mx.google.com> <6278fa75.1c69fb81.9c598.f794@mx.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6278fa75.1c69fb81.9c598.f794@mx.google.com> X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 09-05-22 11:26:43, CGEL wrote: > On Mon, May 09, 2022 at 12:00:28PM +0200, Michal Hocko wrote: > > On Sat 07-05-22 02:05:25, CGEL wrote: > > [...] > > > If there are many containers to run on one host, and some of them have high > > > performance requirements, administrator could turn on thp for them: > > > # docker run -it --thp-enabled=always > > > Then all the processes in those containers will always use thp. > > > While other containers turn off thp by: > > > # docker run -it --thp-enabled=never > > > > I do not know. The THP config space is already too confusing and complex > > and this just adds on top. E.g. is the behavior of the knob > > hierarchical? What is the policy if parent memcg says madivise while > > child says always? How does the per-application configuration aligns > > with all that (e.g. memcg policy madivise but application says never via > > prctl while still uses some madvised - e.g. via library). > > > > The cgroup THP behavior is align to host and totally independent just likes > /sys/fs/cgroup/memory.swappiness. That means if one cgroup config 'always' > for thp, it has no matter with host or other cgroup. This make it simple for > user to understand or control. All controls in cgroup v2 should be hierarchical. This is really required for a proper delegation semantic. > If memcg policy madivise but application says never, just like host, the result > is no THP for that application. > > > > By doing this we could promote important containers's performance with less > > > footprint of thp. > > > > Do we really want to provide something like THP based QoS? To me it > > sounds like a bad idea and if the justification is "it might be useful" > > then I would say no. So you really need to come with a very good usecase > > to promote this further. > > At least on some 5G(communication technology) machine, it's useful to provide > THP based QoS. Those 5G machine use micro-service software architecture, in > other words one service application runs in one container. I am not really sure I understand. If this is one application per container (cgroup) then why do you really need per-group setting? Does the application is a set of different processes which are only very loosely tight? > Container becomes > the suitable management unit but not the whole host. And some performance > sensitive containers desiderate THP to provide low latency communication. > But if we use THP with 'always', it will consume more memory(on our machine > that is about 10% of total memory). And unnecessary huge pages will increase > memory pressure, add latency for minor pages faults, and add overhead when > splitting huge pages or coalescing normal sized pages into huge pages. It is still not really clear to me how do you achieve that the whole workload in the said container has the same THP requirements. -- Michal Hocko SUSE Labs