Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp33344pxf; Tue, 6 Apr 2021 14:09:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxgf2h0IBgbCFSspV29zhi2omBBjAXIPZ/umyA+3KUJg3g3qVgjzAy0ubj0wp/BE2p7uej4 X-Received: by 2002:a05:6402:128a:: with SMTP id w10mr302978edv.277.1617743390056; Tue, 06 Apr 2021 14:09:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617743390; cv=none; d=google.com; s=arc-20160816; b=l5Z8Q/pQG4uWZOFylBjGzvSBlpCr1fSGnqYxsu2Jn8SB8rI2p8F/oWxLbqt2GZ3qEI kW8K/tRC+Spk0nRde9FJCJgDaGsuwZQZQhLbEqJ5R08rM9zL/bUJKcOmw3IF5JUzcFNE gDyruHEZGPORrPJl2v+8mwhcbCHs3gdC26dMB1tHek80lRikwi/lKGAw+Y5GUOdxCoS4 i5MmK+wRf5tSRsm9dYnVyYIFDyi3bYbZX4HvLmda8F+Oif0Pyk34JSS5bUjuU2xtL7ZT RMhvop0QMPZyuXR7JxOHlOGgo2vED2VRHNS9PFJ/Rrp1sheiet2EZd/ag0WShnShSF5d KnQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=3UA2QLgE5D0ijsyIx8P+uuJ+qKQ15FjapR/N/wkS3FM=; b=tdYzsEKcATH8zZ4DqOchp3i/Hw8j7BzPal99EGPMkqBXJwiTzUFYQvLZBkjRNBfXhJ DlcLYgb7N91UxSVlrZi9tuksnDs504QwCMPV596whn9NgrxwhO0oWIJ+Qnd9MfrUdZ5Q LHL7N0pFcSDUOA1j7HMP+cjzX7gLtBpSTMmiYtOGT2InSiTX4GRrsn6IpukxynXgOM++ iwfg+2mxIAN4mkUlytdQgGk4wJp2/y0r6D9FMuwZZucP1R9FY2DSFBMVmwfPNoPnxwz5 oslTLB43g4Zev7xRSoXzsD7uxc/r5QuXt7l9nDYlxHneC1Qn0aqjuNN8KnHXQtJZVATP F6Xg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HF72u0MO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p16si8255817edy.252.2021.04.06.14.09.25; Tue, 06 Apr 2021 14:09:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HF72u0MO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343748AbhDFMKz (ORCPT + 99 others); Tue, 6 Apr 2021 08:10:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343756AbhDFMKx (ORCPT ); Tue, 6 Apr 2021 08:10:53 -0400 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A874CC06174A for ; Tue, 6 Apr 2021 05:10:45 -0700 (PDT) Received: by mail-wm1-x331.google.com with SMTP id o6so4309919wmq.3 for ; Tue, 06 Apr 2021 05:10:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=3UA2QLgE5D0ijsyIx8P+uuJ+qKQ15FjapR/N/wkS3FM=; b=HF72u0MOJvcyb2iO031+3c68hjNTeovFQNLtmtnt5xqUF1rnl9HuJPHYt5suzoXNS0 XGZaD4/jxQsphiOqCQ7iwY+xVYPkdS/lkbMeDHiHVVsO2ZhBbaxKz/R9tJcLJVw5qyKg wH8RjFs6SbFvit+GIlaly8tMXLlJxWDnkFGCOo1q9cvbp0r9VqQzi3KqtUC3Bzf4Hp+9 7rCvU/b58AauosJJ+qvqsWEZBDrc1Rk7er7tyPgLkisjbxAqgRi6V4Xn54K4K5lpuQtr dkJWd0eWbKIhUf7hs+Ch4z2XyTMWUK86+nbI/TqALTF24exGJ9lnQmaju2WAqmH0rAof qZ8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=3UA2QLgE5D0ijsyIx8P+uuJ+qKQ15FjapR/N/wkS3FM=; b=Du+YtiXAcJlBjUkQW2cOOydgAeF+smRj7bjM74xy1s587U8bn5Ftd+zTPkP5ClL+7F O7oHSv7rJLW8vtCpUhxg/opCaYIIK1o7f18dBbgxDJWSwdv7cF+DKAHwl1qN8kaElAfb TJYiImIjTiac4BwIjLilTm48YJ/56D2vPmiU1W+UE/u0io0cQBU7c4fdBHYU335lHGhp FwQwvEqXDAcFJheLIwOb0zcQGth7Z8CLe6A7sMrS9arw6i9mLWUbAGWNb3/c3SrG1UqL NkdQaMMFPSCKU2B5MPkz64kZomC7WpRADHwU6wKdEoUt1yyEvRwFX/V6rwVabj8RxW+p 3Izg== X-Gm-Message-State: AOAM533FLTIIglM5mJBG6inY2sf44hJsuXXIugrEi4UGwzeatlmsAuLf w57HHu7QZ+Y14XKV1pamjhgXJMjdWr4gKQ== X-Received: by 2002:a1c:c918:: with SMTP id f24mr3980300wmb.12.1617711044267; Tue, 06 Apr 2021 05:10:44 -0700 (PDT) Received: from google.com (105.168.195.35.bc.googleusercontent.com. [35.195.168.105]) by smtp.gmail.com with ESMTPSA id s21sm2641882wmc.10.2021.04.06.05.10.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Apr 2021 05:10:43 -0700 (PDT) Date: Tue, 6 Apr 2021 12:10:41 +0000 From: Quentin Perret To: Pavankumar Kondeti Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Tejun Heo , Zefan Li , Johannes Weiner , Wei Wang Subject: Re: [PATCH] cgroup: Relax restrictions on kernel threads moving out of root cpu cgroup Message-ID: References: <1617706753-25349-1-git-send-email-pkondeti@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1617706753-25349-1-git-send-email-pkondeti@codeaurora.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Pavan, On Tuesday 06 Apr 2021 at 16:29:13 (+0530), Pavankumar Kondeti wrote: > In Android GKI, CONFIG_FAIR_GROUP_SCHED is enabled [1] to help prioritize > important work. Given that CPU shares of root cgroup can't be changed, > leaving the tasks inside root cgroup will give them higher share > compared to the other tasks inside important cgroups. This is mitigated > by moving all tasks inside root cgroup to a different cgroup after > Android is booted. However, there are many kernel tasks stuck in the > root cgroup after the boot. > > We see all kworker threads are in the root cpu cgroup. This is because, > tasks with PF_NO_SETAFFINITY flag set are forbidden from cgroup migration. > This restriction is in place to avoid kworkers getting moved to a cpuset > which conflicts with kworker affinity. Relax this restriction by explicitly > checking if the task is moving out of a cpuset cgroup. This allows kworkers > to be moved out root cpu cgroup. > > We also see kthreadd_task and any kernel thread created after the Android boot > also stuck in the root cgroup. The current code prevents kthreadd_task moving > out root cgroup to avoid the possibility of creating new RT kernel threads > inside a cgroup with no RT runtime allocated. Apply this restriction when tasks > are moving out of cpu cgroup under CONFIG_RT_GROUP_SCHED. This allows all > kernel threads to be moved out of root cpu cgroup if the kernel does not > enable RT group scheduling. OK, so IIUC this only works with cgroup v1 -- the unified hierarchy in v2 forces you to keep cpu and cpuset in 'sync'. But that should be fine, so this looks like a nice improvement to me. > [1] https://android.googlesource.com/kernel/common/+/f08f049de11c15a4251cb1db08cf0bee20bd9b59 > > Signed-off-by: Pavankumar Kondeti > --- > kernel/cgroup/cgroup-internal.h | 3 ++- > kernel/cgroup/cgroup-v1.c | 2 +- > kernel/cgroup/cgroup.c | 24 +++++++++++++++++++----- > 3 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h > index bfbeabc..a96ed9a 100644 > --- a/kernel/cgroup/cgroup-internal.h > +++ b/kernel/cgroup/cgroup-internal.h > @@ -232,7 +232,8 @@ int cgroup_migrate(struct task_struct *leader, bool threadgroup, > int cgroup_attach_task(struct cgroup *dst_cgrp, struct task_struct *leader, > bool threadgroup); > struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup, > - bool *locked) > + bool *locked, > + struct cgroup *dst_cgrp) > __acquires(&cgroup_threadgroup_rwsem); > void cgroup_procs_write_finish(struct task_struct *task, bool locked) > __releases(&cgroup_threadgroup_rwsem); > diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c > index a575178..d674a6c 100644 > --- a/kernel/cgroup/cgroup-v1.c > +++ b/kernel/cgroup/cgroup-v1.c > @@ -497,7 +497,7 @@ static ssize_t __cgroup1_procs_write(struct kernfs_open_file *of, > if (!cgrp) > return -ENODEV; > > - task = cgroup_procs_write_start(buf, threadgroup, &locked); > + task = cgroup_procs_write_start(buf, threadgroup, &locked, cgrp); > ret = PTR_ERR_OR_ZERO(task); > if (ret) > goto out_unlock; > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index 9153b20..41864a8 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -2744,7 +2744,8 @@ int cgroup_attach_task(struct cgroup *dst_cgrp, struct task_struct *leader, > } > > struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup, > - bool *locked) > + bool *locked, > + struct cgroup *dst_cgrp) > __acquires(&cgroup_threadgroup_rwsem) > { > struct task_struct *tsk; > @@ -2784,15 +2785,28 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup, > tsk = tsk->group_leader; > > /* > + * RT kthreads may be born in a cgroup with no rt_runtime allocated. > + * Just say no. > + */ > +#ifdef CONFIG_RT_GROUP_SCHED > + if (tsk->no_cgroup_migration && (dst_cgrp->root->subsys_mask & (1U << cpu_cgrp_id))) { > + tsk = ERR_PTR(-EINVAL); > + goto out_unlock_threadgroup; > + } > +#endif > + > + /* > * kthreads may acquire PF_NO_SETAFFINITY during initialization. > * If userland migrates such a kthread to a non-root cgroup, it can > - * become trapped in a cpuset, or RT kthread may be born in a > - * cgroup with no rt_runtime allocated. Just say no. > + * become trapped in a cpuset. Just say no. > */ > - if (tsk->no_cgroup_migration || (tsk->flags & PF_NO_SETAFFINITY)) { > +#ifdef CONFIG_CPUSETS > + if ((tsk->no_cgroup_migration || (tsk->flags & PF_NO_SETAFFINITY)) && > + (dst_cgrp->root->subsys_mask & (1U << cpuset_cgrp_id))) { > tsk = ERR_PTR(-EINVAL); > goto out_unlock_threadgroup; > } > +#endif Nit: maybe move this #ifdefery out to a header? Thanks, Quentin