Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp684890imm; Fri, 1 Jun 2018 07:53:59 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIKvLLYoePSvUD3VtkVEJNHfIdlzhNdrbATFpI7oBu1vYy7rr3rqE9tnZWnc+fQ/Di7Kz5n X-Received: by 2002:a17:902:b786:: with SMTP id e6-v6mr11655883pls.260.1527864839872; Fri, 01 Jun 2018 07:53:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527864839; cv=none; d=google.com; s=arc-20160816; b=N28auUKY27V6AZ4cjwyL7wcGLE4iAC0PkbGJqsu0hJkgB7AkclFIz63tnP0xavhyGX gi0UvinWfCejI5nivLsDB2odWXmd4GeuvGVOc12/r2Thzd07m8MxDXCVCnMQdRWnNDzn NL6EG4rZxUle785RhZddVI72EKRFjr4SgRZWVau5Kr4Oo3cfk32lSZ5C9Vf2oUX1UgPz 3lAOiOHOjnGeAe3qBNtrFX+VO2wr094Up3utcpvxvu/E94GU6Qju601jOqC0AjPacrv9 pzR/tT9zeuJBWfNQ6be2912AUXV5Juj1zHbt51FDT+CfIft+yCFHSMNRP9Ww4sMg1MsI Rq2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:mime-version:user-agent :message-id:in-reply-to:date:references:cc:to:from :arc-authentication-results; bh=wH2Az/TQEJoj1h3+6pT/tp2Dt8Tpb9FCmvspj89Ahg4=; b=u1UIcF9m1b4pIbSGgRzFp1KgYcq1GiiJTlC2boOBOQQbp+S3wqui0GkpmVEasDZ3MV eAQ+TOOwwxhM4Yw9qJ8Rpc/yQbC9RAAlfufOJpVsPkGhsdE9wC5JisYpj7VC79xJ0fwd edfnKvgwdfI+wQh8lNWfLm7FEAiryab/LTPgIugVyIz1SPlmpdkTw3aZ2pMnvDCTBRcJ HlDLSPjkSieXiUGnQwMPGkVQB5NC73+dQN2XzLvG8sEmMUAqmaaax8nefJANa+MOeIgd VOQ2Yv9WHqe7AL69UzHS8KKd9sl8Ol/myI4SDlcVmG6OLBJ9yPdFRdoSxmM2GWwxri+4 yHwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m14-v6si32202624pgs.178.2018.06.01.07.53.43; Fri, 01 Jun 2018 07:53:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752091AbeFAOxR (ORCPT + 99 others); Fri, 1 Jun 2018 10:53:17 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:60737 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751936AbeFAOxQ (ORCPT ); Fri, 1 Jun 2018 10:53:16 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fOlQZ-000098-DP; Fri, 01 Jun 2018 08:53:15 -0600 Received: from 97-119-124-205.omah.qwest.net ([97.119.124.205] helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fOlQY-0000L0-0k; Fri, 01 Jun 2018 08:53:15 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Kirill Tkhai , peterz@infradead.org, viro@zeniv.linux.org.uk, mingo@kernel.org, paulmck@linux.vnet.ibm.com, keescook@chromium.org, riel@redhat.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, marcos.souza.org@gmail.com, hoeun.ryu@gmail.com, pasha.tatashin@oracle.com, gs051095@gmail.com, dhowells@redhat.com, rppt@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, Balbir Singh , Tejun Heo , Oleg Nesterov References: <20180504145435.GA26573@redhat.com> <87y3gzfmjt.fsf@xmission.com> <20180504162209.GB26573@redhat.com> <871serfk77.fsf@xmission.com> <87tvrncoyc.fsf_-_@xmission.com> <20180510121418.GD5325@dhcp22.suse.cz> <20180522125757.GL20020@dhcp22.suse.cz> <87wovu889o.fsf@xmission.com> <20180524111002.GB20441@dhcp22.suse.cz> <20180524141635.c99b7025a73a709e179f92a2@linux-foundation.org> <20180530121721.GD27180@dhcp22.suse.cz> <87wovjacrh.fsf@xmission.com> <87wovj8e1d.fsf_-_@xmission.com> <87y3fywodn.fsf_-_@xmission.com> Date: Fri, 01 Jun 2018 09:53:09 -0500 In-Reply-To: <87y3fywodn.fsf_-_@xmission.com> (Eric W. Biederman's message of "Fri, 01 Jun 2018 09:52:04 -0500") Message-ID: <87sh66wobu.fsf_-_@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1fOlQY-0000L0-0k;;;mid=<87sh66wobu.fsf_-_@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=97.119.124.205;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+R+hx0VH/cHOO9Tc5PuTJL8tqFJXpOMVQ= X-SA-Exim-Connect-IP: 97.119.124.205 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on sa03.xmission.com X-Spam-Level: *** X-Spam-Status: No, score=3.7 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,TR_Symld_Words,TVD_RCVD_IP,T_XMDrugObfuBody_14,XMNoVowels, XMSubLong autolearn=disabled version=3.4.0 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.7 XMSubLong Long Subject * 1.5 TR_Symld_Words too many words that have symbols inside * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa03 1397; Body=1 Fuz1=1 Fuz2=1] * 0.2 T_XMDrugObfuBody_14 obfuscated drug references X-Spam-DCC: XMission; sa03 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Michal Hocko X-Spam-Relay-Country: X-Spam-Timing: total 1006 ms - load_scoreonly_sql: 0.28 (0.0%), signal_user_changed: 6 (0.6%), b_tie_ro: 3.3 (0.3%), parse: 2.5 (0.3%), extract_message_metadata: 34 (3.4%), get_uri_detail_list: 4.0 (0.4%), tests_pri_-1000: 15 (1.5%), tests_pri_-950: 2.9 (0.3%), tests_pri_-900: 2.2 (0.2%), tests_pri_-400: 44 (4.3%), check_bayes: 41 (4.1%), b_tokenize: 18 (1.7%), b_tok_get_all: 10 (1.0%), b_comp_prob: 5 (0.5%), b_tok_touch_all: 3.6 (0.4%), b_finish: 0.98 (0.1%), tests_pri_0: 879 (87.4%), check_dkim_signature: 1.35 (0.1%), check_dkim_adsp: 6 (0.5%), tests_pri_500: 12 (1.2%), rewrite_mail: 0.00 (0.0%) Subject: [RFC][PATCH 1/2] memcg: Ensure every task that uses an mm is in the same memory cgroup X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From a userspace perspective the cgroup of a mm is determined by which the cgroup the task belongs too. As practically an mm can only belong to a single memory cgroup having multiple tasks with the same mm in different memory cgroups is not well defined. Avoid the difficulties of dealing with crazy semantics and restrict all tasks that share a single mm to the same memory cgroup. This is accomplished by adding a new function mem_cgroup_mm_can_attach that checks this condition with a straight forward algorithm. In the worst case it is O(N^2). In the common case it should be O(N) in the number of tasks being migrated. As typically only a single process and thus a single process is being migrated and it is optimized for that case. There are users of mmget such as proc that can create references to an mm this function can not find. This results in an unnecessary migration failure. It does not break the invariant that every task of an mm stays in the same memory cgroup. So this condition is annoying but harmelss. This requires multi-threaded mm's to be migrated using the procs file. This effectively forbids process with mm's shared processes being migrated. Although enabling the control file might work. Signed-off-by: "Eric W. Biederman" --- mm/memcontrol.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 21c13f4768eb..078ef562bb90 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4798,6 +4798,50 @@ static void mem_cgroup_clear_mc(void) mmput(mm); } +static int mem_cgroup_mm_can_attach(struct cgroup_taskset *tset) +{ + struct cgroup_subsys_state *css, *tcss; + struct task_struct *p, *t; + struct mm_struct *mm = NULL; + int ret = -EINVAL; + + /* + * Ensure all references for every mm can be accounted for by + * tasks that are being migrated. + */ + rcu_read_lock(); + cgroup_taskset_for_each(p, css, tset) { + int mm_users, mm_count; + + /* Does this task have an mm that has not been visited? */ + if (!p->mm || (p->flags & PF_KTHREAD) || (p->mm == mm)) + continue; + + mm = p->mm; + mm_users = atomic_read(&mm->mm_users); + if (mm_users == 1) + continue; + + mm_count = 0; + cgroup_taskset_for_each(t, tcss, tset) { + if (t->mm != mm) + continue; + mm_count++; + } + /* + * If there are no non-task references mm_users will + * be stable as holding cgroup_thread_rwsem for write + * blocks fork and exit. + */ + if (mm_count != mm_users) + goto out; + } + ret = 0; +out: + rcu_read_unlock(); + return ret; +} + static int mem_cgroup_can_attach(struct cgroup_taskset *tset) { struct cgroup_subsys_state *css; @@ -4806,7 +4850,12 @@ static int mem_cgroup_can_attach(struct cgroup_taskset *tset) struct task_struct *leader, *p; struct mm_struct *mm; unsigned long move_flags; - int ret = 0; + int ret; + + /* Is every task of every mm in tset being moved? */ + ret = mem_cgroup_mm_can_attach(tset); + if (ret) + return ret; /* charge immigration isn't supported on the default hierarchy */ if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) -- 2.14.1