Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752583AbcJFIcg (ORCPT ); Thu, 6 Oct 2016 04:32:36 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:46949 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751107AbcJFIcd (ORCPT ); Thu, 6 Oct 2016 04:32:33 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Tejun Heo , Evgeny Vereshchagin , "Serge E. Hallyn" , Aditya Kali , "Eric W. Biederman" Subject: [PATCH 4.7 002/141] cgroup: fix invalid controller enable rejections with cgroup namespace Date: Thu, 6 Oct 2016 10:27:18 +0200 Message-Id: <20161006074448.715403162@linuxfoundation.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20161006074448.608056610@linuxfoundation.org> References: <20161006074448.608056610@linuxfoundation.org> User-Agent: quilt/0.64 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3102 Lines: 87 4.7-stable review patch. If anyone has any objections, please let me know. ------------------ From: Tejun Heo commit 9157056da8f8c4a6305f15619e269f164b63a6de upstream. On the v2 hierarchy, "cgroup.subtree_control" rejects controller enables if the cgroup has processes in it. The enforcement of this logic assumes that the cgroup wouldn't have any css_sets associated with it if there are no tasks in the cgroup, which is no longer true since a79a908fd2b0 ("cgroup: introduce cgroup namespaces"). When a cgroup namespace is created, it pins the css_set of the creating task to use it as the root css_set of the namespace. This extra reference stays as long as the namespace is around and makes "cgroup.subtree_control" think that the namespace root cgroup is not empty even when it is and thus reject controller enables. Fix it by making cgroup_subtree_control() walk and test emptiness of each css_set instead of testing whether the list_head is empty. While at it, update the comment of cgroup_task_count() to indicate that the returned value may be higher than the number of tasks, which has always been true due to temporary references and doesn't break anything. Signed-off-by: Tejun Heo Reported-by: Evgeny Vereshchagin Cc: Serge E. Hallyn Cc: Aditya Kali Cc: Eric W. Biederman Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces") Link: https://github.com/systemd/systemd/pull/3589#issuecomment-249089541 Signed-off-by: Greg Kroah-Hartman --- kernel/cgroup.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -3452,9 +3452,28 @@ static ssize_t cgroup_subtree_control_wr * Except for the root, subtree_control must be zero for a cgroup * with tasks so that child cgroups don't compete against tasks. */ - if (enable && cgroup_parent(cgrp) && !list_empty(&cgrp->cset_links)) { - ret = -EBUSY; - goto out_unlock; + if (enable && cgroup_parent(cgrp)) { + struct cgrp_cset_link *link; + + /* + * Because namespaces pin csets too, @cgrp->cset_links + * might not be empty even when @cgrp is empty. Walk and + * verify each cset. + */ + spin_lock_irq(&css_set_lock); + + ret = 0; + list_for_each_entry(link, &cgrp->cset_links, cset_link) { + if (css_set_populated(link->cset)) { + ret = -EBUSY; + break; + } + } + + spin_unlock_irq(&css_set_lock); + + if (ret) + goto out_unlock; } /* save and update control masks and prepare csses */ @@ -3905,7 +3924,9 @@ void cgroup_file_notify(struct cgroup_fi * cgroup_task_count - count the number of tasks in a cgroup. * @cgrp: the cgroup in question * - * Return the number of tasks in the cgroup. + * Return the number of tasks in the cgroup. The returned number can be + * higher than the actual number of tasks due to css_set references from + * namespace roots and temporary usages. */ static int cgroup_task_count(const struct cgroup *cgrp) {