Received: by 2002:a25:b794:0:0:0:0:0 with SMTP id n20csp7234814ybh; Thu, 8 Aug 2019 12:15:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqy/O8WyaU7ya0vANfiXnCOC9O/ln9clbOsRWw2icJaZsjM/+fTG1TXixf600PgiwW0jUx/b X-Received: by 2002:a63:7e17:: with SMTP id z23mr14260694pgc.14.1565291720136; Thu, 08 Aug 2019 12:15:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565291720; cv=none; d=google.com; s=arc-20160816; b=FXOL6XWXUjE1GvAiMAgdmtULGeflOE7y04oJI+hyedjygowgSBQCavbpq4NVh1u1nK ZFww3+YttS3wirNyCx586xInWPnWxMh5zi4tOqvG3ZrGnKjA3XlexYTkwUBnQ4G+wyoc nY9PM4W/XEonncuRQ6rhhHF1+C/ASBn2wB+Pfk4+4fNOKNoL2Ju6yDRuW6OYoZHJHGVE J6fJoCXp3Bf+u1xn9OSYtGi4pjxZgQDkPkGUZ6KJqA6kQ59gTxPMcuCw3gORjDTrRjIW pNtfnbfsSA/IdvASLQ8Bdwucu1ASZX72QEiq5X33cSfU4raAIbpFQY1Z0scTkU4az5LZ 2NXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=UYt7vhsfoyTmhsPaFLq9k7RZA2RZy5sZMWUUU7Ut/sI=; b=OHd2I0ADvfrcVb+vC8ZLb+9FB6yt9ZlLnVKMddeRYL4Tu385R2morp4oz4d0uaUBjA GixKWvNOTu+/EuSaACj9ZLYmdfH5tb4q/p3qO1BHo6szUlOgOWfdfo9JMHW9K3L3SoUD 231UcYFI+oFlpCbyWplH7ljx+xRjBE5Nc9zXtbfA+oBZbv252NTyHzWnVowU22kREfaq QqImj+3zQsEob1Ow+xeeh/5bIZceP45NQkXgTDzfEMoC2N7H7IXWYRFvq6NZJxyWezg6 kp8Ibtowvd3nlq2xHtL3k6Kk5YjTReQpaZDxQCos+Nt6P7pA/gn5Kowo1MSEAvCWeZ5M RZJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=B1yuTfWD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s101si2514236pjc.5.2019.08.08.12.15.04; Thu, 08 Aug 2019 12:15:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=B1yuTfWD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405238AbfHHTN1 (ORCPT + 99 others); Thu, 8 Aug 2019 15:13:27 -0400 Received: from mail.kernel.org ([198.145.29.99]:44182 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405181AbfHHTJv (ORCPT ); Thu, 8 Aug 2019 15:09:51 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 369812173E; Thu, 8 Aug 2019 19:09:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1565291390; bh=DqXeoOBtZ7dIbwj5xSi70ezL4gJOZKb2h0qnweY3Mqw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=B1yuTfWDATz95ZfaUtEcvWKDY31wohcc0svk73FTwMMLE6bH6JyABLKlgajrtlCak JrUKQSnj/+7kULqtvZQAJ2dIixRpfdWF/vR26Tfecte9E+qOwRiQKfLT4kGpI/Ssfx Nz3q/cg1Yz7KjnfjgzR8Zo4xH1qcXwAerZgBVOaI= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Tejun Heo , Oleg Nesterov Subject: [PATCH 4.19 41/45] cgroup: Implement css_task_iter_skip() Date: Thu, 8 Aug 2019 21:05:27 +0200 Message-Id: <20190808190456.193657578@linuxfoundation.org> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190808190453.827571908@linuxfoundation.org> References: <20190808190453.827571908@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tejun Heo commit b636fd38dc40113f853337a7d2a6885ad23b8811 upstream. When a task is moved out of a cset, task iterators pointing to the task are advanced using the normal css_task_iter_advance() call. This is fine but we'll be tracking dying tasks on csets and thus moving tasks from cset->tasks to (to be added) cset->dying_tasks. When we remove a task from cset->tasks, if we advance the iterators, they may move over to the next cset before we had the chance to add the task back on the dying list, which can allow the task to escape iteration. This patch separates out skipping from advancing. Skipping only moves the affected iterators to the next pointer rather than fully advancing it and the following advancing will recognize that the cursor has already been moved forward and do the rest of advancing. This ensures that when a task moves from one list to another in its cset, as long as it moves in the right direction, it's always visible to iteration. This doesn't cause any visible behavior changes. Signed-off-by: Tejun Heo Cc: Oleg Nesterov Signed-off-by: Greg Kroah-Hartman --- include/linux/cgroup.h | 3 ++ kernel/cgroup/cgroup.c | 60 +++++++++++++++++++++++++++++-------------------- 2 files changed, 39 insertions(+), 24 deletions(-) --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -43,6 +43,9 @@ /* walk all threaded css_sets in the domain */ #define CSS_TASK_ITER_THREADED (1U << 1) +/* internal flags */ +#define CSS_TASK_ITER_SKIPPED (1U << 16) + /* a css_task_iter should be treated as an opaque object */ struct css_task_iter { struct cgroup_subsys *ss; --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -212,7 +212,8 @@ static struct cftype cgroup_base_files[] static int cgroup_apply_control(struct cgroup *cgrp); static void cgroup_finalize_control(struct cgroup *cgrp, int ret); -static void css_task_iter_advance(struct css_task_iter *it); +static void css_task_iter_skip(struct css_task_iter *it, + struct task_struct *task); static int cgroup_destroy_locked(struct cgroup *cgrp); static struct cgroup_subsys_state *css_create(struct cgroup *cgrp, struct cgroup_subsys *ss); @@ -775,6 +776,21 @@ static void css_set_update_populated(str cgroup_update_populated(link->cgrp, populated); } +/* + * @task is leaving, advance task iterators which are pointing to it so + * that they can resume at the next position. Advancing an iterator might + * remove it from the list, use safe walk. See css_task_iter_skip() for + * details. + */ +static void css_set_skip_task_iters(struct css_set *cset, + struct task_struct *task) +{ + struct css_task_iter *it, *pos; + + list_for_each_entry_safe(it, pos, &cset->task_iters, iters_node) + css_task_iter_skip(it, task); +} + /** * css_set_move_task - move a task from one css_set to another * @task: task being moved @@ -800,22 +816,9 @@ static void css_set_move_task(struct tas css_set_update_populated(to_cset, true); if (from_cset) { - struct css_task_iter *it, *pos; - WARN_ON_ONCE(list_empty(&task->cg_list)); - /* - * @task is leaving, advance task iterators which are - * pointing to it so that they can resume at the next - * position. Advancing an iterator might remove it from - * the list, use safe walk. See css_task_iter_advance*() - * for details. - */ - list_for_each_entry_safe(it, pos, &from_cset->task_iters, - iters_node) - if (it->task_pos == &task->cg_list) - css_task_iter_advance(it); - + css_set_skip_task_iters(from_cset, task); list_del_init(&task->cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); @@ -4183,10 +4186,19 @@ static void css_task_iter_advance_css_se list_add(&it->iters_node, &cset->task_iters); } -static void css_task_iter_advance(struct css_task_iter *it) +static void css_task_iter_skip(struct css_task_iter *it, + struct task_struct *task) { - struct list_head *next; + lockdep_assert_held(&css_set_lock); + + if (it->task_pos == &task->cg_list) { + it->task_pos = it->task_pos->next; + it->flags |= CSS_TASK_ITER_SKIPPED; + } +} +static void css_task_iter_advance(struct css_task_iter *it) +{ lockdep_assert_held(&css_set_lock); repeat: if (it->task_pos) { @@ -4195,15 +4207,15 @@ repeat: * consumed first and then ->mg_tasks. After ->mg_tasks, * we move onto the next cset. */ - next = it->task_pos->next; - - if (next == it->tasks_head) - next = it->mg_tasks_head->next; + if (it->flags & CSS_TASK_ITER_SKIPPED) + it->flags &= ~CSS_TASK_ITER_SKIPPED; + else + it->task_pos = it->task_pos->next; - if (next == it->mg_tasks_head) + if (it->task_pos == it->tasks_head) + it->task_pos = it->mg_tasks_head->next; + if (it->task_pos == it->mg_tasks_head) css_task_iter_advance_css_set(it); - else - it->task_pos = next; } else { /* called from start, proceed to the first cset */ css_task_iter_advance_css_set(it);