Received: by 2002:a25:b794:0:0:0:0:0 with SMTP id n20csp7232940ybh; Thu, 8 Aug 2019 12:13:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqwD0qiZoL05PxRoOZPP41zXbuWtcKJHh7v2H9C8kB3NQnYuFNBvdFJ1LWqiFHoTJE/E1Htu X-Received: by 2002:aa7:8a0a:: with SMTP id m10mr881391pfa.100.1565291615319; Thu, 08 Aug 2019 12:13:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565291615; cv=none; d=google.com; s=arc-20160816; b=CexStaEL1KeTXnIgUqcn1fyL/kJozB0NqjYVw7b+Eyi2eOOWj+k/2TLgqDfq+eXl3s rkhpIeT9N0cw+sz7NAvpe91T3Ym3h/XRVMhi/DlqdC4FBV8sfBHG45scT7tpg8XZzabh xr/rrhY9fzef7n8ZEQ4VHHY8JJM+/4kyDD/X77fAN8KjCgYedlXV2zgQNQw51Fou4u75 dRSja7fJkoUfjYn6+ucO/67W5F4+DyZZyJGWmtv1fgItuCACX+9g7O+gg+g0t2SbVxXX zPCPwigljK2mml38JjzvWGv0WPWtv46aexLoAh8to2MGe6tF+Im32gIzo2IpATrPOLqd VIrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=5Iz0owaHTWtobkS1kpE1IQ5Qkaj0rR0nQveBX/YOBlY=; b=S4tNRZK8MAH0kAOM2/vBtCoapv0iKox3eq1HiW/Y0g/tAx0tjqIAvTVMr0lU3NKgBg GWDstT33WEL6fEezXWMv7YyyrQ2RxFFe7YTHgWGSG8QWGhc4scUHzBvq9b0xqDX87KU5 YpDQlJyDi49g/Q9kVnQ2laAsqCrYvcywQFtohYAjNFSrzEzWWYgxftQvNew80dcm+Lxu svQ4EsX4zc6wtC4EobRdirgiwz72ya+kmkVj6RpOqsmFfnurBO0D7n/mEDih6ci+Wu0V 0+MPn5JPstR/lN0vIk0uoi4VAuM4VJrK1eJYtftOpJhfsoax+2bmTD9iXHfKds4wKXlN Obxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SrWPd3V5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id be3si48559939plb.383.2019.08.08.12.13.19; Thu, 08 Aug 2019 12:13:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SrWPd3V5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405575AbfHHTMD (ORCPT + 99 others); Thu, 8 Aug 2019 15:12:03 -0400 Received: from mail.kernel.org ([198.145.29.99]:46272 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405523AbfHHTLk (ORCPT ); Thu, 8 Aug 2019 15:11:40 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B6689214C6; Thu, 8 Aug 2019 19:11:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1565291499; bh=hKTPgVco64fFZS4H1RKTSfv9q8WWlhmr9XebO6ZIbGc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SrWPd3V5ui4xfdt9iDjRaAeuE+RQhHSBYNZV1vAggz95vyAB4D/D2Pfvv9+ogLlDv neTPsrZlSRhbMIWBjLMK1ywNk0UudeUBvQQD0aeLJpQsCTwXgDmhAhp44g//ntGcET zgUhYKVqF3R4CsZD5J9cpNNbh2EuabFYKT7P1Fw0= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Tejun Heo , Oleg Nesterov , Topi Miettinen Subject: [PATCH 4.14 30/33] cgroup: Include dying leaders with live threads in PROCS iterations Date: Thu, 8 Aug 2019 21:05:37 +0200 Message-Id: <20190808190455.151177736@linuxfoundation.org> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190808190453.582417307@linuxfoundation.org> References: <20190808190453.582417307@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tejun Heo commit c03cd7738a83b13739f00546166969342c8ff014 upstream. CSS_TASK_ITER_PROCS currently iterates live group leaders; however, this means that a process with dying leader and live threads will be skipped. IOW, cgroup.procs might be empty while cgroup.threads isn't, which is confusing to say the least. Fix it by making cset track dying tasks and include dying leaders with live threads in PROCS iteration. Signed-off-by: Tejun Heo Reported-and-tested-by: Topi Miettinen Cc: Oleg Nesterov Signed-off-by: Greg Kroah-Hartman --- include/linux/cgroup-defs.h | 1 + include/linux/cgroup.h | 1 + kernel/cgroup/cgroup.c | 44 +++++++++++++++++++++++++++++++++++++------- 3 files changed, 39 insertions(+), 7 deletions(-) --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -201,6 +201,7 @@ struct css_set { */ struct list_head tasks; struct list_head mg_tasks; + struct list_head dying_tasks; /* all css_task_iters currently walking this cset */ struct list_head task_iters; --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -59,6 +59,7 @@ struct css_task_iter { struct list_head *task_pos; struct list_head *tasks_head; struct list_head *mg_tasks_head; + struct list_head *dying_tasks_head; struct css_set *cur_cset; struct css_set *cur_dcset; --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -643,6 +643,7 @@ struct css_set init_css_set = { .dom_cset = &init_css_set, .tasks = LIST_HEAD_INIT(init_css_set.tasks), .mg_tasks = LIST_HEAD_INIT(init_css_set.mg_tasks), + .dying_tasks = LIST_HEAD_INIT(init_css_set.dying_tasks), .task_iters = LIST_HEAD_INIT(init_css_set.task_iters), .threaded_csets = LIST_HEAD_INIT(init_css_set.threaded_csets), .cgrp_links = LIST_HEAD_INIT(init_css_set.cgrp_links), @@ -1107,6 +1108,7 @@ static struct css_set *find_css_set(stru cset->dom_cset = cset; INIT_LIST_HEAD(&cset->tasks); INIT_LIST_HEAD(&cset->mg_tasks); + INIT_LIST_HEAD(&cset->dying_tasks); INIT_LIST_HEAD(&cset->task_iters); INIT_LIST_HEAD(&cset->threaded_csets); INIT_HLIST_NODE(&cset->hlist); @@ -4046,15 +4048,18 @@ static void css_task_iter_advance_css_se it->task_pos = NULL; return; } - } while (!css_set_populated(cset)); + } while (!css_set_populated(cset) && !list_empty(&cset->dying_tasks)); if (!list_empty(&cset->tasks)) it->task_pos = cset->tasks.next; - else + else if (!list_empty(&cset->mg_tasks)) it->task_pos = cset->mg_tasks.next; + else + it->task_pos = cset->dying_tasks.next; it->tasks_head = &cset->tasks; it->mg_tasks_head = &cset->mg_tasks; + it->dying_tasks_head = &cset->dying_tasks; /* * We don't keep css_sets locked across iteration steps and thus @@ -4093,6 +4098,8 @@ static void css_task_iter_skip(struct cs static void css_task_iter_advance(struct css_task_iter *it) { + struct task_struct *task; + lockdep_assert_held(&css_set_lock); repeat: if (it->task_pos) { @@ -4109,17 +4116,32 @@ repeat: if (it->task_pos == it->tasks_head) it->task_pos = it->mg_tasks_head->next; if (it->task_pos == it->mg_tasks_head) + it->task_pos = it->dying_tasks_head->next; + if (it->task_pos == it->dying_tasks_head) css_task_iter_advance_css_set(it); } else { /* called from start, proceed to the first cset */ css_task_iter_advance_css_set(it); } - /* if PROCS, skip over tasks which aren't group leaders */ - if ((it->flags & CSS_TASK_ITER_PROCS) && it->task_pos && - !thread_group_leader(list_entry(it->task_pos, struct task_struct, - cg_list))) - goto repeat; + if (!it->task_pos) + return; + + task = list_entry(it->task_pos, struct task_struct, cg_list); + + if (it->flags & CSS_TASK_ITER_PROCS) { + /* if PROCS, skip over tasks which aren't group leaders */ + if (!thread_group_leader(task)) + goto repeat; + + /* and dying leaders w/o live member threads */ + if (!atomic_read(&task->signal->live)) + goto repeat; + } else { + /* skip all dying ones */ + if (task->flags & PF_EXITING) + goto repeat; + } } /** @@ -5552,6 +5574,7 @@ void cgroup_exit(struct task_struct *tsk if (!list_empty(&tsk->cg_list)) { spin_lock_irq(&css_set_lock); css_set_move_task(tsk, cset, NULL, false); + list_add_tail(&tsk->cg_list, &cset->dying_tasks); cset->nr_tasks--; spin_unlock_irq(&css_set_lock); } else { @@ -5572,6 +5595,13 @@ void cgroup_release(struct task_struct * do_each_subsys_mask(ss, ssid, have_release_callback) { ss->release(task); } while_each_subsys_mask(); + + if (use_task_css_set_links) { + spin_lock_irq(&css_set_lock); + css_set_skip_task_iters(task_css_set(task), task); + list_del_init(&task->cg_list); + spin_unlock_irq(&css_set_lock); + } } void cgroup_free(struct task_struct *task)