Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp950162pxu; Fri, 4 Dec 2020 22:18:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJw2uvF1VhcKOPZfKwAMtuc4FMvsOIYwT9jwih/pCVo+d9ZHbViAES0pfuNKba0GAdMQE4NE X-Received: by 2002:a50:cd08:: with SMTP id z8mr11130698edi.256.1607149088940; Fri, 04 Dec 2020 22:18:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607149088; cv=none; d=google.com; s=arc-20160816; b=BJabb7sPJ2mez1V6COxlg3c0+v1/O8vCu+yPDubew3r+VqyH2rYh5B4Yl8hTN3dOCw DmLLxHz5EuX/RrWuqH7laHd4OGKUmmCxETFcRQ5mNS9EwDY6Y5Z8eiW48gwbl9aSOVaT qcMKjWqtrFlrzwrT1oXfLqzAFgMeTEqRzBYTfeJy/ARoaYTHyD38RpWQdOIuCpVpI4Ym B3zba0RxTRtEjYLnb+9owVT7T/yPySWrRObun3ukj81ymp/qbDdvrFsDmeyqx11DyNO4 jBna+6P/9lPDomWrc2Ck/SghpXaZd7oAPtrmhDyCwCDJh0+zIj+aoO+u0Yb7TaJbswck PlUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from :dkim-signature; bh=2/jhAwMajH8k2J2ZK9X/1hEzEK4bVDfiEmqV5lNgiHE=; b=UNRTaD2aDPXFiP+65ci2MyuMp3f4LJUhSZjCMgXyD3i4ahLWv7b1ZO+I5WRFEepn9g D8WxIiWcj/KI/IcLaUuhEoVVYSXEk8xNQ+CsFFhNZFJygbaI/oaPel0v/nqs21o3Wo5F FcewkEDEBTkKtkOLkvM2PFBLrCZWkHcL8hR6X4bkUdSRTP3wjVMRGayTZu0IJIPf5FCt ZAIUktBez/t5sosYhSEGtE5zS+MBhqWLLAGoRQAG6u0lz6t6Z9WMdKx+0BcJfcVaO4fq 9sxs9YIfcPUxusth9U85QQyrdUiOQsKxPSRNEn+CymlSxZKU/US9lCIPPU+Fhwtbimv7 G/2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@163.com header.s=s110527 header.b=iTCK4M7+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=163.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y24si4132592edl.364.2020.12.04.22.17.45; Fri, 04 Dec 2020 22:18:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@163.com header.s=s110527 header.b=iTCK4M7+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=163.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727543AbgLEGPu (ORCPT + 99 others); Sat, 5 Dec 2020 01:15:50 -0500 Received: from m12-17.163.com ([220.181.12.17]:36312 "EHLO m12-17.163.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726513AbgLEGPu (ORCPT ); Sat, 5 Dec 2020 01:15:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=From:Subject:Date:Message-Id; bh=2/jhAwMajH8k2J2ZK9 X/1hEzEK4bVDfiEmqV5lNgiHE=; b=iTCK4M7+GHaYvTW6aReSaT88Fx5roEdGtZ cfFVF4FLlx9hHp18SOPmck0lD0vT/+SGVqWTOzNCUXSSo4EZDeGgVRQPjvptgdW4 3jEHTRJBMtuJRGTQwyqnog0YxMD0p9sLjmZS6eAY30vXind9oE1hmUbJ91Sl1Vck zmsSiEnyw= Received: from localhost.localdomain (unknown [223.87.230.17]) by smtp13 (Coremail) with SMTP id EcCowAC3tIygAMtfds9_Xw--.30129S2; Sat, 05 Dec 2020 11:38:10 +0800 (CST) From: carver4lio@163.com To: mingo@redhat.com Cc: juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, linux-kernel@vger.kernel.org, carver4lio@163.com, Hailong Liu Subject: [PATCH] sched/rt:fix the missing of rt_rq runtime check in rt-period timer Date: Sat, 5 Dec 2020 11:38:01 +0800 Message-Id: <20201205033801.6924-1-carver4lio@163.com> X-Mailer: git-send-email 2.17.1 X-CM-TRANSID: EcCowAC3tIygAMtfds9_Xw--.30129S2 X-Coremail-Antispam: 1Uf129KBjvJXoW7Aw4DGF13WFykXry8tr13Jwb_yoW5JFy3pF ZrX34xGa1vy3WUKa48A3s7WryFgws5try7J3WDt3yxA3W5Wrn0qr1rtFs3KFW0gFn3CFWx AF1DG34fua1DtFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07U04EiUUUUU= X-Originating-IP: [223.87.230.17] X-CM-SenderInfo: xfdu4v3uuox0i6rwjhhfrp/1tbiWBHwnVuHujdBTAADs7 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Hailong Liu The rq->rd->span of a cpu in a system with isolated cpus splited into two different parts: one is for isolated cpus, another for non-isolated cpus. When CONFIG_RT_GROUP_SCHED enabled, the handler of sched_rt_period_timer updates rt_time and rt_runtime for every cpus in rq(this_cpu)->rd->span. It means that other parts cpus out of this_cpu's rd->span will be missed by sched_rt_period_timer handler, when CONFIG_RT_GROUP_SCHED enabled and isolated cpus presents in system. E.g problem will be triggered as follows on my 8 cores machine: 1 enable CONFIG_RT_GROUP_SCHED=y, and boot kernel with command-line "isolcpus=4-7" 2 create a child group and init it: mount -t cgroup -o cpu cpu /sys/fs/cgruop mkdir /sys/fs/cgroup/child0 echo 950000 > /sys/fs/cgroup/child0/cpu.rt_runtime_us 3 run two rt-loop tasks, assume their pids are $pid1 and $pid2 4 affinity a rt task to the isolated cpu-sets taskset -p 0xf0 $pid2 5 add tasks created above into child cpu-group echo $pid1 > /sys/fs/cgroup/child0/tasks echo $pid2 > /sys/fs/cgroup/child0/tasks 6 check wat happened: "top": one of the task will fail to has cpu usage, but its stat is "R" "kill": the task on the problem rt_rq can't be killed This patch will fix this problem. Signed-off-by: Hailong Liu --- kernel/sched/rt.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 49ec096a8..c5c39695c 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -855,19 +855,10 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun) int i, idle = 1, throttled = 0; const struct cpumask *span; - span = sched_rt_period_mask(); #ifdef CONFIG_RT_GROUP_SCHED - /* - * FIXME: isolated CPUs should really leave the root task group, - * whether they are isolcpus or were isolated via cpusets, lest - * the timer run on a CPU which does not service all runqueues, - * potentially leaving other CPUs indefinitely throttled. If - * isolation is really required, the user will turn the throttle - * off to kill the perturbations it causes anyway. Meanwhile, - * this maintains functionality for boot and/or troubleshooting. - */ - if (rt_b == &root_task_group.rt_bandwidth) - span = cpu_online_mask; + span = cpu_online_mask; +#else + span = sched_rt_period_mask(); #endif for_each_cpu(i, span) { int enqueue = 0; -- 2.17.1