Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp7413177ybl; Tue, 24 Dec 2019 02:02:52 -0800 (PST) X-Google-Smtp-Source: APXvYqz49x95MPhDzMROZKgEDF+XV9rfOqtOONJPTUumcgYdZQA3xt7BIoDwM5mpKp6cWAtnO0o8 X-Received: by 2002:a9d:f26:: with SMTP id 35mr39448408ott.260.1577181772892; Tue, 24 Dec 2019 02:02:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1577181772; cv=none; d=google.com; s=arc-20160816; b=kPWOY2j34w+0hzDMNtqskeZ4h4Nts85qnEl4cHqk7Bw10j9y5HkJagkvU0Y8IUOXSi G4u1Nne7O5cI3PKWoVihB/+8Jhf9QbbrCBmEyFlbPcJlWMOEVuYL8z3M4ftjDedJv6Ka PJbBDLeMP6nBmeqBcOfOIZQor5i4WNzNBjGxxKptJyKzS2DWwfD8GFOYNz5yHZqTQjlx se+Qe8ekgcJrveYDw+90Z8zs5h+I42aUcoiEai99b0ht9m73LhU0pjm0yB5bllBwX2Gv 0mMT9Vd4qgrmxz/z6UajoBk8TdRIe7RY/zxveiuh678n9M8GOCmT94xGTjHYJsdbj2aO OmMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=YKd7DlXs5e+6kwkdTUIRK/Uq5iyeZnqEAHifJAFcwq0=; b=xIFi359EBDWlc+9a3yNd6srxulVOLsPhJxRryYz1qsdfdzkHbG+jDRzKwgVnkxozs2 5S2HbUMpEJdD2HvhUxRr6OCX9/+/XE9Pyxxu1osknP4rxrf8x2RYTBXvATk8hqYXesX3 VgvnnRap8CSwqn8Mr5oRCmtk5QwmCIZmTo3W9ejp1JH0jsKqjbiGx7d74acl88Lf1Fc9 kGr0M6Bdw0xUnapvAtj6ewPhUK+0ko88KEamE3PVjv5XxPQhPVI9PJ0lHShwVN6oxYdI gCt6BrezTpOBPQgajGyiZvQu8f6QOeFjMm70rdIMBtU4GPnQ30awvU63MYkDmLyJ4Qw3 sh4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@posteo.de header.s=2017 header.b=f86ZQuCi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d2si11563843oth.267.2019.12.24.02.02.38; Tue, 24 Dec 2019 02:02:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@posteo.de header.s=2017 header.b=f86ZQuCi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726157AbfLXKBk (ORCPT + 99 others); Tue, 24 Dec 2019 05:01:40 -0500 Received: from mout01.posteo.de ([185.67.36.65]:57232 "EHLO mout01.posteo.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726076AbfLXKBj (ORCPT ); Tue, 24 Dec 2019 05:01:39 -0500 Received: from submission (posteo.de [89.146.220.130]) by mout01.posteo.de (Postfix) with ESMTPS id 7C694160066 for ; Tue, 24 Dec 2019 11:01:35 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1577181695; bh=y/0eB+P/HM8JllnIK/xIajMqq2tyxId8DnWHVfKJaT4=; h=Subject:From:To:Cc:Date:From; b=f86ZQuCicfvx6j+f1TNB4PD2uD/yjEXJHCHvNO8TQ6HnZmJWrr0KdH7xjtap/TM8Q m3zsHOFaou9UEyCRX5QULbA2Rkb+kGj2Ts/+EdQizngounDTYwLOTvFhymLr8ETIPe j3RfrRAstPrXWI6dR5nmrMmHR5iKOj4VFQPycj8in2JsJqlV5gPB+CdlVM/RNqIuBW G2APoNtXHH3pIoWES0VfO6QYWAfG2sOe5+9661/jF9+8PhJd5AdvUM7qWVft30g81X l+D3YAlNcJR2j1X+xBKfdclzzITtRHjYQJB0axC5g5t06+uO1hOq5CFTyyVlp/lDr1 yTBWw7wLtnGVA== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 47hsF907ZKz6tm7; Tue, 24 Dec 2019 11:01:32 +0100 (CET) Message-ID: <1a322df842e0dc5646ef1198ea0bbe668d94646e.camel@posteo.de> Subject: Re: SCHED_DEADLINE with CPU affinity From: Philipp Stanner To: Juri Lelli Cc: linux-kernel@vger.kernel.org, Hagen Pfeifer , mingo@redhat.com, peterz@infradead.org, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de Date: Tue, 24 Dec 2019 11:03:29 +0100 In-Reply-To: <20191120085024.GB23227@localhost.localdomain> References: <1574202052.1931.17.camel@posteo.de> <20191120085024.GB23227@localhost.localdomain> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.34.2 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 20.11.2019, 09:50 +0100 Juri Lelli wrote: > Hi Philipp, Hey Juri, thanks so far; we indeed could make it work with exclusive CPU-sets. On 19/11/19 23:20, Philipp Stanner wrote: > > > from implementing our intended architecture. > > > > Now, the questions we're having are: > > > > 1. Why does the kernel do this, what is the problem with > > scheduling with > > SCHED_DEADLINE on a certain core? In contrast, how is it > > handled when > > you have single core systems etc.? Why this artificial > > limitation? > > Please have also a look (you only mentioned manpage so, in case you > missed it) at > > https://elixir.bootlin.com/linux/latest/source/Documentation/scheduler/sched-deadline.rst#L667 > > and the document in general should hopefully give you the answer > about > why we need admission control and current limitations regarding > affinities. > > > 2. How can we possibly implement this? We don't want to use > > SCHED_FIFO, > > because out-of-control tasks would freeze the entire > > container. > > I experimented myself a bit with this kind of setup in the past and I > think I made it work by pre-configuring exclusive cpusets (similarly > as > what detailed in the doc above) and then starting containers inside > such > exclusive sets with podman run --cgroup-parent option. > > I don't have proper instructions yet for how to do this (plan to put > them together soon-ish), but please see if you can make it work with > this hint. I fear I have not understood quite well yet why this "workaround" leads to (presumably) the same results as set_affinity would. From what I have read, I understand it as follows: For sched_dead, admission control tries to guarantee that the requested policy can be executed. To do so, it analyzes the current workload situation, taking especially the number of cores into account. Now, with a pre-configured set, the kernel knows which tasks will run on which core, therefore it's able to judge wether a process can be deadline scheduled or not. But when using the default way, you could start your processes as SCHED_OTHER, set SCHED_DEADLINE as policy and later many of them could suddenly call set_affinity, desiring to run on the same core, therefore provoking collisions. Is my understanding of the situation correct? Merry Christmas, P.