Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp1662630ybk; Mon, 11 May 2020 01:04:24 -0700 (PDT) X-Google-Smtp-Source: APiQypLmIRoMWctKtiP6f+uvEUTFlvyFjxJzT4Xfi0p/aelan59hQLQaqB8Lje/k+sPErFU10O4e X-Received: by 2002:aa7:c492:: with SMTP id m18mr12129181edq.346.1589184264053; Mon, 11 May 2020 01:04:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589184264; cv=none; d=google.com; s=arc-20160816; b=cQljM3gyc27AxWuM/ODItCP5j3Z2rzK+s5crNV0op8GJGP93eB6OhDO9tOzo+xW0my JQ9R89g4mIi97+8SV/LVU1xuS/OpaF39dsuF9rVN27lG9xCuvgiM/OZc2xt7v/gfkGoQ CT5nKeEUCCE64lX6oPtAhh5VGjhHE17sDEJcZitWjzuZF5DNYjGvJDrBKnx4faMtsd4G aHa0Duio3fexKvBazCV5xGpeJFFFHfD/9k//59lW4/KoQVmpLvIQ+APWbWSjPU+TKzjo TjnSI/Qu19zUfS0vmtuzaTsbUhzkXRJoXPNP0H7HNbpPmaKbVApqn81gzy/zEBmzfZMS EgFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=p4xy/gFUAuwiSRnEJ/0Zyb9V6KmjnNYS+I/N0r7gbyo=; b=sBDwngNbPHVW/iZSWyXC+UPTDM0EW5SosgtGeJlg0LbGhjru9IANA7j8wtizsFrSSm nKjhBcWRd648gKSk0KdJmPMPJ3Yq5MuiGEWJg4HdpURS+K4LwhNRT856XYJczxGfSXh7 2Ot46Fiu3u/AmCGN/mthPcOjrHtJ/uwSZ1MhdxgL7ZTvWbjYpMnR4u67Sd+zddx3ZgwQ mRl3bfH5XcTU0J/sAlrPIx+mOmtPCq4LWMCedW0Rf4uqzKnnuy3xF9nqEq7TyAMLTLJL FAVAmd4VAdER5yuDqlHsI4K1I9G4ORTinAWj4B5qjY50mIYY18eCsLkG4NqmWutfv68A D/Rw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="IRs5hd/g"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id oz15si5648464ejb.24.2020.05.11.01.04.00; Mon, 11 May 2020 01:04:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="IRs5hd/g"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728573AbgEKIBb (ORCPT + 99 others); Mon, 11 May 2020 04:01:31 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:49192 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726082AbgEKIBa (ORCPT ); Mon, 11 May 2020 04:01:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589184088; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=p4xy/gFUAuwiSRnEJ/0Zyb9V6KmjnNYS+I/N0r7gbyo=; b=IRs5hd/gwLvjv0t2XXUd+rPxQlxvAjfl0qEVYFqFNnanDJj1meq1wP0mkB5q7i9eVvm/9Y M5y9/mqXbqn78Uz0DBUOXsCJ7y7rdJ3DFtzhDBwr/WgC63wmNS40OluPLo1C9cpHbZ3v2n F8hv8VmVDck6FNMcsCx6IuAxPiTTWd8= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-82-7U-PuGQLOKOHm-DsJ7HoAw-1; Mon, 11 May 2020 04:01:27 -0400 X-MC-Unique: 7U-PuGQLOKOHm-DsJ7HoAw-1 Received: by mail-wr1-f70.google.com with SMTP id y7so4841421wrd.12 for ; Mon, 11 May 2020 01:01:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=p4xy/gFUAuwiSRnEJ/0Zyb9V6KmjnNYS+I/N0r7gbyo=; b=fHj87fP4xxJkA3fd9WMoi+5haEkP1ae1i5cUKN+FAXQzw2ufWJr3o2t5HH5Z/m6p0h XCvZYS0PiljJb5arFZ7Cqg4qvipQirdF7Qe09f5zky8WFfreZXrdoXyRlO6QJusOnzIg UdBCKPdh+jH+2YUcwX6q5uvnK+78W7IEI7WiKhUi+ydH05c1sSI/k1PBDGtsbYq3Hulo VAzE9Cn5ql0Pnbar8hK/XNyeKevLNCcm8y6YZIBAJXGm2tvwa1nZ/zIHyOb4ghN9Q8F+ y4oNbZTKpKbWVKyiF/cTG4PHGCR/h9PQN5mhZAGAZcXZLQer7+mHkYCJslEnCvH4+htk Cj8w== X-Gm-Message-State: AGi0PuZ4nly8x3qYSJtjK56etoaJJDiq1vGdVEKBDofhKNlfDJF9H5a6 TvcSuFvTiPdP/HdrqQZJv+XJ3H+uY0USZRx9eL4jPOHJAuNyzn24tCvn3hEx/DWJ44Y7yFbPUDa uVNuO9SkNTFyXUWRP4lU2IjV7 X-Received: by 2002:a7b:c399:: with SMTP id s25mr29022667wmj.169.1589184085594; Mon, 11 May 2020 01:01:25 -0700 (PDT) X-Received: by 2002:a7b:c399:: with SMTP id s25mr29022632wmj.169.1589184085258; Mon, 11 May 2020 01:01:25 -0700 (PDT) Received: from localhost.localdomain ([151.29.188.60]) by smtp.gmail.com with ESMTPSA id x5sm17344688wro.12.2020.05.11.01.01.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 May 2020 01:01:24 -0700 (PDT) Date: Mon, 11 May 2020 10:01:22 +0200 From: Juri Lelli To: Dietmar Eggemann Cc: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Steven Rostedt , Luca Abeni , Daniel Bristot de Oliveira , Wei Wang , Quentin Perret , Alessio Balsini , Pavan Kondeti , Patrick Bellasi , Morten Rasmussen , Valentin Schneider , Qais Yousef , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 3/6] sched/deadline: Add dl_bw_capacity() Message-ID: <20200511080122.GI264022@localhost.localdomain> References: <20200427083709.30262-1-dietmar.eggemann@arm.com> <20200427083709.30262-4-dietmar.eggemann@arm.com> <23bde551-0d91-4bfe-ce65-40af9fbfdef9@arm.com> <20200506123738.GJ17381@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/05/20 17:09, Dietmar Eggemann wrote: > On 06/05/2020 14:37, Juri Lelli wrote: > > On 06/05/20 12:54, Dietmar Eggemann wrote: > >> On 27/04/2020 10:37, Dietmar Eggemann wrote: > > [...] > > >> There is an issue w/ excl. cpusets and cpuset.sched_load_balance=0. The > >> latter is needed to demonstrate the problem since DL task affinity can't > >> be altered. > >> > >> A CPU in such a cpuset has its rq attached to def_root_domain which does > >> not have its 'sum_cpu_capacity' properly set. > > > > Hummm, but if sched_load_balance is disabled it means that we've now got > > a subset of CPUs which (from a DL AC pow) are partitioned. So, I'd tend > > Yes, the CPUs of the cpuset w/ cpuset.sched_load_balance=0 (cpuset B in > the example). > > > to say that we actually want to check new tasks bw requirement against > > the available bandwidth of the particular CPU they happen to be running > > (and will continue to run) when setscheduler is called. > > By 'available bandwidth of the particular CPU' you refer to > '\Sum_{cpu_rq(i)->rd->span} CPU capacity', right? No. I was referring to the single CPU capacity. The capacity of the CPU where a task is running when setscheduler is called for it (and DL AC performed). See below, maybe more clear why I wondered about this case.. > This is what this fix tries to achieve. Regardless whether cpu_rq(i)->rd > is a 'real' rd or the def_root_domain, dl_bw_capacity() will now always > return '\Sum_{cpu_rq(i)->rd->span} CPU capacity' > > > If then load balance is enabled again, AC check we did above should > > still be valid for all tasks admitted in the meantime, no? > > Example (w/ this fix) on Juno [L b b L L L], capacity_orig_of(L)=446 : > > mkdir /sys/fs/cgroup/cpuset/A > echo 0 > /sys/fs/cgroup/cpuset/A/cpuset.mems > echo 1 > /sys/fs/cgroup/cpuset/A/cpuset.cpu_exclusive > echo 0-2 > /sys/fs/cgroup/cpuset/A/cpuset.cpus > > mkdir /sys/fs/cgroup/cpuset/B > echo 0 > /sys/fs/cgroup/cpuset/B/cpuset.mems > echo 1 > /sys/fs/cgroup/cpuset/B/cpuset.cpu_exclusive > echo 3-5 > /sys/fs/cgroup/cpuset/B/cpuset.cpus > > echo 0 > /sys/fs/cgroup/cpuset/B/cpuset.sched_load_balance > echo 0 > /sys/fs/cgroup/cpuset/cpuset.sched_load_balance > > echo $$ > /sys/fs/cgroup/cpuset/B/tasks > chrt -d --sched-runtime 8000 --sched-period 16000 -p 0 $$ > > ... > [ 144.920102] __dl_bw_capacity CPU3 rd->span=3-5 return 1338 > [ 144.925607] sched_dl_overflow: [bash 1999] task_cpu(p)=3 cap=1338 cpus_ptr=3-5 So, here you are checking new task bw against 1338 which is 3*L capacity. However, since load balance is disabled at this point for 3-5, once admitted the task will only be able to run on CPU 3. Now, if more tasks on CPU 3 are admitted the same way (up to 1338) I believe they will start to experience deadline misses because only 446 will be actually available to them until load balance is enabled below and they are then free to migrate to CPUs 4 and 5. Does it makes sense? > [ 144.932841] __dl_bw_capacity CPU3 rd->span=3-5 return 1338 > ... > > echo 1 > /sys/fs/cgroup/cpuset/B/cpuset.sched_load_balance > > echo $$ > /sys/fs/cgroup/cpuset/B/tasks > chrt -d --sched-runtime 8000 --sched-period 16000 -p 0 $$ > > ... > [ 254.367982] __dl_bw_capacity CPU5 rd->span=3-5 return 1338 > [ 254.373487] sched_dl_overflow: [bash 2052] task_cpu(p)=5 cap=1338 cpus_ptr=3-5 > [ 254.380721] __dl_bw_capacity CPU5 rd->span=3-5 return 1338 > ... > > Regardless of 'B/cpuset.sched_load_balance' > '\Sum_{cpu_rq(i)->rd->span} CPU capacity' stays 1338 (3*446) > > So IMHO, DL AC check stays intact. >