Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758628AbdHYU3a (ORCPT ); Fri, 25 Aug 2017 16:29:30 -0400 Received: from mail-lf0-f49.google.com ([209.85.215.49]:38692 "EHLO mail-lf0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756084AbdHYU33 (ORCPT ); Fri, 25 Aug 2017 16:29:29 -0400 MIME-Version: 1.0 In-Reply-To: <20170825163754.08bda23f@luca> References: <1502918443-30169-1-git-send-email-mathieu.poirier@linaro.org> <20170822142136.3604336e@luca> <20170825163754.08bda23f@luca> From: Mathieu Poirier Date: Fri, 25 Aug 2017 14:29:26 -0600 Message-ID: Subject: Re: [PATCH 0/7] sched/deadline: fix cpusets bandwidth accounting To: Luca Abeni Cc: Ingo Molnar , Peter Zijlstra , tj@kernel.org, vbabka@suse.cz, Li Zefan , akpm@linux-foundation.org, weiyongjun1@huawei.com, Juri Lelli , Steven Rostedt , Claudio Scordino , Daniel Bristot de Oliveira , "linux-kernel@vger.kernel.org" , Tommaso Cucinotta Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2469 Lines: 67 On 25 August 2017 at 08:37, Luca Abeni wrote: > Hi Mathieu, > > On Wed, 23 Aug 2017 13:47:13 -0600 > Mathieu Poirier wrote: > >> On 22 August 2017 at 06:21, Luca Abeni wrote: >> > Hi Mathieu, >> >> Good day to you, >> >> > >> > On Wed, 16 Aug 2017 15:20:36 -0600 >> > Mathieu Poirier wrote: >> > >> >> This is a renewed attempt at fixing a problem reported by Steve Rostedt [1] >> >> where DL bandwidth accounting is not recomputed after CPUset and CPUhotplug >> >> operations. When CPUhotplug and some CUPset manipulation take place root >> >> domains are destroyed and new ones created, loosing at the same time DL >> >> accounting pertaining to utilisation. >> > >> > Thanks for looking at this longstanding issue! I am just back from >> > vacations; in the next days I'll try your patches. >> > Do you have some kind of scripts for reproducing the issue >> > automatically? (I see that in the original email Steven described how >> > to reproduce it manually; I just wonder if anyone already scripted the >> > test). >> >> I didn't bother scripting it since it is so easy to do. I'm eager to >> see how things work out on your end. > > I ran some tests with your patchset, and I confirm that it fixes the > issue originally pointed out by Steven. > Good, at least it's a start. > But I still need to run some more tests (I'll continue on Monday). > > I think I found an issue by: > 1) creating two disjoint cpusets (CPUs 0 and 1 in the first cpuset, > CPUs 2 and 3 in the second one) and setting sched_load_balance to 0 > 2) starting a task in one of the two cpusets, and making it > SCHED_DEADLINE <--- up to here, everything looks fine > 3) setting sched_load_balance to 1 <--- At this point, I think there is > a bug: the system has only one root domain, and the task utilization > is summed to it... But the task affinity mask is still the one of > the "old root domain" that was associated with the cpuset where the > task is executing. I can reproduce the problem on my side as well. This is how CPUset works and the expected behaviour. For normal tasks it isn't a problem but I agree with you that for DL tasks, we need to address this. > > I still need to run some experiments about this. Thanks for the time, Mathieu > > > > Thanks, > Luca