Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753767AbdHKTKO (ORCPT ); Fri, 11 Aug 2017 15:10:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34944 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753238AbdHKTKN (ORCPT ); Fri, 11 Aug 2017 15:10:13 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E088E61B84 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=lcapitulino@redhat.com Date: Fri, 11 Aug 2017 15:09:57 -0400 From: Luiz Capitulino To: Frederic Weisbecker Cc: LKML , Peter Zijlstra , Chris Metcalf , Thomas Gleixner , Christoph Lameter , "Paul E . McKenney" , Ingo Molnar , Mike Galbraith , Rik van Riel , Wanpeng Li Subject: Re: [RFC PATCH 7/9] housekeeping: Use own boot option, independant from nohz Message-ID: <20170811123927.33e094f3@redhat.com> In-Reply-To: <1500643290-25842-8-git-send-email-fweisbec@gmail.com> References: <1500643290-25842-1-git-send-email-fweisbec@gmail.com> <1500643290-25842-8-git-send-email-fweisbec@gmail.com> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 11 Aug 2017 19:10:13 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6462 Lines: 154 On Fri, 21 Jul 2017 15:21:28 +0200 Frederic Weisbecker wrote: > The housekeeping is currently driven by nohz_full where any CPU that > is not in the nohz_full range is considered as a housekeeper. This is > a design mistake because nohz is just a detail among all the existing > isolation features. Nohz shouldn't imply anything else than tick related > things. > > We rather want to drive all the isolation features from the housekeeping > subsystem which is responsible for all the work that can be either > affined (unpinned workqueues, timers, kthreads, ...) or offloaded > (scheduler tick, ...). That makes a lot of sense. I think this is moving in the right direction. I have a comment below though. > > Let's start with a boot option to define the housekeepers. We should be > able to further enhance that through cpusets. > > Signed-off-by: Frederic Weisbecker > Cc: Chris Metcalf > Cc: Rik van Riel > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Cc: Mike Galbraith > Cc: Ingo Molnar > Cc: Christoph Lameter > Cc: Paul E. McKenney > Cc: Wanpeng Li > Cc: Luiz Capitulino > --- > include/linux/housekeeping.h | 2 -- > init/main.c | 2 -- > kernel/housekeeping.c | 22 ++++++++++------------ > 3 files changed, 10 insertions(+), 16 deletions(-) > > diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h > index 320cc2b..ba769c8 100644 > --- a/include/linux/housekeeping.h > +++ b/include/linux/housekeeping.h > @@ -11,7 +11,6 @@ extern int housekeeping_any_cpu(void); > extern const struct cpumask *housekeeping_cpumask(void); > extern void housekeeping_affine(struct task_struct *t); > extern bool housekeeping_test_cpu(int cpu); > -extern void __init housekeeping_init(void); > > #else > > @@ -26,7 +25,6 @@ static inline const struct cpumask *housekeeping_cpumask(void) > } > > static inline void housekeeping_affine(struct task_struct *t) { } > -static inline void housekeeping_init(void) { } > #endif /* CONFIG_NO_HZ_FULL */ > > static inline bool housekeeping_cpu(int cpu) > diff --git a/init/main.c b/init/main.c > index 9904a1e..9789ab7 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -46,7 +46,6 @@ > #include > #include > #include > -#include > #include > #include > #include > @@ -608,7 +607,6 @@ asmlinkage __visible void __init start_kernel(void) > early_irq_init(); > init_IRQ(); > tick_init(); > - housekeeping_init(); > rcu_init_nohz(); > init_timers(); > hrtimers_init(); > diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c > index f8be7e6..a54765d 100644 > --- a/kernel/housekeeping.c > +++ b/kernel/housekeeping.c > @@ -45,23 +45,21 @@ bool housekeeping_test_cpu(int cpu) > return true; > } > > -void __init housekeeping_init(void) > +/* Parse the boot-time housekeeping CPU list from the kernel parameters. */ > +static int __init housekeeping_setup(char *str) > { > - if (!tick_nohz_full_enabled()) > - return; > - > - if (!alloc_cpumask_var(&housekeeping_mask, GFP_KERNEL)) { > - WARN(1, "NO_HZ: Can't allocate not-full dynticks cpumask\n"); > - cpumask_clear(tick_nohz_full_mask); > - tick_nohz_full_running = false; > - return; > + alloc_bootmem_cpumask_var(&housekeeping_mask); > + if (cpulist_parse(str, housekeeping_mask) < 0) { > + pr_warn("Housekeeping: Incorrect cpumask\n"); > + free_bootmem_cpumask_var(housekeeping_mask); > + return 1; > } > > - cpumask_andnot(housekeeping_mask, > - cpu_possible_mask, tick_nohz_full_mask); > - > static_branch_enable(&housekeeping_overriden); > > /* We need at least one CPU to handle housekeeping work */ > WARN_ON_ONCE(cpumask_empty(housekeeping_mask)); > + > + return 1; > } > +__setup("housekeeping=", housekeeping_setup); Am I right that from now on nohz_full= users will also have to specify housekeeping= in order to get nohz_full working? If that's correct, then won't this patch break nohz_full for existing setups? Also, I just give this series a try and got this: [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.13.0-rc4+ root=/dev/mapper/rhel_virtlab508-root ro crashkernel=auto rd.lvm.lv=rhel_virtlab508/root rd.lvm.lv=rhel_virtlab508/swap console=ttyS1,115200 LANG=en_US.UTF-8 housekeeping=0,2,4,6,8,10,12,14,1 isolcpus=15 nohz_full=15 intel_pstate=disable [ 0.000000] static_key_slow_inc used before call to jump_label_init [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:108 static_key_slow_inc+0x86/0xa0 [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.13.0-rc4+ #2 [ 0.000000] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015 [ 0.000000] task: ffffffffb6010480 task.stack: ffffffffb6000000 [ 0.000000] RIP: 0010:static_key_slow_inc+0x86/0xa0 [ 0.000000] RSP: 0000:ffffffffb6003d98 EFLAGS: 00010046 ORIG_RAX: 0000000000000000 [ 0.000000] RAX: 0000000000000037 RBX: ffffffffb66aa780 RCX: ffffffffb6061308 [ 0.000000] RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000002 [ 0.000000] RBP: ffffffffb6003da0 R08: 6b5f636974617473 R09: 00000000000001e4 [ 0.000000] R10: 776f6c735f79656b R11: 0000000000000000 R12: ffff972c3ffd1cfe [ 0.000000] R13: ffffffffffffffff R14: 0000000000000000 R15: 000000000000000d [ 0.000000] FS: 0000000000000000(0000) GS:ffff97282ea00000(0000) knlGS:0000000000000000 [ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.000000] CR2: ffff972905974000 CR3: 0000000545209000 CR4: 00000000000406b0 [ 0.000000] Call Trace: [ 0.000000] static_key_enable+0x1d/0x30 [ 0.000000] housekeeping_setup+0x5a/0x7e [ 0.000000] unknown_bootoption+0x8b/0x19a [ 0.000000] parse_args+0x224/0x3b0 [ 0.000000] ? set_init_arg+0x5a/0x5a [ 0.000000] start_kernel+0x209/0x4cd [ 0.000000] ? set_init_arg+0x5a/0x5a [ 0.000000] ? early_idt_handler_array+0x120/0x120 [ 0.000000] x86_64_start_reservations+0x24/0x26 [ 0.000000] x86_64_start_kernel+0x14c/0x16f [ 0.000000] secondary_startup_64+0x9f/0x9f