Received: by 2002:ab2:687:0:b0:1f4:6588:b3a7 with SMTP id s7csp196685lqe; Tue, 9 Apr 2024 21:27:43 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWis/G2l6+nGH413fkQhJI3kGAFtB3AofnqeEJC4SekmE46wmz9+JP5ZXNce0XKBPGyGNm4X10xyWYENjeYzfM1NVlMzw0daShlwxCakw== X-Google-Smtp-Source: AGHT+IEVVp6pHgkodhJ61PkNBLM1NuY5fKD1IC9ZdqReQgKWMh2J9V0bTda4/hYd3/Dk9cu7ttS4 X-Received: by 2002:a17:906:2554:b0:a51:adac:e1dd with SMTP id j20-20020a170906255400b00a51adace1ddmr1066875ejb.26.1712723263479; Tue, 09 Apr 2024 21:27:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712723263; cv=pass; d=google.com; s=arc-20160816; b=V0xEB3zxq1bGO7fc1XZs6csGxW/2ScYGojEh5NjxbHFBETYdpquKEgCDP2uMuFIUyC 2hxR5Lnu7pe8C/cpibSmR3lKeLe8LcBGRJZ+ScSlBpmVth/h2HZ1ze4AB8qYQ0JLCdmC gAeIF9a6F5pQAV+Tg1Lkq8jR/Vx70awOiv0Ifr+paVIs6q2BDRT/LSJy0BWs4ro2nLxz cy8FQWCWr3I+4P0mGuOJrhcCWc10L3xjCLdQ3stISvFVSB2wrxMAVK7nUchGBX9eGSOM mMtM6VfwRJMFQVZWf00jnvkGkOR1bHpQrQpGHlcyklNb0Cvm40BSmaHOPUydf6qdzmQY D/jg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:references:cc:to:from:subject:message-id:date :content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:dkim-signature; bh=CrFBZdgD567Qbh4QE70HtghoHwa5Y0PZMSOGViHVEPk=; fh=iuiH7HXU3+h6H+LenT6mkVPy+O3Meob21soH8vF5dj4=; b=YQGP3viRvvWYiHovsWzrhiVB8Vq79yzMdk/Ce7Ymg9y/aEKbUt4cHgJh9FSj8s/6i4 5/hT4tQ9ZASeq/PE8lDwc31o4ji5HXG2xHuTNK//vmmbhF0YAVftjt47PvUZMuk2ah52 i6Gja0i1zyJmjTmc2O3zcB53t9vYDe+ARVQ6dzuqRiLmYp5UkrOriCc6q95mbVdZOW7Y UiHFTJJt2yZHxhq25CHVZDwDvkM2tFZFdDYWbJ+srasf1+AJ0TtH2a+0SbT3vgzDrMv+ YPrWQ7ajp2gWykDi+Q3Wx8256EtI6fUMUhkiEQPkESr4KRIABgaz/8xz4SiUH86Vr4vb 2GfQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=fzYeFgVJ; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-137881-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-137881-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id gt42-20020a1709072daa00b00a51d35e21a8si3020936ejc.906.2024.04.09.21.27.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Apr 2024 21:27:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-137881-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=fzYeFgVJ; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-137881-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-137881-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 08D601F264F6 for ; Wed, 10 Apr 2024 04:27:43 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id AE8E8C157; Wed, 10 Apr 2024 04:27:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fzYeFgVJ" Received: from mail-pg1-f173.google.com (mail-pg1-f173.google.com [209.85.215.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F175BA50 for ; Wed, 10 Apr 2024 04:27:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712723255; cv=none; b=uKpwdUFidemS5UzCnpOgDPE1Rd8Bio2wAHHtJ9BRa2GN36aWz/6wsIkZZ49aYBtXtwQyAY/QTmoA5mAQ+3V5mvdbDOnmcxQ2mMFgiE8o3VPYigvNrX3Q4TQ88bHRVeSjJDhaGfSV7GOr8SOgQndBBOp6cssJ9ia3kB8OhSP1MWs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712723255; c=relaxed/simple; bh=PbThbDXv809IaEOM4OIswNTUbAnRn1pVRio7xqGAmek=; h=Mime-Version:Content-Type:Date:Message-Id:Subject:From:To:Cc: References:In-Reply-To; b=GuObLwNJnBA2oHnGD3ohov5uLDNHINMkJ20LPUpLW+JNX54KHLJA5ijWEmdobDpMWSl9cGD6KzT3ldluBpYyGelU8Fn8IBNXxMsojR0tvVuZZVhqg0hFWtUT7PU9iAsK8zIuuoeVTpPAtrL/KKxSPj7FKy0bYct4OU72PBHaLGM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=fzYeFgVJ; arc=none smtp.client-ip=209.85.215.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pg1-f173.google.com with SMTP id 41be03b00d2f7-5e4613f2b56so4836714a12.1 for ; Tue, 09 Apr 2024 21:27:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712723253; x=1713328053; darn=vger.kernel.org; h=in-reply-to:references:cc:to:from:subject:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=CrFBZdgD567Qbh4QE70HtghoHwa5Y0PZMSOGViHVEPk=; b=fzYeFgVJpx48OQaF0qFui96gqAeluzwHg4I7nF+Rdj8CT38xGAvOY7xZ5CpukB+FEf 3Aj6IWM8NvbK1aXfuOVKg0HgO5AEaTYli5o0/aIks3ruUz9ianv8Qs2mXsZUapDnyXqX tOj9t8M1VWp9KuLq9JH2Hc4JwIOtOXlyxO1z7WlWJ4mRNDIoLgAxLDFKjj514i0y5978 UeusOsq6p8IKV6wLhuqcYa82HP31f+unGvtyQ1HZElTq6AVCh/o4nPHVeKWfFg6hz7H9 dMOguNWDjRR9dRZYiwLlvS2fD2wUEyoTkyuzwSSMRsllRieKHl69IIvTJvPw3t+tKsg1 qdvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712723253; x=1713328053; h=in-reply-to:references:cc:to:from:subject:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=CrFBZdgD567Qbh4QE70HtghoHwa5Y0PZMSOGViHVEPk=; b=SrAJ2Bj9yPqiHQ/2GtmuTvT64/f3vOPraKEwb/mskWpS655nZR5bbCsLyRcj1dkJTW c7KuyaFo17b1zbm1NbuyCGLuI+7R4s+VHQDJQfaOUnGlG3TXwyfVpjy11fK4zaZvfVDo vOWseHJLsH+ZTUBs167NyMn9nN4rFSco4I8zBtCOZkAOaXdQ5inQj/MPyycpy0Ww3E6L pV8264Fi+cKM+FqIRQV88UCNB/MDYMJb8r8HHdC4kFl3AE8nOfeMMSTI24zEhmTLsUxJ g+6ajHOYZhSgwMFVkaroBeZ8vmHWrWRjuEEOzsm72kX98lXw01qFGyrE/RA4xeyWLUqw gvkw== X-Forwarded-Encrypted: i=1; AJvYcCVqpGlfTob7+SnrYzX6zQnM6IJsndYTstoBHit6KoROsJgbV3yzrPrYvZaj0dzUJFmHQUafrJkaBiOw0hk0vWBKmto+vzJhD5pOCvKm X-Gm-Message-State: AOJu0YzjrGZHZAtlNxxx4J5vf4PonL1hkKS2k2OnDPbDWngzQ7Iu/X6/ fb/NvQFsCcT0y0hazEp58YvuCIFymhhBMTroRL3cMiRJriEAvont X-Received: by 2002:a05:6a21:3993:b0:1a9:3e65:3488 with SMTP id ad19-20020a056a21399300b001a93e653488mr2250211pzc.25.1712723253306; Tue, 09 Apr 2024 21:27:33 -0700 (PDT) Received: from localhost ([1.146.50.27]) by smtp.gmail.com with ESMTPSA id y21-20020a170902ed5500b001e2bbc0a672sm9781581plb.188.2024.04.09.21.26.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 09 Apr 2024 21:26:55 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Wed, 10 Apr 2024 14:26:07 +1000 Message-Id: Subject: Re: Nohz_full on boot CPU is broken (was: Re: [PATCH v2 1/1] wq: Avoid using isolated cpus' timers on queue_delayed_work) From: "Nicholas Piggin" To: "Oleg Nesterov" , "Frederic Weisbecker" Cc: "Tejun Heo" , "Leonardo Bras" , "Thomas Gleixner" , "Peter Zijlstra" , "Ingo Molnar" , "Lai Jiangshan" , , "Junyao Zhao" , "Chris von Recklinghausen" X-Mailer: aerc 0.17.0 References: <20240130010046.2730139-2-leobras@redhat.com> <20240402105847.GA24832@redhat.com> <20240403203814.GD31764@redhat.com> <20240405140449.GB22839@redhat.com> <20240407130914.GA10796@redhat.com> <20240409130727.GC29396@redhat.com> In-Reply-To: <20240409130727.GC29396@redhat.com> On Tue Apr 9, 2024 at 11:07 PM AEST, Oleg Nesterov wrote: > On 04/09, Frederic Weisbecker wrote: > > > > Le Sun, Apr 07, 2024 at 03:09:14PM +0200, Oleg Nesterov a =C3=A9crit : > > > Well, the changelog says > > > > > > nohz_full has been trialed at a large supercomputer site and foun= d to > > > significantly reduce jitter. In order to deploy it in production,= they > > > need CPU0 to be nohz_full > > > > > > so I guess this feature has users. It was the Summit/Sierra supercomputers which I suppose are still using it. IIRC it was an existing job scheduler system they had which ran housekeeping work on the highest numbered core in a socket and allocated jobs from lowest number. We certainly asked if they could change that, but apparently that was difficult. I'm surprised nobody ran into it on x86 though. Maybe the system had more jitter (SMT4 doesn't help), so maybe it wasn't needed to use isolcpus=3D in other cases. The other thing is powerpc can boot on arbitrary CPU number. So if boot CPU must be in the mask then it could randomly break your boot config if boot CPU must be in HK mask. > > > > > > But after the commit aae17ebb53cd3da ("workqueue: Avoid using isolate= d cpus' > > > timers on queue_delayed_work") the kernel will crash at boot time if = the boot > > > CPU is nohz_full. > > > > Right but there are many possible calls to housekeeping on boot before = the > > first housekeeper becomes online. > > Well, it seems that other callers are more or less fine in this respect..= . > At least the kernel boots fine with that commit reverted. > > But a) I didn't try to really check, and b) this doesn't matter. > > I agree, and that is why I never blamed this change in queue_delayed_work= (). > > OK, you seem to agree with the patch below, I'll write the changelog/comm= ent > and send it "officially". > > Or did I misunderstand you? Thanks for this. Taking a while to page this back in, the intention is for housekeeping to be done by boot CPU until house keeper is awake, so returning smp_processor_id() seems like the right thing to do here for ephemeral jobs like timers and work, provided that CPU / mask is not stored somewhere long term by the caller. For things that set an affinity like kthread, sched, maybe managed irqs, and such. There are not many callers of housekeeping_any_cpu() so that's easy enough to verify. But similar like housekeeping_cpumask() and others could be an issue or at least a foot-gun, I'm not sure how well I convinced myself of those. Could you test like this? WARN_ON_ONCE(system_state =3D=3D SYSTEM_RUNNING || type !=3D HK_TYPE_TIMER); With a comment to say other ephemeral mask types could be exempted if needed. It would also be nice to warn for cases that would be bugs if the boot CPU was not in the HK mask. Could that be done by having a housekeepers_online() call after smp_init() (maybe at the start of sched_init_smp()) that could verify there is at least one online, and set a flag that could be used to create warnings. Thanks, Nick > > Oleg. > > > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c > index 373d42c707bc..e912555c6fc8 100644 > --- a/kernel/sched/isolation.c > +++ b/kernel/sched/isolation.c > @@ -46,7 +46,11 @@ int housekeeping_any_cpu(enum hk_type type) > if (cpu < nr_cpu_ids) > return cpu; > =20 > - return cpumask_any_and(housekeeping.cpumasks[type], cpu_online_mask); > + cpu =3D cpumask_any_and(housekeeping.cpumasks[type], cpu_online_mask)= ; > + if (cpu < nr_cpu_ids) > + return cpu; > + > + WARN_ON_ONCE(system_state =3D=3D SYSTEM_RUNNING); > } > } > return smp_processor_id();