Received: by 2002:ab2:3350:0:b0:1f4:6588:b3a7 with SMTP id o16csp119915lqe; Fri, 5 Apr 2024 15:01:12 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUiaqvaBm9w7havURgoYuWQ3+V66Rdw+dFsvF3ZNpVou/oVssj0pu57UrMldvX/81zz3Ui4mGdFc5m3xorFvUzpSrgXHYMSYT1u0zMu3g== X-Google-Smtp-Source: AGHT+IEeK61FZRvJjZ4ebu5eQ1oz+d2DLuA3GfEf9SdSRkCFwd70e8A4BpMKdZaH8bTe6CSC3Yqh X-Received: by 2002:a05:6a20:3c9e:b0:1a3:4e27:c8d1 with SMTP id b30-20020a056a203c9e00b001a34e27c8d1mr4588836pzj.25.1712354472667; Fri, 05 Apr 2024 15:01:12 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712354472; cv=pass; d=google.com; s=arc-20160816; b=zmrAR8mAop94wj/XPQTFcDXcdSooLqsi3/Kn2qv0o/NyQAcOorA5XDRZT3jwHBpi1T Y4SKXo7AAW6gQuBUXNRUqNEXrqJ4dgrM7aXsC8EeXcjyx21J3BdeofpVdMS0fEnczYnE hvbI9pi9zTq0gZa7aRdJvv+zTQgkRuDe7OVYuJkQuWXCrfbv/C8dUpO+IlPWXTAYXzRu OBNIBMm0EN2H2zH8/8RT5rBFrgPsr1Z13d9Qxc2HnPi/IVCJvF52KE89MZa483LRH8G4 vo2z/6O7WAxxH2XTd5a2jN47gfN7BvU+lTGPr/rMU6DnQL3mmqvD32oGq7J+KLUJnETX pVbQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:subject:cc:to:from:date:dkim-signature; bh=CA5tr2ujJneJfvYpgQHNJu2hZ1VzTO0naPbhuYepQWY=; fh=KsyWxblsExvtZDPE/06M6bT9so11CuFAVsNA7fMe0U0=; b=rX6fzqPxC/8Wx6hal+dQloyVBxCgUcm0YgDeyz/m7mYYz2/DuB8qlblmmFM2n66qr7 vthMWZg+3pI7/LJp4+P3piGPCeCR0ZC26j41c563OJRQvypvwETl+2vGoD+sMw3qxT39 3vY+Fp2+/zE6NlF3Vro2lY8BroIjFSyojvmN0gLQCmcAXRnPFomIWnckM5XHJCQLpbOe zIpBC4M5WvhpUI4NONKPJQL8sk91O2B4Uo01L1YnW6t9t8cpjggKxeHp2TkqxorlxR/Q Jj0pMHexdAM4b25RxAWhEhOn/a9cw45mCDT3ZBNxhZCRSgtpaRbYaS+g0RhVgn9afhGW V9SQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=BKQof1J9; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-133658-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-133658-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id i29-20020a63221d000000b005dc891520f0si1980998pgi.272.2024.04.05.15.01.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Apr 2024 15:01:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-133658-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=BKQof1J9; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-133658-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-133658-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id AD98FB22B69 for ; Fri, 5 Apr 2024 21:59:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B7CEE179216; Fri, 5 Apr 2024 21:52:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BKQof1J9" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1F5D174EF3 for ; Fri, 5 Apr 2024 21:52:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712353927; cv=none; b=gdFLTh88VV2/n6dCBeEwxT8AWB5LajaM+CheEGP5QNi/WRZ9+KCgdSpkPnHdu+u27z000OjxVE2+J0dYRzCb5ypFPI6LcMBB3/NW96vrcQhNijbZrFoOFnKJvU2gV7z9kqgO3M0wecF7+1c4qSbljXhKGBbGsXwl+SwOXZOTy5I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712353927; c=relaxed/simple; bh=R7P/IXBN3U1X+IlN1yc+r7zwuk1qsaWQywc9Htpv00I=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=QJ051vrG4y4SkXueL9cmIebKXMuD0gxIrRUbfsg6/dXtqYlyAGQs+HUwH5GuK4Hfr7haO+jiysEU9uktEkPp6+L4cr0+mOJEIKvZ9nLzj8ZYDcyESH197M42jlC7OXjWdr9zju5c9ANsIyWdxHf05Z71e6CriVnZLHmolM+aVRE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BKQof1J9; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id E760FC433F1; Fri, 5 Apr 2024 21:52:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712353926; bh=R7P/IXBN3U1X+IlN1yc+r7zwuk1qsaWQywc9Htpv00I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=BKQof1J9LxxtArQfGTsZehrLaaPz9mmN1dVepAPtSjaqObRtkEN5oXx5G/cDl1s07 PWxvY7qz81BhrFSdzCa5iol5DwUchCeVcieacBMH1/42z+8b7Fz291xKaR/ycggD2m RQmapT3c0zDuaoTm+A8H+5vnHC1S2U2IqFc3sOYi/B1FnLnGOcmw165CZJSlF/fZo8 1svzQ5fs07D0a9bC7Et0tEv9ql5kl6qrREPvFMxtx/KgFOtpHLpHT6MaTPcYAxEOiu T3DEiR6y+PvdPrVGkI9/Dlg3Fgu7Pu0N5NnF+9G8Jz4Q3DilenAgzaSxUMRnLctW66 lqXCnQ8gQRr5w== Date: Fri, 5 Apr 2024 23:52:03 +0200 From: Frederic Weisbecker To: Oleg Nesterov , Nick Piggin Cc: Tejun Heo , Leonardo Bras , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Lai Jiangshan , linux-kernel@vger.kernel.org, Junyao Zhao , Chris von Recklinghausen Subject: Nohz_full on boot CPU is broken (was: Re: [PATCH v2 1/1] wq: Avoid using isolated cpus' timers on queue_delayed_work) Message-ID: References: <20240130010046.2730139-2-leobras@redhat.com> <20240402105847.GA24832@redhat.com> <20240403203814.GD31764@redhat.com> <20240405140449.GB22839@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20240405140449.GB22839@redhat.com> +Cc Nick Le Fri, Apr 05, 2024 at 04:04:49PM +0200, Oleg Nesterov a ?crit : > On 04/03, Oleg Nesterov wrote: > > > > > > OTOH, Documentation/timers/no_hz.rst says > > > > > > > > Therefore, the > > > > boot CPU is prohibited from entering adaptive-ticks mode. Specifying a > > > > "nohz_full=" mask that includes the boot CPU will result in a boot-time > > > > error message, and the boot CPU will be removed from the mask. > > > > > > > > and this doesn't match the reality. > > > > > > Don't some archs allow the boot CPU to go down too tho? If so, this doesn't > > > really solve the problem, right? > > > > I do not know. But I thought about this too. > > > > In the context of this discussion we do not care if the boot CPU goes down. > > But we need at least one housekeeping CPU after cpu_down(). The comment in > > cpu_down_maps_locked() says > > > > Also keep at least one housekeeping cpu onlined > > > > but it checks HK_TYPE_DOMAIN, and I do not know (and it is too late for me > > to try to read the code ;) if housekeeping.cpumasks[HK_TYPE_TIMER] can get > > empty or not. > > This nearly killed me, but I managed to convince myself we shouldn't worry > about cpu_down(). > > HK_FLAG_TIMER implies HK_FLAG_TICK. > > HK_FLAG_TICK implies tick_nohz_full_setup() which sets > tick_nohz_full_mask = non_housekeeping_mask. > > When tick_setup_device() is called on a housekeeping CPU it does > > else if (tick_do_timer_boot_cpu != -1 && > !tick_nohz_full_cpu(cpu)) { > tick_take_do_timer_from_boot(); > tick_do_timer_boot_cpu = -1; > > > and this sets tick_do_timer_cpu = first-housekeeping-cpu. > > cpu_down(tick_do_timer_cpu) will fail, tick_nohz_cpu_down() will nack it. > > So cpu_down() can't make housekeeping.cpumasks[HK_FLAG_TIMER] empty and I > still think that the change below is the right approach. > > But probably WARN_ON() in housekeeping_any_cpu() makes sense anyway. > > What do you think? Good analysis on this nasty housekeeping VS tick code. I promised so many times to cleanup this mess but things keep piling up. It is indeed possible for the boot CPU to be a nohz_full CPU and as you can see, it's only half-working. This is so ever since: 08ae95f4fd3b (nohz_full: Allow the boot CPU to be nohz_full) I wish I had nacked it before it got merged, especially as the changelog mentions that the user could have solved this with modifying its setup... I would love to revert that now but I don't know if anyone uses this and have it working by chance somewhere... Should we continue to support a broken feature? Can we break user ABI if it's already half-broken? Anyway so during boot it's possible to have an empty housekeeping_mask(HK_TYPE_TIMER) & cpu_online_mask. After boot though (provided any CPU from the housekeeping_mask(HK_TYPE_TIMER) has actually booted, which isn't even guaranteed if maxcpus= is passed...) the first online housekeeping can't go down like you spotted. Thanks.