Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp1350832rwi; Thu, 13 Oct 2022 12:24:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM66PgBLl+GXzSrMbO5aEpu8riRpgO3UO/7iY3whUYr1Gi/8sne2g56GMAUam+IBQXCreajM X-Received: by 2002:a05:6402:320e:b0:45c:ae50:dbca with SMTP id g14-20020a056402320e00b0045cae50dbcamr1165146eda.104.1665689076099; Thu, 13 Oct 2022 12:24:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665689076; cv=none; d=google.com; s=arc-20160816; b=kr48EnypzZqfzIMvf3h0Atghk0U1MJMKjbswrs7/hJls8B54Ns4p2tVbxAuJr0Zp1y Msf1LGVFHHJF73Zti0Uq7pvUIQnHNa0tKmbIO+d06JaenDlrN0ORADdaF/F3oL2HUIKz QhYoiIOsIohGM4BOCDd5vP4AiyvN/oc6SmgMIV6Mkt+HDsDwOa/+7hxKAM+drn5B08/e NkQrZgS35eUSalQL8lV5xoea9vtVg2Umh7VgaB71psmpghNLqvVpo65eaQZpmRuAQHt9 ZaE5HsvBNe4t8iAlpTbnw6X7/qFA5QBdM4W6xPag2ByBjyHOyGZUQOS5jZz0QvoR4+/U ATtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=BumlHx0TbfQLcFeagrlufzZWok4cptrV+jGQa6OSx7M=; b=OMJ5y/lfUbZQfKLDfKPWXci7yA+4mT3HNHmxIgvIGq/zBjmw4FMI0DClKydZqac7D8 dVK3H8KMH4NlLEOlQgwQv72j5tM4b4074hXS6AsgIgwbmbK3bzRVSBWOTvjV127uhZbK hRbZZlmbb8Tc1iFmksWwV2676H080JyIBhEQKiDACkkU+Qy3M2hssi6a9C0MxvlruzaG uVcdz6SH3omOYSxbsyAYPy2pRPirPYEwmfG97Z+3Zw4iaLLOgqDb0AfVl7dit+vNkHas tVxCli5hLya9ZJyUd+zEoSN1OHwPQ15FCMjJQfnxh3S73VzIKzPW2fGwyUrqEvgfw9i8 bK1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jBSO50SW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hc18-20020a170907169200b0078de57da349si522950ejc.463.2022.10.13.12.24.08; Thu, 13 Oct 2022 12:24:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jBSO50SW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231899AbiJMSoP (ORCPT + 99 others); Thu, 13 Oct 2022 14:44:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232101AbiJMSnX (ORCPT ); Thu, 13 Oct 2022 14:43:23 -0400 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 157EB9F749 for ; Thu, 13 Oct 2022 11:41:28 -0700 (PDT) Received: by mail-qt1-x835.google.com with SMTP id g11so2297148qts.1 for ; Thu, 13 Oct 2022 11:41:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=BumlHx0TbfQLcFeagrlufzZWok4cptrV+jGQa6OSx7M=; b=jBSO50SWb8vc1i7NmBGUyugVxbPggZjjIygLfoGLl9bOi49sv61bD7+5ZXWvgavehS tS2iwjZwEvrq4Cu+9JFTgPvWLc4KdS8m0x2udE+zdkYnv+sM2TC48MgW4DJ+vwUZkmfG 2kz+pK1Ctcla80XguQLCzbdLRIbbdm3wWrQ+wvqDvqAePIUZt6wzJP7Eprb9GlDs/8uc tPfg/zbd3J1JAzIFEcZksI+tVdWLgbBUCvjCa8WpMvztsVbrCABfY5R0z+R0D7YXu9Y+ geVl4DlFebZKd/A+isMioID9EE2B3i9o037iSdMRuAcu3trjI7S/rpMPmXlIfuIY/h7w P2mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BumlHx0TbfQLcFeagrlufzZWok4cptrV+jGQa6OSx7M=; b=tCxJzWSY333A/GkNVz0HB9XmRGBUUopvQwXIxRegZ4d+PyTvfryV6dVsW9uY2HkPcd 7lx/J0yYwFVfRX4UYzdT7LCzCpuFI3UiMmsQo4Jh1MlrAHuP+kOwK17u6kdYbFQN6aOA Bv7SMrMJvRNJKEKMhsQMyLPQ+owVydJIPRaDx4prqF6HQYuR0D5wCaEWCC5ckn994RiK 8G5d61V/0S9RWVfS1kcHlkuWTyDMqHTOqpPCP/a69xo7SZGE4ntKaClfHSl8/zXLAoSy EcsPci9ik/jHRiDEvDLjzibQyptiYPFCdihch+RMsG20SM4m6MyOBvGlUUyhrwehD0yA Q3uw== X-Gm-Message-State: ACrzQf3LFMzNFp19bko1swC/4Qf66B4ApKZn4JwxtSophpzZuzVYXDPC +i4mAFhJ32LVWhQ4/MVFQahS9Q== X-Received: by 2002:ac8:5f8d:0:b0:39c:beab:fc0d with SMTP id j13-20020ac85f8d000000b0039cbeabfc0dmr1042180qta.683.1665686456983; Thu, 13 Oct 2022 11:40:56 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id y10-20020a37f60a000000b006e2d087fd63sm336493qkj.63.2022.10.13.11.40.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Oct 2022 11:40:56 -0700 (PDT) Date: Thu, 13 Oct 2022 11:40:40 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Sasha Levin cc: Hugh Dickins , linux-kernel@vger.kernel.org, stable@vger.kernel.org, Jan Kara , Keith Busch , Jens Axboe , linux-block@vger.kernel.org Subject: Re: [PATCH AUTOSEL 6.0 64/67] sbitmap: fix lockup while swapping In-Reply-To: Message-ID: References: <20221013001554.1892206-1-sashal@kernel.org> <20221013001554.1892206-64-sashal@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 13 Oct 2022, Sasha Levin wrote: > On Wed, Oct 12, 2022 at 06:08:50PM -0700, Hugh Dickins wrote: > >On Wed, 12 Oct 2022, Sasha Levin wrote: > > > >> From: Hugh Dickins > >> > >> [ Upstream commit 30514bd2dd4e86a3ecfd6a93a3eadf7b9ea164a0 ] > >> > >> Commit 4acb83417cad ("sbitmap: fix batched wait_cnt accounting") > >> is a big improvement: without it, I had to revert to before commit > >> 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") > >> to avoid the high system time and freezes which that had introduced. > >> > >> Now okay on the NVME laptop, but 4acb83417cad is a disaster for heavy > >> swapping (kernel builds in low memory) on another: soon locking up in > >> sbitmap_queue_wake_up() (into which __sbq_wake_up() is inlined), cycling > >> around with waitqueue_active() but wait_cnt 0 . Here is a backtrace, > >> showing the common pattern of outer sbitmap_queue_wake_up() interrupted > >> before setting wait_cnt 0 back to wake_batch (in some cases other CPUs > >> are idle, in other cases they're spinning for a lock in dd_bio_merge()): > >> > >> sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < > >> __blk_mq_free_request < blk_mq_free_request < __blk_mq_end_request < > >> scsi_end_request < scsi_io_completion < scsi_finish_command < > >> scsi_complete < blk_complete_reqs < blk_done_softirq < __do_softirq < > >> __irq_exit_rcu < irq_exit_rcu < common_interrupt < asm_common_interrupt < > >> _raw_spin_unlock_irqrestore < __wake_up_common_lock < __wake_up < > >> sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < > >> __blk_mq_free_request < blk_mq_free_request < dd_bio_merge < > >> blk_mq_sched_bio_merge < blk_mq_attempt_bio_merge < blk_mq_submit_bio < > >> __submit_bio < submit_bio_noacct_nocheck < submit_bio_noacct < > >> submit_bio < __swap_writepage < swap_writepage < pageout < > >> shrink_folio_list < evict_folios < lru_gen_shrink_lruvec < > >> shrink_lruvec < shrink_node < do_try_to_free_pages < try_to_free_pages < > >> __alloc_pages_slowpath < __alloc_pages < folio_alloc < vma_alloc_folio < > >> do_anonymous_page < __handle_mm_fault < handle_mm_fault < > >> do_user_addr_fault < exc_page_fault < asm_exc_page_fault > >> > >> See how the process-context sbitmap_queue_wake_up() has been interrupted, > >> after bringing wait_cnt down to 0 (and in this example, after doing its > >> wakeups), before advancing wake_index and refilling wake_cnt: an > >> interrupt-context sbitmap_queue_wake_up() of the same sbq gets stuck. > >> > >> I have almost no grasp of all the possible sbitmap races, and their > >> consequences: but __sbq_wake_up() can do nothing useful while wait_cnt 0, > >> so it is better if sbq_wake_ptr() skips on to the next ws in that case: > >> which fixes the lockup and shows no adverse consequence for me. > >> > >> The check for wait_cnt being 0 is obviously racy, and ultimately can lead > >> to lost wakeups: for example, when there is only a single waitqueue with > >> waiters. However, lost wakeups are unlikely to matter in these cases, > >> and a proper fix requires redesign (and benchmarking) of the batched > >> wakeup code: so let's plug the hole with this bandaid for now. > >> > >> Signed-off-by: Hugh Dickins > >> Reviewed-by: Jan Kara > >> Reviewed-by: Keith Busch > >> Link: > >> https://lore.kernel.org/r/9c2038a7-cdc5-5ee-854c-fbc6168bf16@google.com > >> Signed-off-by: Jens Axboe > >> Signed-off-by: Sasha Levin > > > >Whoa! NAK to this 6.0 backport, and to the 5.19, 5.15, 5.10, 5.4 > >AUTOSEL backports of the same commit. I never experienced such a > >lockup on those releases. Or have I missed announcements of stable > >backports of the whole series of 6.1-rc commits to which this one > >is a fix? (I hope not.) > > Happy to drop it. Thanks. > > >I'm happy for my NAK to be overruled by Jens or Jan or Keith, > >if they see virtue in this commit, beyond what I'm aware of: > >but as it stands, it looks like AUTOSEL out of control again - > >it found the word "fix", and found that the commit applies cleanly, > >so thinks it must be a good stable addition. Not necessarily so! > > I'm a bit confused: the subject of the patch is "fix lockup while > swapping" and the body describes a lockup and that this patch "fixes the > lockup and shows no adverse consequence". What am I missing? You are missing that it was a commit to (Jens's branch for) 6.1, and that the problematic commits called out in the comments above were to (Jens's branch for) 6.1. It had no Cc stable tag, and that was intentional, because it was a fix for 6.1 alone. Perhaps it would have helped AUTOSEL to exclude it, if it had a Fixes tag, pointing to one of those 6.1 commits: the initial version did, but in review we had agreed that it was unclear which commit was being fixed. (I've been choosing my words a little carefully above, because the sbitmap wakeup situation was not perfect before or after any of these 6.1 commits, and Jan hopes to improve it in future. For all I know, this little commit might be an improvement in 6.0, or a disaster in 6.0: it has neither been needed nor tested there.) Hugh