Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp389093rwb; Mon, 26 Sep 2022 21:27:05 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5rVSNlN5ok533V/GEKXsSWEZLKYPXVJCik+P1Qyy2cvOa2lAKwSw3af+iO95Z+w3j6GOdZ X-Received: by 2002:a17:907:86a2:b0:781:eff0:999a with SMTP id qa34-20020a17090786a200b00781eff0999amr21534897ejc.71.1664252825476; Mon, 26 Sep 2022 21:27:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664252825; cv=none; d=google.com; s=arc-20160816; b=sMUjPLu9Ib4TcATLshriOED9T6I4IDkhMKT2NZ3rS+5BypIkPW+5arVcUFi3hBVf36 WUajDzxTron1USq+pesxmeARQsYz9hh42OtItFB/iu8sMQKF9LdEebItp8kUoIE5JzJf 8o74wcoSChZzIFavA6Lv8QVZLO/6k8iXZJarXVRDy/UHG8CiPjn0Jb/1zW/3LPZZ8kpX UAGm/MEOrxqczdhjlV/8EMKYwlyujvKG2TRW6OPGYAl7tSBh/HK2g/MEqR2/FrHVfO5N /80+pKhtd/QSYCux7pGUAgzUVZfdF+AP0BKOK1d+74Hkbme6o7uNA4AFME7bA8IrOMxC htRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=RVeDgL58MZ/WnottLUU1JEd91S6UWqT1/tRkRRE+vRU=; b=p4PtlhC96gelpzK+SPhIvovig9dD37vmBbdmEVbU1j13dqjWSP6G911Hks7usdrcvP OjZcCtRPwqGnAuUyTjd5dbMLCE7qcLBctpxn2o0j378Z9GCQ4xhP3zYbripyFLJ5S/0Q vNdxNa8ZoSfVA94PIDWqLNZwi6hslj8Qnz3Q4sV4MM0FbcqxkGQ14rhQ+m6VOumHJKl0 HKtGEMncSMSnKTURWGN2wxrsSMoudAcxaI2BoSQBZH0dFNNcPU54yv1/B6rZbyNXNzJp BtJ2KL4X7mv3lQJFjGyAwOKJ7kei6izwmBWkDLj42WBkuaMHJSxJTAx3stWd4YH4OYjG rTjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=epR5HFBd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z6-20020a05640235c600b0044ecd80f257si509459edc.603.2022.09.26.21.26.21; Mon, 26 Sep 2022 21:27:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=epR5HFBd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229631AbiI0EDd (ORCPT + 99 others); Tue, 27 Sep 2022 00:03:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229571AbiI0EDW (ORCPT ); Tue, 27 Sep 2022 00:03:22 -0400 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B23FCAB049 for ; Mon, 26 Sep 2022 21:03:21 -0700 (PDT) Received: by mail-qk1-x734.google.com with SMTP id k12so5368156qkj.8 for ; Mon, 26 Sep 2022 21:03:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date; bh=RVeDgL58MZ/WnottLUU1JEd91S6UWqT1/tRkRRE+vRU=; b=epR5HFBd6WZTiwOoOcTrwHK0nEhccfHOmjrktKl+qtCLHJhx/SLK1MmYZ3OvETzJij pev/5Nb1gqOU4HW6sdBTnH9atIHmoj2azPRB7YdwF/suqrYlmAo9OCU2dvaXgrS6Kd9d lSPyYvcEcQV8rm51m3ZBG3ncAC9YyO/MK3Jb1OhehgT4WbZx6pjlEinKMhwBEQ4xBFGr UEwmeFcdVP90XwJLRbdqJLU7OKJVWzRdbtXkT1ZOcpNVcOVX45+3UKwP8oZYbebqKJt5 25pWnsS9KSLMNSDrB49ymLwM4KHNAtgBV/4hDKJSyKXwISMg94qvubWiCcuOxbdMpx60 8nhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date; bh=RVeDgL58MZ/WnottLUU1JEd91S6UWqT1/tRkRRE+vRU=; b=HikU7Vkbt1Le57clqJen3LMuo0xWZvjpQdr5DL0S+x2kMgraTly9geRC38WtqMLq9E riopzYIfoqH9yUlM8zxLAy5mgzJr3QGt4PDThxwIlJdlGQzfye3YAwjIKDZFtsWm29lB vnsalArJb+4jueTSTBvtUNWrLk8sjkXxDOR2IOHO8iUkSLGnKz8d3Fc/JCUloCz7CGZE DEVZ3/msuoN9PA0iQ50K5LnNH+RsKpdoxZJwaplBp3owkUOeT2zcvP4EwFAU9r55+x+F xHcIanylPQKAEQWBqo6y/kvGb0zF7G4EON9Tb5Y8wZujmKvWewpJslo55S+lFILd1v0a VThw== X-Gm-Message-State: ACrzQf3EdKSVA1kdGelTj+zKpLhFuDFX9SXFggjVTDcwqfEiDDjKSmJ7 GA0sOhiLSH5UQKsk7YPH0bH6Ag== X-Received: by 2002:a37:e107:0:b0:6ce:1a08:fbd8 with SMTP id c7-20020a37e107000000b006ce1a08fbd8mr16879136qkm.493.1664251400586; Mon, 26 Sep 2022 21:03:20 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id bq35-20020a05620a46a300b006bb78d095c5sm290319qkb.79.2022.09.26.21.02.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Sep 2022 21:02:31 -0700 (PDT) Date: Mon, 26 Sep 2022 21:02:22 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Hillf Danton cc: Hugh Dickins , Keith Busch , Jan Kara , Jens Axboe , Yu Kuai , Liu Song , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH next] sbitmap: fix lockup while swapping In-Reply-To: <20220924023047.1410-1-hdanton@sina.com> Message-ID: <5880722-767c-16db-fc3-df50a12754b9@google.com> References: <20220921164012.s7lvklp2qk6occcg@quack3> <20220923144303.fywkmgnkg6eken4x@quack3> <391b1763-7146-857-e3b6-dc2a8e797162@google.com> <20220924023047.1410-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 24 Sep 2022, Hillf Danton wrote: > > I think the lockup can be avoided by > a) either advancing wake_index as early as I can [1], > b) or doing wakeup in case of zero wait_cnt to kill all cases of waitqueue_active(). > > Only for thoughts now. Thanks Hillf: I gave your __sbq_wake_up() patch below several tries, and as far as I could tell, it works just as well as my one-liner. But I don't think it's what we would want to do: doesn't it increment wake_index on every call to __sbq_wake_up()? whereas I thought it was intended to be incremented only after wake_batch calls (thinking in terms of nr 1). I'll not be surprised if your advance-wake_index-earlier idea ends up as a part of the solution: but mainly I agree with Jan that the whole code needs a serious redesign (or perhaps the whole design needs a serious recode). So I didn't give your version more thought. Hugh > > Hillf > > [1] https://lore.kernel.org/lkml/afe5b403-4e37-80fd-643d-79e0876a7047@linux.alibaba.com/ > > +++ b/lib/sbitmap.c > @@ -613,6 +613,16 @@ static bool __sbq_wake_up(struct sbitmap > if (!ws) > return false; > > + do { > + /* open code sbq_index_atomic_inc(&sbq->wake_index) to avoid race */ > + int old = atomic_read(&sbq->wake_index); > + int new = sbq_index_inc(old); > + > + /* try another ws if someone else takes care of this one */ > + if (old != atomic_cmpxchg(&sbq->wake_index, old, new)) > + return true; > + } while (0); > + > cur = atomic_read(&ws->wait_cnt); > do { > /* > @@ -620,7 +630,7 @@ static bool __sbq_wake_up(struct sbitmap > * function again to wakeup a new batch on a different 'ws'. > */ > if (cur == 0) > - return true; > + goto out; > sub = min(*nr, cur); > wait_cnt = cur - sub; > } while (!atomic_try_cmpxchg(&ws->wait_cnt, &cur, wait_cnt)); > @@ -634,6 +644,7 @@ static bool __sbq_wake_up(struct sbitmap > > *nr -= sub; > > +out: > /* > * When wait_cnt == 0, we have to be particularly careful as we are > * responsible to reset wait_cnt regardless whether we've actually > @@ -661,12 +672,6 @@ static bool __sbq_wake_up(struct sbitmap > */ > smp_mb__before_atomic(); > > - /* > - * Increase wake_index before updating wait_cnt, otherwise concurrent > - * callers can see valid wait_cnt in old waitqueue, which can cause > - * invalid wakeup on the old waitqueue. > - */ > - sbq_index_atomic_inc(&sbq->wake_index); > atomic_set(&ws->wait_cnt, wake_batch); > > return ret || *nr;