Received: by 2002:a05:6358:bb9e:b0:b9:5105:a5b4 with SMTP id df30csp5929057rwb; Wed, 7 Sep 2022 09:59:27 -0700 (PDT) X-Google-Smtp-Source: AA6agR6yff3Xx2B1t1deFP1BZPOKu7THJoazdXbpMVYFxYz7wDgFrYBbBYCJACYiEk7N235dJBb6 X-Received: by 2002:a05:6402:274e:b0:447:4e9d:69e8 with SMTP id z14-20020a056402274e00b004474e9d69e8mr3729187edd.295.1662569966740; Wed, 07 Sep 2022 09:59:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662569966; cv=none; d=google.com; s=arc-20160816; b=jJmiuWA24DWdte3IEbeu8jDQ+GbkDz+1E0vrKWzuT+Nhrli3lvOWGtXKP/LJ7RMUAX dCUlXvGffa3ZeOJLHvmpxGe4ggPt+GqZ18GTXg48V+zGjTbkWcFA5Pg1MV0zBnDXHYMx fEwIBFTrgw21mZSKupFs/7KHb2n0rx6gel4fPRUH707nLwKqHnWz4VH6smBoYhLCESbI oVgBa3MnpgX1EUJDzfZ6LVHJuJBfRSgDkh5dHqHosi6uQvUwbuthHo67CRVaJ1xWy9R5 vqMxF8ki3Z9PN9MLLx7E7/cIchRJSaZsVxnXYPG5+Yb1FTtvtsiIee3h77+04oxo9fTd x+7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=Q6tOcNdtfFU3h01fMPnmLVH8bomBdNq7Oqmv2FQpXEA=; b=wApZTbQmCkx3pk2TY5gxJEz9Pk768aAmGNEVaOVRNYdZTKyR431yFMmuoR0MR0jOdk tbcK7+F0oBPBr2o5BJyPL1B21QzxHBPoC4BvbdhQrmynsxJOfQxVPU5bn7nMaJVPraEl UrgJ9WDjKC2CuPSHwIioUsgpNl5zkICxbiZi82SHFc5hnqq3C0K4WAjEveUNv20ib844 VPvCouq11Dl3A5nk8pY6kYX788nOBdy6tI+W+5zJ9c4Xkk7dFENhPRtZYvQi3aaMfYkr WsOqhtJlnoRGp93+y/phmc/nto5BPxGmBKe+qQOj16s0SKabtmFL6Db8xq1bntUvb2AB kPTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=rBXrqdSr; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=vncHjWdK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y25-20020a170906471900b007416895306fsi10048681ejq.645.2022.09.07.09.59.01; Wed, 07 Sep 2022 09:59:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=rBXrqdSr; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=vncHjWdK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230336AbiIGQmY (ORCPT + 99 others); Wed, 7 Sep 2022 12:42:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229941AbiIGQmP (ORCPT ); Wed, 7 Sep 2022 12:42:15 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A561B760FC; Wed, 7 Sep 2022 09:41:52 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 32E8833F2D; Wed, 7 Sep 2022 16:41:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1662568911; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Q6tOcNdtfFU3h01fMPnmLVH8bomBdNq7Oqmv2FQpXEA=; b=rBXrqdSrIf0kgAXJiAQ+wTgPQhRvW0WCEX6CcQi0LnpBLlk0uu3m/F26IV8BTZHWWp13iv nPC2WCOeXiTdgOUWYAdDXct1irh3uCOjkCjKIQlh1cLDEv7yxPkPJpkPb9eGj4KN6/nZ8p hUR6odgLbdwGFqltXR39eIBZPkSqpeg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1662568911; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Q6tOcNdtfFU3h01fMPnmLVH8bomBdNq7Oqmv2FQpXEA=; b=vncHjWdKL5K7o+84s+DptSAhlOr0WFNomHISBRMXXR4eprSqCJYXvVjEKDWTVYcDZANQ1t jAZ7ynqc4kL2TwCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 25C0213A66; Wed, 7 Sep 2022 16:41:51 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id XsssCc/JGGMZeAAAMHmgww (envelope-from ); Wed, 07 Sep 2022 16:41:51 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id A8857A067E; Wed, 7 Sep 2022 18:41:50 +0200 (CEST) Date: Wed, 7 Sep 2022 18:41:50 +0200 From: Jan Kara To: Keith Busch Cc: Jan Kara , Yu Kuai , axboe@kernel.dk, osandov@fb.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yi.zhang@huawei.com Subject: Re: [PATCH] sbitmap: fix possible io hung due to lost wakeup Message-ID: <20220907164150.tykjl3jsctjddcnq@quack3> References: <20220803121504.212071-1-yukuai1@huaweicloud.com> <20220907102318.pdpzpmhah2m3ptbn@quack3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_SOFTFAIL, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 07-09-22 08:13:40, Keith Busch wrote: > On Wed, Sep 07, 2022 at 12:23:18PM +0200, Jan Kara wrote: > > On Tue 06-09-22 15:27:51, Keith Busch wrote: > > > On Wed, Aug 03, 2022 at 08:15:04PM +0800, Yu Kuai wrote: > > > > wait_cnt = atomic_dec_return(&ws->wait_cnt); > > > > - if (wait_cnt <= 0) { > > > > - int ret; > > > > + /* > > > > + * For concurrent callers of this, callers should call this function > > > > + * again to wakeup a new batch on a different 'ws'. > > > > + */ > > > > + if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) > > > > + return true; > > > > > > If wait_cnt is '0', but the waitqueue_active happens to be false due to racing > > > with add_wait_queue(), this returns true so the caller will retry. > > > > Well, note that sbq_wake_ptr() called to obtain 'ws' did waitqueue_active() > > check. So !waitqueue_active() should really happen only if waiter was woken > > up by someone else or so. Not that it would matter much but I wanted to > > point it out. > > > > > The next atomic_dec will set the current waitstate wait_cnt < 0, which > > > also forces an early return true. When does the wake up happen, or > > > wait_cnt and wait_index get updated in that case? > > > > I guess your concern could be rephrased as: Who's going to ever set > > ws->wait_cnt to value > 0 if we ever exit with wait_cnt == 0 due to > > !waitqueue_active() condition? > > > > And that is a good question and I think that's a bug in this patch. I think > > we need something like: > > > > ... > > /* > > * For concurrent callers of this, callers should call this function > > * again to wakeup a new batch on a different 'ws'. > > */ > > if (wait_cnt < 0) > > return true; > > /* > > * If we decremented queue without waiters, retry to avoid lost > > * wakeups. > > */ > > if (wait_cnt > 0) > > return !waitqueue_active(&ws->wait); > > I'm not sure about this part. We've already decremented, so the freed bit is > accounted for against the batch. Returning true here may double-count the freed > bit, right? Yes, we may wake up waiters unnecessarily frequently. But that's a performance issue at worst and only if it happens frequently. So I don't think it matters in practice (famous last words ;). Honza -- Jan Kara SUSE Labs, CR