Received: by 2002:ab2:7903:0:b0:1fb:b500:807b with SMTP id a3csp1395914lqj; Mon, 3 Jun 2024 23:12:50 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXX2o1AhdFyQX/Bm9IZhi3sJLkxPcVstpO/quSUCepwX/qcjuD+LUlSlw2T/7EMpXGD6X9cxjF8CPlWKC3u07w7/FEmugsuevgv1R2nvw== X-Google-Smtp-Source: AGHT+IE8X4o+pxWtHjkw0kvdo9pJELZOtTKIlych/6PdRC/tPbz+m5hTDMMGH5h93n5V4cg0TsDM X-Received: by 2002:a17:902:dac8:b0:1f4:5fe5:74d1 with SMTP id d9443c01a7336-1f636fe8b2amr128903345ad.18.1717481570152; Mon, 03 Jun 2024 23:12:50 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717481570; cv=pass; d=google.com; s=arc-20160816; b=f1RU+hIuHBZGM4zYsd4PYfIx4ieH/NoRfxUefFI/Z9YKluJLlzHEyvME8F+8jpdUhF eSp4X0vCapWDITm7zYrP63chvi64WbZyPPObaq8z+bAauJq9LBeO2LkkENH6L8PWo7gh VLinA9q8VSCW9z227fbIub42JIaIpt2SFc/IrYZkJ/i0M143JuSMJjARB8yotU45jUMg lCXuG9H3TlmKTbyJEGpyHj7fLReBhnqijuoiVVTJK0UKUG7mDK9hSuNwdzXo3ReEpjWj wDw8qKu6YzBQlL/dVb8HezY8xcRqackqJzQ9yRIKpkz1Rg37F/etMyjXK47GupQzpcR8 lWSw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:user-agent:date:message-id:from :references:cc:to:subject; bh=6mjQzz9AsoVZPXK8V/gO1xe1BZH3e7jU5vnfvgCFwf0=; fh=TMLtZAB7h/AL2Yq5NAxIjxb+a5k8lULqRwQfWKQbTo0=; b=awD2V/nqKIcn3O4Ix5fRqzdJc/UgUH7RPdpuMHoTVvRFR/WRQlUZzZOHLJtd+2lSfn xes62jCr8SUVdPCHX4p9USqC1XyxnHpnFMDcuVz567WOEKWndD/zeVYQ+Jn5AAA3iXKl HE9zfNJWhal/374CP/IEYxpnnrPyRaA3MiH5v3gJl8FD9u1EPIX0qV9ON6Rt/NAACQgv mhY0ypLoy+EF8P53/ioSBIn/jxl1P9DYYjfnwPrZi6leD+cq5jdBr6SMcz9PmbT578Y7 V/No7i6NWpeSIK/dE96zm7TKERRAAKgYL8TWXl0Gv1z//pL3/q7kCkrTBj4amhZjEVfg Vflg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-200092-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-200092-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id d9443c01a7336-1f63236cdf6si78243955ad.194.2024.06.03.23.12.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Jun 2024 23:12:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-200092-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-200092-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-200092-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 7FB542872FF for ; Tue, 4 Jun 2024 06:12:39 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E420013D889; Tue, 4 Jun 2024 06:12:31 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D027314294; Tue, 4 Jun 2024 06:12:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717481551; cv=none; b=Ajm2aYGZuPAJFOhoVRCoGFFM0N7uwXChYFHNkRJmOy/ixEf+VisC+lhptw0A6VHbv6qtuNorQ+qGKiFrZzpIN+H3uubSzTIKhiH1m4TN5Cw91qmzwO1xxF7qzPfcNSAqvVQKt4iUpJa6eSo87GaXElGFapU5YiHqFsyL4Y6wnvs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717481551; c=relaxed/simple; bh=DGGP/Rplelsv7Ug2joBC4Oe+Q/RBrQPCpgeRhztTN/I=; h=Subject:To:Cc:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=oMN20PzK9AmYm3bRC5mSy8pwbEcSTGzYhXs0FuJgd5chCE4XNaxcRiXXKcfiIsKIj6E8bHRwlmkt4T9oLDk3Mk6oLo9606mYyaSLe3C6SbNDB6RBTA0UBue7st2TwDUoSrAcP5HZ7qOgjCLR4S4HcDTnWC4JjBhr1rhA82lxhC8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4VtgH113Vwz4f3n6L; Tue, 4 Jun 2024 14:12:13 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 3B9501A0199; Tue, 4 Jun 2024 14:12:24 +0800 (CST) Received: from [10.174.176.73] (unknown [10.174.176.73]) by APP1 (Coremail) with SMTP id cCh0CgBnOBFGsF5mJvsYOg--.35067S3; Tue, 04 Jun 2024 14:12:23 +0800 (CST) Subject: Re: [PATCH v2] sbitmap: fix io hung due to race on sbitmap_word::cleared To: Ming Lei , Yang Yang Cc: Jens Axboe , Andrew Morton , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, "yukuai (C)" References: <20240604031124.2261-1-yang.yang@vivo.com> From: Yu Kuai Message-ID: Date: Tue, 4 Jun 2024 14:12:22 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:cCh0CgBnOBFGsF5mJvsYOg--.35067S3 X-Coremail-Antispam: 1UD129KBjvJXoW7ur1UGw43KrW3KFWDtr4rAFb_yoW8Kw47pr W5tF1xKrZ5t342vw1DW34rAF1Iyws7trsrJr10gryfCa4UuF9xJF48KF43t3WkGFWkJF1D Wa1rJrZ5Kw1qgaUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyEb4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JMxk0xIA0c2IEe2xFo4CEbIxvr21l42xK82IYc2Ij 64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x 8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE 2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42 xK8VAvwI8IcIk0rVWrZr1j6s0DMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIE c7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU1zuWJUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Hi, 在 2024/06/04 11:25, Ming Lei 写道: > On Tue, Jun 4, 2024 at 11:12 AM Yang Yang wrote: >> >> Configuration for sbq: >> depth=64, wake_batch=6, shift=6, map_nr=1 >> >> 1. There are 64 requests in progress: >> map->word = 0xFFFFFFFFFFFFFFFF >> 2. After all the 64 requests complete, and no more requests come: >> map->word = 0xFFFFFFFFFFFFFFFF, map->cleared = 0xFFFFFFFFFFFFFFFF >> 3. Now two tasks try to allocate requests: >> T1: T2: >> __blk_mq_get_tag . >> __sbitmap_queue_get . >> sbitmap_get . >> sbitmap_find_bit . >> sbitmap_find_bit_in_word . >> __sbitmap_get_word -> nr=-1 __blk_mq_get_tag >> sbitmap_deferred_clear __sbitmap_queue_get >> /* map->cleared=0xFFFFFFFFFFFFFFFF */ sbitmap_find_bit >> if (!READ_ONCE(map->cleared)) sbitmap_find_bit_in_word >> return false; __sbitmap_get_word -> nr=-1 >> mask = xchg(&map->cleared, 0) sbitmap_deferred_clear >> atomic_long_andnot() /* map->cleared=0 */ >> if (!(map->cleared)) >> return false; >> /* >> * map->cleared is cleared by T1 >> * T2 fail to acquire the tag >> */ >> >> 4. T2 is the sole tag waiter. When T1 puts the tag, T2 cannot be woken >> up due to the wake_batch being set at 6. If no more requests come, T1 >> will wait here indefinitely. >> >> To fix this issue, simply revert commit 661d4f55a794 ("sbitmap: >> remove swap_lock"), which causes this issue. > > I'd suggest to add the following words in commit log: > > Check on ->cleared and update on both ->cleared and ->word need to be > done atomically, and using spinlock could be the simplest solution. > > Otherwise, the patch looks fine for me. Maybe I'm noob, but I'm confused how can this fix the problem, looks like the race condition doesn't change. In sbitmap_find_bit_in_word: 1) __sbitmap_get_word read word; 2) sbitmap_deferred_clear clear cleared; 3) sbitmap_deferred_clear update word; 2) and 3) are done atomically while 1) can still concurrent with 3): t1: sbitmap_find_bit_in_word __sbitmap_get_word -> read old word, return -1 t2: sbitmap_find_bit_in_word __sbitmap_get_word -> read old word, return -1 sbitmap_deferred_clear -> clear cleared and update word sbitmap_deferred_clear -> cleared is cleared, fail BYW, I still think it's fine to fix this problem by trying the __sbitmap_get_word() at least one more time if __sbitmap_get_word() failed. Thanks, Kuai > > Thanks, > > > . >