Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2536410rwb; Fri, 9 Dec 2022 03:28:28 -0800 (PST) X-Google-Smtp-Source: AA0mqf5PeSF59n+XU2udWuKQFPu21W7w2IaERaZ6vSYslNsTwIXX5I1i1Kj5D/JmYpJGJvrSFjm0 X-Received: by 2002:a17:906:d10d:b0:7c1:4623:400e with SMTP id b13-20020a170906d10d00b007c14623400emr728123ejz.16.1670585307911; Fri, 09 Dec 2022 03:28:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670585307; cv=none; d=google.com; s=arc-20160816; b=A1nEfhOPWn0MEFdbRIyGkb+qSTKOh1/lMebLz9zb96/hHwZhALfB1o2yLE62pejaV9 Qr/MKR5ygHF6oZrTAfyGeK2n+AcDVxa90R4cUQl5JPiBp5OSfi2xhJ7xR9rMDmpADKKt d9cJAH7Lu0eH/hG+c0xE+jf8aQO2sBg1ltZj4Xl0MSvBGe+ZdV/v2QJWdOlyJwaBTzMu 5fRnskjy38FzvkyOhw5AJT5Z5VUxsN2A3Xe5C99y1QqiDW6S8uTT7anQeSfot3AX3ghg sjlkYXnXVLD4SxN7xdeMq97wpequiauvG0kZECvFq7JIsC6ncZ8F9TXam4BPsOKE1Rlo CIdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=AZRDSKGmX+lZEDFSHGGp+V98mrI6ZV7B3Y86K5XrFL0=; b=trEdrPn1PUeTARnGD1FAZfPtP9RXVtB5wZRK7UBTRJuTtvyIDNxsdt9t/8/EB9DZjU g9G+kdXuyXQWi1zyI4JF1q+896T1EAAIZpQbEVFB2umNPHyQMoNseBkjcqdEZkG4lQkq kX1Ga4or2Mr1scDYVTr8LZF36emzV+A8BCItaiOpJnh5HIEBEBLds6oraaLa20bEtqV0 1U5UM6Gz95l9oYRJjKcKwhxZUexUlFPh0Qsv3ZZnhgomJRz+uXnx1K+0Hw3Tzhh9VV+T mlSbUjK6PbBXNGDUBIYjiSNjXLPuf9NH5mqhSkIak5OBzg1urYMLynbqsPu9RhgWkdR+ bOXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hd17-20020a170907969100b007bc30c06aa2si995477ejc.902.2022.12.09.03.28.09; Fri, 09 Dec 2022 03:28:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229963AbiLILSS (ORCPT + 75 others); Fri, 9 Dec 2022 06:18:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229777AbiLILSE (ORCPT ); Fri, 9 Dec 2022 06:18:04 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB5C92E9D4 for ; Fri, 9 Dec 2022 03:17:58 -0800 (PST) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4NT7mJ4lLbz4f3lFZ for ; Fri, 9 Dec 2022 19:17:52 +0800 (CST) Received: from [10.174.176.117] (unknown [10.174.176.117]) by APP2 (Coremail) with SMTP id Syh0CgDXOrZgGZNjhjCcBw--.37129S2; Fri, 09 Dec 2022 19:17:55 +0800 (CST) Subject: Re: [PATCH] fscache: Use wake_up_var() to wake up pending volume acquisition To: linux-cachefs@redhat.com, David Howells Cc: Jeff Layton , linux-erofs@lists.ozlabs.org, linux-kernel@vger.kernel.org, "houtao1@huawei.com" References: <20221128031929.3918348-1-houtao@huaweicloud.com> From: Hou Tao Message-ID: <42b33792-50e9-77d7-4d3e-ac5ce1adeda6@huaweicloud.com> Date: Fri, 9 Dec 2022 19:17:52 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20221128031929.3918348-1-houtao@huaweicloud.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US X-CM-TRANSID: Syh0CgDXOrZgGZNjhjCcBw--.37129S2 X-Coremail-Antispam: 1UD129KBjvJXoWxCw13ur1fWryrWw1rArW3trb_yoW5WrW7p3 9I9ayftFWkX342yw45Jw47ZryS9FykGFs7Gr4vkryUCF4xJr1ktF1Ikan8uFW7C3yDJrW2 qa1Ykw1agw4jy3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyEb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I 0E14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40E x7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x 0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lc7I2V7IY0VAS07AlzVAYIcxG8wCF04k20xvY0x0E wIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E74 80Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0 I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04 k26cxKx2IYs7xG6rW3Jr0E3s1lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY 1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU1CPfJUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi David, Could you please pick it up for v6.2 ? On 11/28/2022 11:19 AM, Hou Tao wrote: > From: Hou Tao > > The freeing of relinquished volume will wake up the pending volume > acquisition by using wake_up_bit(), however it is mismatched with > wait_var_event() used in fscache_wait_on_volume_collision() and it will > never wake up the waiter in the wait-queue because these two functions > operate on different wait-queues. > > According to the implementation in fscache_wait_on_volume_collision(), > if the wake-up of pending acquisition is delayed longer than 20 seconds > (e.g., due to the delay of on-demand fd closing), the first > wait_var_event_timeout() will timeout and the following wait_var_event() > will hang forever as shown below: > > FS-Cache: Potential volume collision new=00000024 old=00000022 > ...... > INFO: task mount:1148 blocked for more than 122 seconds. > Not tainted 6.1.0-rc6+ #1 > task:mount state:D stack:0 pid:1148 ppid:1 > Call Trace: > > __schedule+0x2f6/0xb80 > schedule+0x67/0xe0 > fscache_wait_on_volume_collision.cold+0x80/0x82 > __fscache_acquire_volume+0x40d/0x4e0 > erofs_fscache_register_volume+0x51/0xe0 [erofs] > erofs_fscache_register_fs+0x19c/0x240 [erofs] > erofs_fc_fill_super+0x746/0xaf0 [erofs] > vfs_get_super+0x7d/0x100 > get_tree_nodev+0x16/0x20 > erofs_fc_get_tree+0x20/0x30 [erofs] > vfs_get_tree+0x24/0xb0 > path_mount+0x2fa/0xa90 > do_mount+0x7c/0xa0 > __x64_sys_mount+0x8b/0xe0 > do_syscall_64+0x30/0x60 > entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > Fixing it by using wake_up_var() instead of wake_up_bit(). In addition > because waitqueue_active() is used in wake_up_var() and clear_bit() > doesn't imply any memory barrier, so do smp_mb__after_atomic() before > invoking wake_up_var(). > > Fixes: 62ab63352350 ("fscache: Implement volume registration") > Signed-off-by: Hou Tao > --- > fs/fscache/volume.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/fscache/volume.c b/fs/fscache/volume.c > index ab8ceddf9efa..cf8293bb1aca 100644 > --- a/fs/fscache/volume.c > +++ b/fs/fscache/volume.c > @@ -348,7 +348,12 @@ static void fscache_wake_pending_volume(struct fscache_volume *volume, > if (fscache_volume_same(cursor, volume)) { > fscache_see_volume(cursor, fscache_volume_see_hash_wake); > clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags); > - wake_up_bit(&cursor->flags, FSCACHE_VOLUME_ACQUIRE_PENDING); > + /* > + * Paired with barrier in wait_var_event(). Check > + * waitqueue_active() and wake_up_var() for details. > + */ > + smp_mb__after_atomic(); > + wake_up_var(&cursor->flags); > return; > } > }