Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp5560196rwl; Sun, 8 Jan 2023 17:56:22 -0800 (PST) X-Google-Smtp-Source: AMrXdXsxGuWin7xR6mUVf3K0pcX7hu158hLK93xtNVmtmZxD/cP2mwPzsN6Q7gCeubH95FQHMV2b X-Received: by 2002:a17:907:8b09:b0:7c1:bb5:5704 with SMTP id sz9-20020a1709078b0900b007c10bb55704mr57206170ejc.26.1673229382334; Sun, 08 Jan 2023 17:56:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673229382; cv=none; d=google.com; s=arc-20160816; b=NhWNvzRCBet4o1BD2Im8RaF3cmkHMhgEQStbT6v0mqVWDgQapuIQ0ZwJ/B3CDZk1A8 cV63xpwL2H5wlhL45u9dulf19WddcIfr9qvAZsAp1sHjWVDX3ZriZZcH2Yuw3ocjlEqC 1SVQRH4FjJEo5Oq9Ez73tAL77wqy1UX9abn2+qRz/21y1B9q6yFgne8yKaTpZetlhbnG BWnSiXD0E6YvCJvXRNbPmpxVGguFzDE0xuqZr4RexKxAGi1Vdof/VMk61zL4WxOyCimt UUe4Z0Cm88uEbH0GWVNQV1HIHxfK21QZ0ZIw0Xol1+Phs6Q9UKsH6eDBg0uekOKz2ZA7 lKhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=CnhRNXZwp9DAB4iCXVkztfgGtbQd5iGKjUWicdEn/eQ=; b=S8lO95TR2rn5IPy0YVgnqXtbiJQykRkqzI3v1Mbd3JBxrpaD2jmo6vLih6vm1fC9yd eNs6ZxXXUZfBkRo7AIoAAx4odFQrwSdK82sdmc+Bly2w3op73M0Cy/2X7JpHH57kLqc5 Eqxykr2clzbwaCR6KFM5WXA6eUVxAwxwaTgC4p16lld+rSln3h/qJZe5413CWO9GPI0n wKsP157J52RMqHQ4MNHgVQb7JDhocT/yYx8NN7Wf4KcHh4K1kgoeRmaJ5NXUDxJTDlbz P+Zm25sh9TgjlQtbXTkN/Vcl9ylm2+mwD7YdPp+gOo1PsBDzS0QdaQ4bORClTNqJF6cC nEmA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e6-20020a17090658c600b007ad8bc64c89si8657537ejs.701.2023.01.08.17.56.06; Sun, 08 Jan 2023 17:56:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233500AbjAIBiW (ORCPT + 54 others); Sun, 8 Jan 2023 20:38:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230449AbjAIBiS (ORCPT ); Sun, 8 Jan 2023 20:38:18 -0500 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE1A4BE39; Sun, 8 Jan 2023 17:38:16 -0800 (PST) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4NqxR15n7qz4f49JP; Mon, 9 Jan 2023 09:38:05 +0800 (CST) Received: from [10.174.176.73] (unknown [10.174.176.73]) by APP4 (Coremail) with SMTP id gCh0CgBXwLP_b7tjGBp5BQ--.9981S3; Mon, 09 Jan 2023 09:38:08 +0800 (CST) Subject: Re: [PATCH -next 3/4] block/rq_qos: use a global mutex to protect rq_qos apis To: Tejun Heo , Yu Kuai Cc: hch@infradead.org, josef@toxicpanda.com, axboe@kernel.dk, cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yi.zhang@huawei.com, yangerkun@huawei.com, "yukuai (C)" References: <20230104085354.2343590-1-yukuai1@huaweicloud.com> <20230104085354.2343590-4-yukuai1@huaweicloud.com> From: Yu Kuai Message-ID: <31e57528-39a5-84ed-8ea0-5c61bab00541@huaweicloud.com> Date: Mon, 9 Jan 2023 09:38:07 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=gbk; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID: gCh0CgBXwLP_b7tjGBp5BQ--.9981S3 X-Coremail-Antispam: 1UD129KBjvJXoWxJFWkGw1xWryUXF17Wr1Utrb_yoW5Zr4xpr WDCa92yF4DKr15ZasFvF4fC3WUtw4vg3y5Jrn5GF1Iy3sF9rn7Xrs2qF4j9FWvywsFka1I vrWUta15C3sxuFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9214x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gc CE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E 2Ix0cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJV W8JwACjcxG0xvEwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2ka 0xkIwI1lc7I2V7IY0VAS07AlzVAYIcxG8wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7x kEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E 67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCw CI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6rWUJVWr Zr1UMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYx BIdaVFxhVjvjDU0xZFpf9x0JUZa9-UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, ?? 2023/01/07 2:23, Tejun Heo ะด??: > Hello, > > On Fri, Jan 06, 2023 at 09:33:26AM +0800, Yu Kuai wrote: >>> wbt's lazy init is tied to one of the block device sysfs files, right? So, >>> it *should* already be protected against device removal. >> >> That seems not true, I don't think q->sysfs_lock can protect that, >> consider that queue_wb_lat_store() doesn't check if del_gendisk() is >> called or not: >> >> t1: wbt lazy init t2: remove device >> queue_attr_store >> del_gendisk >> blk_unregister_queue >> mutex_lock(&q->sysfs_lock) >> ... >> mutex_unlock(&q->sysfs_lock); >> rq_qos_exit >> mutex_lock(&q->sysfs_lock); >> queue_wb_lat_store >> wbt_init >> rq_qos_add >> mutex_unlock(&q->sysfs_lock); > > So, it's not sysfs_lock but sysfs file deletion. When a kernfs, which backs > sysfs, file is removed, it disables future operations and drains all > inflight ones before returning, so you remove the interface files before > cleaning up the object that it interacts with, you don't have to worry about > racing against file operations as none can be in flight at that point. Ok, thanks for explanation, I'll look into this and try to find out how this works. > >> I tried to comfirm that by adding following delay: >> >> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c >> index 93d9e9c9a6ea..101c33cb0a2b 100644 >> --- a/block/blk-sysfs.c >> +++ b/block/blk-sysfs.c >> @@ -11,6 +11,7 @@ >> #include >> #include >> #include >> +#include >> >> #include "blk.h" >> #include "blk-mq.h" >> @@ -734,6 +735,8 @@ queue_attr_store(struct kobject *kobj, struct attribute >> *attr, >> if (!entry->store) >> return -EIO; >> >> + msleep(10000); >> + >> mutex_lock(&q->sysfs_lock); >> res = entry->store(q, page, length); >> mutex_unlock(&q->sysfs_lock); >> >> And then do the following test: >> >> 1) echo 10000 > /sys/block/sdb/queue/wbt_lat_usec & >> 2) echo 1 > /sys/block/sda/device/delete >> >> Then, following bug is triggered: >> >> [ 51.923642] BUG: unable to handle page fault for address: >> ffffffffffffffed >> [ 51.924294] #PF: supervisor read access in kernel mode >> [ 51.924773] #PF: error_code(0x0000) - not-present page >> [ 51.925252] PGD 1820b067 P4D 1820b067 PUD 1820d067 PMD 0 >> [ 51.925754] Oops: 0000 [#1] PREEMPT SMP >> [ 51.926123] CPU: 1 PID: 539 Comm: bash Tainted: G W >> 6.2.0-rc1-next-202212267 >> [ 51.927124] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> ?-20190727_073836-b4 >> [ 51.928334] RIP: 0010:__rq_qos_issue+0x30/0x60 > > This indicates that we aren't getting the destruction order right. It could > be that there are other reasons why the ordering is like this and we might > have to synchronize separately. > > Sorry that I've been asking you to go round and round but block device > add/remove paths have always been really tricky and we wanna avoid adding > more complications if at all possible. Can you see why the device is being > destroyed before the queue attr is removed? Of course, I'll glad to help, I'll let you know if I have any progress. Thanks, Kuai