Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp124652rda; Sat, 21 Oct 2023 00:52:48 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHn8C8ZOqfHKXo7YeQwDKQ7vbph1ANYG1SZGeShSAL/Pg6oYGCnWb9RJBJsS5YRzVaAtXWQ X-Received: by 2002:a05:620a:2782:b0:770:70d6:417c with SMTP id g2-20020a05620a278200b0077070d6417cmr4176536qkp.33.1697874767768; Sat, 21 Oct 2023 00:52:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697874767; cv=none; d=google.com; s=arc-20160816; b=j7sKGrTqhHvWPM3VqaBM1DNRTA2YhuCrRH3+dAE2nFf8sVkehm4V/0a1ylaI2VGzjp 05D/IPBntvJzbbDbtIShXa1kE8TDrrfyqvlEwbub22l2No6W0q0EKR0Lm3DfGYXAenki vT9GPHGrI6uk4+/6WgIRKdvp5D7+DlkrHhqEYUX7Zl4seiSMFk8S8u7x6EpuoXhGxDzH 37kgC+B2vkE2BjifKB/0WIFVCwmhcnGm6XaLB1UmrhpGw1J/+WM06KGkOHYBuBdwH6ED HGFzobaswf/aUliKpB6LMCBGCszwgpsOMeLchBvhDgcj7rZ1M6bV7HGTD1RqLrSV75Iz YgKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=ar0fo9OYk48ndFV/4OLzGteF4vK6Qbva87V7+Hbx1/w=; fh=arL6pA5NmwMvFY+nZRGHincKL7bYEPRz5M0mwGB0hxY=; b=OA6btU2h60Pwn58W+3nu8MT7rDi3xTiF6/F3823JBC1UjGs5QVY2jC8KDD1jwlRMIV h7VyvpdIHkKtsm95iv1HhzSguySjy12G2w5PhgNRulqXP1jBQq22YaH6c4iQJtYM6aJ7 Mf0Qi0CaqxQb3WfgJMzjiHtmtjLEpqGod902xeA3brArDIJGY/frdIO+QEjeOCRZ/WFt 89UQV1ZKO7r1ECldzT4jnNOU9qch+RmqaqyepVK9f6v+v6kj1NuCIKRK6G1CBZJ2kjEB 7ll8TymyYUz7k/f4ZbTRW9DryUdZrRZzSn/bZfuP1YZj71PkW49L4txt07x2CcRiS4jP GcDg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id az1-20020a056a02004100b005704979833csi3364017pgb.855.2023.10.21.00.52.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Oct 2023 00:52:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id B83938293C7B; Sat, 21 Oct 2023 00:52:41 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229603AbjJUHw2 (ORCPT + 99 others); Sat, 21 Oct 2023 03:52:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229472AbjJUHw1 (ORCPT ); Sat, 21 Oct 2023 03:52:27 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2D31D66; Sat, 21 Oct 2023 00:52:24 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFF5wBtz4f3mHR; Sat, 21 Oct 2023 15:52:17 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S4; Sat, 21 Oct 2023 15:52:20 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 0/8] blk-mq: improve tag fair sharing Date: Sat, 21 Oct 2023 23:47:58 +0800 Message-Id: <20231021154806.4019417-1-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S4 X-Coremail-Antispam: 1UD129KBjvJXoW7Aryxur4rAw48ArWktF48Crg_yoW8tFy8pF W3Ka1fGw4xtrW2qr43Z3y0qa4Fqw4kCF45Krn3X345Ar1Ykrs2q3Wvqr4rZFyxJrs3AFsr XF4jyr98CFWUJ37anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9q14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2ocxC64kIII 0Yj41l84x0c7CEw4AK67xGY2AK021l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xv wVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4 x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG 64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r 1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAq YI8I648v4I1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1l42xK82IYc2Ij64vIr41l4I8I3I0E4I kC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWU WwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr 0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWr Jr0_WFyUJwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r 4UJbIYCTnIWIevJa73UjIFyTuYvjTRNgAwUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-0.7 required=5.0 tests=DATE_IN_FUTURE_06_12, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Sat, 21 Oct 2023 00:52:42 -0700 (PDT) From: Yu Kuai Current implementation: - a counter active_queues record how many queue/hctx is sharing tags, and it's updated while issue new IO, and cleared in blk_mq_timeout_work(). - if active_queues is more than 1, then tags is fair shared to each node; New implementation: - a new field 'available_tags' is added to each node, and it's calculate in slow path, hence fast path won't be affected, patch 5; - a new counter 'busy_queues' is added to blk_mq_tags, and it's updated while fail to get driver tag, and it's also cleared in blk_mq_timeout_work(), and tag sharing will based on 'busy_queues' instead of 'active_queues', patch 6,7; - a new counter 'busy_count' is added to each node to record how many times a node failed to get driver tag, and it's used to judge if a node is busy and need more tags, patch 8; - a new timer is added to blk_mq_tags, it will start if any node failed to get driver tag, and timer function will be used to borrow tags and return borrowed tags, patch 8; A simple test, 32 tags with two shared node: [global] ioengine=libaio iodepth=2 bs=4k direct=1 rw=randrw group_reporting [sda] numjobs=32 filename=/dev/sda [sdb] numjobs=1 filename=/dev/sdb Test result(monitor new debugfs entry): time active available sda sdb sda sdb 0 0 0 32 32 1 16 2 16 16 -> start fair sharing 2 19 2 20 16 3 24 2 24 16 4 26 2 28 16 -> borrow 32/8=4 tags each round 5 28 2 28 16 -> save at lease 4 tags for sdb Yu Kuai (8): blk-mq: factor out a structure from blk_mq_tags blk-mq: factor out a structure to store information for tag sharing blk-mq: add a helper to initialize shared_tag_info blk-mq: support to track active queues from blk_mq_tags blk-mq: precalculate available tags for hctx_may_queue() blk-mq: add new helpers blk_mq_driver_tag_busy/idle() blk-mq-tag: delay tag sharing until fail to get driver tag blk-mq-tag: allow shared queue/hctx to get more driver tags block/blk-core.c | 2 - block/blk-mq-debugfs.c | 30 +++++- block/blk-mq-tag.c | 226 +++++++++++++++++++++++++++++++++++++++-- block/blk-mq.c | 12 ++- block/blk-mq.h | 64 +++++++----- include/linux/blk-mq.h | 36 +++++-- include/linux/blkdev.h | 11 +- 7 files changed, 328 insertions(+), 53 deletions(-) -- 2.39.2