Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp278318imm; Tue, 22 May 2018 18:47:38 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpO3Gg1ELrwv0qOHktd1/N00bJ2mmuzDlctpPg9ihXB1zxk2TgC4J9cYwLJ1wrQpvdWlGF4 X-Received: by 2002:a17:902:8307:: with SMTP id bd7-v6mr881805plb.234.1527040058006; Tue, 22 May 2018 18:47:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527040057; cv=none; d=google.com; s=arc-20160816; b=ysTxIszo2PoW7ircKLA8clzvAg9SS8RxlnkUSS0omnLF/HLh9IFxMcs+WHJYv4Uk5Z 8fAXt1OMcLbnmpkcN1USM4ui387Ru+TUwv3OUMy/C6GyIrSlCTZ0xA4/bP/TqKVmGjQb MC5vEsiYqvOoa/6mYaQAlr/0pIkzGPwkMoC5zzvZ2GEzqRb2yn0g2dsnf6aFp8vBWDlD HqFUOB5bl0r/TQTK1hInhbOX8l/D0QuQQxrTkTVuWbKeR+zKMgII+9vbwimvTXUXc9Eh iS92nEdXDHppSjDk+O0q76lXnRBfW19Dlr+/348IWFyC1qnF3d4GJp80R3HfyTvio3Fb hIEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=vIlOvA1rEwtfwaRxfs60u4wULulgFAHg2oJILKeusG4=; b=0kPqUWZncsdXAqGRtuG0Z5srMXWFmLe1DUhW8DkC8aIsPCluoSIA62jjquP5nW53nN tUlw4EO4D6l8puWCCDcO84IXanSXvjFGVz+ZRjeixM9j47X3udqN9Z3aG2cc/2rzCpvc 0OUKC+G36O0gC37DX1pTaxh/EtL1igh77DRpiFg8vu8tgvPtRffm50BHZknZFMgPqZWG 9NGQVlxvxJ6YA+M/JCowcdOuMq33Cp7+BOnZz7R8yUCjjZ/Ojv4O3iMl1QeQ52dqg2Ys 78/E7c49vtKAi7DBRoNwK43BJcEXfuV2OtNQu3Mzq2urJiFksH1cwJ1Md1uz8bL2s8CO xbBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=fBObvakD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s13-v6si17244066plp.350.2018.05.22.18.47.23; Tue, 22 May 2018 18:47:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=fBObvakD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753792AbeEWBrJ (ORCPT + 99 others); Tue, 22 May 2018 21:47:09 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:38118 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753723AbeEWBrF (ORCPT ); Tue, 22 May 2018 21:47:05 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w4N1fD48088697; Wed, 23 May 2018 01:47:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=vIlOvA1rEwtfwaRxfs60u4wULulgFAHg2oJILKeusG4=; b=fBObvakDDpTFF5KZV1x7l04lM7rHNq6D8XrMiYZ0T2nDbqKCycCK2wmRk3EzEXAKSJgV c1pi61XqLgKhye15BXv0HAK4O5RRkaO6StDghzjR5mHeqeGTpZI5l+H94Bs/tEKjXJmw rRHfcSl54K2HR4ma280DtoKcdhBACuk4TY+dlOohn0OPq/WQx27vLzl+1UbqkM9ctetT W7rR8ZTUaeWMoTWuULUz+w0ynssFBz4iz9vsOmfY/ObOVGShk8xPWVyLB9jQ/AWxcdO4 iufQqlGcJrN37ncusrynvbtmqhzEsg88q9eWFVVKymq1EQudur4fC1AoEEHtlLt1t1oX uQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2j4nh7hyvy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 23 May 2018 01:47:02 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w4N1l1ot020444 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 23 May 2018 01:47:01 GMT Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w4N1l0xr013008; Wed, 23 May 2018 01:47:01 GMT Received: from [10.182.69.179] (/10.182.69.179) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 22 May 2018 18:47:00 -0700 Subject: Re: [PATCH] block: kyber: make kyber more friendly with merging To: Omar Sandoval Cc: axboe@kernel.dk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org References: <1527000509-2619-1-git-send-email-jianchao.w.wang@oracle.com> <20180522200214.GF9536@vader> From: "jianchao.wang" Message-ID: Date: Wed, 23 May 2018 09:47:05 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180522200214.GF9536@vader> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8901 signatures=668700 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805230013 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Omar Thanks for your kindly response. On 05/23/2018 04:02 AM, Omar Sandoval wrote: > On Tue, May 22, 2018 at 10:48:29PM +0800, Jianchao Wang wrote: >> Currently, kyber is very unfriendly with merging. kyber depends >> on ctx rq_list to do merging, however, most of time, it will not >> leave any requests in ctx rq_list. This is because even if tokens >> of one domain is used up, kyber will try to dispatch requests >> from other domain and flush the rq_list there. > > That's a great catch, I totally missed this. > > This approach does end up duplicating a lot of code with the blk-mq core > even after Jens' change, so I'm curious if you tried other approaches. > One idea I had is to try the bio merge against the kqd->rqs lists. Since > that's per-queue, the locking overhead might be too high. Alternatively, Yes, I used to make a patch as you say, try the bio merge against kqd->rqs directly. The patch looks even simpler. However, because the khd->lock is needed every time when try bio merge, there maybe high contending overhead on hkd->lock when cpu-hctx mapping is not 1:1. > you could keep the software queues as-is but add our own version of > flush_busy_ctxs() that only removes requests of the domain that we want. > If one domain gets backed up, that might get messy with long iterations, > though. Yes, I also considered this approach :) But the long iterations on every ctx->rq_list looks really inefficient. > > Regarding this approach, a couple of comments below. ... >> } >> @@ -379,12 +414,33 @@ static void kyber_exit_sched(struct elevator_queue *e) >> static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx) >> { >> struct kyber_hctx_data *khd; >> + struct kyber_queue_data *kqd = hctx->queue->elevator->elevator_data; >> int i; >> + int sd; >> >> khd = kmalloc_node(sizeof(*khd), GFP_KERNEL, hctx->numa_node); >> if (!khd) >> return -ENOMEM; >> >> + khd->kcqs = kmalloc_array_node(nr_cpu_ids, sizeof(void *), >> + GFP_KERNEL, hctx->numa_node); >> + if (!khd->kcqs) >> + goto err_khd; > > Why the double indirection of a percpu allocation per hardware queue > here? With, say, 56 cpus and that many hardware queues, that's 3136 > pointers, which seems like overkill. Can't you just use the percpu array > in the kqd directly, or make it per-hardware queue instead? oops, I forgot to change the nr_cpu_ids to hctx->nr_ctx. The mapping between cpu and hctx has been setup when kyber_init_hctx is invoked, so just need to allocate hctx->nr_ctx * struct kyber_ctx_queue per khd. ... >> +static int bio_sched_domain(const struct bio *bio) >> +{ >> + unsigned int op = bio->bi_opf; >> + >> + if ((op & REQ_OP_MASK) == REQ_OP_READ) >> + return KYBER_READ; >> + else if ((op & REQ_OP_MASK) == REQ_OP_WRITE && op_is_sync(op)) >> + return KYBER_SYNC_WRITE; >> + else >> + return KYBER_OTHER; >> +} > > Please add a common helper for rq_sched_domain() and bio_sched_domain() > instead of duplicating the logic. > Yes, I will do it in next version. Thanks Jianchao