Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp2434385rwl; Thu, 13 Apr 2023 06:31:12 -0700 (PDT) X-Google-Smtp-Source: AKy350aX423ervibyb4JYujDjSwZn9LOb0Cmyzvz2rMYoC11/be3MnFsYGG75iMqcDQeE4LtuPAY X-Received: by 2002:a05:6a20:7a08:b0:d6:80a4:f0b6 with SMTP id t8-20020a056a207a0800b000d680a4f0b6mr1996138pzh.6.1681392672446; Thu, 13 Apr 2023 06:31:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681392672; cv=none; d=google.com; s=arc-20160816; b=DwQoinrnm0iuWhvHUUoJD4bXtuKQLuDy8yk1TBrZSK3SWn7TB6tJkPARr7nnyM+Etu gm9bjztSW9PCgI0AEqBc8F3gEvquSKDn2UlmRpN+H+lmCkVKm9qweu055vHbkLz5M6NS CxApfJ9yVklRInOtvWTaUhXQgk4284PIIL6W2HuFIBo/PsxYbziO3soMhHRoCjCLFbUj DwmNowx9aAdQzUMkf8Qqn2BKruh+cimXNCNA51wYsZQMUetzXsxoTD4Y8QpE4PqkUKZV f7nINdYtCRxzvlQkbGkUi3/Unhyf5AmZAA7hFkTKKNIru1OBAgx1CHm3AmOrYTzST7sO ABJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=J4DTAdqVpW5zvcBkZu175H3BHMe4k/6Dkx1pmXMIM5U=; b=0kw5g8sRZcYOXaMXmcfcjQTKcoFe+TQugPaRyPF79zV4n1m6ykWtE7m0v9AHEfupRI LCSoQdABDRaE2SdLLisQgag+TpjjSfs3iaUPEFtusM0+J1A04Xg4VTNM70aVekH2NqOe Rq0cfMwt+qJ0CJXcoxod5rh1mvU5FF0XrvYW4gnSWbChueZXX3qw8iRm8KkHpj3t2jec 2x+pV7aD1hlnQl967ICZtWMgt273Klkek8ELu4KKa0a5eyuSLR/ZJZrDN260PJnh2Qq8 /aH8y+dIDa8EAhf7HJcEEKo77J6hctCZ5EY7l3LPtg2c96tvL+R+JY1S7JoEZrELEVwg K9Gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@smartx-com.20221208.gappssmtp.com header.s=20221208 header.b="tujX/Tjr"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b25-20020aa79519000000b0062b24b48726si1822283pfp.123.2023.04.13.06.30.58; Thu, 13 Apr 2023 06:31:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@smartx-com.20221208.gappssmtp.com header.s=20221208 header.b="tujX/Tjr"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230354AbjDMNaj (ORCPT + 99 others); Thu, 13 Apr 2023 09:30:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229642AbjDMNad (ORCPT ); Thu, 13 Apr 2023 09:30:33 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17F114683 for ; Thu, 13 Apr 2023 06:30:04 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id mq14-20020a17090b380e00b002472a2d9d6aso573966pjb.5 for ; Thu, 13 Apr 2023 06:30:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=smartx-com.20221208.gappssmtp.com; s=20221208; t=1681392603; x=1683984603; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J4DTAdqVpW5zvcBkZu175H3BHMe4k/6Dkx1pmXMIM5U=; b=tujX/TjriV2EXeDcQeiSaMykmam+Sh+rcS20J6bQD0n2DvMxj0XC2I3nyyDHy7K2p0 JZd0cWNbeDXxFakoGiuwX+/MU9aVtCGRyVBoTcsk7h1OlpbvtOCleVWlcurrMjbADDcq X8x4sikMWMKYTV6nvY/IIG/DAH/DkmWqxOWGVxq9F9lWOAwo8IoTzOujhCb8BALhlSYH Cu6kQEjwj6IL5x55pdkjSQ+TgN+M74FgDGUA7TCoSJj/3vF/lseEzNZxAxaeH4cjPZ2C J4JvxPz5Kfe+B8akcPZWk/FB+3tv5wdFqlyZVcFjbq8gQKm9ZS/4qgh4cRqrzJ6CpAts 2mAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681392603; x=1683984603; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J4DTAdqVpW5zvcBkZu175H3BHMe4k/6Dkx1pmXMIM5U=; b=WY7vLGAxCrBKZSpAYylCkdYQG9lrSCF6Al3Rri44+HX6etdJlO4ntJGvJ7tu99SoEj Q/vyxiw+6hCS8TZ+nkqqdwrnwa92aSfU1HLWKqdA0GSnrm5s3TDbVbv5dlxUrGsxlbSo 2e0Xr0DVhntJziZNIXS/oCCQOOE9VVgfivnoIuZbok8tX+rTPPJr9AsADoRx0nvhjfgV QWkFP0mfb8oROIcAFoQW/HyCtqg2sNwMzPTFSxKgXf/jnnteOn/ti4Tyh+kWRgsvMrXt 3f15euYf4JWY+fCV3/DC3qaqezNR6dhoh/J2pqrq4gp8uSOrevsFg+AGKEKE5W/iLSlr 05uw== X-Gm-Message-State: AAQBX9dD+vDtWhrEcBhwCksxMwqJsBceYUeXG/PXLF6N54yIUHES1d7H JH2OSgryHUFjQmGT5gcu6Vfzpw== X-Received: by 2002:a17:902:e884:b0:1a6:8ee3:4e2e with SMTP id w4-20020a170902e88400b001a68ee34e2emr1670819plg.33.1681392603237; Thu, 13 Apr 2023 06:30:03 -0700 (PDT) Received: from localhost.localdomain.cc ([47.75.78.161]) by smtp.gmail.com with ESMTPSA id z18-20020a170902ee1200b001a4fe00a8d4sm1487836plb.90.2023.04.13.06.30.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Apr 2023 06:30:02 -0700 (PDT) From: Li Feng To: Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , linux-nvme@lists.infradead.org (open list:NVM EXPRESS DRIVER), linux-kernel@vger.kernel.org (open list) Cc: lifeng1519@gmail.com, Li Feng Subject: [PATCH v2] nvme/tcp: Add support to set the tcp worker cpu affinity Date: Thu, 13 Apr 2023 21:29:41 +0800 Message-Id: <20230413132941.2489795-1-fengli@smartx.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230413062339.2454616-1-fengli@smartx.com> References: <20230413062339.2454616-1-fengli@smartx.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The default worker affinity policy is using all online cpus, e.g. from 0 to N-1. However, some cpus are busy for other jobs, then the nvme-tcp will have a bad performance. This patch adds a module parameter to set the cpu affinity for the nvme-tcp socket worker threads. The parameter is a comma separated list of CPU numbers. The list is parsed and the resulting cpumask is used to set the affinity of the socket worker threads. If the list is empty or the parsing fails, the default affinity is used. Signed-off-by: Li Feng --- V2 - Fix missing static reported by lkp drivers/nvme/host/tcp.c | 54 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 49c9e7bc9116..47748de5159b 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -31,6 +31,18 @@ static int so_priority; module_param(so_priority, int, 0644); MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority"); +/* Support for specifying the CPU affinity for the nvme-tcp socket worker + * threads. This is a comma separated list of CPU numbers. The list is + * parsed and the resulting cpumask is used to set the affinity of the + * socket worker threads. If the list is empty or the parsing fails, the + * default affinity is used. + */ +static char *cpu_affinity_list; +module_param(cpu_affinity_list, charp, 0644); +MODULE_PARM_DESC(cpu_affinity_list, "nvme tcp socket worker cpu affinity list"); + +static struct cpumask cpu_affinity_mask; + #ifdef CONFIG_DEBUG_LOCK_ALLOC /* lockdep can detect a circular dependency of the form * sk_lock -> mmap_lock (page fault) -> fs locks -> sk_lock @@ -1483,6 +1495,41 @@ static bool nvme_tcp_poll_queue(struct nvme_tcp_queue *queue) ctrl->io_queues[HCTX_TYPE_POLL]; } +static ssize_t update_cpu_affinity(const char *buf) +{ + cpumask_var_t new_value; + cpumask_var_t dst_value; + int err = 0; + + if (!zalloc_cpumask_var(&new_value, GFP_KERNEL)) + return -ENOMEM; + + err = bitmap_parselist(buf, cpumask_bits(new_value), nr_cpumask_bits); + if (err) + goto free_new_cpumask; + + if (!zalloc_cpumask_var(&dst_value, GFP_KERNEL)) { + err = -ENOMEM; + goto free_new_cpumask; + } + + /* + * If the new_value does not have any intersection with the cpu_online_mask, + * the dst_value will be empty, then keep the cpu_affinity_mask as cpu_online_mask. + */ + if (cpumask_and(dst_value, new_value, cpu_online_mask)) + cpu_affinity_mask = *dst_value; + + free_cpumask_var(dst_value); + +free_new_cpumask: + free_cpumask_var(new_value); + if (err) + pr_err("failed to update cpu affinity mask, bad affinity list [%s], err %d\n", + buf, err); + return err; +} + static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue) { struct nvme_tcp_ctrl *ctrl = queue->ctrl; @@ -1496,7 +1543,12 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue) else if (nvme_tcp_poll_queue(queue)) n = qid - ctrl->io_queues[HCTX_TYPE_DEFAULT] - ctrl->io_queues[HCTX_TYPE_READ] - 1; - queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, false); + + if (!cpu_affinity_list || update_cpu_affinity(cpu_affinity_list) != 0) { + // Set the default cpu_affinity_mask to cpu_online_mask + cpu_affinity_mask = *cpu_online_mask; + } + queue->io_cpu = cpumask_next_wrap(n - 1, &cpu_affinity_mask, -1, false); } static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid) -- 2.40.0