Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp1495619rwb; Wed, 26 Jul 2023 13:29:26 -0700 (PDT) X-Google-Smtp-Source: APBJJlHre0lUSDC/EsEjky1IhaVbHWq2f80yGGmMHSv0VCPqoWbEtillzYrcFPJYeJNgniXZzlsf X-Received: by 2002:a2e:8483:0:b0:2b6:fa60:85a1 with SMTP id b3-20020a2e8483000000b002b6fa6085a1mr111654ljh.21.1690403366610; Wed, 26 Jul 2023 13:29:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690403366; cv=none; d=google.com; s=arc-20160816; b=YM5Lv+H0YhzPJHpXCBgD3ptBlD4aVL72mfX8NaJJNQVE4QumOyL8PFQ6k67EBR3Dc4 PRKO6/WXuexpFW2mv2pmd5yXCqTYROUfd7Wu0UFMYhBUWc4QLkKtUFaqA6vYCrbhT0Mb ktQfir4sGQ3H4vTbw/yK7GnrZhvmWrxefCaFGQY/CBaS/xSZIPDy2GWEmI0ABEAtWonT l9hBXzTYuQatCptTsmty2RLHWu+Nx9gOnMdb90NRqQZUhsyM8/ZHLIv8IGrNC+hau+1o xkbc/8u/B1/lRqGiAVxNEkcE9yi4iucwhEWFO6jmbnDoxTkIfjS/QhS19QChrvKIDMUB UBww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:in-reply-to :date:references:subject:cc:to:from:dkim-signature; bh=PdsSRu7tlZBA3gNrHuPdzW7N3hmmobQp+k8jhBUmFHA=; fh=TBMVSANgyKQ+qChmqgMyMaWmXyYurDpRA0DizCEGCHo=; b=gTxMJwiYEd5EwkYN+XY/8ZVyY7VGBzXkeaLAbJEOpPltjTrNXeBpsfZGOL8KGccydX F+tVgc5PlnEeN+x3Mt5SGdw2Z1/PZoxwPqOhHaRWyg/MrfK9YpRcgPvZqtqb0rkb2lZK obZlrIfACM/VyR+6YyN71AP1M3WccmaeKZnEPveu92nv1PWgQXUPDkQ40RPZJAfHsWIO 3iTM3ZhbfDq1L7Xbc2LlQwuCRHDa10yjBhBK4PV28n4/zqD01WydATIjXnC6ri7J3ZZo icLqZZsHpDzoEgUZXGhR5a3GP1+gz9A/v/L5TKwdsNHf68nAKyo5TJjtOgnurBeqt6E5 8bGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=OXynlwML; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y25-20020a17090668d900b0099b422f9c93si8960195ejr.524.2023.07.26.13.28.59; Wed, 26 Jul 2023 13:29:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=OXynlwML; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229685AbjGZTck (ORCPT + 99 others); Wed, 26 Jul 2023 15:32:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229541AbjGZTcj (ORCPT ); Wed, 26 Jul 2023 15:32:39 -0400 Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55795E4F for ; Wed, 26 Jul 2023 12:32:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1690399958; x=1721935958; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=PdsSRu7tlZBA3gNrHuPdzW7N3hmmobQp+k8jhBUmFHA=; b=OXynlwMLFT78B4BOVvI3Ilrp8Io9k8UJReLOGgz+K6PjmSWneq+2dNGX ywFp2RnSyEoBKyayumzc/Wyeom9aDOFVTf2lf5GVMsck+H6lGbLzX2r4k HqPJtLwR4RdR4wNKTneSH7LR69p6xRBnCQGV+WEqkdYLWefbcB1AIRmy5 4=; X-IronPort-AV: E=Sophos;i="6.01,232,1684800000"; d="scan'208";a="228901664" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-iad-1e-m6i4x-529f0975.us-east-1.amazon.com) ([10.25.36.210]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2023 19:32:35 +0000 Received: from EX19MTAUEC002.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-iad-1e-m6i4x-529f0975.us-east-1.amazon.com (Postfix) with ESMTPS id 2DFAA46E5E; Wed, 26 Jul 2023 19:32:34 +0000 (UTC) Received: from EX19MTAUEB001.ant.amazon.com (10.252.135.35) by EX19MTAUEC002.ant.amazon.com (10.252.135.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.30; Wed, 26 Jul 2023 19:32:33 +0000 Received: from dev-dsk-ptyadav-1c-37607b33.eu-west-1.amazon.com (10.15.11.255) by mail-relay.amazon.com (10.252.135.35) with Microsoft SMTP Server id 15.2.1118.30 via Frontend Transport; Wed, 26 Jul 2023 19:32:33 +0000 Received: by dev-dsk-ptyadav-1c-37607b33.eu-west-1.amazon.com (Postfix, from userid 23027615) id 3672C20E1D; Wed, 26 Jul 2023 21:32:33 +0200 (CEST) From: Pratyush Yadav To: Keith Busch CC: Christoph Hellwig , Sagi Grimberg , "Jens Axboe" , , Subject: Re: [PATCH] nvme-pci: do not set the NUMA node of device if it has none References: <20230725110622.129361-1-ptyadav@amazon.de> <50a125da-95c8-3b9b-543a-016c165c745d@grimberg.me> <20230726131408.GA15909@lst.de> Date: Wed, 26 Jul 2023 21:32:33 +0200 In-Reply-To: (Keith Busch's message of "Wed, 26 Jul 2023 10:17:20 -0600") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, T_SPF_PERMERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 26 2023, Keith Busch wrote: > On Wed, Jul 26, 2023 at 05:30:33PM +0200, Pratyush Yadav wrote: >> On Wed, Jul 26 2023, Christoph Hellwig wrote: >> > On Wed, Jul 26, 2023 at 10:58:36AM +0300, Sagi Grimberg wrote: >> >>>> For example, AWS EC2's i3.16xlarge instance does not expose NUMA >> >>>> information for the NVMe devices. This means all NVMe devices have >> >>>> NUMA_NO_NODE by default. Without this patch, random 4k read performance >> >>>> measured via fio on CPUs from node 1 (around 165k IOPS) is almost 50% >> >>>> less than CPUs from node 0 (around 315k IOPS). With this patch, CPUs on >> >>>> both nodes get similar performance (around 315k IOPS). >> >>> >> >>> irqbalance doesn't work with this driver though: the interrupts are >> >>> managed by the kernel. Is there some other reason to explain the perf >> >>> difference? >> >> Hmm, I did not know that. I have not gone and looked at the code but I >> think the same reasoning should hold, just with s/irqbalance/kernel. If >> the kernel IRQ balancer sees the device is on node 0, it would deliver >> its interrupts to CPUs on node 0. >> >> In my tests I can see that the interrupts for NVME queues are sent only >> to CPUs from node 0 without this patch. With this patch CPUs from both >> nodes get the interrupts. > > Could you send the output of: > > numactl --hardware $ numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 node 0 size: 245847 MB node 0 free: 245211 MB node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 node 1 size: 245932 MB node 1 free: 245328 MB node distances: node 0 1 0: 10 21 1: 21 10 > > and then with and without your patch: > > for i in $(cat /proc/interrupts | grep nvme0 | sed "s/^ *//g" | cut -d":" -f 1); do \ > cat /proc/irq/$i/{smp,effective}_affinity_list; \ > done Without my patch: $ for i in $(cat /proc/interrupts | grep nvme0 | sed "s/^ *//g" | cut -d":" -f 1); do \ > cat /proc/irq/$i/{smp,effective}_affinity_list; \ > done 40 40 33 33 44 44 9 9 32 32 2 2 6 6 11 11 1 1 35 35 39 39 13 13 42 42 46 46 41 41 46 46 15 15 5 5 43 43 0 0 14 14 8 8 12 12 7 7 10 10 47 47 38 38 36 36 3 3 34 34 45 45 5 5 With my patch: $ for i in $(cat /proc/interrupts | grep nvme0 | sed "s/^ *//g" | cut -d":" -f 1); do \ > cat /proc/irq/$i/{smp,effective}_affinity_list; \ > done 9 9 15 15 5 5 23 23 38 38 52 52 21 21 36 36 13 13 56 56 44 44 42 42 31 31 48 48 5 5 3 3 1 1 11 11 28 28 18 18 34 34 29 29 58 58 46 46 54 54 59 59 32 32 7 7 56 56 62 62 49 49 57 57 -- Regards, Pratyush Yadav Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879