Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp889445pxf; Wed, 7 Apr 2021 14:14:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwEBP+8bij4sx6o77dnyS6qG5K1aerIYiKx4rmp5YTn7s8/lVY63blBRsL3HJa5N15bV8Jp X-Received: by 2002:a02:cc1b:: with SMTP id n27mr5479070jap.106.1617830048824; Wed, 07 Apr 2021 14:14:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617830048; cv=none; d=google.com; s=arc-20160816; b=gfmJDd8uMJK+IupilYnwp0RFPW0/iLwbuuw2PtnsCXiaiIv2Zg/x0W09hPPGSBiGCI Xl9VC59da1OddeXZZPXo8hlH0CXINiA5gP1TkiYWF4p4jKY4B45JWuUfM0W7ckKuyPUE la8UGW7N+3E38UfXxF3r5XATPL3sILHCqUMvNLa/BmEzVZ8IOEP9C7GGmAQbeCVqu1yI zwt+n9jG2e9iWj+jDJ71Vr+b01jwYPkXewbaECKK1yDeLz3+j6+BQdCQ8KWINcbRkOSN AGSuMQ51wVZf1kMTftcx+B42qIKLmG3f6iQh8eHOuDporH+i+e4VDZhJubdjrQdeUlXO eyaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:organization :from:references:cc:to:subject:dkim-signature; bh=94hkI7NPCyAYeAQmz1SUXoCi4HddgW5OI5csEcga2wY=; b=rPt+uTJ2pp8p7I67nADffdPRdiTc/oPk594PdWN8vxSXy6dBdiaA8MevtDV7+QMDcD voSe0kXu4r5UBL0fTm+HyY2qk8eu86DDlcfDeBeRKohdZbZtapgfxYd634xepN8Vt6Iz 0ge9e+8n76RbTgZfR7ShsFpuMqtJzibZFURmgSfbXeqTUmCkVMQP7LypTOfNOTlRidAH PnclLHFaU5BtM8MOr5EwmeG506BfWkRTxQXl8LQa5GNuN44exFVUBQBQZuT1ObidHgFm Tx0ywa0LQioxvgUrHCoPsTSuxmcZ/qJM17p7XczcT5X0duJD6x/HS5fbqMmYw3oQAHMM i7uw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BPjKzVKO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y16si22111666jaq.71.2021.04.07.14.13.55; Wed, 07 Apr 2021 14:14:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BPjKzVKO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347229AbhDGPSf (ORCPT + 99 others); Wed, 7 Apr 2021 11:18:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27001 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241915AbhDGPSe (ORCPT ); Wed, 7 Apr 2021 11:18:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1617808704; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=94hkI7NPCyAYeAQmz1SUXoCi4HddgW5OI5csEcga2wY=; b=BPjKzVKO4TZdT+rJa7hGnkBhjhcPWuSkKDqfq/ulraUlDfv1YhR4PheHBsfgXPrjK3DF1u /JDP4nsXj+93WXO/mxRhx/93kF/6vfbjFReNw0M8NkiVDid1dKuCU2ff/yXfLkB7kPR2yo z0dr53y01sSGZVTJubBa2U1cPH6nEtU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-50-jHkyYJ17PzSdetgpARriSg-1; Wed, 07 Apr 2021 11:18:21 -0400 X-MC-Unique: jHkyYJ17PzSdetgpARriSg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 696AA10054F6; Wed, 7 Apr 2021 15:18:17 +0000 (UTC) Received: from [10.10.116.88] (ovpn-116-88.rdu2.redhat.com [10.10.116.88]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 18CE75D6CF; Wed, 7 Apr 2021 15:18:10 +0000 (UTC) Subject: Re: [Patch v4 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs To: Jesse Brandeburg , Marcelo Tosatti , Thomas Gleixner , "frederic@kernel.org" Cc: Robin Murphy , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, juri.lelli@redhat.com, abelits@marvell.com, bhelgaas@google.com, linux-pci@vger.kernel.org, rostedt@goodmis.org, mingo@kernel.org, peterz@infradead.org, davem@davemloft.net, akpm@linux-foundation.org, sfr@canb.auug.org.au, stephen@networkplumber.org, rppt@linux.vnet.ibm.com, jinyuqi@huawei.com, zhangshaokun@hisilicon.com References: <20200625223443.2684-1-nitesh@redhat.com> <20200625223443.2684-2-nitesh@redhat.com> <3e9ce666-c9cd-391b-52b6-3471fe2be2e6@arm.com> <20210127121939.GA54725@fuller.cnet> <87r1m5can2.fsf@nanos.tec.linutronix.de> <20210128165903.GB38339@fuller.cnet> <87h7n0de5a.fsf@nanos.tec.linutronix.de> <20210204181546.GA30113@fuller.cnet> <20210204190647.GA32868@fuller.cnet> <87y2g26tnt.fsf@nanos.tec.linutronix.de> <7780ae60-efbd-2902-caaa-0249a1f277d9@redhat.com> <07c04bc7-27f0-9c07-9f9e-2d1a450714ef@redhat.com> <20210406102207.0000485c@intel.com> From: Nitesh Narayan Lal Organization: Red Hat Inc, Message-ID: <1a044a14-0884-eedb-5d30-28b4bec24b23@redhat.com> Date: Wed, 7 Apr 2021 11:18:09 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <20210406102207.0000485c@intel.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/6/21 1:22 PM, Jesse Brandeburg wrote: > Continuing a thread from a bit ago... > > Nitesh Narayan Lal wrote: > >>> After a little more digging, I found out why cpumask_local_spread change >>> affects the general/initial smp_affinity for certain device IRQs. >>> >>> After the introduction of the commit: >>> >>>     e2e64a932 genirq: Set initial affinity in irq_set_affinity_hint() >>> >> Continuing the conversation about the above commit and adding Jesse. >> I was trying to understand the problem that the commit message explains >> "The default behavior of the kernel is somewhat undesirable as all >> requested interrupts end up on CPU0 after registration.", I have also been >> trying to reproduce this behavior without the patch but I failed in doing >> so, maybe because I am missing something here. >> >> @Jesse Can you please explain? FWIU IRQ affinity should be decided based on >> the default affinity mask. Thanks, Jesse for responding. > The original issue as seen, was that if you rmmod/insmod a driver > *without* irqbalance running, the default irq mask is -1, which means > any CPU. The older kernels (this issue was patched in 2014) used to use > that affinity mask, but the value programmed into all the interrupt > registers "actual affinity" would end up delivering all interrupts to > CPU0, So does that mean the affinity mask for the IRQs was different wrt where the IRQs were actually delivered? Or, the affinity mask itself for the IRQs after rmmod, insmod was changed to 0 instead of -1? I did a quick test on top of 5.12.0-rc6 by comparing the i40e IRQ affinity mask before removing the kernel module and after doing rmmod+insmod and didn't find any difference. > and if the machine was under traffic load incoming when the > driver loaded, CPU0 would start to poll among all the different netdev > queues, all on CPU0. > > The above then leads to the condition that the device is stuck polling > even if the affinity gets updated from user space, and the polling will > continue until traffic stops. > >> The problem with the commit is that when we overwrite the affinity mask >> based on the hinting mask we completely ignore the default SMP affinity >> mask. If we do want to overwrite the affinity based on the hint mask we >> should atleast consider the default SMP affinity. For the issue where the IRQs don't follow the default_smp_affinity mask because of this patch, the following are the steps by which it can be easily reproduced with the latest linux kernel: # Kernel 5.12.0-rc6+ # Other pramaeters in the cmdline isolcpus=2-39,44-79 nohz=on nohz_full=2-39,44-79 rcu_nocbs=2-39,44-79 # cat /proc/irq/default_smp_affinity 0000,00000f00,00000003 [Corresponds to HK CPUs - 0, 1, 40, 41, 42 and 43] # Create VFs and check IRQ affinity mask /proc/irq/1423/iavf-ens1f1v3-TxRx-3 3 /proc/irq/1424/iavf-0000:3b:0b.0:mbx 0 40 42 /proc/irq/1425/iavf-ens1f1v8-TxRx-0 0 /proc/irq/1426/iavf-ens1f1v8-TxRx-1 1 /proc/irq/1427/iavf-ens1f1v8-TxRx-2 2 /proc/irq/1428/iavf-ens1f1v8-TxRx-3 3 ... /proc/irq/1475/iavf-ens1f1v15-TxRx-0 0 /proc/irq/1476/iavf-ens1f1v15-TxRx-1 1 /proc/irq/1477/iavf-ens1f1v15-TxRx-2 2 /proc/irq/1478/iavf-ens1f1v15-TxRx-3 3 /proc/irq/1479/iavf-0000:3b:0a.0:mbx 0 40 42 ... /proc/irq/240/iavf-ens1f1v3-TxRx-0 0 /proc/irq/248/iavf-ens1f1v3-TxRx-1 1 /proc/irq/249/iavf-ens1f1v3-TxRx-2 2 Trace dump: ---------- .. 11551082:  NetworkManager-1734  [040]  8167.465719: vector_activate:                 irq=1478 is_managed=0 can_reserve=1 reserve=0 11551090:  NetworkManager-1734  [040]  8167.465720: vector_alloc:             irq=1478 vector=65 reserved=1 ret=0 11551093:  NetworkManager-1734  [040]  8167.465721: vector_update:                   irq=1478 vector=65 cpu=42 prev_vector=0 prev_cpu=0 11551097:  NetworkManager-1734  [040]  8167.465721: vector_config:                   irq=1478 vector=65 cpu=42 apicdest=0x00000200 11551357:  NetworkManager-1734  [040]  8167.465768: vector_alloc:                     irq=1478 vector=46 reserved=0 ret=0 11551360:  NetworkManager-1734  [040]  8167.465769: vector_update:                   irq=1478 vector=46 cpu=3 prev_vector=65 prev_cpu=42 11551364:  NetworkManager-1734  [040]  8167.465770: vector_config:                   irq=1478 vector=46 cpu=3 apicdest=0x00040100 .. As we can see in the above trace the initial affinity for the IRQ 1478 was correctly set as per the default_smp_affinity mask which includes CPU 42, however, later on, it is updated with CPU3 which is returned from cpumask_local_spread(). > Maybe the right thing is to fix which CPUs are passed in as the valid > mask, or make sure the kernel cross checks that what the driver asks > for is a "valid CPU"? > Sure, if we can still reproduce the problem that your patch was fixing then maybe we can consider adding a new API like cpumask_local_spread_irq in which we should consider deafult_smp_affinity mask as well before returning the CPU. -- Thanks Nitesh