Received: by 2002:ab2:6203:0:b0:1f5:f2ab:c469 with SMTP id o3csp874051lqt; Fri, 19 Apr 2024 13:07:33 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV0b2nTW0Jzi8pvXtjgA0LkvN1E827z+xB155s0pl4qqtbANm3IV18rYC5+yGD4g9T0ed1bwoS7c3AwfV2CeLoDRj2GEsYW9c+XrQBHAA== X-Google-Smtp-Source: AGHT+IG1lHa8+DRpHOZGNpVt2Zo1oGRWtBazr/CZqqcGQqMOCCLSmcQ8k9I7dBZ3xWEazlYO2K0A X-Received: by 2002:a05:6214:1086:b0:699:2df2:b106 with SMTP id o6-20020a056214108600b006992df2b106mr3295808qvr.23.1713557252970; Fri, 19 Apr 2024 13:07:32 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713557252; cv=pass; d=google.com; s=arc-20160816; b=XCviAYWrlGEUoWagmSagXgbh2d5j0kzc0nHC/wYomyv0vp73pZzksQh7nS+ITu9b5a 6klRJ+Lp+f+CcdfrZFBtKT/qIbdqYRC4KplCyF4JkbAV7NOn+tQzrbYiqB4+q+P0ufAX U8GXaRMGHlgcI7bgugVfeALFd9sI674WtZVkwPAkBIUPeJW76dX8JlTDKz/eMOMwwJcH SJ364QLnltC49Z2s3m288gtDpw1Na/SrW9FFR096KqCAPanL25J1e82C+tb3Rh7bOYpG K7vGBl3FJpqaItUfUpKyITEUKo9CQnCZ7hcMxVknUP3UiZrgQZRa8/MavPmf0M5OaQqQ sZXg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:subject:cc:to:from:date:dkim-signature; bh=rpfCc45JwcQ9ePG/Nna/L5FJ2hy83XCL3in+ok+QqqU=; fh=9chnAObGko98d4QjNMST/tS4hYQMZqp032WYjm8SnJ8=; b=R0qqKJBn/D1Oih1EN6fm0wQdWKOVHSDZKZUndu69ab7SIfqBBgqcSlzvGc+/AaieUm EcSl2zuo9SfE6MioN1un419bofHN7Q2NL5BqRAvKTxtGBfvp3oZ8e5loK2kUD3BE/wMq VFq+WOc/XZh2Y/c0Az0dxVQyKfuVuqztnxoBtXuCPvhCB2HSeu/e8Olxz6IpU4lQJK71 8eeTU4y9g+2PWRZv2yc6yZtseXW3KRg4crbyO8bpkin5wJFs6z1SIJ7ZcywH41mR4QRH N33zR77ZqDAep/qpVnId8LZpB3Mglwxt9lp1uyllTh2BLyQg5QO3+s0FJlEs7ssf4782 69Qw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=QfiCvfRU; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-151929-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151929-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id fn10-20020ad45d6a000000b0069b1f1c4be7si4729398qvb.166.2024.04.19.13.07.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 13:07:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-151929-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=QfiCvfRU; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-151929-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151929-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id A2D941C20CDE for ; Fri, 19 Apr 2024 20:07:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0D4EB13CF9E; Fri, 19 Apr 2024 20:07:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QfiCvfRU" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0718D10A03; Fri, 19 Apr 2024 20:07:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713557241; cv=none; b=EWCfgw6IHu+P8Vqjw1DAtj2EMaQ7D08Y8jVSc0yEMlkw+jRPsu2okUhWyHKUKvXnj5SAWtFBL3lDaHC4ZXUZ1ws60qcwJ+DuQtPPHlXqO2d7ml2LH0mQXd61q5uKEaHiYL11R3yvz+kf1MzuDW85kSEpO3Ilb359dWXeolTlSpI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713557241; c=relaxed/simple; bh=ycgfHMlt+UlcaG4ketR/yuNNw2hvmApeoBvZ3KPmBP4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KC5eSQxMEInqzmma7a4O3BA0Untk4pIsTRhkS1/ArH+IXfvBu/q+AC8ujkY7JSvXQQ5vVQQz6fQxXDE5ueseOTA8uFrslKZYQvVxHR0iSXikYfPCdlG97nkKRYvVVdR/n7+PZ2dZE2ButJYnaS342Cq1nf9cd/JmKSJTRBAAVSM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QfiCvfRU; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id F1759C072AA; Fri, 19 Apr 2024 20:07:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1713557240; bh=ycgfHMlt+UlcaG4ketR/yuNNw2hvmApeoBvZ3KPmBP4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QfiCvfRU8afAJ78s4+eQn6qzBAimsQ2168PoqAIHVelV/LRVyalukWuZ6HVlfP68B FAeSC1pSGwpOwealSFcBOFuDOIU7XREj+dbocpnEGmodtiP0831dOBTYtVLkqHaFy/ a6O81VtJbB5ZZVd7SYwZuKvDZnRFJ6DdxWqL9vTOVXvHFIpXlzjGeJwGjzloMikfA3 8Ogdj9mHkI+ra9Ew6t8V0IiTht/NZ1wxXIy8cHyS+GUIe0JhxNXwvEBnsqo2AWnAhu RkbLosw5jKAGFaadrKUV2+dhdXWWbgbC/+XS9+imfzkke0f7phdt6hOdEbVMkhsjJi JCzeiah7YRVZA== Date: Fri, 19 Apr 2024 17:07:17 -0300 From: Arnaldo Carvalho de Melo To: Thomas Gleixner Cc: Jacob Pan , LKML , X86 Kernel , Peter Zijlstra , iommu@lists.linux.dev, Lu Baolu , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , Borislav Petkov , Ingo Molnar , Paul Luse , Dan Williams , Jens Axboe , Raj Ashok , "Tian, Kevin" , maz@kernel.org, seanjc@google.com, Robin Murphy , jim.harris@samsung.com, a.manzanares@samsung.com, Bjorn Helgaas , guang.zeng@intel.com, robert.hoo.linux@gmail.com, kan.liang@intel.com, "Kleen, Andi" Subject: Re: [PATCH v2 05/13] x86/irq: Reserve a per CPU IDT vector for posted MSIs Message-ID: References: <20240415134354.67c9d1d1@jacob-builder> <87jzkuxaqv.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87jzkuxaqv.ffs@tglx> On Fri, Apr 19, 2024 at 06:00:24AM +0200, Thomas Gleixner wrote: > On Mon, Apr 15 2024 at 13:43, Jacob Pan wrote: > > On Mon, 15 Apr 2024 11:53:58 -0700, Jacob Pan wrote: > >> On Thu, 11 Apr 2024 18:51:14 +0200, Thomas Gleixner wrote: > >> > If we really care then we do it proper for _all_ of them. Something like > >> > the uncompiled below. There is certainly a smarter way to do the build > >> > thing, but my kbuild foo is rusty. > >> I too had the concern of the wasting system vectors, but did not know how > >> to fix it. But now your code below works well. Tested without KVM in > >> .config to show the gaps: > >> > >> In VECTOR IRQ domain. > >> > >> BEFORE: > >> System: 46: 0-31,50,235-236,244,246-255 > >> > >> AFTER: > >> System: 46: 0-31,50,241-242,245-255 > >> > >> The only gap is MANAGED_IRQ_SHUTDOWN_VECTOR(243), which is expected on a > >> running system. > >> > >> Verified in irqvectors.s: .ascii "->MANAGED_IRQ_SHUTDOWN_VECTOR $243 > >> > >> POSTED MSI/first system vector moved up from 235 to 241 for this case. > >> > >> Will try to let tools/arch/x86/include/asm/irq_vectors.h also use it > >> instead of manually copy over each time. Any suggestions greatly > >> appreciated. > >> > > On a second thought, if we make system IRQ vector determined at compile > > time based on different CONFIG options, will it break userspace tools such > > as perf? More importantly the rule of not breaking userspace. The rule for tools/perf is "don't impose _any requirement_ on the kernel developers, they don't have to test if any change they do outside of tools/ will break something inside tools/." > tools/arch/x86/include/asm/irq_vectors.h is only used to generate the > list of system vectors for pretty output. And your change already broke > that. Yeah, I even moved that from tools/arch/x86/include/asm/irq_vectors.h to tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h (for next merge window). Having it in tools/arch/x86/include/asm/irq_vectors.h was a bad decision as it, as you mentinoned, is only used to generate string tables: ⬢[acme@toolbox perf-tools-next]$ tools/perf/trace/beauty/tracepoints/x86_irq_vectors.sh static const char *x86_irq_vectors[] = { [0x02] = "NMI", [0x80] = "IA32_SYSCALL", [0xec] = "LOCAL_TIMER", [0xed] = "HYPERV_STIMER0", [0xee] = "HYPERV_REENLIGHTENMENT", [0xef] = "MANAGED_IRQ_SHUTDOWN", [0xf0] = "POSTED_INTR_NESTED", [0xf1] = "POSTED_INTR_WAKEUP", [0xf2] = "POSTED_INTR", [0xf3] = "HYPERVISOR_CALLBACK", [0xf4] = "DEFERRED_ERROR", [0xf6] = "IRQ_WORK", [0xf7] = "X86_PLATFORM_IPI", [0xf8] = "REBOOT", [0xf9] = "THRESHOLD_APIC", [0xfa] = "THERMAL_APIC", [0xfb] = "CALL_FUNCTION_SINGLE", [0xfc] = "CALL_FUNCTION", [0xfd] = "RESCHEDULE", [0xfe] = "ERROR_APIC", [0xff] = "SPURIOUS_APIC", }; ⬢[acme@toolbox perf-tools-next]$ Used in: root@number:~# perf trace -a -e irq_vectors:irq_work_entry/max-stack=32/ --max-events=1 0.000 kworker/u57:0-/9912 irq_vectors:irq_work_entry(vector: IRQ_WORK) __sysvec_irq_work ([kernel.kallsyms]) __sysvec_irq_work ([kernel.kallsyms]) sysvec_irq_work ([kernel.kallsyms]) asm_sysvec_irq_work ([kernel.kallsyms]) _raw_spin_unlock_irqrestore ([kernel.kallsyms]) dma_fence_wait_timeout ([kernel.kallsyms]) intel_atomic_commit_tail ([kernel.kallsyms]) process_one_work ([kernel.kallsyms]) worker_thread ([kernel.kallsyms]) kthread ([kernel.kallsyms]) ret_from_fork ([kernel.kallsyms]) ret_from_fork_asm ([kernel.kallsyms]) root@number:~# But as the original cset introducing this explains, these irq_vectors: tracepoins operate on just one of the vectors, so irq_work_entry(vector: IRQ_WORK), irq_vectors:reschedule_exit(vector: RESCHEDULE), etc. > The obvious solution to that is to expose that list in sysfs for > consumption by perf. nah, the best thing these days is stop using 'int' for vector and use 'enum irq_vector', then since we have BTF we can use that to do the enum -> string translation, like with (using /sys/kernel/btf/vmlinux, that is pretty much available everywhere these days): root@number:~# pahole clocksource_ids enum clocksource_ids { CSID_GENERIC = 0, CSID_ARM_ARCH_COUNTER = 1, CSID_MAX = 2, }; root@number:~# pahole skb_drop_reason | head enum skb_drop_reason { SKB_NOT_DROPPED_YET = 0, SKB_CONSUMED = 1, SKB_DROP_REASON_NOT_SPECIFIED = 2, SKB_DROP_REASON_NO_SOCKET = 3, SKB_DROP_REASON_PKT_TOO_SMALL = 4, SKB_DROP_REASON_TCP_CSUM = 5, SKB_DROP_REASON_SOCKET_FILTER = 6, SKB_DROP_REASON_UDP_CSUM = 7, SKB_DROP_REASON_NETFILTER_DROP = 8, root@number:~# Then its easy to go from 0 to CSID_GENERIC, etc. ⬢[acme@toolbox pahole]$ perf stat -e cycles pahole skb_drop_reason > /dev/null Performance counter stats for 'pahole skb_drop_reason': 6,095,427 cpu_atom/cycles:u/ (2.82%) 103,694,633 cpu_core/cycles:u/ (97.18%) 0.039031759 seconds time elapsed 0.016028000 seconds user 0.023007000 seconds sys ⬢[acme@toolbox pahole]$ - Arnaldo > But we don't have to do any of that right away. It's an orthogonal > issue. Just waste the extra system vector to start with and then we can > add the compile time dependend change on top if we really care about > gaining back the vectors. > > Thanks, > > tglx