Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp8078846rwi; Tue, 25 Oct 2022 02:06:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Tvi6bY1E6FZDfc+zl9N4Z8GWBuu7+uE0zNlge0d2pmAYuVY4IMTvWwwwPfms5s3MOjzA7 X-Received: by 2002:aa7:c6c1:0:b0:460:f684:901a with SMTP id b1-20020aa7c6c1000000b00460f684901amr23939272eds.6.1666688768326; Tue, 25 Oct 2022 02:06:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666688768; cv=none; d=google.com; s=arc-20160816; b=vmLEaUt/wcz9Om+4fYTSQnT1lZIhEpNjJ/mcRyTUMGGiiMcgKsk4atasOSfCfdHTgv Mgoep1TCa6k9h8Hh+bsh9XFx7KfSv5PjViACbyy6UHgSinPcKgmpxTYQKTsOxUOovDRg cTZR4zW84fjt18XFOZUop3lZZm+9Sy0LPJjaDFrxdTi50C5oq13xASgCXAyMy7/UoA7p IR8Ve6ff3F5jqEGHaoYJKBZp1NJrNdNbe0qnVfzZxnVu6y5DyQhH1kM00Ge9yOTVAzyj q+6XwftURt8t4hGm6MFKj3r/enrj6yXL/FTKH3eciTqVh/e3GihINrKlvkvU5jYNbNOC T54w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date; bh=/W4I0vnkQLbmhTcdZVYuZJnSEYDF8muGIQu7sRnnMlk=; b=okHdc2kX9HaajM1FwIlJeYBtC/sVXxiEk8T2J86JW0paJiWowU+J1fzo9utP6qKj55 UQ2K6M5K49HusiD6JTzD61b+Arf3I/5SWG/FbmWrM9n9Ge9sg1nKVR2d6Z7BvRVG4zfV tCAVvFVHYiRAAPKx6akTC/scAayoC3ZRxzWTyya/NohWMJYoIS3Dg/xTZ0tToTNnqt2c GQN5UYdTp/cwGsHhuW/dkPhNZ4vBgGZhHDKj0kWOAv6sc9MHcMlAibxK0QlyRTyS+yPC pME6lKtppsX0R9AVN7pnTffo/57tjxo81/xhEMEU+kQY/+xmCwKwCEstBi8LvlFG9CDt 7Opg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ht22-20020a170907609600b00783ac0b4d32si2520272ejc.941.2022.10.25.02.05.43; Tue, 25 Oct 2022 02:06:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231788AbiJYIIT (ORCPT + 99 others); Tue, 25 Oct 2022 04:08:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229864AbiJYIIR (ORCPT ); Tue, 25 Oct 2022 04:08:17 -0400 Received: from mail.wantstofly.org (hmm.wantstofly.org [213.239.204.108]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F2B27C1D7 for ; Tue, 25 Oct 2022 01:08:16 -0700 (PDT) Received: by mail.wantstofly.org (Postfix, from userid 1000) id A83C97F527; Tue, 25 Oct 2022 11:08:13 +0300 (EEST) Date: Tue, 25 Oct 2022 11:08:13 +0300 From: Lennert Buytenhek To: David Woodhouse , Lu Baolu Cc: Joerg Roedel , Will Deacon , Robin Murphy , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH,RFC] iommu/vt-d: Convert dmar_fault IRQ to a threaded IRQ Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Under a high enough I/O page fault load, the dmar_fault hardirq handler can end up starving other tasks that wanted to run on the CPU that the IRQ is being routed to. On an i7-6700 CPU this seems to happen at around 2.5 million I/O page faults per second, and at a fraction of that rate on some of the lower-end CPUs that we use. An I/O page fault rate of 2.5 million per second may seem like a very high number, but when we get an I/O page fault for every cache line touched by a DMA operation, this I/O page fault rate can be the result of a confused PCIe device DMAing to RAM at 2.5 * 64 = 160 MB/sec, which is not an unlikely rate to be DMAing things to RAM at. And, in fact, when we do see PCIe devices getting confused like this, this sort of I/O page fault rate is not uncommon. A peripheral device continuously DMAing to RAM at 160 MB/s is inarguably a bug, either in the kernel driver for the device or in the firmware for the device, and should be fixed there, but it's the sort of bug that iommu/vt-d could be handling better than it currently does, and there is a fairly simple way to achieve that. This patch changes the dmar_fault IRQ handler to be a threaded IRQ handler. This is a pretty minimal code change, and comes with the advantage that Intel IOMMU I/O page fault handling work is now subject to RT throttling, which allows it to be kept under control using the sched_rt_period_us / sched_rt_runtime_us parameters. iommu/amd already uses a threaded IRQ handler for its I/O page fault reporting, and so it already has this advantage. When IRQ remapping is enabled, iommu/vt-d will try to set up its dmar_fault IRQ handler from start_kernel() -> x86_late_time_init() -> apic_intr_mode_init() -> apic_bsp_setup() -> irq_remap_enable_fault_handling() -> enable_drhd_fault_handling(), which happens before kthreadd is started, and trying to set up a threaded IRQ handler this early on will oops. However, there doesn't seem to be a reason why iommu/vt-d needs to set up its fault reporting IRQ handler this early, and if we remove the IRQ setup code from enable_drhd_fault_handling(), the IRQ will be registered instead from pci_iommu_init() -> intel_iommu_init() -> init_dmars(), which seems to work just fine. Suggested-by: Scarlett Gourley Suggested-by: James Sewart Suggested-by: Jack O'Sullivan Signed-off-by: Lennert Buytenhek --- drivers/iommu/intel/dmar.c | 27 ++------------------------- 1 file changed, 2 insertions(+), 25 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 5a8f780e7ffd..d0871fe9d04d 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -2043,7 +2043,8 @@ int dmar_set_interrupt(struct intel_iommu *iommu) return -EINVAL; } - ret = request_irq(irq, dmar_fault, IRQF_NO_THREAD, iommu->name, iommu); + ret = request_threaded_irq(irq, NULL, dmar_fault, IRQF_ONESHOT, + iommu->name, iommu); if (ret) pr_err("Can't request irq\n"); return ret; @@ -2051,30 +2052,6 @@ int dmar_set_interrupt(struct intel_iommu *iommu) int __init enable_drhd_fault_handling(void) { - struct dmar_drhd_unit *drhd; - struct intel_iommu *iommu; - - /* - * Enable fault control interrupt. - */ - for_each_iommu(iommu, drhd) { - u32 fault_status; - int ret = dmar_set_interrupt(iommu); - - if (ret) { - pr_err("DRHD %Lx: failed to enable fault, interrupt, ret %d\n", - (unsigned long long)drhd->reg_base_addr, ret); - return -1; - } - - /* - * Clear any previous faults. - */ - dmar_fault(iommu->irq, iommu); - fault_status = readl(iommu->reg + DMAR_FSTS_REG); - writel(fault_status, iommu->reg + DMAR_FSTS_REG); - } - return 0; } -- 2.37.3