Received: by 2002:a89:413:0:b0:1fd:dba5:e537 with SMTP id m19csp1207618lqs; Fri, 14 Jun 2024 21:45:53 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUkKIrkXdTFC/yh1hIBKef726+iApBnTRe3FUewd24ad1yF2hHTR8dYJ4C47WGGc0JHBvh0E7GuaVb2XT15O4rd9Ufc4T/J9CdvojmeTg== X-Google-Smtp-Source: AGHT+IGSCTwL8Q/tFCnA8LiWiDYHCYaLdRLiiPSx+kVyhXQVkWlwwZCO6YGEdqa5dPVpeBubzHjZ X-Received: by 2002:a50:f604:0:b0:57c:60f0:98bb with SMTP id 4fb4d7f45d1cf-57cbd651d44mr2327697a12.4.1718426753654; Fri, 14 Jun 2024 21:45:53 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718426753; cv=pass; d=google.com; s=arc-20160816; b=Wy1hEQc+4dKx/ZlI+AVIiNmWGSUeo0pBf1gc5OBJx5vJ4imYDNPtuIKXNY2sZI+hC3 Hu6dtGlPRqOID/3GB8gTDyCe+ij+mKwXIiUkSbNA2c2VHCGj8rQlidfoBBJTTjp++yFx 9SPJwKqS/hXph7lQLHRNX49TBXOPcoTbLJcmijRGSSjb3SwsqjF7pK+dOAaTYvrKUJVQ kP6DMkLZqMmpfJNdvWtnBrEceHZjcqFJFAcvHJJV/ZPYxwdgflN0LH3hjZ9hzp3o4mVz /EG2hi2ime90Cv2kS3gydTz1bpidDEPCvdhOw7IzcN0QLyznPS+lS8h7vojMMxM1xUgf kUsQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:dkim-signature; bh=Ux3AuQ+Jiebt7/7pb5tqKqqOopFVDXImGSpGZa//7ic=; fh=FsxuZIGy/fvCSx5Xj5DrvdswVri7rbqwgd+oo9hHcIA=; b=Hez/xRoXdb6tC6mVFKSmrwLVz5JTsouiqAqeG4P6jrWOsu6+/RGqwAtj5ZYoZXiYd6 3k0XXZzAEOpqRKkcHGfR1zLyLOul60o/AjtKO4+hxBbS25/S1qT+QP7ZSRui3BX4oGt0 ynIWO8Xox0QbwBEvKv25gC0JAapkWa1bnV63KaNrBjkq8v7m9oQp4rFNaTjHHhm2W585 VQMErexYByjhqsw5cEYCUD5ZEmc5r8i4brNTcZQukXIi/iOzXcL3GRKI8iaLs9IRlxlP MPOVd92W07d7646hizc1yfngX+M2v4SBC4rzsv/xNgH+ZUsi28O1xMnatyr1eS3WmTes XocA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=IiyUjbxv; arc=pass (i=1 spf=pass spfdomain=flex--swine.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-215703-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-215703-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id 4fb4d7f45d1cf-57cb7447aaasi2485018a12.450.2024.06.14.21.45.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Jun 2024 21:45:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-215703-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=IiyUjbxv; arc=pass (i=1 spf=pass spfdomain=flex--swine.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-215703-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-215703-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 335641F23440 for ; Sat, 15 Jun 2024 04:45:53 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E60031078F; Sat, 15 Jun 2024 04:45:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IiyUjbxv" Received: from mail-io1-f73.google.com (mail-io1-f73.google.com [209.85.166.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7622A19D8A4 for ; Sat, 15 Jun 2024 04:45:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718426744; cv=none; b=mir+vGAnGgjAENj+cn9V4BO2mbhLs75H2fXa+mnju9yQRE6XAVwMrAnfh9lo/8sCLTtR/vlx36JAlRh0MjEPfjLIw9OWc8UFswB8u+T2InnhPc7CJ+vDMRcIepDeOOGb49Cwjt/7KwL0/XQZiBxh8hfMjG3tcbPgEjd/ObQ6kYk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718426744; c=relaxed/simple; bh=FSYzqXk5GCVDk0+QYP2MD6C+6aLyvkArzgiSOUKHXe8=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=lKmYfxZIDINZ1NURnyMP/n8owZ2Lm0IAQpxyiuAEMC2KIOh38VQBwCTl9aQxeOuTtHr7dEJA7UK8jkwpp11BsHlNCtAbWO13jup2XWyo91XJMOtBLg/8AUfncgU4vrd8xS5w4tXbmt2UZn68ViYKo4ZhwN6X0oUuNTtwD7uQsqI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--swine.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=IiyUjbxv; arc=none smtp.client-ip=209.85.166.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--swine.bounces.google.com Received: by mail-io1-f73.google.com with SMTP id ca18e2360f4ac-7e91ad684e4so300954939f.2 for ; Fri, 14 Jun 2024 21:45:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1718426742; x=1719031542; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=Ux3AuQ+Jiebt7/7pb5tqKqqOopFVDXImGSpGZa//7ic=; b=IiyUjbxvrCsUx0r/zmlyuvWAmkDuQjhjIpTD2tcvIRgISnPXxpeaabaWU9klYvAaq9 +I1IdsPzc01BlbQLrWFeb+YZP4Fy6IdPc1N8zebjLX+Sz7rQdfj2FhydlxIvQioMYl7G g4Je5VoupyT8is5A8fjnWZmc6FBBY4788T0vhlungml+tq2g2ylI8Ur3H15cfd8dX677 c/e7yEDfZSPuJ4di9xlFnoJpkrMv9DdoHiFo7oIMs7uZ9E4t5Y6H4pFwfAqeyQgmIp1k BoKsxUWm39daWJ0NS2O83/0DW9jt/YGUAS695juyqe8TNmlJmOkUeidHqSY8goQto6lr Jguw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718426742; x=1719031542; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Ux3AuQ+Jiebt7/7pb5tqKqqOopFVDXImGSpGZa//7ic=; b=BnnNtBKiWTlJpX2/ev9kOwlzSe442eJzZoWCSbxN1E60LWKF5xLvH6CMs8xfqJc8vg wYSSXyWTrOhfXar1nFjMBd1Vlw00yyRPRrJMpWPgpOR4Y8/nUNxRTNmELE+xNIKrhner 6BFrGD199MgwQMwvmcIla4nYe7PfOp2GXnydKljVGkYu8yWN/pY20zg5GaTqwabjWSbc +Fyj+zo8cgq520BxCCL8u4OwrcGlfLiFyXE1xA4/1AiA+a6CGQuz9YCxHCZpLKneppoL QhCLC1ZmHpDPyBtWyvmmKY2gr89cTqsqM1/RHWzI3u5NbdGHpN1P8m8XquM5TPIVDv8r QHKA== X-Gm-Message-State: AOJu0YwSghwAVmigid4GGKSD/Te7tRTFMhaELnuzAMIbh0ltmRFEevhp jvz29AxY39eZSq2yVyrpTGTz8xbXbfSPF9vx2H2N7/iXSWFLNWzQQ6KbV4AfHo1cUV5ozJ0ikJe 3KB96H/aWmCibR6jAPbKncR2hAun4LDIlNoSweou+ZpBl2gkNUmIdXBESuiyldvccumY0KM/GId QfMtpWEBXhUZwHhu2M0Ds1lLOzPpUCnMbeXzA= X-Received: from swine2.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:1b8e]) (user=swine job=sendgmr) by 2002:a05:6638:2505:b0:4b7:cb85:c0de with SMTP id 8926c6da1cb9f-4b9640b560amr207558173.4.1718426742571; Fri, 14 Jun 2024 21:45:42 -0700 (PDT) Date: Fri, 14 Jun 2024 21:42:56 -0700 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.45.2.627.g7a2c4fd464-goog Message-ID: <20240615044307.359980-1-swine@google.com> Subject: [PATCH] FIXUP: genirq: defuse spurious-irq timebomb From: Pete Swain To: linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, Pete Swain , stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" The flapping-irq detector still has a timebomb. A pathological workload, or test script, can arm the spurious-irq timebomb described in 4f27c00bf80f ("Improve behaviour of spurious IRQ detect") This leads to irqs being moved the much slower polled mode, despite the actual unhandled-irq rate being well under the 99.9k/100k threshold that the code appears to check. How? - Queued completion handler, like nvme, servicing events as they appear in the queue, even if the irq corresponding to the event has not yet been seen. - queues frequently empty, so seeing "spurious" irqs whenever the last events of a threaded handler's while (events_queued()) process_them(); ends with those events' irqs posted while thread was scanning. In this case the while() has consumed last event(s), so next handler says IRQ_NONE. - In each run of "unhandled" irqs, exactly one IRQ_NONE response is promoted from IRQ_NONE to IRQ_HANDLED, by note_interrupt()'s SPURIOUS_DEFERRED logic. - Any 2+ unhandled-irq runs will increment irqs_unhandled. The time_after() check in note_interrupt() resets irqs_unhandled to 1 after an idle period, but if irqs are never spaced more than HZ/10 apart, irqs_unhandled keeps growing. - During processing of long completion queues, the non-threaded handlers will return IRQ_WAKE_THREAD, for potentially thousands of per-event irqs. These bypass note_interrupt()'s irq_count++ logic, so do not count as handled, and do not invoke the flapping-irq logic. - When the _counted_ irq_count reaches the 100k threshold, it's possible for irqs_unhandled > 99.9k to force a move to polling mode, even though many millions of _WAKE_THREAD irqs have been handled without being counted. Solution: include IRQ_WAKE_THREAD events in irq_count. Only when IRQ_NONE responses outweigh (IRQ_HANDLED + IRQ_WAKE_THREAD) by the old 99:1 ratio will an irq be moved to polling mode. Fixes: 4f27c00bf80f ("Improve behaviour of spurious IRQ detect") Cc: stable@vger.kernel.org Signed-off-by: Pete Swain --- kernel/irq/spurious.c | 68 +++++++++++++++++++++---------------------- 1 file changed, 34 insertions(+), 34 deletions(-) diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c index 02b2daf07441..ac596c8dc4b1 100644 --- a/kernel/irq/spurious.c +++ b/kernel/irq/spurious.c @@ -321,44 +321,44 @@ void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret) */ if (!(desc->threads_handled_last & SPURIOUS_DEFERRED)) { desc->threads_handled_last |= SPURIOUS_DEFERRED; - return; - } - /* - * Check whether one of the threaded handlers - * returned IRQ_HANDLED since the last - * interrupt happened. - * - * For simplicity we just set bit 31, as it is - * set in threads_handled_last as well. So we - * avoid extra masking. And we really do not - * care about the high bits of the handled - * count. We just care about the count being - * different than the one we saw before. - */ - handled = atomic_read(&desc->threads_handled); - handled |= SPURIOUS_DEFERRED; - if (handled != desc->threads_handled_last) { - action_ret = IRQ_HANDLED; - /* - * Note: We keep the SPURIOUS_DEFERRED - * bit set. We are handling the - * previous invocation right now. - * Keep it for the current one, so the - * next hardware interrupt will - * account for it. - */ - desc->threads_handled_last = handled; } else { /* - * None of the threaded handlers felt - * responsible for the last interrupt + * Check whether one of the threaded handlers + * returned IRQ_HANDLED since the last + * interrupt happened. * - * We keep the SPURIOUS_DEFERRED bit - * set in threads_handled_last as we - * need to account for the current - * interrupt as well. + * For simplicity we just set bit 31, as it is + * set in threads_handled_last as well. So we + * avoid extra masking. And we really do not + * care about the high bits of the handled + * count. We just care about the count being + * different than the one we saw before. */ - action_ret = IRQ_NONE; + handled = atomic_read(&desc->threads_handled); + handled |= SPURIOUS_DEFERRED; + if (handled != desc->threads_handled_last) { + action_ret = IRQ_HANDLED; + /* + * Note: We keep the SPURIOUS_DEFERRED + * bit set. We are handling the + * previous invocation right now. + * Keep it for the current one, so the + * next hardware interrupt will + * account for it. + */ + desc->threads_handled_last = handled; + } else { + /* + * None of the threaded handlers felt + * responsible for the last interrupt + * + * We keep the SPURIOUS_DEFERRED bit + * set in threads_handled_last as we + * need to account for the current + * interrupt as well. + */ + action_ret = IRQ_NONE; + } } } else { /* -- 2.45.2.627.g7a2c4fd464-goog