Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp618053pxx; Mon, 26 Oct 2020 17:04:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyxOt+7zuWPwPY+2Rm16gNOt1NhWmuig7m3i51stikiVBXgm3+bbBSAOk0Jw/p2l/F6oV5m X-Received: by 2002:a17:906:31d0:: with SMTP id f16mr17726500ejf.409.1603757067401; Mon, 26 Oct 2020 17:04:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603757067; cv=none; d=google.com; s=arc-20160816; b=q+Zoz1qVOL5fDwN3ORGi/mrbndBU9sTcbRAQvmTl+WEC4X9BOoRsHquZGtUf68CmHI VWV5f+dpZObXRsjzr6G9f8Ei+sPtehyNtKwtzf7hUwx6iqd4SjIxrKlbIfG8xpJoLsBl Zf8MJTvY5TX/d15wgenNArz042c+AU3UrZ83Dx/YgZ/i64Cn3crGWXmnuQn9pvSP8BLh oSNCU9+sXpvmgY2stgIxZ5KeNW5HoywJBDmgUO0xOaFTRozjRwqdT+cKjMycAF+daNKZ jQN60FJhTKMKh/KZCV3DMKPMLlMA82OMFYhmLHedGZ+SnndZHDpsi18FasLbc1FvfNue 5UHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=1lkoWn6wcOFMTEOogm/Sqf3fEH2S/rX9rztcdNBPDfA=; b=eTmSUbXBF7HPZvMP+L83tYfbZli4OYMzQ/b6XRNg2wBrzbzv80u+ID9KLCa8EwV4u3 gcqfisWWM5dfp2wdDdN4+TmHAD9RtwlnDwsLexcIlUM2ySgq4X7jokPFQoWOka7T0c0R +oIdui6d6VB0eYDMsfL6Bhoy4FGrkNj2hAELfSk7uCMnO06I7tfPfGXvkvsvXmET0bP0 4jlQJZr40kiZdbn0we9T+2nZ1VtZQYrrUeHcd8b0LI5gLIrXR723CJROBnGsfTz3TfUn zQwXCMFxhjQokZfmY3sMkEoTvCqlLjWuo3ZHKLg3/5KsJtf/DMjxPlcBjf2O0k1SX+EN 6rBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=SToqhqmg; dkim=neutral (no key) header.i=@linutronix.de header.b=6HgJyqSF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z21si8323511edl.534.2020.10.26.17.04.05; Mon, 26 Oct 2020 17:04:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=SToqhqmg; dkim=neutral (no key) header.i=@linutronix.de header.b=6HgJyqSF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731461AbgJZVVj (ORCPT + 99 others); Mon, 26 Oct 2020 17:21:39 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:42604 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730958AbgJZVVj (ORCPT ); Mon, 26 Oct 2020 17:21:39 -0400 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1603747296; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1lkoWn6wcOFMTEOogm/Sqf3fEH2S/rX9rztcdNBPDfA=; b=SToqhqmgRX6Vzs/Mym0aZ8Z2G4TzrKFFfjWUHtG0QByqKG7ws9EAw4uRf2Ez6PT9x8BAb1 LBx3QZEIElMIjEeCkYbCJcAYibNF2Euzo07v2OY9ozAlWswtrzGx/flOvT+pgmlM/7GVmn B4zi7+m2VTjvxgcS0wZqGkdLdHtMdyoanfqbgoqEEe2kEU6FAtIzrVOHRDJ1OUacBlhCJf QUWuZs4NjdYGVGoI2aVJqnvSHmVLtYBiXVK3Jhqs+vtPYGJixLArq98sI1j8AEU4ESs2zg xjGdexISfRQCU6YtSrqom/9k80dBtcfzFOKxyeLNLUmwb4tJGhQcVSnlFjyEqw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1603747296; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1lkoWn6wcOFMTEOogm/Sqf3fEH2S/rX9rztcdNBPDfA=; b=6HgJyqSFqo4iDdJgNJrQ9Jjj60Sy44P8+J6oyjZXgqpjmijD/kx/J8YwOJSirJ6rzSldOR qQw34ev+pgekZpCg== To: Guilherme Piccoli Cc: Pingfan Liu , LKML , Peter Zijlstra , Jisheng Zhang , Andrew Morton , Petr Mladek , Marc Zyngier , Linus Walleij , afzal mohammed , Lina Iyer , "Gustavo A. R. Silva" , Maulik Shah , Al Viro , Jonathan Corbet , Pawan Gupta , Mike Kravetz , Oliver Neukum , linux-doc@vger.kernel.org, Kexec Mailing List , Bjorn Helgaas Subject: Re: [PATCH 0/3] warn and suppress irqflood In-Reply-To: References: <1603346163-21645-1-git-send-email-kernelfans@gmail.com> <871rhq7j1h.fsf@nanos.tec.linutronix.de> <87y2js3ghv.fsf@nanos.tec.linutronix.de> Date: Mon, 26 Oct 2020 22:21:36 +0100 Message-ID: <87o8ko3cpr.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 26 2020 at 17:28, Guilherme Piccoli wrote: > On Mon, Oct 26, 2020 at 4:59 PM Thomas Gleixner wrote: >> It gets flooded right at the point where the crash kernel enables >> interrupts in start_kernel(). At that point there is no device driver >> and no interupt requested. All you can see on the console for this is >> >> "common_interrupt: $VECTOR.$CPU No irq handler for vector" >> >> And contrary to Liu's patches which try to disable a requested interrupt >> if too many of them arrive, the kernel cannot do anything because there >> is nothing to disable in your case. That's why you needed to do the MSI >> disable magic in the early PCI quirks which run before interrupts get >> enabled. > > Wow, thank you very much for this great explanation (without a > reproducer) - it's nice to hear somebody that deeply understands the > code! And double thanks for CCing Bjorn. Understanding the code is only half of the picture. You need to understand how the hardware works or not :) > So, I don't want to hijack Liu's thread, but do you think it makes > sense to have my approach as a (debug) parameter to prevent such a > degenerate case? At least it makes sense to some extent even if it's incomplete. What bothers me is that it'd be x86 specific while the issue is pretty much architecture independent. I don't think that the APIC is special in that regard. Rogue MSIs should be able to bring down pretty much all architectures. > Or could we have something in core IRQ code to prevent irq flooding in > such scenarios, something "stronger" than disabling MSIs (APIC-level, > likely)? For your case? No. The APIC cannot be protected against rogue MSIs. The only cure is to disable interrupts or disable MSIs on all PCI[E] devices early on. Disabling interrupts is not so much of an option obviously :) Thanks, tglx