Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp3599858imm; Fri, 19 Oct 2018 13:27:16 -0700 (PDT) X-Google-Smtp-Source: ACcGV60RuGTv/8168LnTroNBKQtvido1T7fBoo/bfkEmt24yEl0bpVSX0OJR2mF6OIlnfGGs8zph X-Received: by 2002:a17:902:708a:: with SMTP id z10-v6mr6936077plk.330.1539980835994; Fri, 19 Oct 2018 13:27:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539980835; cv=none; d=google.com; s=arc-20160816; b=gn3ugMXVQory6rD+al9gub+Zn5+hf/sKRMv0MBlUZmVY4HNRvh8CnItJtppVZ5H6Lg 1ZOiHgEN0iovBf/La4ij2Nc0HFJjMUXC7T6gKk4audsunczcWGxeSsffuTTsiE8bq0rn UiVQP0xuCJUt6LRAyb4PKzgnFH5gKFqVfJmdHJlO55gIIDyyE1yDXrCJLSMXkn3CezLA bz+/iShsTcWjtImRZtlUOf1i1Ihw5MjIZKj6hSe4kA9B/1Cfj7LPw71fJ3HgcZoN4Wz0 UHd+TnSzG91UQjr5x29J6H4WWPUUoy8HPRj+DVi+X6oiq088N8WLeP2vlWpKEUhLu43i jClw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=UnusqL5JOrudWGcgk2NmxTurw9DHWWcK5K4JJox//1Q=; b=x6/nqH5BlI6KsJmO+faAflc+iI3huFEkIKJ/2r7LfuGIEAzHjNIVUlkOj3FDCZ3Gjf 9ZjF6WjKUkBzDgULROH6bO2eZSMAlDMrxMaonhTdQUO/It3vNxnUNN8xBWhci89uDcdv ZIBZBc/OCLyUFVPmAo3xZRsLRqEyv28UCLX5w68yU+AE1IXiz0Pcnkr7L8at0jSbNfSy MtJPmvVZBLFe7ne7P6RxkX383VonUQGR+20w5CmoCEn6kvDNA/EHFeREF6xIJUyT5SvS eFTRPgpXsxXmBbeBo+kAZYgq2Swo45WXCawBTg5DPB5fqPFS+RIqtjESSTeV1EMdN0w1 pKfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z67-v6si19300510pfz.5.2018.10.19.13.27.00; Fri, 19 Oct 2018 13:27:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727295AbeJTEeH (ORCPT + 99 others); Sat, 20 Oct 2018 00:34:07 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:40158 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726282AbeJTEeH (ORCPT ); Sat, 20 Oct 2018 00:34:07 -0400 Received: from p5492fe24.dip0.t-ipconnect.de ([84.146.254.36] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1gDbLj-0005t8-1y; Fri, 19 Oct 2018 22:26:23 +0200 Date: Fri, 19 Oct 2018 22:26:22 +0200 (CEST) From: Thomas Gleixner To: David Laight cc: "casey.fitzpatrick@timesys.com" , "lukas@wunner.de" , "mingo@kernel.org" , "m.duckeck@kunbus.de" , "hpa@zytor.com" , "akshay.bhat@timesys.com" , "linux-kernel@vger.kernel.org" , "linux-tip-commits@vger.kernel.org" Subject: RE: [tip:irq/core] genirq: Fix race on spurious interrupt detection In-Reply-To: Message-ID: References: <1dfd8bbd16163940648045495e3e9698e63b50ad.1539867047.git.lukas@wunner.de> <2bbbe6ba27424b3b83d7f22bafea13ad@AcuMS.aculab.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org David, On Fri, 19 Oct 2018, Thomas Gleixner wrote: > On Fri, 19 Oct 2018, David Laight wrote: > > From: Lukas Wunner > > > Sent: 19 October 2018 16:34 > > > > > > genirq: Fix race on spurious interrupt detection > > > > > > Commit 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of > > > threaded irqs") made detection of spurious interrupts work for threaded > > > handlers by: > > > > > > a) incrementing a counter every time the thread returns IRQ_HANDLED, and > > > b) checking whether that counter has increased every time the thread is > > > woken. > > > > That seems horribly broken. > > What is it trying to achieve? > > > > There are (at least) two common cases where IRQ_HANDLED doesn't get returned. > > (Unless the driver always returns it to avoid the message.) > > > > 1) The IOW that causes the hardware to drop a level sensitive IRQ is posted > > on the bus (etc) and happens late enough that the IRQ line is still > > asserted when the iret executes. > > If this happens all the time you need to flush the IOW, but if only > > occasionally it doesn't matter and you don't want a message. > > > > 2) Typically an ethernet driver ISR has to enable the interrupt and then > > check the ring for work before returning from the interrupt. > > If a packet arrives at this time it might be processed by the 'old' > > ISR invocation but still generate another interrupt. > > If no more packets arrive the second ISR invocation will find no work. > > Again this is normal behaviour. > > (Deferring everything with NAPI might make this not happen - but other > > interrupts end up working the same way.) > > > > If you are really trying to detect 'stuck' interrupts then you probably > > want to count un-handled ones and zero the count on handled ones. > > I'm also pretty sure you don't need an atomic counter. > > Care to look at the logic which handles all of this including the > interaction with threaded interrupt handlers? Sorry for being dense. Wanted to postpone and hit send ... The thing is that the spurious detector works perfectly fine for the scenarios you described since years. It does proper handled vs. unhandled accounting. What Lukas is addressing is not the normal interrupt case, it's the threaded handler case where we did the accounting check after reenabling the interrupt, which causes spurious detection to trigger occasionally. That's not leading to disabling the interrupt unless you enforce it with a gazillion of printks. The atomic counter is necessary in that case because for certain scenarios the hard interrupt handler can race with the threaded handler legitimately. Thanks, tglx