Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3112312pxb; Sun, 31 Jan 2021 04:45:54 -0800 (PST) X-Google-Smtp-Source: ABdhPJzJQ+9MhNqaPH8N4NjcRxG1saRMPbToH3qyj2cKP0jJYXkon3nfMn8hVkmSWebQ7vy1qu24 X-Received: by 2002:a05:6402:556:: with SMTP id i22mr13913644edx.56.1612097154624; Sun, 31 Jan 2021 04:45:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612097154; cv=none; d=google.com; s=arc-20160816; b=X91nEnCevdhZFpCw5seBy6px5FYzFE6N3sHCyuTZtQfRNlZmS7K61usMsgUm2NqlBI QKVwTWsKzhLrOLkJ8Ijs/7EE0x4TSZSvK/CfGvNiQQG8Le0HKlRxOBZD6fLS+VYdFs/x Zj2/wWlF7vGyqtJVqHApXKGMxJ8p9T7YkzTBB6bKrpLe2Z5qLCto96s1kPKiaKl9sTyD OyygY9nCoVRXG+sN/dXwwZ2Qn/SJjjyDXtT529WFQXnEbpzeGaRRofXzbH888Xc9Cm2m JPCrJRFjIm0jp0v6X35AENwSowWPrFMX+ZpXoHvdrcKMo7A2WZ8aRc9W3lQ69NURWoJn NKrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=bxD/rwlXKrpjbUMZWW5bu7pWWONVtYsKM+9y8fJivuc=; b=OcBXDRISxL2sR0BGrMHIQ/ogeCF03LmqoJVhpbBn26SgH/7cMNfJds3CwRnvSO6WZX wFazH+WYrzndGNTGMP6jCzJ3fGFKX7dwPbZhKpmqKa6l98uZvP3DTsGAIJGoY4n0ynAx l0bXKsIZPn7jqVXOUYz+DdWeNLOeHGO/dv8r2hFH3KeKl2+SJIaVJhEV2fxJPyZHNyAQ Rj87UBs7OflH1KJjtZ8dzGYYcEegv8fLArQg25xmo4uwji7XE6kx09zHDuiz/B9js8pA qG7lnkkRKlVO8S5aL6aiYVutuMtkLhLtWhfva7BRh4Ddw+vyjJEF99jarkxAHY61UaU9 aJnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y62si11214855ede.330.2021.01.31.04.45.30; Sun, 31 Jan 2021 04:45:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231848AbhAaMFs convert rfc822-to-8bit (ORCPT + 99 others); Sun, 31 Jan 2021 07:05:48 -0500 Received: from mail.kernel.org ([198.145.29.99]:52412 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231314AbhAaLUk (ORCPT ); Sun, 31 Jan 2021 06:20:40 -0500 Received: from archlinux (cpc108967-cmbg20-2-0-cust86.5-4.cable.virginm.net [81.101.6.87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8B54564E25; Sun, 31 Jan 2021 11:08:15 +0000 (UTC) Date: Sun, 31 Jan 2021 11:08:11 +0000 From: Jonathan Cameron To: Fabrice Gasnier Cc: Ahmad Fatoum , Olivier Moysan , Fabrice Gasnier , Lars-Peter Clausen , Holger Assmann , Maxime Coquelin , , Peter Meerwald-Stadler , , , Thomas Gleixner , Lucas Stach , , , Alexandre Torgue Subject: Re: [Linux-stm32] [PATCH] iio: adc: stm32-adc: fix erroneous handling of spurious IRQs Message-ID: <20210131110811.7d923ff6@archlinux> In-Reply-To: <156c5b73-2f66-653f-4512-e271a10ddd5e@foss.st.com> References: <20210112152441.20665-1-a.fatoum@pengutronix.de> <20210116175333.4d8684c5@archlinux> <47b0905a-4496-2f21-3b17-91988aa88e91@pengutronix.de> <7668b126-d77c-7339-029f-50333d03fbd9@foss.st.com> <156c5b73-2f66-653f-4512-e271a10ddd5e@foss.st.com> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 26 Jan 2021 16:52:37 +0100 Fabrice Gasnier wrote: > On 1/22/21 1:18 PM, Ahmad Fatoum wrote: > > Hello Fabrice, > > > > On 19.01.21 18:56, Fabrice Gasnier wrote: > >> On 1/18/21 12:42 PM, Ahmad Fatoum wrote: > >>> Hello Jonathan, > >>> > >>> On 16.01.21 18:53, Jonathan Cameron wrote: > >>>> On Tue, 12 Jan 2021 16:24:42 +0100 > >>>> Ahmad Fatoum wrote: > >>>> > >>>>> 1c6c69525b40 ("genirq: Reject bogus threaded irq requests") makes sure > >>>>> that threaded IRQs either > >>>>> - have IRQF_ONESHOT set > >>>>> - don't have the default just return IRQ_WAKE_THREAD primary handler > >>>>> > >>>>> This is necessary because level-triggered interrupts need to be masked, > >>>>> either at device or irqchip, to avoid an interrupt storm. > >>>>> > >>>>> For spurious interrupts, the STM32 ADC driver still does this bogus > >>>>> request though: > >>>>> - It doesn't set IRQF_ONESHOT > >>>>> - Its primary handler just returns IRQ_WAKE_THREAD if the interrupt > >>>>> is unexpected, i.e. !(status & enabled_mask) > >>>> This seems 'unusual'. If this is a spurious interrupt we should be > >>>> returning IRQ_NONE and letting the spurious interrupt protection > >>>> stuff kick in. > >>>> > >>>> The only reason I can see that it does this is print an error message. > >>>> I'm not sure why we need to go into the thread to do that given > >>>> it's not supposed to happen. If we need that message at all, I'd > >>>> suggest doing it in the interrupt handler then return IRQ_NONE; > >>> As described, I run into the spurious IRQ case, so I think the message is > >>> still useful (until that's properly fixed), but yes, it should've returned > >>> IRQ_NONE in that case. > >>> > >>> With these changes, IRQF_ONESHOT shouldn't be necessary, but in practice > >>> the driver doesn't function correctly with the primary IRQ handler threaded. > >>> > >>> Olivier, Fabrice: Are you aware of this problem? > >> > >> > >> Hi Ahmad, Jonathan, > >> > >> I wasn't aware of this up to now. I confirm we've the same behavior at > >> our end with threadirqs=1. > >> > >> Olivier and I started to look at this. Indeed, the IRQF_ONESHOT makes > >> the issue to disappear. > >> I'm not sure 100% that's for the above reasons. Please let me share some > >> piece of logs, analysis and thoughts. > > > > Thanks for looking at this. > > > >> I may miss it but, the patch "genirq: Reject bogus threaded irq > >> requests" seems to handle the case where no HW handler is provided, but > >> only the threaded part? > > > > There is still a primary handler, but that one does only do IRQ_WAKE_THREAD, > > so I assumed that would be equivalent to what the driver is doing in the > > spurious IRQ case. > > > >> In the stm32-adc both are provided. Also the IRQ domain in > >> stm32-adc-core maybe a key here ? > > > > Oh, missed completely that the stm32-adc-core does the interrupt routing. > > Hi Ahmad, Jonathan, > > The interrupt routing is done in the core by using "dummy_irq_chip". > > Currently (with threadirqs=1), irq_mask and irq_unmask callbacks are > called, but this makes a "noop": > struct irq_chip dummy_irq_chip = { > ... > .irq_mask = noop, > .irq_unmask = noop, > ... > > That's the reason for the hw irq storm until the primary threaded > handler can run. > > I see no easy way to mask the irq from the core driver. e.g. enable bits > are in "child" registers. > The child adc driver already clear/set them at will (in IER: EOC/OVR bits). > > Please find other considerations here after > > > > >> We did some testing, ftrace and observed following behavior for one > >> capture (a single cat in_voltage..._raw) : > >> > >> in stm32-adc-core, as IRQ source is still active until the IRQ thread > >> can execute: > >> - stm32_adc_irq_handler <-- generic_handle_irq > >> - stm32_adc_irq_handler <-- generic_handle_irq > >> - stm32_adc_irq_handler <-- generic_handle_irq > >> ... > >> > >> - sched_switch to the 1st IRQ thread > >> - stm32_adc_irq_handler <-- generic_handle_irq (again until DR get read) > >> > >> - stm32_adc_isr <-- irq_forced_thread_fn (from stm32-adc) > >>   DR read, clears the active flag > >> - stm32_adc_isr <-- irq_forced_thread_fn > >>   wakes the 2nd IRQ thread to print an error (unexpected...) > >> > >> sched_switch to the 2nd IRQ thread that prints the message. > >> > >> - stm32_adc_threaded_isr <-- irq_thread_fn > >> > >> > >> So my understanding is: the cause seems to be the concurrency between > >> > >> - stm32_adc_irq_handler() storm calls in stm32-adc-core > >> - stm32_adc_isr() call to clear the cause (forced into a thread with > >> threadirqs=1). > > > > I can't follow here. Where does stm32_adc_isr() clear the IRQ cause? > > The 'eoc' end of conversion flag is cleared by reading data register. > > > I assumed it can't be isr_ovr.mask, because that's checked in the > > primary handler. > > > >> To properly work, the stm32_adc_irq_handler() should be masked in between. > >> > >> As you explain, this works in this case: the call to stm32_adc_isr (in > >> stm32-adc) isn't longer forced threaded with IRQF_ONESHOT. > >> > >> It looks like IRQF_NO_THREAD for forced threading would have similar > >> effect? Maybe the same would be applicable here ? (I haven't tested...) > > > > I guess IRQF_NO_THREAD is meant for use with request_irq and > > IRQF_ONESHOT for request_threaded_irq? > > Thanks for pointing this. So I guess IRQF_ONESHOT can be used here, so > we don't hit the "noop" irq_mask/unmask. > > Some other considerations are: some stm32 ADC have a single IRQ line > (stm32f4/stm32h7) but stm32mp1 have one irq line per "child" adc. > So I'd prefer to keep this approach, to prevent using a threaded primary > handler. Also, there's not always a fifo in stm32 adc (so tight timing > to read data in PIO mode). > > Ahmad, Jonathan, > > Is it reasonable to improve the commit message to summarize the root > cause, explain the rationale, and keep IRQF_ONESHOT (e.g. current patch ?) ? As long as we also tidy up the IRQ_NONE for the error path, as mentioned earlier in this thread I think I'm fine with that. Maybe also add comments in the code, as well as the commit description seeing as this is going to look a little unusual.. Jonathan > > Best Regards, > Fabrice > > > > >> Hopefully this helps and is similar to what you observed. > > > > Cheers, > > Ahmad > > > >> > >> Thanks and best regards, > >> Fabrice > >> > >>> > >>> Cheers, > >>> Ahmad > >>> > >>>>> - stm32mp151.dtsi describes the ADC interrupt as level-triggered > >>>>> > >>>>> Fix this by setting IRQF_ONESHOT to have the irqchip mask the IRQ > >>>>> until the IRQ thread has finished. > >>>>> > >>>>> IRQF_ONESHOT also has the effect that the primary handler is no longer > >>>>> forced into a thread. This makes the issue with spurious interrupts > >>>>> interrupts disappear when reading the ADC on a threadirqs=1 kernel. > >>>>> This used to result in following kernel error message: > >>>>> > >>>>> iio iio:device1: Unexpected IRQ: IER=0x00000000, ISR=0x0000100e > >>>>> or > >>>>> iio iio:device1: Unexpected IRQ: IER=0x00000004, ISR=0x0000100a > >>>>> > >>>>> But with this patch applied (or threaded IRQs disabled), this no longer > >>>>> occurs. > >>>>> > >>>>> Cc: Lucas Stach > >>>>> Reported-by: Holger Assmann > >>>>> Fixes: 695e2f5c289b ("iio: adc: stm32-adc: fix a regression when using dma and irq") > >>>>> Signed-off-by: Ahmad Fatoum > >>>>> --- > >>>>> drivers/iio/adc/stm32-adc.c | 2 +- > >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/drivers/iio/adc/stm32-adc.c b/drivers/iio/adc/stm32-adc.c > >>>>> index c067c994dae2..7e0e21c79ac8 100644 > >>>>> --- a/drivers/iio/adc/stm32-adc.c > >>>>> +++ b/drivers/iio/adc/stm32-adc.c > >>>>> @@ -1910,7 +1910,7 @@ static int stm32_adc_probe(struct platform_device *pdev) > >>>>> > >>>>> ret = devm_request_threaded_irq(&pdev->dev, adc->irq, stm32_adc_isr, > >>>>> stm32_adc_threaded_isr, > >>>>> - 0, pdev->name, indio_dev); > >>>>> + IRQF_ONESHOT, pdev->name, indio_dev); > >>>>> if (ret) { > >>>>> dev_err(&pdev->dev, "failed to request IRQ\n"); > >>>>> return ret; > >>>> > >> > >