Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68D61C742A7 for ; Wed, 8 Mar 2023 14:53:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231465AbjCHOx6 (ORCPT ); Wed, 8 Mar 2023 09:53:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231422AbjCHOxx (ORCPT ); Wed, 8 Mar 2023 09:53:53 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C9735A91C; Wed, 8 Mar 2023 06:53:52 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C922461856; Wed, 8 Mar 2023 14:53:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 36861C433D2; Wed, 8 Mar 2023 14:53:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1678287231; bh=P2m088EuAK0N6PMye3c2byTzR8oRj3RrQdTsuIbY79U=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=B2WgvJRW/MEtfR/IH9ZhHnyLljSjiYw2t/bteiWfa9fckX2t4S+RjfhJIuHFSAEtz fgTQpb1rIzS4SPr2ms+8U9xE53FG5hiBMT9CVwgc4Fb8+TqrBJgDnJGU8mikhEPfOI vUsNs5f+Jhb8FC0kGy24H5jPKHrG2jm2BNkzF2kNQKrKJrJnoqqFSGu/6bXwS0s9Qg zz+WKIEZ/Iq+WbjcWQW3dSvN2Aa9rRDa22QbERO9FqtlzJHvQeP0CPqZy3uLh30uxy E7ppJl79XyzMw99lOflHNyfZv/DdmqZ2a/DsYcZ7EY944I5rnLBddK+R7GyTN4P1ML Wuu/9MWun5Frw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pZvAq-00G1YA-Ii; Wed, 08 Mar 2023 14:53:49 +0000 Date: Wed, 08 Mar 2023 14:53:48 +0000 Message-ID: <86zg8nxpj7.wl-maz@kernel.org> From: Marc Zyngier To: Cyril Brulebois Cc: Johan Hovold , Thomas Gleixner , x86@kernel.org, platform-driver-x86@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Dmitry Torokhov , Jon Hunter , Hsin-Yi Wang , Mark-PK Tsai Subject: Re: [PATCH v6 06/20] irqdomain: Fix mapping-creation race In-Reply-To: <20230308144105.di552lbogqv2s7fk@mraw.org> References: <20230213104302.17307-1-johan+linaro@kernel.org> <20230213104302.17307-7-johan+linaro@kernel.org> <20230308144105.di552lbogqv2s7fk@mraw.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kibi@debian.org, johan+linaro@kernel.org, tglx@linutronix.de, x86@kernel.org, platform-driver-x86@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, dtor@chromium.org, jonathanh@nvidia.com, hsinyi@chromium.org, mark-pk.tsai@mediatek.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 08 Mar 2023 14:41:05 +0000, Cyril Brulebois wrote: >=20 > Hi Johan, >=20 > And thanks so much for this patch series. >=20 > Johan Hovold (2023-02-13): > > Parallel probing of devices that share interrupts (e.g. when a driver > > uses asynchronous probing) can currently result in two mappings for the > > same hardware interrupt to be created due to missing serialisation. > >=20 > > Make sure to hold the irq_domain_mutex when creating mappings so that > > looking for an existing mapping before creating a new one is done > > atomically. >=20 > Just for information: This patch fixes a long-standing regression > regarding Raspberry Pi devices, which have been failing to boot (at > least reliably) due to MMC timeouts for a long while; I think that > started between v5.17 and v5.19, but I couldn't bisect at the time > (I was already chasing some other regression). >=20 > Example bug report: > https://bugs.debian.org/1019700 >=20 > Before trying to pinpoint when the regression appeared, I've checked > these versions, with a Debian testing userspace as of 2023-03-07: > - v6.1.12: affected. > - v6.2: affected. > - v6.3-rc1: not affected. >=20 > A bisect between v6.2 and v6.3-rc1 led me to this patch specifically. > Seeing how it's part of a patch series, and how previous patches are > preliminary ones, I've checked that cherry-picking the first 6 patches > on top of v6.1.15 indeed fixes the problem there too, and it does > (git cherry-pick v6.2-rc4..601363cc08da25747feb87c55573dd54de91d66a). >=20 >=20 > With the following systems: > - Pi 4 B, using external storage (SD card), > - CM4 Lite on CM4 IO Board, using external storage (SD card), > - CM4 on CM4 IO Board, using internal storage (eMMC), >=20 > I've been able to verify that v6.1.12 (baseline in Debian testing) > triggers this MMC timeout issue, while v6.1.15 + the aforementioned > range of cherry-picked commits no longer triggers this issue. >=20 > (Methodology: cold boot then reboot 20 times, monitoring via serial > console to keep HDMI output of the equation; affected systems stop > booting after 1-4 boots; unaffected systems boot and reboot just fine > all the time.) >=20 >=20 > This looks like a critical bugfix for Raspberry Pi users. >=20 > Seeing the stable@ mention is about 4.8, I suppose this is going to be > considered for a wide range of kernels already=E2=80=A6 but I'm happy to = dig > into this further to pinpoint when the regression appeared, if that's > helpful. If you have an interest in these patches being backported, may I suggest you look at the backporting failures that have been reported[1]? Note that now that 4.9 is out of the picture, nothing is going to be backported past 4.14. Thanks, M. [1] https://lore.kernel.org/r/167812853717924@kroah.com --=20 Without deviation from the norm, progress is not possible.