Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp2369323rbb; Tue, 27 Feb 2024 22:20:46 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUHEQ4kxpsVxESlBf25NGldGYuCbQhlf4GFDhTosHCncu9ojBz8euudZoGkEe9b7RQZ3ENHJYuDOn799UKZKuh9IvgzgvpSw1sWbLMTGg== X-Google-Smtp-Source: AGHT+IHbTeCt+j+txJDgxbMG72YITCPtIXWDX7IQmSC2sq2PYYkTbEYGlaBUB+c/3qx1AzjjUSdS X-Received: by 2002:ac8:7599:0:b0:42e:aa3c:d0bb with SMTP id s25-20020ac87599000000b0042eaa3cd0bbmr1704320qtq.5.1709101246309; Tue, 27 Feb 2024 22:20:46 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709101246; cv=pass; d=google.com; s=arc-20160816; b=HK743SU/0CTZu/iZ2SaW2cIBM2TxjPpWAw7GtP/yruz9tMSA/JOV+l6i2+BuZjcPpv XqVduA/dP3YEwpiT/4CuAQCABFXG4GkLkLgEOcY0tOeeqA/kjRsEYt3RxaJoO7ec9en6 ucI6N7xyvkVwTxAZjX6K2X1zt0cisb2704UHtvX+tctYfJwn3swMQr+3GDxIEJ0FCqCq +q9L0PCjQNt4wcI9I+9NQ6ev2Vkpy2hW5pDzh5FZQ2YMBzG7E5KbrpilAylrEvEqSbRm G0uSS+5tFEwYqn5poYMQuhXyn4xlZ5Bo5gCZ0d6CJ2VQcwHRxlsM+EwQHBqF6tN45pRH f8kw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:reply-to:user-agent:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:date:message-id :dkim-signature; bh=Tis/Y5P1c6986WEgqGUJ7LAnUXHIukbiwL2Ll2OtVE0=; fh=Tg9KrA/bNTCYld7ecbBvZx7hRCT2WZcwdOJDfKLZKqc=; b=kJTP7pn5c0ERbMsNLh5egQ5jU7b3dvwrX5i6wLilhMxqMBOvv7ErEDtEfw5Tk7HTbA ooi0s8RyuwL/i9L0TEEDJNdgCepEGWP/YRtZ0r5LlYTmEBKOBIIVMnaLbn1XeQcmTV+y rjKxwgoLZDfc+39ukadaU60VCcXYSiYFh82sPDsRYIgO2huwRLnPjJ6jEIqDf92cBPPk SeFZ2hf9uUmwSo3q6Dt5pEyK1I1PPgBbqs5VUyZenKLmHFdbhEMVqghs9yRrW8YDB+jt 7XUqWj0+bYmFOIDD6RIXDSrMqyKWnzqgq1gWYQbYYXPbRTXsnxxHUtcD21kwAerHFq04 W8Vg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@leemhuis.info header.s=he214686 header.b=a7ASRwun; arc=pass (i=1 spf=pass spfdomain=leemhuis.info dkim=pass dkdomain=leemhuis.info); spf=pass (google.com: domain of linux-kernel+bounces-84560-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84560-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id t20-20020ac865d4000000b0042e8062afe7si6284067qto.122.2024.02.27.22.20.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Feb 2024 22:20:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84560-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@leemhuis.info header.s=he214686 header.b=a7ASRwun; arc=pass (i=1 spf=pass spfdomain=leemhuis.info dkim=pass dkdomain=leemhuis.info); spf=pass (google.com: domain of linux-kernel+bounces-84560-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84560-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id D56911C236B1 for ; Wed, 28 Feb 2024 06:20:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 166AC2233E; Wed, 28 Feb 2024 06:20:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=leemhuis.info header.i=@leemhuis.info header.b="a7ASRwun" Received: from wp530.webpack.hosteurope.de (wp530.webpack.hosteurope.de [80.237.130.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF41320DC8; Wed, 28 Feb 2024 06:20:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.237.130.52 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709101222; cv=none; b=EyJ8ezycKddmLd0I13ZbLi9P//IyouAiDpxdDbhmhpFZqkJFoAbpYKA3KxUq6VxaBjVYBQMa+OaBoKTylHdbbSFcaclAfVc1ODzbLZreMIAFzZ9na9ii0q40miFEVR92yJhW7wK9KBY4tZ234LaCgZKGBh0P42rJGMnbZJ7um9c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709101222; c=relaxed/simple; bh=KM3IFL7VBvfv0H/qsM6GNcfHmm/qg0ho1w04RZ/MwvM=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=oWU9aI6zVkNujIHgRdrEBrZsAIyCd86s2mvjbMIr8b+5dwn6Emoh70IDa4OaejD4dlnMjyVq16qHBvxBSFAfK8s2wzCjHxMoMkwzPQCsz351tUszmNHscnJfWhnPgFzeaOMXm6bXsnl0KCqjupJF2Ix4kBSgkjJWn1bSbW/xXew= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=leemhuis.info; spf=pass smtp.mailfrom=leemhuis.info; dkim=pass (2048-bit key) header.d=leemhuis.info header.i=@leemhuis.info header.b=a7ASRwun; arc=none smtp.client-ip=80.237.130.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=leemhuis.info Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=leemhuis.info DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=leemhuis.info; s=he214686; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:From:References:Cc:To:Subject:Reply-To:MIME-Version:Date: Message-ID:From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc:MIME-Version: Content-Type:Content-Transfer-Encoding:Content-ID:Content-Description: In-Reply-To:References; bh=Tis/Y5P1c6986WEgqGUJ7LAnUXHIukbiwL2Ll2OtVE0=; t=1709101219; x=1709533219; b=a7ASRwuniBuydrELW5EpoamZweLc4G/Bq45P+Mwhoz5Owgu 928gKnc/f4z5XX2jvr6840/CVJEU64ZH1J7FRf7gZ37lGJfSA6JdGjxJKXQyTOp6uUNShX7OcZm45 ac+s5osril0pRk1jnkCrkQW9AT8fwN0DECm0QuZT/S+Fh7BlU94qpoHyH0x3nsWwXp9Vblcc+xOBW LeXjVT1E/QJFOVS/HBpMF27MDjdMk2Z8B2NkgCMl4mOItG7E4wlNbhxxaEdnKEX7bWlBJVAzSgq/9 L2hr5C+Ava/VlZWygSUiOxaJVKrR7D57OKYgvH803yuKrAw4kyZj/ngR4LPFbj2Q==; Received: from [2a02:8108:8980:2478:8cde:aa2c:f324:937e]; authenticated by wp530.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) id 1rfDIN-0008F8-3u; Wed, 28 Feb 2024 07:19:59 +0100 Message-ID: Date: Wed, 28 Feb 2024 07:19:56 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: Linux regressions mailing list Subject: Re: [PATCH net v3] net: stmmac: protect updates of 64-bit statistics counters Content-Language: en-US, de-DE To: Eric Dumazet , "David S. Miller" , Paolo Abeni , Jakub Kicinski Cc: Jisheng Zhang , Petr Tesarik , Alexandre Torgue , Jose Abreu , Maxime Coquelin , Chen-Yu Tsai , Jernej Skrabec , Samuel Holland , "open list:STMMAC ETHERNET DRIVER" , "moderated list:ARM/STM32 ARCHITECTURE" , "moderated list:ARM/STM32 ARCHITECTURE" , open list , "open list:ARM/Allwinner sunXi SoC support" , Marc Haber , Andrew Lunn , Florian Fainelli , stable@vger.kernel.org, Linux kernel regressions list , alexis.lothore@bootlin.com, Guenter Roeck References: <20240203190927.19669-1-petr@tesarici.cz> <20d94512-c4f2-49f7-ac97-846dc24a6730@roeck-us.net> From: "Linux regression tracking (Thorsten Leemhuis)" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-bounce-key: webpack.hosteurope.de;regressions@leemhuis.info;1709101219;0a885a42; X-HE-SMSGID: 1rfDIN-0008F8-3u Net maintainers, chiming in here, as it seems handling this regression stalled. On 13.02.24 16:52, Eric Dumazet wrote: > On Tue, Feb 13, 2024 at 4:26 PM Guenter Roeck wrote: >> On Tue, Feb 13, 2024 at 03:51:35PM +0100, Eric Dumazet wrote: >>> On Tue, Feb 13, 2024 at 3:29 PM Jisheng Zhang wrote: >>>> On Sun, Feb 11, 2024 at 08:30:21PM -0800, Guenter Roeck wrote: >>>>> On Sat, Feb 03, 2024 at 08:09:27PM +0100, Petr Tesarik wrote: >>>>>> As explained by a comment in , write side of struct >>>>>> u64_stats_sync must ensure mutual exclusion, or one seqcount update could >>>>>> be lost on 32-bit platforms, thus blocking readers forever. Such lockups >>>>>> have been observed in real world after stmmac_xmit() on one CPU raced with >>>>>> stmmac_napi_poll_tx() on another CPU. >>>>>> >>>>>> To fix the issue without introducing a new lock, split the statics into >>>>>> three parts: >>>>>> >>>>>> 1. fields updated only under the tx queue lock, >>>>>> 2. fields updated only during NAPI poll, >>>>>> 3. fields updated only from interrupt context, >>>>>> >>>>>> Updates to fields in the first two groups are already serialized through >>>>>> other locks. It is sufficient to split the existing struct u64_stats_sync >>>>>> so that each group has its own. >>>>>> >>>>>> Note that tx_set_ic_bit is updated from both contexts. Split this counter >>>>>> so that each context gets its own, and calculate their sum to get the total >>>>>> value in stmmac_get_ethtool_stats(). >>>>>> >>>>>> For the third group, multiple interrupts may be processed by different CPUs >>>>>> at the same time, but interrupts on the same CPU will not nest. Move fields >>>>>> from this group to a newly created per-cpu struct stmmac_pcpu_stats. >>>>>> >>>>>> Fixes: 133466c3bbe1 ("net: stmmac: use per-queue 64 bit statistics where necessary") >>>>>> Link: https://lore.kernel.org/netdev/Za173PhviYg-1qIn@torres.zugschlus.de/t/ >>>>>> Cc: stable@vger.kernel.org >>>>>> Signed-off-by: Petr Tesarik >>>>> >>>>> This patch results in a lockdep splat. Backtrace and bisect results attached. >>>>> >>>>> --- >>>>> [ 33.736728] ================================ >>>>> [ 33.736805] WARNING: inconsistent lock state >>>>> [ 33.736953] 6.8.0-rc4 #1 Tainted: G N >>>>> [ 33.737080] -------------------------------- >>>>> [ 33.737155] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. >>>>> [ 33.737309] kworker/0:2/39 [HC1[1]:SC0[2]:HE0:SE0] takes: >>>>> [ 33.737459] ef792074 (&syncp->seq#2){?...}-{0:0}, at: sun8i_dwmac_dma_interrupt+0x9c/0x28c >>>>> [ 33.738206] {HARDIRQ-ON-W} state was registered at: >>>>> [ 33.738318] lock_acquire+0x11c/0x368 >>>>> [ 33.738431] __u64_stats_update_begin+0x104/0x1ac >>>>> [ 33.738525] stmmac_xmit+0x4d0/0xc58 >>>> >>>> interesting lockdep splat... >>>> stmmac_xmit() operates on txq_stats->q_syncp, while the >>>> sun8i_dwmac_dma_interrupt() operates on pcpu's priv->xstats.pcpu_stats >>>> they are different syncp. so how does lockdep splat happen. >>> >>> Right, I do not see anything obvious yet. >> >> Wild guess: I think it maybe saying that due to >> >> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. >> >> the critical code may somehow be interrupted and, while handling the >> interrupt, try to acquire the same lock again. > > This should not happen, the 'syncp' are different. They have different > lockdep classes. > > One is exclusively used from hard irq context. > > The second one only used from BH context. Alexis Lothoré hit this now as well, see yesterday report in this thread; apart from that nothing seem to have happened for two weeks now. The change recently made it to some stable/longterm kernels, too. Makes me wonder: What's the plan forward here? Is this considered to be a false positive? Or a real problem? Or a kind of situation along the lines of "that commit should not cause the problem we are seeing, so it might have exposed a older bug in the code, but nobody looked closer yet to check"? Or something else? Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.