Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp2302439imw; Wed, 6 Jul 2022 03:43:54 -0700 (PDT) X-Google-Smtp-Source: AGRyM1su5p6yc/TLxAL5Kl6xJoHebhXx2Xb7FrUkiA84VFlsL5vjv2q/Pwu7QEHyLrEWiukCkrKK X-Received: by 2002:a05:6a00:80d:b0:525:b61f:6df2 with SMTP id m13-20020a056a00080d00b00525b61f6df2mr46252942pfk.66.1657104234061; Wed, 06 Jul 2022 03:43:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657104234; cv=none; d=google.com; s=arc-20160816; b=LaQ+nqTpAwYixdjlH/nZjxJuPW3CvXa9GMogc+Fomgo8ZqUIJV3nxR5LRYQLbgoyR9 Qzn0whp6tH3kFbkETmrFg3pKGw4ztEjvIyGHwNDE1mgSEqEW68kwXxoyjjIIzwmB+h4X 7/gS4pujjKuPNEdxbZilL4S4ZY2mygUEdAhi3Hp37SHcYowhIZ0ro/VhJts/ZhzznJEX PLWGE/1PX+eyrfVjMFcbfOmlqaZW1h3L/FKFWGI70o8fXr7Sq6iQKcB2RB1Gx9ZyQ20K 7Idfhp/ddg4WeXkfilw4Gtm7uP+iCBm65rD2NZsv4wPX3m4sXkYJEnBaRZFAwb7JeYl0 T6nQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=qsl24AabvUMZrh0TOWW5RW+hspVLM7vZt4Oyc5HzQ/U=; b=LShWv0+z85jX/+Uq5Xb4t7RHHrBp7BtLa58msJw/HlAAHWaF1kgmB90ZRo8y0AX7tm QUiSgkompWi/i3MJ4Gmn6X1ycmjuEVvmEG6tfBP+qWZBaCjF4T/x/yLXXlVgrzv0HCcY K/cniPJ6SLnoXoh1HTvdUSNBRmfdkUIyljssWrvm5RetkHqendCGE7ZaL+g/W9RTwpaX wfqHXh8iFVoSuEuReUrfUosJwjS0WA05+ODA20VZHwPyd5T4r5OYtGVofDzkX+sTY/1J BxwHvyL7hFi2Ye0ccX54QqbKLh4vn7nWNL4KSJUcAbNzXrt92969EIrFffXedLIyy2fe +D9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 194-20020a6300cb000000b0040dd4c11cb8si42308679pga.679.2022.07.06.03.43.41; Wed, 06 Jul 2022 03:43:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232554AbiGFKjy (ORCPT + 99 others); Wed, 6 Jul 2022 06:39:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229599AbiGFKjx (ORCPT ); Wed, 6 Jul 2022 06:39:53 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AADAE2716B; Wed, 6 Jul 2022 03:39:52 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8006F1042; Wed, 6 Jul 2022 03:39:52 -0700 (PDT) Received: from e126311.manchester.arm.com (unknown [10.57.71.227]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2C5EF3F66F; Wed, 6 Jul 2022 03:39:49 -0700 (PDT) Date: Wed, 6 Jul 2022 11:39:40 +0100 From: Kajetan Puchalski To: Will Deacon Cc: Florian Westphal , Pablo Neira Ayuso , Jozsef Kadlecsik , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Mel Gorman , lukasz.luba@arm.com, dietmar.eggemann@arm.com, mark.rutland@arm.com, broonie@kernel.org, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, netdev@vger.kernel.org, stable@vger.kernel.org, regressions@lists.linux.dev, linux-kernel@vger.kernel.org, peterz@infradead.org, kajetan.puchalski@arm.com Subject: Re: [Regression] stress-ng udp-flood causes kernel panic on Ampere Altra Message-ID: References: <20220701200110.GA15144@breakpoint.cc> <20220702205651.GB15144@breakpoint.cc> <20220705105749.GA711@willie-the-truck> <20220705110724.GB711@willie-the-truck> <20220705112449.GA931@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220705112449.GA931@willie-the-truck> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 05, 2022 at 12:24:49PM +0100, Will Deacon wrote: > > > Sorry, but I have absolutely no context here. We have a handy document > > > describing the differences between atomic_t and refcount_t: > > > > > > Documentation/core-api/refcount-vs-atomic.rst > > > > > > What else do you need to know? > > > > Hmm, and I see a tonne of *_inc_not_zero() conversions in 719774377622 > > ("netfilter: conntrack: convert to refcount_t api") which mean that you > > no longer have ordering to subsequent reads in the absence of an address > > dependency. > > I think the patch above needs auditing with the relaxed behaviour in mind, > but for the specific crash reported here possibly something like the diff > below? > > Will > > --->8 > > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c > index 082a2fd8d85b..5ad9fcc84269 100644 > --- a/net/netfilter/nf_conntrack_core.c > +++ b/net/netfilter/nf_conntrack_core.c > @@ -1394,6 +1394,7 @@ static unsigned int early_drop_list(struct net *net, > * already fired or someone else deleted it. Just drop ref > * and move to next entry. > */ > + smp_rmb(); /* XXX: Why? */ > if (net_eq(nf_ct_net(tmp), net) && > nf_ct_is_confirmed(tmp) && > nf_ct_delete(tmp, 0, 0)) > Just to follow up, I think you're right, the patch in question should be audited further for other missing memory barrier issues. While this one smp_rmb() helps a lot, ie lets the test run for at least an hour or two, an overnight 6 hour test still resulted in the same crash somewhere along the way so it looks like it's not the only one that's needed.