Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp80617pxv; Wed, 30 Jun 2021 15:32:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzs0FowdAYyvkA+AT3KtqAnZ1LFi/uk/ydz95uXo14hS2I6QVzfOX66htwIOKdNmaUdkEi6 X-Received: by 2002:a17:906:c14e:: with SMTP id dp14mr12316134ejc.5.1625092364194; Wed, 30 Jun 2021 15:32:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625092364; cv=none; d=google.com; s=arc-20160816; b=KDeuzYBTbmNeo7y+YpukM0g0Ggz420+zgmBmxM+iQloStjlW6YtrTCyDjen7CRv2MG Qu85GavTxTeLJ37Sy6+fQOu/lkndczaOb5XAYbIx+BiCm+ib884kT6lKgwkelxb9p5rO ouZARphDrHIwgVmUmRmbwt/vRrCUVX1qU8LpJW0sJctzeYh7JSUSsPR2F8JdaWS1Gc+2 dMaEBTP1WYOXlGG7idxez97CtIFHfeK1LRm63XuKMl/N0W9LcUX1owSakJDAQ21vs5QV Yk37GlGIwvXylgAcefvCRMQthv/t8KZPAB6Pi5NYrs95OBAhu9H8YFG0MKaLzbODYc5k Okkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=GMyVPuFMxYNbLMHQBfZilBl/NBmnN5vmlOEDJDVgikA=; b=BOcU4qMAyX3T0sBnEiPRmsM24Cb08R8F9uGwSbhVgPQoQVLNdrKCvHy4fFzN2P3Ly+ VIKQ1IVNtP5FpsdRh1/xH9PcU181Zy8MlmSFppeozPg5kOGl5a/LgkUFZZXuzGNNCaEv JLbzWpjvWQ6Xlso0h2ls++vXLaGnA/CuK4DXKGZz6v/r4jnojx6a+5CWc3oDOO/manMd 81rjw761Wk1Hf9ea14xGUZOa4iQWRNUf7w3fbizCujVEoKzqRuWc0wgn3x5m+mLRJuST MB1m0gMJybSZF8Fd5yhkvmHiM4ISHICNEjNoD99E0poX6HlzVY7YShcVVmANO9iKzOjW lb3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=google header.b=H6lxDxqf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id eb9si20099976ejc.720.2021.06.30.15.32.19; Wed, 30 Jun 2021 15:32:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=google header.b=H6lxDxqf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231799AbhF3Wdp (ORCPT + 99 others); Wed, 30 Jun 2021 18:33:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231864AbhF3Wdm (ORCPT ); Wed, 30 Jun 2021 18:33:42 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 011B1C061756 for ; Wed, 30 Jun 2021 15:31:12 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id m41-20020a05600c3b29b02901dcd3733f24so5525794wms.1 for ; Wed, 30 Jun 2021 15:31:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=GMyVPuFMxYNbLMHQBfZilBl/NBmnN5vmlOEDJDVgikA=; b=H6lxDxqfFalvZBt10BgprhpbLvdq+DLerLRsXowgxYS2LE3FLuoCBt5dBBadg0Xghf KNDj9akhByJUB+imIU+sZmgZhWGTlhLfIXeCi6ADK574Zg1+4THNfr254tFvYlWUE2m7 g3kxUdJVWLcLarbekZcB5Ay6FOh7hXYRuMJZU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=GMyVPuFMxYNbLMHQBfZilBl/NBmnN5vmlOEDJDVgikA=; b=S26T0h2yyy4CHOT4VL5GII/X45JtrBfzkEhtbMWjSF1gflD3SPHEPxK4B/Mn4dvZsP Sq+y9/n1jMeOFXrcoI1zp2xUXOEsGNAlOExDlsQmob5RpSdrNtkl4ZXiCJhAuBS7E/nn IcDiJkw/TBFMHbIjYeVNtZtnIsE9VXy7fORUNBXZah6m35dB+N7xAZVflFVI4VspikKI ioy71C9n8/j80WB87S/dcsYmvFW9zajeFHCKhKfRwjaKWA1aFznFwYxt0L8VVvBrBlxT uv4HOLeAkLI6XsDwJ647oXGMxdbwKpTjRc7yZCK8lCrpXlVZx7Lg34s9/S3B1fiGejXP 3fdg== X-Gm-Message-State: AOAM530vpqv20vLZmmSAsGO4Z0NWmFkOr0wqKxwLM38Ky/iinKH9RMOF eGYntkB99VCbPwl48ECB5DJoPg== X-Received: by 2002:a7b:ce82:: with SMTP id q2mr6904410wmj.60.1625092269984; Wed, 30 Jun 2021 15:31:09 -0700 (PDT) Received: from ?IPv6:2001:8b0:aba:5f3c:a683:959f:4ccb:54d6? ([2001:8b0:aba:5f3c:a683:959f:4ccb:54d6]) by smtp.gmail.com with ESMTPSA id t11sm23408362wrz.7.2021.06.30.15.31.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Jun 2021 15:31:09 -0700 (PDT) Message-ID: <696dc58209707ce364616430673998d0124a9a31.camel@linuxfoundation.org> Subject: Re: [PATCH] cgroup1: fix leaked context root causing sporadic NULL deref in LTP From: Richard Purdie To: Mark Brown , Tejun Heo Cc: Paul Gortmaker , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Al Viro , Zefan Li , Johannes Weiner , stable@vger.kernel.org Date: Wed, 30 Jun 2021 23:31:06 +0100 In-Reply-To: <20210630161036.GA43693@sirena.org.uk> References: <20210616125157.438837-1-paul.gortmaker@windriver.com> <20210630161036.GA43693@sirena.org.uk> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.40.0-1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2021-06-30 at 17:10 +0100, Mark Brown wrote: > On Wed, Jun 16, 2021 at 11:23:34AM -0400, Tejun Heo wrote: > > On Wed, Jun 16, 2021 at 08:51:57AM -0400, Paul Gortmaker wrote: > > > > A fix would be to not leave the stale reference in fc->root as follows: > > > >    -------------- > > >                   dput(fc->root); > > >   + fc->root = NULL; > > >                   deactivate_locked_super(sb); > > >    -------------- > > > > ...but then we are just open-coding a duplicate of fc_drop_locked() so we > > > simply use that instead. > > > As this is unlikely to be a real-world problem both in probability and > > circumstances, I'm applying this to cgroup/for-5.14 instead of > > cgroup/for-5.13-fixes. > > FWIW at Arm we've started seeing what appears to be this issue blow up > very frequently in some of our internal LTP CI runs against -next, seems > to be mostly on lower end platforms. We seem to have started finding it > at roughly the same time that the Yocto people did, I guess some other > change made it more likely to trigger. Not exactly real world usage > obviously but it's creating quite a bit of noise in testing which is > disruptive so it'd be good to get it into -next as a fix. It is a horrible bug to debug as you end up with "random" failures on the  systems which are hard to pin down. Along with the RCU stall hangs it was all a bit of a nightmare. Out of interest are you also seeing the proc01 test hang on a non-blocking read of /proc/kmsg periodically? https://bugzilla.yoctoproject.org/show_bug.cgi?id=14460 I've not figured out a way to reproduce it at will yet and it seems strace was enough to unblock it. It seems arm specific. Cheers, Richard