Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp5620323pxb; Tue, 16 Feb 2021 03:23:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJx2gqQVil1DAunsYLdQL4T8j9VDkr3KwY0s4QuM24HtQEWiIP4R3STGbectWtfty0BMR9Gs X-Received: by 2002:a17:906:b50:: with SMTP id v16mr19067975ejg.298.1613474604809; Tue, 16 Feb 2021 03:23:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613474604; cv=none; d=google.com; s=arc-20160816; b=Pc0RGgLGN6rmvvkrDzRM+p0+am9u3WfJxXMKrajESmzevbVR7ysxGT/MZG2SmUkkhE szF5zdo1NK97uuCU/l/M2+utDpuGR4JOZgIEJuxboK2XO5QVtZMzEkb8eZg2x9LuztP3 eHDjtRWXeY4+4Af6gd6hQ140vFxzn7EOmpu8ArTWJYrOjIUx7IrZbYHzxT99Pp1RyJfG mqCOTMPXdEMqzhIbDE9Yz1HtowX5ynl3moIhRAiJ25h0eSJfI1+RsIwxhFQi2oLN8r6f +d76dBuhPaBXiAGDFoxwXVTUyixBcKEOBCBjrDNr76ZCEn9UZ2Z6hPzOU2X20on98KHt yKYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=o502SusQ/VEabZ3Jb79gv77bHufENRAKQniPP2zan0o=; b=QjS7kMV93PMzirCB/kWqDgYOFsD1tDtx5Nclc3JssOd0ltYxVVNEslbhLzQPcBNvpu 8tHEugVf9ltaUoM5Cv+j+zcARxew3NfX4sFJNbjgFPBUHYF27Lg6wlyk07/pYg0ZyEmW x0VJspce+q3KCa9CZc2U+KuNSS+mfmay6uwvtlIqn746l79jziHg/kIpQVoh5vQ7Y36u ECHQNPP1DCo35TDdMQjt8cpTDZMZCZSqhY+HBNd8gp3B/RTxO1bIIs6dRnFHn9Nxw5xp sa8punoOnMMUElty2By1K+ATpK6lRusOhP7vGVTZ7carqRsaYqX8ooUBzw5Sif9rbQGc CtVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RgkDRhUY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ov8si2322335ejb.339.2021.02.16.03.23.02; Tue, 16 Feb 2021 03:23:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RgkDRhUY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229830AbhBPLUN (ORCPT + 99 others); Tue, 16 Feb 2021 06:20:13 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:42281 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229989AbhBPLTx (ORCPT ); Tue, 16 Feb 2021 06:19:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613474307; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=o502SusQ/VEabZ3Jb79gv77bHufENRAKQniPP2zan0o=; b=RgkDRhUYI27xRXZJ5Nb184V2meSCCn0IvQSwg1oaP7whitSwOGFKEaNJtH2CkvbfTzZ1Zv ZMXwL5KaeW5bm5rI7yoJ4afsiGdY1u8IGkgl6bZi5iIMrnL+v+FwgXqDrnj+fpOIUts2M4 jwTkS+9QQvGGty5ct6OWs3yLUNr7Kkg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-145-9TKQoWd2N_CGqJh0Y9c6LQ-1; Tue, 16 Feb 2021 06:18:25 -0500 X-MC-Unique: 9TKQoWd2N_CGqJh0Y9c6LQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AD9F8195D566; Tue, 16 Feb 2021 11:18:23 +0000 (UTC) Received: from bfoster (ovpn-113-234.rdu2.redhat.com [10.10.113.234]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CBDD019712; Tue, 16 Feb 2021 11:18:22 +0000 (UTC) Date: Tue, 16 Feb 2021 06:18:20 -0500 From: Brian Foster To: Donald Buczek Cc: Dave Chinner , linux-xfs@vger.kernel.org, Linux Kernel Mailing List , it+linux-xfs@molgen.mpg.de Subject: Re: [PATCH] xfs: Wake CIL push waiters more reliably Message-ID: <20210216111820.GA534175@bfoster> References: <1705b481-16db-391e-48a8-a932d1f137e7@molgen.mpg.de> <20201229235627.33289-1-buczek@molgen.mpg.de> <20201230221611.GC164134@dread.disaster.area> <20210104162353.GA254939@bfoster> <20210107215444.GG331610@dread.disaster.area> <20210108165657.GC893097@bfoster> <20210111163848.GC1091932@bfoster> <20210113215348.GI331610@dread.disaster.area> <8416da5f-e8e5-8ec6-df3e-5ca89339359c@molgen.mpg.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8416da5f-e8e5-8ec6-df3e-5ca89339359c@molgen.mpg.de> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 15, 2021 at 02:36:38PM +0100, Donald Buczek wrote: > On 13.01.21 22:53, Dave Chinner wrote: > > [...] > > I agree that a throttling fix is needed, but I'm trying to > > understand the scope and breadth of the problem first instead of > > jumping the gun and making the wrong fix for the wrong reasons that > > just papers over the underlying problems that the throttling bug has > > made us aware of... > > Are you still working on this? > > If it takes more time to understand the potential underlying problem, the fix for the problem at hand should be applied. > > This is a real world problem, accidentally found in the wild. It appears very rarely, but it freezes a filesystem or the whole system. It exists in 5.7 , 5.8 , 5.9 , 5.10 and 5.11 and is caused by c7f87f3984cf ("xfs: fix use-after-free on CIL context on shutdown") which silently added a condition to the wakeup. The condition is based on a wrong assumption. > > Why is this "papering over"? If a reminder was needed, there were better ways than randomly hanging the system. > > Why is > > if (ctx->space_used >= XLOG_CIL_BLOCKING_SPACE_LIMIT(log)) > wake_up_all(&cil->xc_push_wait); > > , which doesn't work reliably, preferable to > > if (waitqueue_active(&cil->xc_push_wait)) > wake_up_all(&cil->xc_push_wait); > > which does? > JFYI, Dave followed up with a patch a couple weeks or so ago: https://lore.kernel.org/linux-xfs/20210128044154.806715-5-david@fromorbit.com/ Brian > Best > Donald > > > Cheers, > > > > Dave >