Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp848985pxb; Wed, 13 Jan 2021 18:14:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJwO8h3ECDnFIbWLTGFUfMvdES5sKw2vAtlKCOsPrpQcyE/FQdpKUUpJMaCjTZpBog3M6a1b X-Received: by 2002:a17:906:af15:: with SMTP id lx21mr2979518ejb.6.1610590487286; Wed, 13 Jan 2021 18:14:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610590487; cv=none; d=google.com; s=arc-20160816; b=vh27ySPLN0Kq9dIDst6Fev3f49yRSblcYw1bg4sRwtTdhW/xapKjXIl+mLM0zkxgJf S36gV5WzgXjvxLSYj79bBVR74ntYa6VXtbuQnmYspQcdZRk94YuVqbhEuL8MwZmuQ+am TFryRivkfRPIJFp/6Tpt551lYQ6gSSjTOwBUInBAZxN6QYf/uEpCtAHhw0DKG9gBPvQi 7eaTORxBR2AoT8jSGfGy4JK6IQD3U8BLo6M73XLuKAAhrpe8mxstWL9m0fqTzDf/ZR2E yVcvZgZCGi/lst638yMYK8tSUfEJkF3jzT3d3tZTCbf9/+a7NNBm/nWrkNq8ocRsGnyv cfPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=5A83sgZxh6dZv5GPabB5/p/JW4vMBf2Kg7xRLgYgfSs=; b=Kd3eX01R/CXHrkJ4BYr5GRTUid5El3k4DdP2Tpcu4WUIoHpCGYt9OkNMBtgrvIZUj8 NJoOs01j2lK9+NTtB0Qsc//0hUNRPbpN3V4haXviAqhpKeYKfMbK07FHXnWa2Nxm5Wmk Qc/qEXjBwJPkSD8aZqZ+pBCQTZtpHsINezl+EXlQ9Cl11XjjQTFH6HqMpHGGBmHpF4Fb tJZHjyJbfIyvKO+6LHug8lrOyDauBe0mvYAJyzU6cSI3/dYzkXnUxFuG1XvyV3nSyDD/ 3OvCvZPR4DVYc+HCnFHz2N6otYb9u1MxA3b/wT6WG+I2okjZUi3d/ungbBT89limy/9O SE9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cy7si1477801edb.462.2021.01.13.18.14.24; Wed, 13 Jan 2021 18:14:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728459AbhANCN1 (ORCPT + 99 others); Wed, 13 Jan 2021 21:13:27 -0500 Received: from mail110.syd.optusnet.com.au ([211.29.132.97]:39735 "EHLO mail110.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729159AbhAMVqA (ORCPT ); Wed, 13 Jan 2021 16:46:00 -0500 Received: from dread.disaster.area (pa49-179-167-107.pa.nsw.optusnet.com.au [49.179.167.107]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id BB84511042C; Thu, 14 Jan 2021 08:44:37 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1kznwS-006ADD-F8; Thu, 14 Jan 2021 08:44:36 +1100 Date: Thu, 14 Jan 2021 08:44:36 +1100 From: Dave Chinner To: Brian Foster Cc: Donald Buczek , linux-xfs@vger.kernel.org, Linux Kernel Mailing List , it+linux-xfs@molgen.mpg.de Subject: Re: [PATCH] xfs: Wake CIL push waiters more reliably Message-ID: <20210113214436.GH331610@dread.disaster.area> References: <1705b481-16db-391e-48a8-a932d1f137e7@molgen.mpg.de> <20201229235627.33289-1-buczek@molgen.mpg.de> <20201230221611.GC164134@dread.disaster.area> <20210104162353.GA254939@bfoster> <20210107215444.GG331610@dread.disaster.area> <20210108165657.GC893097@bfoster> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210108165657.GC893097@bfoster> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=YKPhNiOx c=1 sm=1 tr=0 cx=a_idp_d a=+wqVUQIkAh0lLYI+QRsciw==:117 a=+wqVUQIkAh0lLYI+QRsciw==:17 a=kj9zAlcOel0A:10 a=EmqxpYm9HcoA:10 a=7-415B0cAAAA:8 a=HcGIDgX5mWfwRt8FoaAA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 08, 2021 at 11:56:57AM -0500, Brian Foster wrote: > On Fri, Jan 08, 2021 at 08:54:44AM +1100, Dave Chinner wrote: > > e.g. we run the first transaction into the CIL, it steals the sapce > > needed for the cil checkpoint headers for the transaciton. Then if > > the space returned by the item formatting is negative (because it is > > in the AIL and being relogged), the CIL checkpoint now doesn't have > > the space reserved it needs to run a checkpoint. That transaction is > > a sync transaction, so it forces the log, and now we push the CIL > > without sufficient reservation to write out the log headers and the > > items we just formatted.... > > > > Hmmm... that seems like an odd scenario because I'd expect the space > usage delta to reflect what might or might not have already been added > to the CIL context, not necessarily the AIL. IOW, shouldn't a negative > delta only occur for items being relogged while still CIL resident > (regardless of AIL residency)? > > From a code standpoint, the way a particular log item delta comes out > negative is from having a shadow lv size smaller than the ->li_lv size. > Thus, xlog_cil_insert_format_items() subtracts the currently formatted > lv size from the delta, formats the current state of the item, and > xfs_cil_prepare_item() adds the new (presumably smaller) size to the > delta. We reuse ->li_lv in this scenario so both it and the shadow > buffer remain, but a CIL push pulls ->li_lv from all log items and > chains them to the CIL context for writing, so I don't see how we could > have an item return a negative delta on an empty CIL. Hm? In theory, yes. But like I said, I've made a bunch of assumptions that this won't happen, and so without actually auditting the code I'm not actually sure that it won't. i.e. I need to go check what happens with items being marked stale, how shadow buffer reallocation interacts with items that end up being smaller than the existing buffer, etc. I've paged a lot of this detail out of my brain, so until I spend the time to go over it all again I'm not going to be able to answer definitively. > (I was also wondering whether repeated smaller relogs of an item could > be a vector for this to go wrong, but it looks like > xlog_cil_insert_format_items() always uses the formatted size of the > current buffer...). That's my nagging doubt about all this - that there's an interaction of this nature that I haven't considered due to the assumptions I've made that allows underflow to occur. That would be much worse than the current situation of hanging on a missing wakeup when the CIL is full and used_space goes backwards.... Cheers, Dave. -- Dave Chinner david@fromorbit.com