Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp1311773pxv; Fri, 25 Jun 2021 09:58:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwjFgLE+oM7bXDIGVKD4LUFNg3i7AzwsnUfXg7klpZg8ILJJXOcnwQF/w3NKrSzUsFfKHGA X-Received: by 2002:a50:fb14:: with SMTP id d20mr16000382edq.187.1624640319357; Fri, 25 Jun 2021 09:58:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624640319; cv=none; d=google.com; s=arc-20160816; b=F22emyyncq+KL7c8EPK5gMMAK6zUEOzKJ34iHe3enUz94NXChDgXwU6exV079/eoDo jggdKynzSiIlYSgo0YXhJEpA34yH3Ls7cN9KYQ9H+FofZx+cTtvWbxb1YCSrHjz/2bBg 4MeTGeg3koFmYy4S25FtHHbVoQLzt9sEddHjCdpK2WTzYnSS7wBkSKLYNDOmIsTKkOTJ tca1+O8aEFyTu6RGHG6cP5Xf6OaLLuPlDgZUra2fP55nYQZ55b9lFdy0lnWXj89etZu2 qhdzPG4x92Wo5KaJWuy2R7gZBrbhqPIVIWMoKvtqUem1Qciw1JeQIhqYv7EzfSRX5+Zo gb1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=BCRmeW+55KO9J/qPLknSih+VPwq2hewhwXdQL6zwn1o=; b=1Eqx3wuSLHUlz8H1NSfIy/QMlWxzKzYt2ucH24uosvBPplYS7Lyn9ShJFvCnVDeii2 aWUc7R8nMtjA6sz2dYB8YX2DQXPQPBeVsc+gby/5RqoFIKlpjrlHJpjUfbzhRVooSeBT yz6qUZ4jYBe/Ah7naMFIc99Ai0bEJVRgRUG47/7i3rrg5Nrbqoq/qtdFmeTV25UtX87z +p701KMFqDkc51SXhMboz2d+rf7zhqC5PVA2tpNcY5maPv0FraJ2ujSyoqPoM9IwUb96 fqUIrGb1g61cvnB1h0ZOuG73qnPeVOGdRr4Onk/JQTPgRBMl+J821tOQjvDHpHkyRn7b 08qQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VGzSbCsB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a15si7122284edr.5.2021.06.25.09.58.15; Fri, 25 Jun 2021 09:58:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VGzSbCsB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230082AbhFYQ5H (ORCPT + 99 others); Fri, 25 Jun 2021 12:57:07 -0400 Received: from mail.kernel.org ([198.145.29.99]:36318 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230037AbhFYQ5H (ORCPT ); Fri, 25 Jun 2021 12:57:07 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id B6F206193F; Fri, 25 Jun 2021 16:54:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1624640086; bh=sEkPgxqfk0VrpshYLRmOVtcvJwj/6++q9U6NcXl8Xug=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=VGzSbCsBsKJxTZsiJ6Ug20dJFEb5OndynFvYA0troZL60sa7Uh8DxWK8wYz4roeEK xnW0cQL4YRDaA0va9cJza66CI/kCFHAfhl5cB4eWh7nhi04GCUjSms7/i4mVQULNYH FY/xxZvBUaHVGF3FW1U/fSjaeHrLnc0j2AVjkts5lmTetTmqRXUkcPPjr/v/5Fxrmo ag9fHA44Oy7i2f7QwXjHxjUDqwVsJ9rVT8mPGBGlfExQAkD1i4MxaDEc1dYFffHdNu wbeSMWQz7gp3EvoJ9frBoEfV2WJEjhDNDxckg1HU9wRocr2HSxBQP7HM+peFxVE75+ j8XB3NCviidLw== Message-ID: Subject: Re: [RFC PATCH] ceph: reduce contention in ceph_check_delayed_caps() From: Jeff Layton To: Luis Henriques , Ilya Dryomov Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Date: Fri, 25 Jun 2021 12:54:44 -0400 In-Reply-To: <20210625154559.8148-1-lhenriques@suse.de> References: <20210625154559.8148-1-lhenriques@suse.de> Content-Type: text/plain; charset="ISO-8859-15" User-Agent: Evolution 3.40.2 (3.40.2-1.fc34) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2021-06-25 at 16:45 +0100, Luis Henriques wrote: > Function ceph_check_delayed_caps() is called from the mdsc->delayed_work > workqueue and it can be kept looping for quite some time if caps keep being > added back to the mdsc->cap_delay_list. This may result in the watchdog > tainting the kernel with the softlockup flag. > > This patch re-arranges the loop through the caps list so that it initially > removes all the caps from list, adding them to a temporary list. And then, with > less locking contention, it will eventually call the ceph_check_caps() for each > inode. Any caps added to the list in the meantime will be handled in the next > run. > > Cc: stable@vger.kernel.org > Signed-off-by: Luis Henriques > --- > Hi Jeff! > > So, I've not based this patch on top of your patchset that gets rid of > ceph_async_iput() so that it will make it easier to backport it for stable > kernels. Of course I'm not 100% this classifies as stable material. > > Other than that, I've been testing this patch and I couldn't see anything > breaking. Let me know what you think. > > (I *think* I've seen a tracker bug for this in the past but I couldn't > find it. I guess it could be added as a 'Link:' tag.) > > Cheers, > -- > Luis > > fs/ceph/caps.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > index a5e93b185515..727e41e3b939 100644 > --- a/fs/ceph/caps.c > +++ b/fs/ceph/caps.c > @@ -4229,6 +4229,7 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) > { > struct inode *inode; > struct ceph_inode_info *ci; > + LIST_HEAD(caps_list); > > dout("check_delayed_caps\n"); > spin_lock(&mdsc->cap_delay_lock); > @@ -4239,19 +4240,23 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) > if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && > time_before(jiffies, ci->i_hold_caps_max)) > break; > - list_del_init(&ci->i_cap_delay_list); > + list_move_tail(&ci->i_cap_delay_list, &caps_list); > + } > + spin_unlock(&mdsc->cap_delay_lock); > > + while (!list_empty(&caps_list)) { > + ci = list_first_entry(&caps_list, > + struct ceph_inode_info, > + i_cap_delay_list); > + list_del_init(&ci->i_cap_delay_list); > inode = igrab(&ci->vfs_inode); > if (inode) { > - spin_unlock(&mdsc->cap_delay_lock); > dout("check_delayed_caps on %p\n", inode); > ceph_check_caps(ci, 0, NULL); > /* avoid calling iput_final() in tick thread */ > ceph_async_iput(inode); > - spin_lock(&mdsc->cap_delay_lock); > } > } > - spin_unlock(&mdsc->cap_delay_lock); > } > > /* I'm not sure this approach is viable, unfortunately. Once you've dropped the cap_delay_lock, then nothing protects the i_cap_delay_list head anymore. So you could detach these objects and put them on the private list, and then once you drop the spinlock another task could find one of them and (e.g.) call __cap_delay_requeue on it, potentially corrupting your list. I think we'll need to come up with a different way to do this... -- Jeff Layton