Received: by 2002:a19:771d:0:0:0:0:0 with SMTP id s29csp1261406lfc; Wed, 1 Jun 2022 13:25:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwPRMg7amEaF91Bo45Y4tJ2IrAsCB2IDagdRrFZVbV6qQLep6YSovO5OJMwQkkmDCn7Uhf2 X-Received: by 2002:a17:902:ce8c:b0:163:a21d:9157 with SMTP id f12-20020a170902ce8c00b00163a21d9157mr1211487plg.123.1654115146694; Wed, 01 Jun 2022 13:25:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654115146; cv=none; d=google.com; s=arc-20160816; b=Q/IFVNmVvzTieQ08KaORFMAbpGEPBUnBtZk3ych/m2ohg83SIYcYNbviN9CUi34Xa2 JueBhJBPgkgqEpOhg5x3oeBubxCUHR0hDF47Aokelu1godlh1CM7GoO2QILm2aLKzjtr DLFCq9K+eERieztEuO7V5SY5fyUVgej6MXPMXYQup46Zkw/+vKg3D3VrMODjRPwGa7xu uTzd3G2sch8O3jHZm2gWfKFoFcPyv4htyM5VOS1CbECJTvLhunOpJusECl8yR0SYH7HP X3XdxO4sLQghGLFL9bjCX2mpttFOIa8J8kpEi6dFAlSAyNMJ9pvyZML857qQ4zCNasNM jn2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7cBQ5UAi34wf6hVucTSFDDQUv/4kX0jsV9pUciZ6rdI=; b=SJA0DAwNot0RBOs5gxgrCkciAwQB1In6FzS/mBnXvZHTM6C5aCOAu+7AVdwSLH/OPV t2vPg5edtgSLM5ApNCYu2xueWAEk20QmtxqErQ3bY5wuEGfthFAQ5kCKyzQ+mJkQY1KE mvDSm8w89YgR/kSMUobvzG8qlaA6IvVm4y1uc83fMv+e0JHePLDuuEOmmEemIQaMmK1S Xrd3j0UbWHw7OGnuC5wbORrGW3ZQd+PndFFDXopxPW+FRJXneFeu/5JWnwf390HPAK7w 3JnLqTfA4DFC5Gw4hrSGsxQkd/yU32rjzWwDzTEzdgG/Z26a9hswB+63j707h8WCkRRH yzhg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@crudebyte.com header.s=kylie header.b="v/Oug87J"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=crudebyte.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id g12-20020a170902868c00b0015ab2254cd6si3306909plo.49.2022.06.01.13.25.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 13:25:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@crudebyte.com header.s=kylie header.b="v/Oug87J"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=crudebyte.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BF0C81D92F5; Wed, 1 Jun 2022 12:37:29 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354839AbiFAO33 (ORCPT + 99 others); Wed, 1 Jun 2022 10:29:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354673AbiFAO3Q (ORCPT ); Wed, 1 Jun 2022 10:29:16 -0400 Received: from kylie.crudebyte.com (kylie.crudebyte.com [5.189.157.229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A778AF599 for ; Wed, 1 Jun 2022 07:28:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=kylie; h=Content-Type:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Content-ID:Content-Description; bh=7cBQ5UAi34wf6hVucTSFDDQUv/4kX0jsV9pUciZ6rdI=; b=v/Oug87JxQcoMZTq3LO5us+JPV 3kAT9ABt6auUcZ2aaPD7rpo39mgPpZgel1sOdkfx9PO2Z9TsJCUMa+7QJneNadzG2E3d8Oe2lVYmQ 09l1oS8AX0ksw0sHQN5Hjq5vCtfhtFOibWgqS53lIGYAh7zkLIZtQe8E+fZuRInWfLjk0wUAYv4cv U8SNd58hPLmxua+mGLhJarCB1RVjI35R5ec9OheLiKLz+B475BbYrshD+isfSMzQooixfKz2YbLZK oqYz0Hs7AvGkeoMJm/cPiCtSc6G+qQVQmGFji+dRDhh5CEd2yYGABpheeoXZo8DOyRSUkrTR7ow3r 8nZP1F+5Qj3sxQ+yBsQaB9t67MNOzg90yA9Dg1ehiAhoe689MF3Li8y9/HMV8fjfSOMsW2k8pnlQg tttTPqfEHElnMb0/WreuE2VsF5ohnugk8shS8E5NvYa18XeD0WJCBs2rBQIIO425yZHDKLN+e0UTc GFSwZVrpCI5CsZBXWe1GV+mJiAiPsj7qdLmvLmtzIcxmyxlk3RSLBgnsJ/NgjA4ma+VhXSHprMUlB Y3T00gV6PjEyyBuPOHMYAF27b0TcWuakKDz2DTZ1c5DWcXAruiueGfWH3b1Fh1rE8vwts1os4ConN ErRRwlqM77qI6b8CVzTQKUQhguvEJFJlCqdGZqNYM=; From: Christian Schoenebeck To: Tyler Hicks Cc: Eric Van Hensbergen , Latchesar Ionkov , Dominique Martinet , Jianyong Wu , v9fs-developer@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/5] 9p: Fix refcounting during full path walks for fid lookups Date: Wed, 01 Jun 2022 16:28:49 +0200 Message-ID: <1849605.JvGbLJQp6r@silver> In-Reply-To: <20220531142829.GA6868@sequoia> References: <20220527000003.355812-1-tyhicks@linux.microsoft.com> <43525959.9j6oIFhYhY@silver> <20220531142829.GA6868@sequoia> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Dienstag, 31. Mai 2022 16:28:29 CEST Tyler Hicks wrote: > On 2022-05-30 19:14:43, Christian Schoenebeck wrote: > > On Freitag, 27. Mai 2022 01:59:59 CEST Tyler Hicks wrote: > > > Decrement the refcount of the parent dentry's fid after walking > > > each path component during a full path walk for a lookup. Failure to do > > > so can lead to fids that are not clunked until the filesystem is > > > > > > unmounted, as indicated by this warning: > > > 9pnet: found fid 3 not clunked > > > > That explains why I saw so many fids not being clunked with recent Linux > > kernel versions while doing some 9p protocol debugging with QEMU recently. > > In addition to this refcounting bug, there's another one that I noticed > while running fstests. My series does not fix it and I haven't had a > chance to look into it more. The generic/531 test triggers it. > > https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/5 > 31 > > > The improper refcounting after walking resulted in open(2) returning > > > -EIO on any directories underneath the mount point when using the virtio > > > transport. When using the fd transport, there's no apparent issue until > > > the filesytem is unmounted and the warning above is emitted to the logs. > > > > Actually I never saw that open() = -EIO error. Do you have a reproducer? > > The reproducer that I have is binary only (fairly large and runs a bunch > of different tests) and is used to regression test the Windows Subsystem > for Linux 2 (WSL2) host <-> guest filesystem sharing. Now that I think > about it, I'm not sure if the open() = -EIO error happens with other 9p > servers. > > I can try to tease out the exact sequence of filesystem operations from > this test binary but it might take me a bit. It looks like it has to do > with switching UIDs, which could make sense because different users may > not be connected to the filesystem yet (the conditional block that does > p9_client_attach() and v9fs_fid_add()). > > > > In some cases, the user may not yet be attached to the filesystem and a > > > new root fid, associated with the user, is created and attached to the > > > root dentry before the full path walk is performed. Increment the new > > > root fid's refcount to two in that situation so that it can be safely > > > decremented to one after it is used for the walk operation. The new fid > > > will still be attached to the root dentry when > > > v9fs_fid_lookup_with_uid() returns so a final refcount of one is > > > correct/expected. > > > > > > Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct") > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Tyler Hicks > > > --- > > > > > > fs/9p/fid.c | 17 +++++------------ > > > 1 file changed, 5 insertions(+), 12 deletions(-) > > > > > > diff --git a/fs/9p/fid.c b/fs/9p/fid.c > > > index 79df61fe0e59..5a469b79c1ee 100644 > > > --- a/fs/9p/fid.c > > > +++ b/fs/9p/fid.c > > > @@ -152,7 +152,7 @@ static struct p9_fid > > > *v9fs_fid_lookup_with_uid(struct > > > dentry *dentry, const unsigned char **wnames, *uname; > > > > > > int i, n, l, clone, access; > > > struct v9fs_session_info *v9ses; > > > > > > - struct p9_fid *fid, *old_fid = NULL; > > > + struct p9_fid *fid, *old_fid; > > > > > > v9ses = v9fs_dentry2v9ses(dentry); > > > access = v9ses->flags & V9FS_ACCESS_MASK; > > > > > > @@ -194,13 +194,12 @@ static struct p9_fid > > > *v9fs_fid_lookup_with_uid(struct > > > dentry *dentry, if (IS_ERR(fid)) > > > > > > return fid; > > > > > > + refcount_inc(&fid->count); > > > > > > v9fs_fid_add(dentry->d_sb->s_root, fid); > > > > > > } > > > /* If we are root ourself just return that */ > > > > > > - if (dentry->d_sb->s_root == dentry) { > > > - refcount_inc(&fid->count); > > > + if (dentry->d_sb->s_root == dentry) > > > > > > return fid; > > > > > > - } > > > > Hmm, wouldn't it then be possible that the root fid is returned with > > refcount being 2 here? > > Yes and I think that's correct. One refcount taken for adding the root > fid to the root dentry and another refcount taken for the original > purpose of the lookup. > > Reverting this portion of the change and re-testing with the reproducer > triggers a refcount underflow. Right, I still have some knowledge gaps in the kernel's 9p code base. I was actually rather confused about p9_client_clunk() which I just realized actually does the refcount decrement and then conditionally sends out the 'Tclunk' message on refcount zero only. So yes, it looks fine to me: Reviewed-by: Christian Schoenebeck > > > /* > > > > > > * Do a multipath walk with attached root. > > > * When walking parent we need to make sure we > > > > > > @@ -212,6 +211,7 @@ static struct p9_fid > > > *v9fs_fid_lookup_with_uid(struct > > > dentry *dentry, fid = ERR_PTR(n); > > > > > > goto err_out; > > > > > > } > > > > > > + old_fid = fid; > > > > > > clone = 1; > > > i = 0; > > > while (i < n) { > > > > > > @@ -221,15 +221,8 @@ static struct p9_fid > > > *v9fs_fid_lookup_with_uid(struct > > > dentry *dentry, * walk to ensure none of the patch component change > > > > > > */ > > > > > > fid = p9_client_walk(fid, l, &wnames[i], clone); > > > > > > + p9_client_clunk(old_fid); > > > > > > if (IS_ERR(fid)) { > > > > > > - if (old_fid) { > > > - /* > > > - * If we fail, clunk fid which are > > > > mapping > > > > > - * to path component and not the last > > > > component > > > > > - * of the path. > > > - */ > > > - p9_client_clunk(old_fid); > > > - } > > > > > > kfree(wnames); > > > goto err_out; > > > > > > } > > > > So this is the actual fix mentioned in the commit log. Makes sense. > > I think the refcount_inc() change for the root fid is an important and > required part of the fix. > > > Nitpicking: Wouldn't it be a bit cleaner to set old_fid solely within the > > while loop and just before overwriting fid? And as we now have bumped to > > -std=C11, probably making old_fid a local variable within loop scope only? > > You're right that it would be cleaner for the purposes of this single > patch. In a followup patch in this series, I start tracking the root fid > with a root_fid variable and that requires "old_fid = root_fid" before > we enter the loop and then "old_fid = fid" inside of the loop. s/while/for/ would do the trick I guess. Not a big deal though. Best regards, Christian Schoenebeck