2024-03-15 15:10:13

by Eric Van Hensbergen

[permalink] [raw]
Subject: [GIT PULL] fs/9p patches for 6.9 merge window

The following changes since commit 6613476e225e090cc9aad49be7fa504e290dd33d:

Linux 6.8-rc1 (2024-01-21 14:11:32 -0800)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git tags/9p-for-6.9

for you to fetch changes up to be57855f505003c5cafff40338d5d0f23b00ba4d:

fs/9p: fix dups even in uncached mode (2024-01-26 16:46:56 +0000)

----------------------------------------------------------------
fs/9p changes for the 6.9 merge window

This pull request includes a number of patches
addressing improvements in the cache portions of the 9p
client.

The biggest improvements have to do with fixing handling
of inodes and eliminating duplicate structures and unnecessary
allocation/release of inode structures and many associated
unnecessary protocol traffic. This also dramatically
reduced code complexity across the code and sets us up to add
proper temporal cache capabilities.

Signed-off-by: Eric Van Hensbergen <[email protected]>

----------------------------------------------------------------
Eric Van Hensbergen (8):
fs/9p: switch vfsmount to use v9fs_get_new_inode
fs/9p: convert mkdir to use get_new_inode
fs/9p: remove walk and inode allocation from symlink
fs/9p: Eliminate redundant non-cache path in mknod
fs/9p: Eliminate now unused v9fs_get_inode
fs/9p: rework qid2ino logic
fs/9p: simplify iget to remove unnecessary paths
fs/9p: fix dups even in uncached mode

fs/9p/v9fs.h | 31 +++++-----------------------
fs/9p/v9fs_vfs.h | 11 ++++++----
fs/9p/vfs_dir.c | 4 ++--
fs/9p/vfs_inode.c | 150 +++++++++++++++++++--------------------------------------------------------------------------------------------------------------------
fs/9p/vfs_inode_dotl.c | 194 ++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------------------------------------------------------------
fs/9p/vfs_super.c | 45 +----------------------------------------
6 files changed, 71 insertions(+), 364 deletions(-)


Attachments:
(No filename) (2.11 kB)
signature.asc (849.00 B)
Download all attachments

2024-03-15 17:18:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] fs/9p patches for 6.9 merge window

On Fri, 15 Mar 2024 at 08:10, Eric Van Hensbergen <[email protected]> wrote:
>
> fs/9p changes for the 6.9 merge window

Entirely tangential, but your pgp key drives me insane, and it finally
drove me over the edge.

One of your uid's has your own name mis-spelled. This is not new.

Please tell me there's a reason for it.

Linus

2024-03-15 17:29:00

by pr-tracker-bot

[permalink] [raw]
Subject: Re: [GIT PULL] fs/9p patches for 6.9 merge window

The pull request you sent on Fri, 15 Mar 2024 15:10:03 +0000:

> git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git tags/9p-for-6.9

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/c442a42363b2ce5c3eb2b0ff1e052ee956f0a29f

Thank you!

--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

2024-04-08 14:16:34

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [GIT PULL] fs/9p patches for 6.9 merge window

Hello,

the commit 724a08450f74 ("fs/9p: simplify iget to remove unnecessary paths")
from this PR breaks my setup.

On the host:

$ cat /etc/fstab
/dev/mapper/volgrp-root / ext4 defaults 1 1
/dev/mapper/volgrp-home /home ext4 defaults 1 2

$ qemu-system-x86_64 -kernel ./arch/x86/boot/bzImage \
-append 'rw rootfstype=9p rootflags=version=9p2000.L,trans=virtio' \
-fsdev local,id=FSID_1,path=/home/oleg/TEST_GUEST/,security_model=none \
-device virtio-9p-pci,fsdev=FSID_1,mount_tag=/dev/root \
-fsdev local,id=FSID_2,path=/,security_model=none,readonly \
-device virtio-9p-pci,fsdev=FSID_2,mount_tag=hostfs \
-nographic -serial mon:stdio

In the guest:

# mount -t 9p hostfs /host -o version=9p2000.L,trans=virtio,access=any

Now, before this patch:

/# cd /host
/host# strace -e stat perl -e '-d "home"'
...
stat("home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
+++ exited with 0 +++
/host# cd home
/host/home#

After this patch:

# cd /host
/host# strace -e stat perl -e '-d "home"'
...
stat("home", 0x1397298) = -1 ELOOP (Too many levels of symbolic links)
+++ exited with 0 +++
/host# cd home
-bash: cd: home: Too many levels of symbolic links
/host# dmesg
...
[ 54.255756] VFS: Lookup of 'home' in 9p 9p would have caused loop
[ 72.190399] VFS: Lookup of 'home' in 9p 9p would have caused loop
[ 72.191535] VFS: Lookup of 'home' in 9p 9p would have caused loop
[ 72.192488] VFS: Lookup of 'home' in 9p 9p would have caused loop

Oleg.


On 03/15, Eric Van Hensbergen wrote:
>
> The following changes since commit 6613476e225e090cc9aad49be7fa504e290dd33d:
>
> Linux 6.8-rc1 (2024-01-21 14:11:32 -0800)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git tags/9p-for-6.9
>
> for you to fetch changes up to be57855f505003c5cafff40338d5d0f23b00ba4d:
>
> fs/9p: fix dups even in uncached mode (2024-01-26 16:46:56 +0000)
>
> ----------------------------------------------------------------
> fs/9p changes for the 6.9 merge window
>
> This pull request includes a number of patches
> addressing improvements in the cache portions of the 9p
> client.
>
> The biggest improvements have to do with fixing handling
> of inodes and eliminating duplicate structures and unnecessary
> allocation/release of inode structures and many associated
> unnecessary protocol traffic. This also dramatically
> reduced code complexity across the code and sets us up to add
> proper temporal cache capabilities.
>
> Signed-off-by: Eric Van Hensbergen <[email protected]>
>
> ----------------------------------------------------------------
> Eric Van Hensbergen (8):
> fs/9p: switch vfsmount to use v9fs_get_new_inode
> fs/9p: convert mkdir to use get_new_inode
> fs/9p: remove walk and inode allocation from symlink
> fs/9p: Eliminate redundant non-cache path in mknod
> fs/9p: Eliminate now unused v9fs_get_inode
> fs/9p: rework qid2ino logic
> fs/9p: simplify iget to remove unnecessary paths
> fs/9p: fix dups even in uncached mode
>
> fs/9p/v9fs.h | 31 +++++-----------------------
> fs/9p/v9fs_vfs.h | 11 ++++++----
> fs/9p/vfs_dir.c | 4 ++--
> fs/9p/vfs_inode.c | 150 +++++++++++++++++++--------------------------------------------------------------------------------------------------------------------
> fs/9p/vfs_inode_dotl.c | 194 ++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------------------------------------------------------------
> fs/9p/vfs_super.c | 45 +----------------------------------------
> 6 files changed, 71 insertions(+), 364 deletions(-)


2024-04-10 17:25:00

by Eric Van Hensbergen

[permalink] [raw]
Subject: Re: [GIT PULL] fs/9p patches for 6.9 merge window

April 8, 2024 at 9:14 AM, "Oleg Nesterov" <[email protected]> wrote:
>
> Hello,
>
> the commit 724a08450f74 ("fs/9p: simplify iget to remove unnecessary paths")
>
> from this PR breaks my setup.
>

Thanks for the bisect and detailed reproduction instructions. I am looking at this now and it seems to be related to another problem reported by Kent Overstreet where he was seeing symlink loop reports that were disrupting his testing environment. Once I'm able to reproduce, I'll try and get a patch out later today, if not I may revert the commit to keep from disrupting folks' testing environments.

-eric

2024-04-10 19:20:25

by Eric Van Hensbergen

[permalink] [raw]
Subject: Re: [GIT PULL] fs/9p patches for 6.9 merge window

April 10, 2024 at 12:20 PM, "Eric Van Hensbergen" <[email protected]> wrote:
> April 8, 2024 at 9:14 AM, "Oleg Nesterov" <[email protected]> wrote:
> > Hello,
> > the commit 724a08450f74 ("fs/9p: simplify iget to remove unnecessary paths")
> > from this PR breaks my setup.
> >
>
> Thanks for the bisect and detailed reproduction instructions. I am looking at this now and it seems to be related to another problem reported by Kent Overstreet where he was seeing symlink loop reports that were disrupting his testing environment. Once I'm able to reproduce, I'll try and get a patch out later today, if not I may revert the commit to keep from disrupting folks' testing environments.
>

Okay, I think I understand this one, unfortunately it doesn't appear obviously related to Kent's report if what I believe is correct. I think I've reproduced the problem, fundamentally, since you have two mount points you are exporting together. I believe we are getting an inode number collision which was being hidden by the "always create a new inode on lookup" inefficiency in v9fs_vfs_lookup. You could probably verify that for me by stating the /home directory and the / directory on the server side of your setup. When I created two partitions and mounted one inside the other the "home" and the "root" both had inode 2 and I got -ELOOP when trying to access.

The underlying cause in the patch series is that I was trying to maintain inodes versus constantly creating and deleting them -- and I simplified the inode lookup to be based purely on inode number (versus checking against times, type, i_generation, etc.) -- so collisions are much more likely to happen. If qemu detects that this is a possibility it usually prints something:
qemu-system-aarch64: warning: 9p: Multiple devices detected in same VirtFS export, which might lead to file ID collisions and severe misbehaviours on guest! You should either use a separate export for each device shared from host or use virtfs option 'multidevs=remap'!

I can confirm that multidevs=remap in qemu does appear to avoid the problem, but this doesn't help the fact that we have a broader regression on our hand that used to work. It'd be useful to see if mutlidevs=remap also deals with your problem @Kent, but while I think its the same underlying problem, I don't think it may be the same solution.

I think I have to give up on relying on qid.path/inode-number as unique and maintain our own client view of those -- but this will cause problems with the way several parts of the code currently operate where we need to lookup inode by a unique identifier from the server (which is currently conveyed by qids).

Now that I understand the problem, I think I can work thorugh a partial revert which will go back to a more complex match in vfs_lookup (which examines several other fields from the server beyond qid.path) -- but maybe I can do it in such a way which will still avoid keeping duplicate inode structs around in memory.

This didn't show up in my regressions because I always export a single file system where the inodes are always unique.

Thanks for your patience while I work through this - now that i can reproduce the problem, I'm fairly certain I can get a patch together this week to test to see if it solves the regressions.

-eric

2024-04-11 09:30:56

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [GIT PULL] fs/9p patches for 6.9 merge window

On 04/10, Eric Van Hensbergen wrote:
>
> April 10, 2024 at 12:20 PM, "Eric Van Hensbergen" <[email protected]> wrote:
> > April 8, 2024 at 9:14 AM, "Oleg Nesterov" <[email protected]> wrote:
> > > Hello,
> > > the commit 724a08450f74 ("fs/9p: simplify iget to remove unnecessary paths")
> > > from this PR breaks my setup.
> >

> I think
> I've reproduced the problem, fundamentally, since you have two mount
> points you are exporting together. I believe we are getting an inode
> number collision which was being hidden by the "always create a new inode
> on lookup" inefficiency in v9fs_vfs_lookup. You could probably verify
> that for me by stating the /home directory and the / directory on the
> server side of your setup.

Yes, yes,

$ ls -ldi / /home
2 dr-xr-xr-x. 18 root root 4096 Jan 4 2016 /
2 drwxr-xr-x. 7 root root 4096 Dec 20 2022 /home

that is why I showed you the relevant parts of my /etc/fstab

> If qemu detects that this is a possibility it usually
> prints something: qemu-system-aarch64: warning: 9p: Multiple devices
> detected in same VirtFS export,

My qemu is quite old, it doesn't. But I tested this on another machine
and yes, the newer qemu does warn.

(annoyingly, I had to redirect stderr to the file to see this warning,
it is cleared by the booting kernel otherwise).

> I can confirm that multidevs=remap in qemu does appear to avoid the
> problem,

Confirm!

I didn't know about this option (and again, my old qemu doesn't support
it), but everything seems to work just fine with the newer qemu and
multidevs=remap. Thanks!

So I think this regression is minor.

> now that i can
> reproduce the problem, I'm fairly certain I can get a patch together
> this week to test to see if it solves the regressions.

Thanks ;)

Oleg.