Now that 9p support for macOS hosts just landed in QEMU 7.0 and with support
for Windows hosts on the horizon [1], the question is how to deal with case-
insensitive host filesystems, which are very common on those two systems?
I made some tests, e.g. trying to setup a 9p root fs Linux installation on a
macOS host as described in the QEMU HOWTO [2], which at a certain point causes
the debootstrap script to fail when trying to unpack the 'libpam-runtime'
package. That's because it would try to create this symlink:
/usr/share/man/man7/PAM.7.gz -> /usr/share/man/man7/pam.7.gz
which fails with EEXIST on a case-insensitive APFS. Unfortunately you can't
easily switch an existing APFS partition to case-sensitivity. It requires to
reformat the entire partition, loosing all your data, etc.
So I did a quick test with QEMU as outlined below, trying to simply let 9p
server "eat" EEXIST errors in such cases, but then I realized that most of the
time it would not even come that far, as Linux client would first send a
'Twalk' request to check whether target symlink entry already exists, and as
it gets a positive response from 9p server (again, due to case-insensitivity)
client would stop right there without even trying to send a 'Tsymlink'
request.
So maybe it's better to handle case-insensitivity entirely on client side?
I've read that some generic "case fold" code has landed in the Linux kernel
recently that might do the trick?
Should 9p server give a hint to 9p client that it's a case-insensitive fs? And
if yes, once per entire exported fs or rather for each directory (as there
might be submounts on host)?
[1] https://lore.kernel.org/all/[email protected]/
[2] https://wiki.qemu.org/Documentation/9p_root_fs
---
hw/9pfs/9p-local.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index d42ce6d8b8..d6cb45c758 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -39,6 +39,10 @@
#endif
#endif
#include <sys/ioctl.h>
+#ifdef CONFIG_DARWIN
+#include <glib.h>
+#include <glib/gprintf.h>
+#endif
#ifndef XFS_SUPER_MAGIC
#define XFS_SUPER_MAGIC 0x58465342
@@ -57,6 +61,18 @@ typedef struct {
int mountfd;
} LocalData;
+#ifdef CONFIG_DARWIN
+
+/* Compare strings case-insensitive (assuming UTF-8 encoding). */
+static int p9_stricmp(const char *a, const char *b)
+{
+ g_autofree gchar *cia = g_utf8_casefold(a, -1);
+ g_autofree gchar *cib = g_utf8_casefold(b, -1);
+ return g_utf8_collate(cia, cib);
+}
+
+#endif
+
int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
mode_t mode)
{
@@ -931,6 +947,25 @@ static int local_symlink(FsContext *fs_ctx, const char
*oldpath,
fs_ctx->export_flags & V9FS_SM_NONE) {
err = symlinkat(oldpath, dirfd, name);
if (err) {
+#if CONFIG_DARWIN
+ if (errno == EEXIST) {
+ printf(" -> symlinkat(oldpath='%s', dirfd=%d, name='%s') =
EEXIST\n", oldpath, dirfd, name);
+ }
+ if (errno == EEXIST &&
+ strcmp(oldpath, name) && !p9_stricmp(oldpath, name))
+ {
+ struct stat st1, st2;
+ const int cur_errno = errno;
+ if (!fstatat(dirfd, oldpath, &st1, AT_SYMLINK_NOFOLLOW) &&
+ !fstatat(dirfd, name, &st2, AT_SYMLINK_NOFOLLOW) &&
+ st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino)
+ {
+ printf(" -> iCASE SAME\n");
+ err = 0;
+ }
+ errno = cur_errno;
+ }
+#endif
goto out;
}
err = fchownat(dirfd, name, credp->fc_uid, credp->fc_gid,
@@ -983,6 +1018,25 @@ static int local_link(FsContext *ctx, V9fsPath *oldpath,
ret = linkat(odirfd, oname, ndirfd, name, 0);
if (ret < 0) {
+#if CONFIG_DARWIN
+ if (errno == EEXIST) {
+ printf(" -> linkat(odirfd=%d, oname='%s', ndirfd=%d, name='%s')
= EEXIST\n", odirfd, oname, ndirfd, name);
+ }
+ if (errno == EEXIST &&
+ strcmp(oname, name) && !p9_stricmp(oname, name))
+ {
+ struct stat st1, st2;
+ const int cur_errno = errno;
+ if (!fstatat(odirfd, oname, &st1, AT_SYMLINK_NOFOLLOW) &&
+ !fstatat(ndirfd, name, &st2, AT_SYMLINK_NOFOLLOW) &&
+ st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino)
+ {
+ printf(" -> iCASE SAME\n");
+ ret = 0;
+ }
+ errno = cur_errno;
+ }
+#endif
goto out_close;
}
--
2.32.0 (Apple Git-132)
Christian Schoenebeck wrote on Fri, Apr 22, 2022 at 08:02:46PM +0200:
> So maybe it's better to handle case-insensitivity entirely on client side?
> I've read that some generic "case fold" code has landed in the Linux kernel
> recently that might do the trick?
I haven't tried, but settings S_CASEFOLD on every inodes i_flags might do
what you want client-side.
That's easy enough to test and could be a mount option
Even with that it's possible to do a direct open without readdir first
if one knows the path and I that would only be case-insensitive if the
backing server is case insensitive though, so just setting the option
and expecting it to work all the time might be a little bit
optimistic... I believe guess that should be an optimization at best.
Ideally the server should tell the client they are casefolded somehow,
but 9p doesn't have any capability/mount time negotiation besides msize
so that's difficult with the current protocol.
--
Dominique | Asmadeus
On Freitag, 22. April 2022 21:57:40 CEST Dominique Martinet wrote:
> Christian Schoenebeck wrote on Fri, Apr 22, 2022 at 08:02:46PM +0200:
> > So maybe it's better to handle case-insensitivity entirely on client side?
> > I've read that some generic "case fold" code has landed in the Linux
> > kernel
> > recently that might do the trick?
>
> I haven't tried, but settings S_CASEFOLD on every inodes i_flags might do
> what you want client-side.
> That's easy enough to test and could be a mount option
I just made a quick test using:
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 08f48b70a741..5d8e77daed53 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -257,6 +257,7 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses,
inode->i_atime = inode->i_mtime = inode->i_ctime =
current_time(inode);
inode->i_mapping->a_ops = &v9fs_addr_operations;
inode->i_private = NULL;
+ inode->i_flags |= S_CASEFOLD;
switch (mode & S_IFMT) {
case S_IFIFO:
Unfortunately that did not help much. I still get EEXIST error e.g. when
trying 'ln -s foo FOO'.
I am not sure though whether there would be more code places to touch or
whether that's even the expected behaviour with S_CASEFOLD for some reason.
> Even with that it's possible to do a direct open without readdir first
> if one knows the path and I that would only be case-insensitive if the
> backing server is case insensitive though, so just setting the option
> and expecting it to work all the time might be a little bit
> optimistic... I believe guess that should be an optimization at best.
>
> Ideally the server should tell the client they are casefolded somehow,
> but 9p doesn't have any capability/mount time negotiation besides msize
> so that's difficult with the current protocol.