[adding Ted, the ext4 list, fsdevel, and api, because why not?]
On Tue, Nov 22, 2022 at 10:33:57AM +1100, Dave Chinner wrote:
> On Thu, Nov 17, 2022 at 03:58:06PM -0800, Darrick J. Wong wrote:
> > On Fri, Nov 18, 2022 at 08:51:25AM +1100, Dave Chinner wrote:
> > > On Thu, Nov 17, 2022 at 12:37:33PM -0800, Darrick J. Wong wrote:
> > > > On Wed, Nov 09, 2022 at 02:23:35PM -0800, Catherine Hoang wrote:
> > > > > Add support for the fsuuid command to retrieve the UUID of a mounted
> > > > > filesystem.
> > > > >
> > > > > Signed-off-by: Catherine Hoang <[email protected]>
> > > > > ---
<snip to the good part>
> > > > > diff --git a/spaceman/fsuuid.c b/spaceman/fsuuid.c
> > > > > new file mode 100644
> > > > > index 00000000..be12c1ad
> > > > > --- /dev/null
> > > > > +++ b/spaceman/fsuuid.c
> > > > > @@ -0,0 +1,63 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > +/*
> > > > > + * Copyright (c) 2022 Oracle.
> > > > > + * All Rights Reserved.
> > > > > + */
> > > > > +
> > > > > +#include "libxfs.h"
> > > > > +#include "libfrog/fsgeom.h"
> > > > > +#include "libfrog/paths.h"
> > > > > +#include "command.h"
> > > > > +#include "init.h"
> > > > > +#include "space.h"
> > > > > +#include <sys/ioctl.h>
> > > > > +
> > > > > +#ifndef FS_IOC_GETFSUUID
> > > > > +#define FS_IOC_GETFSUUID _IOR('f', 44, struct fsuuid)
> > > > > +#define UUID_SIZE 16
> > > > > +struct fsuuid {
> > > > > + __u32 fsu_len;
> > > > > + __u32 fsu_flags;
> > > > > + __u8 fsu_uuid[];
> > > >
> > > > This is a flex array ^^ which has no size. struct fsuuid therefore
> > > > has a size of 8 bytes (i.e. enough to cover the two u32 fields) and no
> > > > more. It's assumed that the caller will allocate the memory for
> > > > fsu_uuid...
> > > >
> > > > > +};
> > > > > +#endif
> > > > > +
> > > > > +static cmdinfo_t fsuuid_cmd;
> > > > > +
> > > > > +static int
> > > > > +fsuuid_f(
> > > > > + int argc,
> > > > > + char **argv)
> > > > > +{
> > > > > + struct fsuuid fsuuid;
> > > > > + int error;
> > > >
> > > > ...which makes this usage a problem, because we've not reserved any
> > > > space on the stack to hold the UUID. The kernel will blindly assume
> > > > that there are fsuuid.fsu_len bytes after fsuuid and write to them,
> > > > which will clobber something on the stack.
> > > >
> > > > If you're really unlucky, the C compiler will put the fsuuid right
> > > > before the call frame, which is how stack smashing attacks work. It
> > > > might also lay out bp[] immediately afterwards, which will give you
> > > > weird results as the unparse function overwrites its source buffer. The
> > > > C compiler controls the stack layout, which means this can go bad in
> > > > subtle ways.
> > > >
> > > > Either way, gcc complains about this (albeit in an opaque manner)...
> > > >
> > > > In file included from ../include/xfs.h:9,
> > > > from ../include/libxfs.h:15,
> > > > from fsuuid.c:7:
> > > > In function ‘platform_uuid_unparse’,
> > > > inlined from ‘fsuuid_f’ at fsuuid.c:45:3:
> > > > ../include/xfs/linux.h:100:9: error: ‘uuid_unparse’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread]
> > > > 100 | uuid_unparse(*uu, buffer);
> > > > | ^~~~~~~~~~~~~~~~~~~~~~~~~
> > > > ../include/xfs/linux.h: In function ‘fsuuid_f’:
> > > > ../include/xfs/linux.h:100:9: note: referencing argument 1 of type ‘const unsigned char *’
> > > > In file included from ../include/xfs/linux.h:13,
> > > > from ../include/xfs.h:9,
> > > > from ../include/libxfs.h:15,
> > > > from fsuuid.c:7:
> > > > /usr/include/uuid/uuid.h:107:13: note: in a call to function ‘uuid_unparse’
> > > > 107 | extern void uuid_unparse(const uuid_t uu, char *out);
> > > > | ^~~~~~~~~~~~
> > > > cc1: all warnings being treated as errors
> > > >
> > > > ...so please allocate the struct fsuuid object dynamically.
> > >
> > > So, follow common convention and you'll get it wrong, eh? That a
> > > score of -4 on Rusty's API Design scale.
> > >
> > > http://sweng.the-davies.net/Home/rustys-api-design-manifesto
> > >
> > > Flex arrays in user APIs like this just look plain dangerous to me.
> > >
> > > Really, this says that the FSUUID API should have a fixed length
> > > buffer size defined in the API and the length used can be anything
> > > up to the maximum.
> > >
> > > We already have this being added for the ioctl API:
> > >
> > > #define UUID_SIZE 16
> > >
> > > So why isn't the API definition this:
> > >
> > > struct fsuuid {
> > > __u32 fsu_len;
> > > __u32 fsu_flags;
> > > __u8 fsu_uuid[UUID_SIZE];
> > > };
> > >
> > > Or if we want to support larger ID structures:
> > >
> > > #define MAX_FSUUID_SIZE 256
> > >
> > > struct fsuuid {
> > > __u32 fsu_len;
> > > __u32 fsu_flags;
> > > __u8 fsu_uuid[MAX_FSUUID_SIZE];
> > > };
> > >
> > > Then the structure can be safely placed on the stack, which means
> > > "the obvious use is (probably) the correct one" (a score of 7 on
> > > Rusty's API Design scale). It also gives the kernel a fixed upper
> > > bound that it can use to validate the incoming fsu_len variable
> > > against...
> >
> > Too late now, this already shipped in 6.0. Changing the struct size
> > would change the ioctl number, which is a totally new API. This was
> > already discussed back in July on fsdevel/api.
>
> It is certainly not too late - if we are going to lift this to the
> VFS, then we can simply make it a new ioctl. The horrible ext4 ioctl
> can ber left to rot in ext4 and nobody else ever needs to care that
> it exists.
You're wrong. This was discussed **multiple times** this summer on
the fsdevel and API lists. You had plenty of opportunity to make these
suggestions about the design, and yet you did not:
https://lore.kernel.org/linux-api/[email protected]/
https://lore.kernel.org/linux-api/[email protected]/
https://lore.kernel.org/linux-api/[email protected]/
https://lore.kernel.org/linux-api/[email protected]/
Jeremy built the functionality and followed the customary process,
sending four separate revisions for reviews. He adapted his code based
on our feedback about how to future-proof it by adding an explicit
length parameter, and got it merged into ext4 in 6.0-rc1.
Now you want Catherine and I to tear down his work and initiate a design
review of YET ANOTHER NEW IOCTL just so the API can hit this one design
point you care about, and then convince Ted to go back and redo all the
work that has already been done. All this to extract 16 bytes from the
kernel in a slightly different style than the existing XFS fsgeometry
ioctl.
This was /supposed/ to be a simple way for a less experienced staffer to
gain some experience wiring up an existing ioctl. And, well, I hope she
doesn't take away that developing for Linux is institutionally broken
and frustrating, because that's what I've taken away from the last 2+
years of being here.
--D
> -Dave.
> --
> Dave Chinner
> [email protected]
On Mon, Nov 21, 2022 at 10:21:57PM -0800, Darrick J. Wong wrote:
> [adding Ted, the ext4 list, fsdevel, and api, because why not?]
>
> On Tue, Nov 22, 2022 at 10:33:57AM +1100, Dave Chinner wrote:
> > On Thu, Nov 17, 2022 at 03:58:06PM -0800, Darrick J. Wong wrote:
> > > On Fri, Nov 18, 2022 at 08:51:25AM +1100, Dave Chinner wrote:
> > > > On Thu, Nov 17, 2022 at 12:37:33PM -0800, Darrick J. Wong wrote:
> > > > > On Wed, Nov 09, 2022 at 02:23:35PM -0800, Catherine Hoang wrote:
> > > > > > Add support for the fsuuid command to retrieve the UUID of a mounted
> > > > > > filesystem.
> > > > > >
> > > > > > Signed-off-by: Catherine Hoang <[email protected]>
> > > > > > ---
>
> <snip to the good part>
> > > > > If you're really unlucky, the C compiler will put the fsuuid right
> > > > > before the call frame, which is how stack smashing attacks work. It
> > > > > might also lay out bp[] immediately afterwards, which will give you
> > > > > weird results as the unparse function overwrites its source buffer. The
> > > > > C compiler controls the stack layout, which means this can go bad in
> > > > > subtle ways.
> > > > >
> > > > > Either way, gcc complains about this (albeit in an opaque manner)...
> > > > >
> > > > > In file included from ../include/xfs.h:9,
> > > > > from ../include/libxfs.h:15,
> > > > > from fsuuid.c:7:
> > > > > In function ‘platform_uuid_unparse’,
> > > > > inlined from ‘fsuuid_f’ at fsuuid.c:45:3:
> > > > > ../include/xfs/linux.h:100:9: error: ‘uuid_unparse’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread]
> > > > > 100 | uuid_unparse(*uu, buffer);
> > > > > | ^~~~~~~~~~~~~~~~~~~~~~~~~
> > > > > ../include/xfs/linux.h: In function ‘fsuuid_f’:
> > > > > ../include/xfs/linux.h:100:9: note: referencing argument 1 of type ‘const unsigned char *’
> > > > > In file included from ../include/xfs/linux.h:13,
> > > > > from ../include/xfs.h:9,
> > > > > from ../include/libxfs.h:15,
> > > > > from fsuuid.c:7:
> > > > > /usr/include/uuid/uuid.h:107:13: note: in a call to function ‘uuid_unparse’
> > > > > 107 | extern void uuid_unparse(const uuid_t uu, char *out);
> > > > > | ^~~~~~~~~~~~
> > > > > cc1: all warnings being treated as errors
> > > > >
> > > > > ...so please allocate the struct fsuuid object dynamically.
> > > >
> > > > So, follow common convention and you'll get it wrong, eh? That a
> > > > score of -4 on Rusty's API Design scale.
> > > >
> > > > http://sweng.the-davies.net/Home/rustys-api-design-manifesto
> > > >
> > > > Flex arrays in user APIs like this just look plain dangerous to me.
> > > >
> > > > Really, this says that the FSUUID API should have a fixed length
> > > > buffer size defined in the API and the length used can be anything
> > > > up to the maximum.
> > > >
> > > > We already have this being added for the ioctl API:
> > > >
> > > > #define UUID_SIZE 16
> > > >
> > > > So why isn't the API definition this:
> > > >
> > > > struct fsuuid {
> > > > __u32 fsu_len;
> > > > __u32 fsu_flags;
> > > > __u8 fsu_uuid[UUID_SIZE];
> > > > };
> > > >
> > > > Or if we want to support larger ID structures:
> > > >
> > > > #define MAX_FSUUID_SIZE 256
> > > >
> > > > struct fsuuid {
> > > > __u32 fsu_len;
> > > > __u32 fsu_flags;
> > > > __u8 fsu_uuid[MAX_FSUUID_SIZE];
> > > > };
> > > >
> > > > Then the structure can be safely placed on the stack, which means
> > > > "the obvious use is (probably) the correct one" (a score of 7 on
> > > > Rusty's API Design scale). It also gives the kernel a fixed upper
> > > > bound that it can use to validate the incoming fsu_len variable
> > > > against...
> > >
> > > Too late now, this already shipped in 6.0. Changing the struct size
> > > would change the ioctl number, which is a totally new API. This was
> > > already discussed back in July on fsdevel/api.
> >
> > It is certainly not too late - if we are going to lift this to the
> > VFS, then we can simply make it a new ioctl. The horrible ext4 ioctl
> > can ber left to rot in ext4 and nobody else ever needs to care that
> > it exists.
>
> You're wrong. This was discussed **multiple times** this summer on
> the fsdevel and API lists. You had plenty of opportunity to make these
> suggestions about the design, and yet you did not:
>
> https://lore.kernel.org/linux-api/[email protected]/
> https://lore.kernel.org/linux-api/[email protected]/
> https://lore.kernel.org/linux-api/[email protected]/
> https://lore.kernel.org/linux-api/[email protected]/
There's good reason for that: this was posted and reviewed as *an
EXT4 specific API*. Why are you expecting XFS developers to closely
review a patchset that was titled "Add ioctls to get/set the ext4
superblock uuid."?
There was -no reasons- for me to pay attention to it, and I have
enough to keep up with without having to care about the minutae of
what ext4 internal information is being exposing to userspace.
However, now it's being proposed as a *generic VFS API*, and so it's
now important enough for developers from other filesystems to look
at this ioctl API.
> Jeremy built the functionality and followed the customary process,
> sending four separate revisions for reviews. He adapted his code based
> on our feedback about how to future-proof it by adding an explicit
> length parameter, and got it merged into ext4 in 6.0-rc1.
*As an EXT4 modification*, not a generic VFS ioctl.
> Now you want Catherine and I to tear down his work and initiate a design
> review of YET ANOTHER NEW IOCTL just so the API can hit this one design
> point you care about, and then convince Ted to go back and redo all the
> work that has already been done. All this to extract 16 bytes from the
> kernel in a slightly different style than the existing XFS fsgeometry
> ioctl.
I'm not asking you to tear anything down. Just leave the ext4 ioctl
as it is currently defined and nothing existing breaks or needs
reworking.
All I'm asking is that instead of lifting the ext4 ioctl verbatim,
you lift it with a fixed maximum size for the uuid data array to
replace the flex array. It's a *trivial change to make*, and yes, I
know that this means it's not the same as the ext4 ioctl.
But, really, who cares that it will be a different ioctl? Nobody but
ext4 utilities will be using the ext4 ioctl, and we expect generic
block/fs utilities and applications to use the VFS definition of the
ioctl, not the ext4 specific one.
> This was /supposed/ to be a simple way for a less experienced staffer to
> gain some experience wiring up an existing ioctl. And, well, I hope she
> doesn't take away that developing for Linux is institutionally broken
> and frustrating, because that's what I've taken away from the last 2+
> years of being here.
When we lift stuff from filesystem specific scope (where few people
care about API warts) to generic VFS scope that the whole world is
expected to see, use and understand, you should expect a larger
number of experienced developers to scrutinise it. The wider scope
of the API means the "acceptibility bar" is set higher.
Just because the code change is simple, it doesn't mean the issues
surrounding the code change are simple or straight forward. Just
because it went through a review on the ext4 list it doesn't mean
the API or implementation is flawless.
The point I'm making is that lifting fs ioctl APIs verbatim is a
*known broken process* that leads to future pain fixing all the
problems inherited from the original fs specific API and
implementation. If we want to lift functionality to be generic VFS
UAPI and at the time of lifting we find problems with the UAPI
and/or implementation, then we need to fix the problems before we
expose the new VFS API to the entire world.
Repeat past mistakes, or learn from them. Your choice...
-Dave.
--
Dave Chinner
[email protected]