2023-02-10 09:58:36

by Longlong Xia

[permalink] [raw]
Subject: [PATCH -next v2 0/3] cleanup of devtmpfs_*_node()

In one test, when modprobe zram, no zram device was found in the /dev.
But don't see any errors printed in jouranls/dmesg. Later we found out
that the reason was that device_add() did not check its return value when
calling devtmpfs_create_node().

The test steps:
1. Set the SElinux label of /dev to user_home_t
2. modprobe zram num_devices=1000

v1 -> v2:
- New patch 1 to add error handling for devtmpfs_create_node().
- use dev_err() to replace pr_err_ratelimited in [2].
- only remove return value of devtmpfs_delete_node() in [3].

Longlong Xia (3):
driver core: add error handling for devtmpfs_create_node()
devtmpfs: add debug info to handle()
devtmpfs: remove return value of devtmpfs_delete_node()

drivers/base/base.h | 4 ++--
drivers/base/core.c | 6 +++++-
drivers/base/devtmpfs.c | 20 ++++++++++++++------
3 files changed, 21 insertions(+), 9 deletions(-)

--
2.25.1



2023-02-10 09:58:42

by Longlong Xia

[permalink] [raw]
Subject: [PATCH -next v2 1/3] driver core: add error handling for devtmpfs_create_node()

In some cases, devtmpfs_create_node() can return error value.
So, make use of it.

Signed-off-by: Longlong Xia <[email protected]>
---
drivers/base/core.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7dab705f2937..aaa3088e5456 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3405,7 +3405,9 @@ int device_add(struct device *dev)
if (error)
goto SysEntryError;

- devtmpfs_create_node(dev);
+ error = devtmpfs_create_node(dev);
+ if (error)
+ goto DevtmpfsError;
}

/* Notify clients of device addition. This call must come
@@ -3461,6 +3463,8 @@ int device_add(struct device *dev)
done:
put_device(dev);
return error;
+ DevtmpfsError:
+ device_remove_sys_dev_entry(dev);
SysEntryError:
if (MAJOR(dev->devt))
device_remove_file(dev, &dev_attr_dev);
--
2.25.1


2023-02-10 09:58:47

by Longlong Xia

[permalink] [raw]
Subject: [PATCH -next v2 2/3] devtmpfs: add debug info to handle()

Because handle() is the core function for processing devtmpfs requests,
Let's add some debug info in handle() to help users know why failed.

Signed-off-by: Longlong Xia <[email protected]>
---
drivers/base/devtmpfs.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index ae72d4ba8547..7789b7be4ee5 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -389,10 +389,18 @@ static __initdata DECLARE_COMPLETION(setup_done);
static int handle(const char *name, umode_t mode, kuid_t uid, kgid_t gid,
struct device *dev)
{
+ int ret;
+
if (mode)
- return handle_create(name, mode, uid, gid, dev);
+ ret = handle_create(name, mode, uid, gid, dev);
else
- return handle_remove(name, dev);
+ ret = handle_remove(name, dev);
+
+ if (ret)
+ dev_err(dev, "failed to %s %s, ret = %d\n",
+ mode ? "create" : "remove", name, ret);
+
+ return ret;
}

static void __noreturn devtmpfs_work_loop(void)
--
2.25.1


2023-02-10 09:58:51

by Longlong Xia

[permalink] [raw]
Subject: [PATCH -next v2 3/3] devtmpfs: remove return value of devtmpfs_delete_node()

The only caller of device_del() does not check the return value. And
there's nothing we can do when cleaning things up on a remove path.
Let's make it a void function.

Signed-off-by: Longlong Xia <[email protected]>
---
drivers/base/base.h | 4 ++--
drivers/base/devtmpfs.c | 8 ++++----
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 0e806f641079..f7996bf8d28b 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -201,10 +201,10 @@ void device_pm_move_to_tail(struct device *dev);

#ifdef CONFIG_DEVTMPFS
int devtmpfs_create_node(struct device *dev);
-int devtmpfs_delete_node(struct device *dev);
+void devtmpfs_delete_node(struct device *dev);
#else
static inline int devtmpfs_create_node(struct device *dev) { return 0; }
-static inline int devtmpfs_delete_node(struct device *dev) { return 0; }
+static inline void devtmpfs_delete_node(struct device *dev) { return 0; }
#endif

void software_node_notify(struct device *dev);
diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 7789b7be4ee5..e5c4d3e98ec9 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -147,22 +147,22 @@ int devtmpfs_create_node(struct device *dev)
return devtmpfs_submit_req(&req, tmp);
}

-int devtmpfs_delete_node(struct device *dev)
+void devtmpfs_delete_node(struct device *dev)
{
const char *tmp = NULL;
struct req req;

if (!thread)
- return 0;
+ return;

req.name = device_get_devnode(dev, NULL, NULL, NULL, &tmp);
if (!req.name)
- return -ENOMEM;
+ return;

req.mode = 0;
req.dev = dev;

- return devtmpfs_submit_req(&req, tmp);
+ devtmpfs_submit_req(&req, tmp);
}

static int dev_mkdir(const char *name, umode_t mode)
--
2.25.1


2023-02-10 10:27:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH -next v2 0/3] cleanup of devtmpfs_*_node()

On Fri, Feb 10, 2023 at 09:54:41AM +0000, Longlong Xia wrote:
> In one test, when modprobe zram, no zram device was found in the /dev.
> But don't see any errors printed in jouranls/dmesg. Later we found out
> that the reason was that device_add() did not check its return value when
> calling devtmpfs_create_node().
>
> The test steps:
> 1. Set the SElinux label of /dev to user_home_t
> 2. modprobe zram num_devices=1000
>
> v1 -> v2:
> - New patch 1 to add error handling for devtmpfs_create_node().
> - use dev_err() to replace pr_err_ratelimited in [2].
> - only remove return value of devtmpfs_delete_node() in [3].
>
> Longlong Xia (3):
> driver core: add error handling for devtmpfs_create_node()
> devtmpfs: add debug info to handle()
> devtmpfs: remove return value of devtmpfs_delete_node()
>
> drivers/base/base.h | 4 ++--
> drivers/base/core.c | 6 +++++-
> drivers/base/devtmpfs.c | 20 ++++++++++++++------
> 3 files changed, 21 insertions(+), 9 deletions(-)
>
> --
> 2.25.1
>

Thanks for the quick resend, nice work, all now queued up!

greg k-h

2023-02-14 00:14:51

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH -next v2 1/3] driver core: add error handling for devtmpfs_create_node()

Hi Longlong,

On Fri, Feb 10, 2023 at 09:54:42AM +0000, Longlong Xia wrote:
> In some cases, devtmpfs_create_node() can return error value.
> So, make use of it.
>
> Signed-off-by: Longlong Xia <[email protected]>
> ---
> drivers/base/core.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 7dab705f2937..aaa3088e5456 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3405,7 +3405,9 @@ int device_add(struct device *dev)
> if (error)
> goto SysEntryError;
>
> - devtmpfs_create_node(dev);
> + error = devtmpfs_create_node(dev);
> + if (error)
> + goto DevtmpfsError;
> }
>
> /* Notify clients of device addition. This call must come
> @@ -3461,6 +3463,8 @@ int device_add(struct device *dev)
> done:
> put_device(dev);
> return error;
> + DevtmpfsError:
> + device_remove_sys_dev_entry(dev);
> SysEntryError:
> if (MAJOR(dev->devt))
> device_remove_file(dev, &dev_attr_dev);
> --
> 2.25.1
>

After this change in -next as commit 31b4b6730fd4 ("driver core: add
error handling for devtmpfs_create_node()"), my test machines failed to
boot after the rootfs could not be mounted. I added some logging to see
which device was failing, which triggers a few times with the exact same
message:

device: 'btrfs-control': devtmpfs_create_node() failed, err = -17

with -17 being -EEXIST. I am not sure why this device is getting
registered more than once, it appears to occur during module insertion
though, as I am able to get to systemd starting within the initrd.

Should this particular return value be downgraded to a warning so that
the device still loads or should the driver be fixed? I have cc'd the
btrfs mailing list, in case they have any input.

Cheers,
Nathan

2023-02-14 00:33:44

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH -next v2 1/3] driver core: add error handling for devtmpfs_create_node()

On Mon, Feb 13, 2023 at 05:14:27PM -0700, Nathan Chancellor wrote:
> Hi Longlong,
>
> On Fri, Feb 10, 2023 at 09:54:42AM +0000, Longlong Xia wrote:
> > In some cases, devtmpfs_create_node() can return error value.
> > So, make use of it.
> >
> > Signed-off-by: Longlong Xia <[email protected]>
> > ---
> > drivers/base/core.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 7dab705f2937..aaa3088e5456 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -3405,7 +3405,9 @@ int device_add(struct device *dev)
> > if (error)
> > goto SysEntryError;
> >
> > - devtmpfs_create_node(dev);
> > + error = devtmpfs_create_node(dev);
> > + if (error)
> > + goto DevtmpfsError;
> > }
> >
> > /* Notify clients of device addition. This call must come
> > @@ -3461,6 +3463,8 @@ int device_add(struct device *dev)
> > done:
> > put_device(dev);
> > return error;
> > + DevtmpfsError:
> > + device_remove_sys_dev_entry(dev);
> > SysEntryError:
> > if (MAJOR(dev->devt))
> > device_remove_file(dev, &dev_attr_dev);
> > --
> > 2.25.1
> >
>
> After this change in -next as commit 31b4b6730fd4 ("driver core: add
> error handling for devtmpfs_create_node()"), my test machines failed to
> boot after the rootfs could not be mounted. I added some logging to see
> which device was failing, which triggers a few times with the exact same
> message:
>
> device: 'btrfs-control': devtmpfs_create_node() failed, err = -17
>
> with -17 being -EEXIST. I am not sure why this device is getting
> registered more than once, it appears to occur during module insertion
> though, as I am able to get to systemd starting within the initrd.
>
> Should this particular return value be downgraded to a warning so that
> the device still loads or should the driver be fixed? I have cc'd the
> btrfs mailing list, in case they have any input.

Apologies, I see now that the second patch in this series is logging for
this, sorry for not using or noticing that sooner.

I am now looking at some of my other test machines and I see messages
similar to the one above for other devices:

[ 16.268685] sound timer: failed to create snd/timer, ret = -17
[ 16.274666] ALSA: unable to register timer device (-17)
[ 16.500253] sound timer: failed to create snd/timer, ret = -17
[ 16.506124] ALSA: unable to register timer device (-17)

[ 18.273067] misc rfkill: failed to create rfkill, ret = -17
[ 18.464836] misc rfkill: failed to create rfkill, ret = -17
[ 19.366149] misc tun: failed to create net/tun, ret = -17
[ 19.371613] tun: Can't register misc device 200
[ 19.552362] misc tun: failed to create net/tun, ret = -17
[ 19.557839] tun: Can't register misc device 200
[ 23.124993] misc tun: failed to create net/tun, ret = -17
[ 23.130460] tun: Can't register misc device 200
[ 23.299267] misc tun: failed to create net/tun, ret = -17
[ 23.299319] tun: Can't register misc device 200
[ 27.050363] misc tun: failed to create net/tun, ret = -17
[ 27.055824] tun: Can't register misc device 200
[ 27.181496] misc tun: failed to create net/tun, ret = -17

So it seems like this is not a btrfs specific problem?

Cheers,
Nathan

2023-02-14 07:59:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH -next v2 1/3] driver core: add error handling for devtmpfs_create_node()

On Mon, Feb 13, 2023 at 05:33:06PM -0700, Nathan Chancellor wrote:
> On Mon, Feb 13, 2023 at 05:14:27PM -0700, Nathan Chancellor wrote:
> > Hi Longlong,
> >
> > On Fri, Feb 10, 2023 at 09:54:42AM +0000, Longlong Xia wrote:
> > > In some cases, devtmpfs_create_node() can return error value.
> > > So, make use of it.
> > >
> > > Signed-off-by: Longlong Xia <[email protected]>
> > > ---
> > > drivers/base/core.c | 6 +++++-
> > > 1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > index 7dab705f2937..aaa3088e5456 100644
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -3405,7 +3405,9 @@ int device_add(struct device *dev)
> > > if (error)
> > > goto SysEntryError;
> > >
> > > - devtmpfs_create_node(dev);
> > > + error = devtmpfs_create_node(dev);
> > > + if (error)
> > > + goto DevtmpfsError;
> > > }
> > >
> > > /* Notify clients of device addition. This call must come
> > > @@ -3461,6 +3463,8 @@ int device_add(struct device *dev)
> > > done:
> > > put_device(dev);
> > > return error;
> > > + DevtmpfsError:
> > > + device_remove_sys_dev_entry(dev);
> > > SysEntryError:
> > > if (MAJOR(dev->devt))
> > > device_remove_file(dev, &dev_attr_dev);
> > > --
> > > 2.25.1
> > >
> >
> > After this change in -next as commit 31b4b6730fd4 ("driver core: add
> > error handling for devtmpfs_create_node()"), my test machines failed to
> > boot after the rootfs could not be mounted. I added some logging to see
> > which device was failing, which triggers a few times with the exact same
> > message:
> >
> > device: 'btrfs-control': devtmpfs_create_node() failed, err = -17
> >
> > with -17 being -EEXIST. I am not sure why this device is getting
> > registered more than once, it appears to occur during module insertion
> > though, as I am able to get to systemd starting within the initrd.
> >
> > Should this particular return value be downgraded to a warning so that
> > the device still loads or should the driver be fixed? I have cc'd the
> > btrfs mailing list, in case they have any input.
>
> Apologies, I see now that the second patch in this series is logging for
> this, sorry for not using or noticing that sooner.
>
> I am now looking at some of my other test machines and I see messages
> similar to the one above for other devices:
>
> [ 16.268685] sound timer: failed to create snd/timer, ret = -17
> [ 16.274666] ALSA: unable to register timer device (-17)
> [ 16.500253] sound timer: failed to create snd/timer, ret = -17
> [ 16.506124] ALSA: unable to register timer device (-17)
>
> [ 18.273067] misc rfkill: failed to create rfkill, ret = -17
> [ 18.464836] misc rfkill: failed to create rfkill, ret = -17
> [ 19.366149] misc tun: failed to create net/tun, ret = -17
> [ 19.371613] tun: Can't register misc device 200
> [ 19.552362] misc tun: failed to create net/tun, ret = -17
> [ 19.557839] tun: Can't register misc device 200
> [ 23.124993] misc tun: failed to create net/tun, ret = -17
> [ 23.130460] tun: Can't register misc device 200
> [ 23.299267] misc tun: failed to create net/tun, ret = -17
> [ 23.299319] tun: Can't register misc device 200
> [ 27.050363] misc tun: failed to create net/tun, ret = -17
> [ 27.055824] tun: Can't register misc device 200
> [ 27.181496] misc tun: failed to create net/tun, ret = -17
>
> So it seems like this is not a btrfs specific problem?

Ok, let me go revert the series for now. Longlong, please feel free to
resubmit after testing this a bit better.

thanks,

greg k-h