2004-09-02 08:36:29

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Aug 31, 2004 at 06:05:24PM -0400, Robert Love wrote:
> +int send_kevent(enum kevent type, struct kset *kset,
> + struct kobject *kobj, const char *signal);

Why is the kset needed? We can determine that from the kobject.

How about changing this to:
int send_kevent(struct kobject *kobj, struct attribute *attr);
which just tells userspace that a specific attribute needs to be read,
as something "important" has changed.

Will passing the attribute name be able to successfully handle the
"enum kevent" and "signal" combinations?

thanks,

greg k-h


2004-09-02 12:04:07

by Daniel Stekloff

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, 2004-09-02 at 01:34, Greg KH wrote:
> On Tue, Aug 31, 2004 at 06:05:24PM -0400, Robert Love wrote:
> > +int send_kevent(enum kevent type, struct kset *kset,
> > + struct kobject *kobj, const char *signal);
>
> Why is the kset needed? We can determine that from the kobject.
>
> How about changing this to:
> int send_kevent(struct kobject *kobj, struct attribute *attr);
> which just tells userspace that a specific attribute needs to be read,
> as something "important" has changed.


Do all events require an attribute? What about the "overheating"
example? Would you need an attribute or would getting a "signal" for a
specific kobj be enough?

Binding an attribute to an event would at least tell you the name of the
attribute to check. Otherwise, how does an app know the name of the
attribute that changed? Or am I missing something?

Thanks,

Dan



> Will passing the attribute name be able to successfully handle the
> "enum kevent" and "signal" combinations?
>
> thanks,
>
> greg k-h
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2004-09-02 12:49:06

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, Sep 02, 2004 at 10:34:08AM +0200, Greg KH wrote:
> On Tue, Aug 31, 2004 at 06:05:24PM -0400, Robert Love wrote:
> > +int send_kevent(enum kevent type, struct kset *kset,
> > + struct kobject *kobj, const char *signal);
>
> Why is the kset needed? We can determine that from the kobject.

I expect it's because:
fill_kobj_path(struct kset *kset, struct kobject *kobj, char *path, int length)
get_kobj_path_length(struct kset *kset, struct kobject *kobj)

and therefore the exported:
kobject_get_path(struct kset *kset, struct kobject *kobj, int gfp_mask)

are all passing the kset. If they all are not needed, they can go too?

> How about changing this to:
> int send_kevent(struct kobject *kobj, struct attribute *attr);
> which just tells userspace that a specific attribute needs to be read,
> as something "important" has changed.

Hmm, in most cases this will work. But in mandates the creation of an
attribute instead of the lazy signal string. Yes, it would be nicer in
the long run and closer to the idea, that the whole event data should be
available through sysfs, but it may be hard to reach this in some subsystems.

> Will passing the attribute name be able to successfully handle the
> "enum kevent" and "signal" combinations?

What should we do in the hotplug case? We may send a NULL attr for
the kset creation. But how can we distinguish between "add" and "remove"?
Just by looking if we find the sysfs dir? I'm not sure in this case.

Any ideas?

Thanks,
Kay

2004-09-02 13:26:06

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, Sep 02, 2004 at 05:02:46AM -0700, Daniel Stekloff wrote:
> On Thu, 2004-09-02 at 01:34, Greg KH wrote:
> > On Tue, Aug 31, 2004 at 06:05:24PM -0400, Robert Love wrote:
> > > +int send_kevent(enum kevent type, struct kset *kset,
> > > + struct kobject *kobj, const char *signal);
> >
> > Why is the kset needed? We can determine that from the kobject.
> >
> > How about changing this to:
> > int send_kevent(struct kobject *kobj, struct attribute *attr);
> > which just tells userspace that a specific attribute needs to be read,
> > as something "important" has changed.
>
>
> Do all events require an attribute? What about the "overheating"
> example? Would you need an attribute or would getting a "signal" for a
> specific kobj be enough?

Hmm, both can be the case at the moment. We need to decide, if we want to
mandate the use of a sysfs attribute instead of the string value as the signal.

> Binding an attribute to an event would at least tell you the name of the
> attribute to check. Otherwise, how does an app know the name of the
> attribute that changed? Or am I missing something?

That's a valid point, right. If we don't use "verbs" as the signal string,
and know what we can expect from that particular kind of event, we don't
know which attribute has changed.


The remaining questions are:

o Do we want the multicast - "channels" for the events? It may be nice, to
have it, if a application only interested in e.g. hotplug events, can get
only the interesting events?

o What kind of signal do we need? A lazy string, a well defined set like
ADD/REMOVE/CHANGE?
Or can we get rid of the whole signal? But how can we distinguish between
add and remove? Watching if the sysfs file comes or goes is not a option,
I think.

Thanks,
Kay

2004-09-02 16:26:41

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, 2004-09-02 at 10:34 +0200, Greg KH wrote:

> Why is the kset needed? We can determine that from the kobject.
>
> How about changing this to:
> int send_kevent(struct kobject *kobj, struct attribute *attr);
> which just tells userspace that a specific attribute needs to be read,
> as something "important" has changed.
>
> Will passing the attribute name be able to successfully handle the
> "enum kevent" and "signal" combinations?

We can drop the kset if you say we never need it. Why do all the kobj
get_path functions take a kset then? That is what confused me.

We can also drop the "enum kevent" if we decide we don't want to take
advantage of the multicasting of the netlink socket. The enum defines
what multicast group the netlink message is sent out in. I actually
have been talking to Kay about ditching it, and we are trying to figure
out if we ever _need_ it.

So that is 2 for 2. But ...

I don't dig replacing the signal string with an attribute. I think it
will really limit what we can do - having the signal as a verb
describing the event is really important. We might also not always have
an attribute. Kay's points are all valid.

So what if we had

int send_kevent(struct kobject *kobj, const char *signal);

Which was a way of notifying user-space of a change of "signal" on the
object "kobj" in sysfs.

If we ever wanted to send an attribute, we can do something like

const char *signal;

signal = attribute_to_string(attribute);
send_kevent(kobj, signal);
kfree(signal);

Dig that?

Best,

Robert Love


2004-09-02 16:29:47

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, 2004-09-02 at 15:26 +0200, Kay Sievers wrote:

> o What kind of signal do we need? A lazy string, a well defined set like
> ADD/REMOVE/CHANGE?
> Or can we get rid of the whole signal? But how can we distinguish between
> add and remove? Watching if the sysfs file comes or goes is not a option,
> I think.

I think (from our off-list discussions) that we really need the signal.
Agreed?

I do think that defining the signal to specific values makes sense, e.g.
KEVENT_ADD, KEVENT_REMOVE, KEVENT_MOUNTED, etc. We could also send the
attribute as a string.

To get around the hotplug issue that would occur without 'enum kevent',
as we discussed, we could have a "hotplug_added" signal or whatever.
Nothing wrong with that.

Cool or not?

Robert Love


2004-09-02 18:36:16

by Daniel Stekloff

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, 2004-09-02 at 09:25, Robert Love wrote:
> On Thu, 2004-09-02 at 10:34 +0200, Greg KH wrote:
>
> > Why is the kset needed? We can determine that from the kobject.
> >
> > How about changing this to:
> > int send_kevent(struct kobject *kobj, struct attribute *attr);
> > which just tells userspace that a specific attribute needs to be read,
> > as something "important" has changed.
> >
> > Will passing the attribute name be able to successfully handle the
> > "enum kevent" and "signal" combinations?
>
> We can drop the kset if you say we never need it. Why do all the kobj
> get_path functions take a kset then? That is what confused me.
>
> We can also drop the "enum kevent" if we decide we don't want to take
> advantage of the multicasting of the netlink socket. The enum defines
> what multicast group the netlink message is sent out in. I actually
> have been talking to Kay about ditching it, and we are trying to figure
> out if we ever _need_ it.


The only problem I see is making an app sift through all of the events
to get to a specific type of event. They'd have to parse and match the
signal, that could be costly.

Will there be small single purpose applications listening on the netlink
socket or off dbus for specific signals? Or do you see this mainly
handled by a single kevent daemon?



> So that is 2 for 2. But ...
>
> I don't dig replacing the signal string with an attribute. I think it
> will really limit what we can do - having the signal as a verb
> describing the event is really important. We might also not always have
> an attribute. Kay's points are all valid.
>
> So what if we had
>
> int send_kevent(struct kobject *kobj, const char *signal);
>
> Which was a way of notifying user-space of a change of "signal" on the
> object "kobj" in sysfs.
>
> If we ever wanted to send an attribute, we can do something like
>
> const char *signal;
>
> signal = attribute_to_string(attribute);
> send_kevent(kobj, signal);
> kfree(signal);
>
> Dig that?


I thought you wanted to standardize the signals? Would you have a
delimiter in your signal to have a standard prefix to use to identify
the string in User Space?

Why not add the attribute name to the path you get from the kobj?

send_attribute_kevent(kobj, attr->name, signal);

and

send_kevent(kobj, signal);


Thanks,

Dan

2004-09-02 18:42:13

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, 2004-09-02 at 11:35 -0700, Daniel Stekloff wrote:

> The only problem I see is making an app sift through all of the events
> to get to a specific type of event. They'd have to parse and match the
> signal, that could be costly.

I'd hope that they were not that many events that it would ever be
costly.

But this should be mitigated by your event aggregator in user-space,
whether that is D-BUS or whatever else. If it is D-BUS, you could then
subscribe via D-BUS only to the events you care about.

> Will there be small single purpose applications listening on the netlink
> socket or off dbus for specific signals? Or do you see this mainly
> handled by a single kevent daemon?

Whatever you want to do.

I think they way we would set up our Linux desktop product would have D-
BUS listening on the kevent socket and propagating the events up the
stack (or have a separate daemon listen and funnel the events to D-BUS.
whatever).

Then applications up the stack would respond to and handle the D-BUS
signals.

Robert Love


2004-09-02 20:38:49

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, 2004-09-02 at 12:27 -0400, Robert Love wrote:
> On Thu, 2004-09-02 at 15:26 +0200, Kay Sievers wrote:
>
> > o What kind of signal do we need? A lazy string, a well defined set like
> > ADD/REMOVE/CHANGE?
> > Or can we get rid of the whole signal? But how can we distinguish between
> > add and remove? Watching if the sysfs file comes or goes is not a option,
> > I think.
>
> I think (from our off-list discussions) that we really need the signal.
> Agreed?
>
> I do think that defining the signal to specific values makes sense, e.g.
> KEVENT_ADD, KEVENT_REMOVE, KEVENT_MOUNTED, etc. We could also send the
> attribute as a string.
>
> To get around the hotplug issue that would occur without 'enum kevent',
> as we discussed, we could have a "hotplug_added" signal or whatever.
> Nothing wrong with that.

Hmm, the threads are a bit mixed up now, I will only reply to this one:

I think we can get rid of the kset at all, but let's see what Greg says
about it. Until that, I assume, we don't need it.

We don't depend on the multicast groups, they are only used cause the
concept was so nice, I think. I don't expect a lot of listeners on the
netlink socket, so we can live without it, if it makes something
easier.

Yes, in our current incarnation of kevent, we need the signal as a
string value to describe what actually happens with this kobject. It is
nice as it would be easy to implement a event, without touching to much
code in the subsystems.
I like that model, if we all agree to go this way. It would look like
Robert's example in the other mail:

int send_kevent(struct kobject *kobj,
const char *signal);


But Greg and Dan have valid points, that this won't work well for the
single attribute case, we possibly should cover, without encoding it in
the signal string.
If we may agree to export data only by creating an attribute in the
kobject instead of the signal "verb", it would look like this:

int send_kevent(struct kobject *kobj,
struct attribute *attr,
const char *signal)

The signal may be well defined as "add|remove|change"(like
the /sbin/hotplug ACTION= value today), which will mandate that we
create any data as attributes (no exception). We may allow NULL as the
attribute and send the event for the whole kobject this way.

For the mount example it would look like this:
send_kevent(bdev->kobj, &attr_owner, "change");

For cpu overheating:
send_kevent(cpu->kobj, &attr_temperature_state, "change");

For hotplug:
send_kevent(cpu->kobj, NULL, "add");
send_kevent(cpu->kobj, NULL, "remove");

This model may be a bit over-ambitious, but in the long run, it may be
the better one, as it doesn't require a signal dictionary and _any_
state is readable at any time from userspace.

It is more similar to our current /sbin/hotplug-notification, we use
e.g. for udev. There we get only the sysfs-path and the action for a
device that gets connected or disconnected.
The second model would conceptually only be a strong sysfs-state-change
notification for kobjects(sysfs-directory) or single kobject-attributes
(sysfs-attribute).

Any thoughts?

Thanks,
Kay


2004-09-04 07:51:51

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

Sorry, I'm at a conference, so email access is flaky at times...

On Thu, Sep 02, 2004 at 12:25:21PM -0400, Robert Love wrote:
> On Thu, 2004-09-02 at 10:34 +0200, Greg KH wrote:
>
> > Why is the kset needed? We can determine that from the kobject.
> >
> > How about changing this to:
> > int send_kevent(struct kobject *kobj, struct attribute *attr);
> > which just tells userspace that a specific attribute needs to be read,
> > as something "important" has changed.
> >
> > Will passing the attribute name be able to successfully handle the
> > "enum kevent" and "signal" combinations?
>
> We can drop the kset if you say we never need it. Why do all the kobj
> get_path functions take a kset then? That is what confused me.

Look at kobject_hotplug. We can determine the kset assigned to a
kobject with the logic in that function. Then we use the kset to get
the path and other good stuff that the hotplug call needs.

> We can also drop the "enum kevent" if we decide we don't want to take
> advantage of the multicasting of the netlink socket. The enum defines
> what multicast group the netlink message is sent out in. I actually
> have been talking to Kay about ditching it, and we are trying to figure
> out if we ever _need_ it.

Sounds fine to me.

> So that is 2 for 2. But ...
>
> I don't dig replacing the signal string with an attribute. I think it
> will really limit what we can do - having the signal as a verb
> describing the event is really important. We might also not always have
> an attribute. Kay's points are all valid.
>
> So what if we had
>
> int send_kevent(struct kobject *kobj, const char *signal);
>
> Which was a way of notifying user-space of a change of "signal" on the
> object "kobj" in sysfs.

Ok, I'll accept that, and I like it, it's simple, and yet powerful for
pretty much everything we'll need in the future.

In fact, I like that type of function so much, I wrote it a few years
ago:
void kobject_hotplug(const char *action, struct kobject *kobj)

You fell right into my trap :)

So, we're back to the original issue. Why is this kernel event system
different from the hotplug system? I would argue there isn't one,
becides the transport, as you seem to want everything that we currently
provide in the current kobject_hotplug() call.

But transports are important, I agree.

How about you just add the ability to send hotplug calls across netlink?
Make it so the kobject_hotplug() function does both the exec() call, and
a netlink call (based on a config option for those people who like to
configure such stuff.)

That way, programs who want to listen to netlink messages to get hotplug
events do so, and so does programs who want to do the /etc/hotplug.d/
type of notification?

Oh, and attributes. How about we just change kobject_hotplug() to be:
int kobject_hotplug(const char *action, struct kobject *kobj, struct attribute *attr);
and if we set attr to be a valid pointer, we make the DEVPATH paramater
contain the attribute name at the end of it.

I'd be glad to accept a patch that implements this.

Look acceptable?

thanks,

greg k-h

2004-09-05 02:18:58

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

diff -Nru a/fs/super.c b/fs/super.c
--- a/fs/super.c 2004-09-05 03:54:23 +02:00
+++ b/fs/super.c 2004-09-05 03:54:23 +02:00
@@ -35,6 +35,7 @@
#include <linux/vfs.h>
#include <linux/writeback.h> /* for the emergency remount stuff */
#include <linux/idr.h>
+#include <linux/kobj_notify.h>
#include <asm/uaccess.h>


@@ -633,6 +634,16 @@
return (void *)s->s_bdev == data;
}

+static void notify_device_claim(const char *action, struct block_device *bdev)
+{
+ if (bdev->bd_disk) {
+ if (bdev->bd_part)
+ kobj_notify(action, &bdev->bd_part->kobj, NULL);
+ else
+ kobj_notify(action, &bdev->bd_disk->kobj, NULL);
+ }
+}
+
struct super_block *get_sb_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int))
@@ -675,8 +686,10 @@
up_write(&s->s_umount);
deactivate_super(s);
s = ERR_PTR(error);
- } else
+ } else {
s->s_flags |= MS_ACTIVE;
+ notify_device_claim("claim", bdev);
+ }
}

return s;
@@ -691,6 +704,8 @@
void kill_block_super(struct super_block *sb)
{
struct block_device *bdev = sb->s_bdev;
+
+ notify_device_claim("release", bdev);
generic_shutdown_super(sb);
set_blocksize(bdev, sb->s_old_blocksize);
close_bdev_excl(bdev);
@@ -717,6 +732,7 @@
return ERR_PTR(error);
}
s->s_flags |= MS_ACTIVE;
+
return s;
}

diff -Nru a/include/linux/kobj_notify.h b/include/linux/kobj_notify.h
--- /dev/null Wed Dec 31 16:00:00 196900
+++ b/include/linux/kobj_notify.h 2004-09-05 03:54:23 +02:00
@@ -0,0 +1,31 @@
+#ifndef _KOBJ_NOTIFY_H_
+#define _KOBJ_NOTIFY_H_
+
+#ifdef __KERNEL__
+#ifdef CONFIG_KOBJ_NOTIFY
+
+extern int kobj_notify(const char *signal, struct kobject *kobj,
+ struct attribute *attr);
+
+extern int kobj_notify_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr);
+
+#else
+
+static inline int kobj_notify(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return 0;
+}
+
+static inline int kobj_notify_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return 0;
+}
+
+
+#endif /* CONFIG_KOBJ_NOTIFY */
+#endif /* __KERNEL__ */
+
+#endif /* _KOBJ_NOTIFY_H_ */
diff -Nru a/include/linux/netlink.h b/include/linux/netlink.h
--- a/include/linux/netlink.h 2004-09-05 03:54:23 +02:00
+++ b/include/linux/netlink.h 2004-09-05 03:54:23 +02:00
@@ -17,6 +17,7 @@
#define NETLINK_ROUTE6 11 /* af_inet6 route comm channel */
#define NETLINK_IP6_FW 13
#define NETLINK_DNRTMSG 14 /* DECnet routing messages */
+#define NETLINK_KOBJ_NOTIFY 15 /* Kernel messages to userspace */
#define NETLINK_TAPBASE 16 /* 16 to 31 are ethertap */

#define MAX_LINKS 32
diff -Nru a/init/Kconfig b/init/Kconfig
--- a/init/Kconfig 2004-09-05 03:54:23 +02:00
+++ b/init/Kconfig 2004-09-05 03:54:23 +02:00
@@ -195,6 +195,21 @@
agent" (/sbin/hotplug) to load modules and set up software needed
to use devices as you hotplug them.

+config KOBJ_NOTIFY
+ bool "Kernel Events Layer"
+ depends on NET && HOTPLUG
+ default y
+ help
+ This option enables the kernel events layer, which is a simple
+ mechanism for kernel-to-user communication over a netlink socket.
+ The goal of the kernel events layer is to provide a simple and
+ efficient events system, that notifies userspace about kobject state
+ changes e.g. hotplug events, power state changes or block device
+ claiming (mount/unmount).
+
+ Say Y, unless you are building a system requiring minimal memory
+ consumption.
+
config IKCONFIG
bool "Kernel .config support"
---help---
diff -Nru a/kernel/Makefile b/kernel/Makefile
--- a/kernel/Makefile 2004-09-05 03:54:23 +02:00
+++ b/kernel/Makefile 2004-09-05 03:54:23 +02:00
@@ -24,6 +24,7 @@
obj-$(CONFIG_AUDIT) += audit.o
obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_KOBJ_NOTIFY) += kobj_notify.o

ifneq ($(CONFIG_IA64),y)
# According to Alan Modra <[email protected]>, the -fno-omit-frame-pointer is
diff -Nru a/kernel/kobj_notify.c b/kernel/kobj_notify.c
--- /dev/null Wed Dec 31 16:00:00 196900
+++ b/kernel/kobj_notify.c 2004-09-05 03:54:23 +02:00
@@ -0,0 +1,88 @@
+/*
+ * kernel/kobj_notify.c - sysfs event delivery via netlink socket
+ *
+ * Copyright (C) 2004 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2004 Novell, Inc. All rights reserved.
+ *
+ * Licensed under the GNU GPL v2.
+ *
+ * Authors:
+ * Robert Love <[email protected]>
+ * Kay Sievers <[email protected]>
+ * Arjan van de Ven <[email protected]>
+ */
+
+#include <linux/spinlock.h>
+#include <linux/socket.h>
+#include <linux/skbuff.h>
+#include <linux/netlink.h>
+#include <linux/string.h>
+#include <linux/kobject.h>
+#include <net/sock.h>
+
+static struct sock *kobj_notify_sock = NULL;
+
+static int do_kobj_notify(const char *signal, struct kobject *kobj,
+ struct attribute *attr, int gfp_mask)
+{
+ const char *path;
+ struct sk_buff *skb;
+ char *buffer;
+ int len;
+
+ if (!kobj_notify_sock)
+ return -EIO;
+
+ path = kobject_get_path(NULL, kobj, gfp_mask);
+ if (!path)
+ return -ENOMEM;
+
+ len = strlen(signal) + 1;
+ len += strlen(path) + 1;
+ if (attr)
+ len += strlen(attr->name) + 1;
+
+ skb = alloc_skb(len, gfp_mask);
+ if (!skb)
+ return -ENOMEM;
+
+ buffer = skb_put(skb, len);
+ if (attr)
+ sprintf(buffer, "%s\n%s/%s", signal, path, attr->name);
+ else
+ sprintf(buffer, "%s\n%s", signal, path);
+ kfree(path);
+
+ return netlink_broadcast(kobj_notify_sock, skb, 0, 1, gfp_mask);
+}
+
+int kobj_notify(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return do_kobj_notify(signal, kobj, attr, GFP_KERNEL);
+}
+
+EXPORT_SYMBOL(kobj_notify);
+
+int kobj_notify_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return do_kobj_notify(signal, kobj, attr, GFP_ATOMIC);
+}
+
+EXPORT_SYMBOL(kobj_notify_atomic);
+
+static int __init kobj_notify_init(void)
+{
+ kobj_notify_sock = netlink_kernel_create(NETLINK_KOBJ_NOTIFY, NULL);
+
+ if (!kobj_notify_sock) {
+ printk(KERN_ERR
+ "kobj_notify: unable to create netlink socket!\n");
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+core_initcall(kobj_notify_init);
diff -Nru a/lib/kobject.c b/lib/kobject.c
--- a/lib/kobject.c 2004-09-05 03:54:23 +02:00
+++ b/lib/kobject.c 2004-09-05 03:54:23 +02:00
@@ -13,6 +13,7 @@
#undef DEBUG

#include <linux/kobject.h>
+#include <linux/kobj_notify.h>
#include <linux/string.h>
#include <linux/module.h>
#include <linux/stat.h>
@@ -202,6 +203,8 @@
goto exit;
}
}
+
+ kobj_notify(action, kobj, NULL);

pr_debug ("%s: %s %s %s %s %s %s %s\n", __FUNCTION__, argv[0], argv[1],
envp[0], envp[1], envp[2], envp[3], envp[4]);


Attachments:
(No filename) (2.74 kB)
knotify-07.patch (6.58 kB)
Download all attachments

2004-09-05 02:58:15

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Sat, 2004-09-04 at 02:54 +0200, Greg KH wrote:

> So, we're back to the original issue. Why is this kernel event system
> different from the hotplug system? I would argue there isn't one,
> becides the transport, as you seem to want everything that we currently
> provide in the current kobject_hotplug() call.
>
> But transports are important, I agree.
>
> How about you just add the ability to send hotplug calls across netlink?
> Make it so the kobject_hotplug() function does both the exec() call, and
> a netlink call (based on a config option for those people who like to
> configure such stuff.)

This smells.

Look, I agree that unifying the two ideas and transports as much as
possible is the right way to proceed. But the fact is, as you said,
transports _are_ important. And simply always sending out a hotplug
event _and_ a netlink event is silly and superfluous. We need to make
up our minds.

I don't think anyone argues that netlink makes sense for these low
priority asynchronous events.

I'd prefer to integrate the two approaches as much as possible, but keep
the two transports separate. Use hotplug for hotplug events as we do
now and use kevent, which is over netlink, for the new events we want to
add.

Maybe always do the kevent from the hotplug, but definitely do not do
the hotplug from all kevents. It is redundant and extra overhead.

Doing both simultaneous begs the question of why have both. Picking the
right tool for the job is, well, the right tool for the job.

Robert Love


2004-09-05 03:01:49

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Sun, 2004-09-05 at 04:18 +0200, Kay Sievers wrote:

> If we add this to the kobject_hotplug function, how do we prevent the
> execution of /sbin/hotplug for ksets that have positive hotplug filters?

Ah, another good point.

> So I've created a new function for now:
> int kobj_notify(const char *signal, struct kobject *kobj, struct attribute *attr)
> which can be used from the subsystems. This function is also called for
> the normal /sbin/hotplug event. (The subsystems may provide a additional
> environment for the /sbin/hotplug events, this is ignored by now.)

This is basically the last patch I posted, with the removel of "enum
kevent" and the addition of struct attribute *attr".

Which is exactly what I want. ;-)

Best,

Robert Love


2004-09-05 03:59:42

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Sat, 2004-09-04 at 02:54 +0200, Greg KH wrote:

> But transports are important, I agree.

Oh, I have another thought (see my previous email first).

The proper course of action based on your suggestion is to cleanly
abstract the concept of the "backend transport" from the notifier, and
offer a compile-time option of hotplug, netlink, and/or whatever else.
Make the transport pluggable and configurable. Do it cleanly. Yada
yada.

But that is a lot of code and a lot of work. More than I think is
warranted, right?

Accepting that the above is the clean and proper way to do what you say,
let's carry it through. What is the ideal situation? People pick
either hotplug or netlink or foo as their transport. Why pick more than
one? Most people select hotplug because that is there now and works.
Maybe in the future people would choose netlink and move to that. This
is all ideally.

In practice, however, we get people enabling both hotplug and netlink,
because they need hotplug for hotplug and want netlink for the new
kevent stuff. So this approach leads to no one ever picking the ideal.

What we want is people using hotplug for hotplug, and kevent over
netlink for the event stuff.

So why stick the two together?

We have kobject_hotplug() and kobject_notify() and everything makes
sense.

Robert Love


2004-09-05 07:35:44

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer


> Look, I agree that unifying the two ideas and transports as much as
> possible is the right way to proceed. But the fact is, as you said,
> transports _are_ important. And simply always sending out a hotplug
> event _and_ a netlink event is silly and superfluous. We need to make
> up our minds.

in addition I consider the 2 *uses* ortogonal. Hotplug historically has
been 1) reasonably heavy weight and 2) concerned with hardware changes.
General events such as say, "thermal throttling started" could of course
be munged into hotplug but to me that is something artificial.
Saying "look ma, the interfaces have the same types, so it has to be one
function" is imo the wrong thing to do. The usage is different (although
I will admit there is an overlapping area). The cost of sending a
message is different.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-09-05 12:18:13

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Sat, Sep 04, 2004 at 10:58:08PM -0400, Robert Love wrote:
> On Sat, 2004-09-04 at 02:54 +0200, Greg KH wrote:
>
> > So, we're back to the original issue. Why is this kernel event system
> > different from the hotplug system? I would argue there isn't one,
> > becides the transport, as you seem to want everything that we currently
> > provide in the current kobject_hotplug() call.
> >
> > But transports are important, I agree.
> >
> > How about you just add the ability to send hotplug calls across netlink?
> > Make it so the kobject_hotplug() function does both the exec() call, and
> > a netlink call (based on a config option for those people who like to
> > configure such stuff.)
>
> This smells.
>
> Look, I agree that unifying the two ideas and transports as much as
> possible is the right way to proceed. But the fact is, as you said,
> transports _are_ important. And simply always sending out a hotplug
> event _and_ a netlink event is silly and superfluous. We need to make
> up our minds.
>
> I don't think anyone argues that netlink makes sense for these low
> priority asynchronous events.
>
> I'd prefer to integrate the two approaches as much as possible, but keep
> the two transports separate. Use hotplug for hotplug events as we do
> now and use kevent, which is over netlink, for the new events we want to
> add.
>
> Maybe always do the kevent from the hotplug, but definitely do not do
> the hotplug from all kevents. It is redundant and extra overhead.
>
> Doing both simultaneous begs the question of why have both. Picking the
> right tool for the job is, well, the right tool for the job.

Yes, it doesn't make much sense to pipe the kevents through
/sbin/hotplug, but I definitely want the hotplug-events over netlink,
to get rid of the SEQNUM reorder nightmare and unpredictable delay of
the execution of the /etc/hotplug.d/ helpers, we currently can't handle
very well with HAL.

I expect, that we need to have both transports (for hotplug), cause early
boot is unable to react to netlink messages.
The /sbin/hotplug can easily switched off, after some "advanced event daemon"
is running by: "echo -n "" > /proc/sys/kernel/hotplug".

What about moving the /sbin/hotplug execution from lib/kobject.c to
kernel/kobj_notify.c and merge it with our netlink code? This would
separate the "object-storage" from the "object-notification" which is
nice. And we would have a single place to implement all kind of event
transports.
If anybody invents some other kind of transport it can go into
kobj_notify.c. All transports can share some code and are highly
configurable then.

kernel/kobj_notify.c would export:
kobject_hotplug(const char *action, struct kobject *kobj)
kobj_notify (const char *signal, struct kobject *kobj, struct attribute *attr)

lib/kobject.c just calls kobject_hotplug() and we get the event on both
transports at the same time. All other events just use kobj_notify() which
doesn't do /sbin/hotplug.

How does it sound?

Thanks,
Kay

2004-09-06 02:06:38

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

diff -Nru a/fs/super.c b/fs/super.c
--- a/fs/super.c 2004-09-06 03:47:59 +02:00
+++ b/fs/super.c 2004-09-06 03:47:59 +02:00
@@ -35,6 +35,7 @@
#include <linux/vfs.h>
#include <linux/writeback.h> /* for the emergency remount stuff */
#include <linux/idr.h>
+#include <linux/kobject_uevent.h>
#include <asm/uaccess.h>


@@ -633,6 +634,16 @@
return (void *)s->s_bdev == data;
}

+static void bdev_claim_uevent(const char *action, struct block_device *bdev)
+{
+ if (bdev->bd_disk) {
+ if (bdev->bd_part)
+ kobject_uevent(action, &bdev->bd_part->kobj, NULL);
+ else
+ kobject_uevent(action, &bdev->bd_disk->kobj, NULL);
+ }
+}
+
struct super_block *get_sb_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int))
@@ -675,8 +686,10 @@
up_write(&s->s_umount);
deactivate_super(s);
s = ERR_PTR(error);
- } else
+ } else {
s->s_flags |= MS_ACTIVE;
+ bdev_claim_uevent("claim", bdev);
+ }
}

return s;
@@ -691,6 +704,8 @@
void kill_block_super(struct super_block *sb)
{
struct block_device *bdev = sb->s_bdev;
+
+ bdev_claim_uevent("release", bdev);
generic_shutdown_super(sb);
set_blocksize(bdev, sb->s_old_blocksize);
close_bdev_excl(bdev);
diff -Nru a/include/linux/kobject.h b/include/linux/kobject.h
--- a/include/linux/kobject.h 2004-09-06 03:47:59 +02:00
+++ b/include/linux/kobject.h 2004-09-06 03:47:59 +02:00
@@ -61,7 +61,7 @@

extern void kobject_hotplug(const char *action, struct kobject *);

-extern char * kobject_get_path(struct kset *, struct kobject *, int);
+extern char * kobject_get_path(struct kobject *, int);

struct kobj_type {
void (*release)(struct kobject *);
diff -Nru a/include/linux/kobject_uevent.h b/include/linux/kobject_uevent.h
--- /dev/null Wed Dec 31 16:00:00 196900
+++ b/include/linux/kobject_uevent.h 2004-09-06 03:47:59 +02:00
@@ -0,0 +1,39 @@
+#ifndef _KOBJECT_UEVENT_H_
+#define _KOBJECT_UEVENT_H_
+
+#ifdef __KERNEL__
+
+#ifdef CONFIG_HOTPLUG
+/* counter to tag the hotplug event, read only except for the kobject uevent */
+extern u64 hotplug_seqnum;
+extern void kobject_hotplug(const char *action, struct kobject *kobj);
+#else
+static inline void kobject_hotplug(const char *action, struct kobject *kobj)
+{
+ return;
+}
+#endif
+
+
+#ifdef CONFIG_KOBJECT_UEVENT
+extern int kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr);
+
+extern int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr);
+#else
+static inline int kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return 0;
+}
+
+static inline int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return 0;
+}
+#endif
+
+#endif /* __KERNEL__ */
+#endif /* _KOBJECT_UEVENT_H_ */
diff -Nru a/include/linux/netlink.h b/include/linux/netlink.h
--- a/include/linux/netlink.h 2004-09-06 03:47:59 +02:00
+++ b/include/linux/netlink.h 2004-09-06 03:47:59 +02:00
@@ -17,6 +17,7 @@
#define NETLINK_ROUTE6 11 /* af_inet6 route comm channel */
#define NETLINK_IP6_FW 13
#define NETLINK_DNRTMSG 14 /* DECnet routing messages */
+#define NETLINK_KOBJECT_UEVENT 15 /* Kernel messages to userspace */
#define NETLINK_TAPBASE 16 /* 16 to 31 are ethertap */

#define MAX_LINKS 32
diff -Nru a/init/Kconfig b/init/Kconfig
--- a/init/Kconfig 2004-09-06 03:47:59 +02:00
+++ b/init/Kconfig 2004-09-06 03:47:59 +02:00
@@ -195,6 +195,22 @@
agent" (/sbin/hotplug) to load modules and set up software needed
to use devices as you hotplug them.

+config KOBJECT_UEVENT
+ bool "Kernel Userspace Events"
+ depends on NET && HOTPLUG
+ default y
+ help
+ This option enables the kernel userspace events layer, which is a
+ simple mechanism for kernel-to-user communication over a netlink
+ socket.
+ The goal of the kernel userspace events layer is to provide a simple
+ and efficient events system, that notifies userspace about kobject
+ state changes e.g. hotplug events, power state transitions or block
+ device claiming (mount/unmount).
+
+ Say Y, unless you are building a system requiring minimal memory
+ consumption.
+
config IKCONFIG
bool "Kernel .config support"
---help---
diff -Nru a/kernel/Makefile b/kernel/Makefile
--- a/kernel/Makefile 2004-09-06 03:47:59 +02:00
+++ b/kernel/Makefile 2004-09-06 03:47:59 +02:00
@@ -24,6 +24,7 @@
obj-$(CONFIG_AUDIT) += audit.o
obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_HOTPLUG) += kobject_uevent.o

ifneq ($(CONFIG_IA64),y)
# According to Alan Modra <[email protected]>, the -fno-omit-frame-pointer is
diff -Nru a/kernel/kobject_uevent.c b/kernel/kobject_uevent.c
--- /dev/null Wed Dec 31 16:00:00 196900
+++ b/kernel/kobject_uevent.c 2004-09-06 03:47:59 +02:00
@@ -0,0 +1,233 @@
+/*
+ * kernel/kobj_uevent.c - kernel userspace event delivery
+ *
+ * Copyright (C) 2004 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2004 Novell, Inc. All rights reserved.
+ *
+ * Licensed under the GNU GPL v2.
+ *
+ * Authors:
+ * Robert Love <[email protected]>
+ * Kay Sievers <[email protected]>
+ * Arjan van de Ven <[email protected]>
+ */
+
+#include <linux/spinlock.h>
+#include <linux/socket.h>
+#include <linux/skbuff.h>
+#include <linux/netlink.h>
+#include <linux/string.h>
+#include <linux/kobject.h>
+#include <linux/kobject_uevent.h>
+#include <net/sock.h>
+
+#ifdef CONFIG_HOTPLUG
+u64 hotplug_seqnum;
+static spinlock_t sequence_lock = SPIN_LOCK_UNLOCKED;
+#endif
+
+#ifdef CONFIG_KOBJECT_UEVENT
+static struct sock *uevent_sock = NULL;
+#endif
+
+static int send_uevent(const char *signal, const char *obj,
+ const void *buf, int buflen, int gfp_mask)
+{
+ struct sk_buff *skb;
+ char *pos;
+ int len;
+
+ if (!uevent_sock)
+ return -EIO;
+
+ len = strlen(signal) + 1;
+ len += strlen(obj) + 1;
+ len += buflen;
+
+ skb = alloc_skb(len, gfp_mask);
+ if (!skb)
+ return -ENOMEM;
+
+ pos = skb_put(skb, len);
+
+ pos += sprintf(pos, "%s@%s", signal, obj) + 1;
+ memcpy(pos, buf, buflen);
+
+ return netlink_broadcast(uevent_sock, skb, 0, 1, gfp_mask);
+}
+
+static int do_kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr, int gfp_mask)
+{
+ char *path;
+ char *attrpath;
+ int len;
+ int rc = -ENOMEM;
+
+ path = kobject_get_path(kobj, gfp_mask);
+ if (!path)
+ return -ENOMEM;
+
+ if (attr) {
+ len = strlen(path);
+ len += strlen(attr->name) + 2;
+ attrpath = kmalloc(len, gfp_mask);
+ if (!attrpath)
+ goto exit;
+ sprintf(attrpath, "%s/%s", path, attr->name);
+ rc = send_uevent(signal, attrpath, NULL, 0, gfp_mask);
+ kfree(attrpath);
+ } else {
+ rc = send_uevent(signal, path, NULL, 0, gfp_mask);
+ }
+
+exit:
+ kfree(path);
+ return rc;
+}
+
+int kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return do_kobject_uevent(signal, kobj, attr, GFP_KERNEL);
+}
+
+EXPORT_SYMBOL(kobject_uevent);
+
+int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return do_kobject_uevent(signal, kobj, attr, GFP_ATOMIC);
+}
+
+EXPORT_SYMBOL(kobject_uevent_atomic);
+
+
+#define BUFFER_SIZE 1024 /* should be enough memory for the env */
+#define NUM_ENVP 32 /* number of env pointers */
+void kobject_hotplug(const char *action, struct kobject *kobj)
+{
+ char *argv [3];
+ char **envp = NULL;
+ char *buffer = NULL;
+ char *scratch;
+ int i = 0;
+ int retval;
+ char *kobj_path = NULL;
+ char *name = NULL;
+ u64 seq;
+ struct kobject * top_kobj = kobj;
+ struct kset * kset;
+
+ if (!top_kobj->kset && top_kobj->parent) {
+ do {
+ top_kobj = top_kobj->parent;
+ } while (!top_kobj->kset && top_kobj->parent);
+ }
+
+ if (top_kobj->kset && top_kobj->kset->hotplug_ops)
+ kset = top_kobj->kset;
+ else
+ return;
+
+ /* If the kset has a filter operation, call it.
+ Skip the event, if the filter returns zero. */
+ if (kset->hotplug_ops->filter) {
+ if (!kset->hotplug_ops->filter(kset, kobj))
+ return;
+ }
+
+ pr_debug ("%s\n", __FUNCTION__);
+
+ envp = kmalloc(NUM_ENVP * sizeof (char *), GFP_KERNEL);
+ if (!envp)
+ return;
+ memset (envp, 0x00, NUM_ENVP * sizeof (char *));
+
+ buffer = kmalloc(BUFFER_SIZE, GFP_KERNEL);
+ if (!buffer)
+ goto exit;
+
+ if (kset->hotplug_ops->name)
+ name = kset->hotplug_ops->name(kset, kobj);
+ if (name == NULL)
+ name = kset->kobj.name;
+
+ argv [0] = hotplug_path;
+ argv [1] = name;
+ argv [2] = NULL;
+
+ /* minimal command environment */
+ envp [i++] = "HOME=/";
+ envp [i++] = "PATH=/sbin:/bin:/usr/sbin:/usr/bin";
+
+ scratch = buffer;
+
+ envp [i++] = scratch;
+ scratch += sprintf(scratch, "ACTION=%s", action) + 1;
+
+ spin_lock(&sequence_lock);
+ seq = ++hotplug_seqnum;
+ spin_unlock(&sequence_lock);
+
+ kobj_path = kobject_get_path(kobj, GFP_KERNEL);
+ if (!kobj_path)
+ goto exit;
+
+ envp [i++] = scratch;
+ scratch += sprintf (scratch, "DEVPATH=%s", kobj_path) + 1;
+
+ envp [i++] = scratch;
+ scratch += sprintf(scratch, "SEQNUM=%lld", seq) + 1;
+
+ envp [i++] = scratch;
+ scratch += sprintf(scratch, "SUBSYSTEM=%s", name) + 1;
+
+ if (kset->hotplug_ops->hotplug) {
+ /* have the kset specific function add its stuff */
+ retval = kset->hotplug_ops->hotplug (kset, kobj,
+ &envp[i], NUM_ENVP - i, scratch,
+ BUFFER_SIZE - (scratch - buffer));
+ if (retval) {
+ pr_debug ("%s - hotplug() returned %d\n",
+ __FUNCTION__, retval);
+ goto exit;
+ }
+ }
+
+ pr_debug ("%s: %s %s %s %s %s %s %s\n", __FUNCTION__, argv[0], argv[1],
+ envp[0], envp[1], envp[2], envp[3], envp[4]);
+
+ send_uevent(action, kobj_path, buffer, scratch - buffer, GFP_KERNEL);
+
+ if (!hotplug_path[0])
+ goto exit;
+
+ retval = call_usermodehelper (argv[0], argv, envp, 0);
+ if (retval)
+ pr_debug ("%s - call_usermodehelper returned %d\n",
+ __FUNCTION__, retval);
+
+exit:
+ kfree(kobj_path);
+ kfree(buffer);
+ kfree(envp);
+ return;
+}
+
+EXPORT_SYMBOL(kobject_hotplug);
+
+static int __init kobject_uevent_init(void)
+{
+ uevent_sock = netlink_kernel_create(NETLINK_KOBJECT_UEVENT, NULL);
+
+ if (!uevent_sock) {
+ printk(KERN_ERR
+ "kobject_uevent: unable to create netlink socket!\n");
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+core_initcall(kobject_uevent_init);
diff -Nru a/lib/kobject.c b/lib/kobject.c
--- a/lib/kobject.c 2004-09-06 03:47:59 +02:00
+++ b/lib/kobject.c 2004-09-06 03:47:59 +02:00
@@ -13,6 +13,7 @@
#undef DEBUG

#include <linux/kobject.h>
+#include <linux/kobject_uevent.h>
#include <linux/string.h>
#include <linux/module.h>
#include <linux/stat.h>
@@ -63,7 +64,7 @@
return container_of(entry,struct kobject,entry);
}

-static int get_kobj_path_length(struct kset *kset, struct kobject *kobj)
+static int get_kobj_path_length(struct kobject *kobj)
{
int length = 1;
struct kobject * parent = kobj;
@@ -79,7 +80,7 @@
return length;
}

-static void fill_kobj_path(struct kset *kset, struct kobject *kobj, char *path, int length)
+static void fill_kobj_path(struct kobject *kobj, char *path, int length)
{
struct kobject * parent;

@@ -103,146 +104,21 @@
* @kobj: kobject in question, with which to build the path
* @gfp_mask: the allocation type used to allocate the path
*/
-char * kobject_get_path(struct kset *kset, struct kobject *kobj, int gfp_mask)
+char * kobject_get_path(struct kobject *kobj, int gfp_mask)
{
char *path;
int len;

- len = get_kobj_path_length(kset, kobj);
+ len = get_kobj_path_length(kobj);
path = kmalloc(len, gfp_mask);
if (!path)
return NULL;
memset(path, 0x00, len);
- fill_kobj_path(kset, kobj, path, len);
+ fill_kobj_path(kobj, path, len);

return path;
}

-#ifdef CONFIG_HOTPLUG
-
-#define BUFFER_SIZE 1024 /* should be enough memory for the env */
-#define NUM_ENVP 32 /* number of env pointers */
-static unsigned long sequence_num;
-static spinlock_t sequence_lock = SPIN_LOCK_UNLOCKED;
-
-static void kset_hotplug(const char *action, struct kset *kset,
- struct kobject *kobj)
-{
- char *argv [3];
- char **envp = NULL;
- char *buffer = NULL;
- char *scratch;
- int i = 0;
- int retval;
- char *kobj_path = NULL;
- char *name = NULL;
- unsigned long seq;
-
- /* If the kset has a filter operation, call it. If it returns
- failure, no hotplug event is required. */
- if (kset->hotplug_ops->filter) {
- if (!kset->hotplug_ops->filter(kset, kobj))
- return;
- }
-
- pr_debug ("%s\n", __FUNCTION__);
-
- if (!hotplug_path[0])
- return;
-
- envp = kmalloc(NUM_ENVP * sizeof (char *), GFP_KERNEL);
- if (!envp)
- return;
- memset (envp, 0x00, NUM_ENVP * sizeof (char *));
-
- buffer = kmalloc(BUFFER_SIZE, GFP_KERNEL);
- if (!buffer)
- goto exit;
-
- if (kset->hotplug_ops->name)
- name = kset->hotplug_ops->name(kset, kobj);
- if (name == NULL)
- name = kset->kobj.name;
-
- argv [0] = hotplug_path;
- argv [1] = name;
- argv [2] = NULL;
-
- /* minimal command environment */
- envp [i++] = "HOME=/";
- envp [i++] = "PATH=/sbin:/bin:/usr/sbin:/usr/bin";
-
- scratch = buffer;
-
- envp [i++] = scratch;
- scratch += sprintf(scratch, "ACTION=%s", action) + 1;
-
- spin_lock(&sequence_lock);
- seq = sequence_num++;
- spin_unlock(&sequence_lock);
-
- envp [i++] = scratch;
- scratch += sprintf(scratch, "SEQNUM=%ld", seq) + 1;
-
- kobj_path = kobject_get_path(kset, kobj, GFP_KERNEL);
- if (!kobj_path)
- goto exit;
-
- envp [i++] = scratch;
- scratch += sprintf (scratch, "DEVPATH=%s", kobj_path) + 1;
-
- if (kset->hotplug_ops->hotplug) {
- /* have the kset specific function add its stuff */
- retval = kset->hotplug_ops->hotplug (kset, kobj,
- &envp[i], NUM_ENVP - i, scratch,
- BUFFER_SIZE - (scratch - buffer));
- if (retval) {
- pr_debug ("%s - hotplug() returned %d\n",
- __FUNCTION__, retval);
- goto exit;
- }
- }
-
- pr_debug ("%s: %s %s %s %s %s %s %s\n", __FUNCTION__, argv[0], argv[1],
- envp[0], envp[1], envp[2], envp[3], envp[4]);
- retval = call_usermodehelper (argv[0], argv, envp, 0);
- if (retval)
- pr_debug ("%s - call_usermodehelper returned %d\n",
- __FUNCTION__, retval);
-
-exit:
- kfree(kobj_path);
- kfree(buffer);
- kfree(envp);
- return;
-}
-
-void kobject_hotplug(const char *action, struct kobject *kobj)
-{
- struct kobject * top_kobj = kobj;
-
- /* If this kobj does not belong to a kset,
- try to find a parent that does. */
- if (!top_kobj->kset && top_kobj->parent) {
- do {
- top_kobj = top_kobj->parent;
- } while (!top_kobj->kset && top_kobj->parent);
- }
-
- if (top_kobj->kset && top_kobj->kset->hotplug_ops)
- kset_hotplug(action, top_kobj->kset, kobj);
-}
-#else
-void kobject_hotplug(const char *action, struct kobject *kobj)
-{
- return;
-}
-#endif /* CONFIG_HOTPLUG */
-
-/**
- * kobject_init - initialize object.
- * @kobj: object in question.
- */
void kobject_init(struct kobject * kobj)
{
kref_init(&kobj->kref);
@@ -654,7 +530,6 @@
EXPORT_SYMBOL(kobject_add);
EXPORT_SYMBOL(kobject_del);
EXPORT_SYMBOL(kobject_rename);
-EXPORT_SYMBOL(kobject_hotplug);

EXPORT_SYMBOL(kset_register);
EXPORT_SYMBOL(kset_unregister);


Attachments:
(No filename) (3.35 kB)
kuevents-01.patch (14.76 kB)
Download all attachments

2004-09-10 23:55:41

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

Ok, thanks to a respin of the patch by Kay, and some further tweaks by
me to make the patch a bit smaller (and make the functions
EXPORT_SYMBOL_GPL, Robert and Kay, speak up if you don't like that
change), I've commited the following patch to my trees, and it should
show up in the next -mm release

(Andrew, you can drop your other patch in your stack that contained a
version of this.)

thanks,

greg k-h

----------------------
Subject: Kobject Userspace Event Notification

Implemetation of userspace events through a netlink socket. The kernel events
layer provides the functionality to raise an event from a given kobject
represented by its sysfs-path and a signal string to describe the type of
event.

Currently, kobject additions and removals are signalized to userspace by forking
the /sbin/hotplug helper. This patch moves this special case of userspace-event
out of the kobject core to the new kobject_uevent implementation. This makes it
possible to send all hotplug messages also through the new netlink transport.

Possible new users of the kernel userspace functionality are filesystem
mount events (block device claim/release) or simple device state transitions
(cpu overheating).

To send an event, the user needs to pass the kobject, a optional
sysfs-attribute and the signal string to the following function:

kobject_uevent(const char *signal,
struct kobject *kobj,
struct attribute *attr)

Example:
kobject_uevent("overheating", &cpu->kobj, NULL);

The message itself is sent over multicast netlink socket, which makes
it possible for userspace to listen with multiple applications for the same
messages.

Signed-off-by: Robert Love <[email protected]>
Signed-off-by: Kay Sievers <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

diff -Nru a/include/linux/kobject.h b/include/linux/kobject.h
--- a/include/linux/kobject.h 2004-09-10 16:45:59 -07:00
+++ b/include/linux/kobject.h 2004-09-10 16:45:59 -07:00
@@ -62,9 +62,7 @@
extern struct kobject * kobject_get(struct kobject *);
extern void kobject_put(struct kobject *);

-extern void kobject_hotplug(const char *action, struct kobject *);
-
-extern char * kobject_get_path(struct kset *, struct kobject *, int);
+extern char * kobject_get_path(struct kobject *, int);

struct kobj_type {
void (*release)(struct kobject *);
@@ -236,6 +234,34 @@

extern int subsys_create_file(struct subsystem * , struct subsys_attribute *);
extern void subsys_remove_file(struct subsystem * , struct subsys_attribute *);
+
+
+#ifdef CONFIG_HOTPLUG
+extern void kobject_hotplug(const char *action, struct kobject *kobj);
+#else
+static inline void kobject_hotplug(const char *action, struct kobject *kobj) { }
+#endif
+
+
+#ifdef CONFIG_KOBJECT_UEVENT
+extern int kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr);
+
+extern int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr);
+#else
+static inline int kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return 0;
+}
+
+static inline int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return 0;
+}
+#endif

#endif /* __KERNEL__ */
#endif /* _KOBJECT_H_ */
diff -Nru a/include/linux/netlink.h b/include/linux/netlink.h
--- a/include/linux/netlink.h 2004-09-10 16:45:59 -07:00
+++ b/include/linux/netlink.h 2004-09-10 16:45:59 -07:00
@@ -17,6 +17,7 @@
#define NETLINK_ROUTE6 11 /* af_inet6 route comm channel */
#define NETLINK_IP6_FW 13
#define NETLINK_DNRTMSG 14 /* DECnet routing messages */
+#define NETLINK_KOBJECT_UEVENT 15 /* Kernel messages to userspace */
#define NETLINK_TAPBASE 16 /* 16 to 31 are ethertap */

#define MAX_LINKS 32
diff -Nru a/init/Kconfig b/init/Kconfig
--- a/init/Kconfig 2004-09-10 16:45:59 -07:00
+++ b/init/Kconfig 2004-09-10 16:45:59 -07:00
@@ -195,6 +195,25 @@
agent" (/sbin/hotplug) to load modules and set up software needed
to use devices as you hotplug them.

+config KOBJECT_UEVENT
+ bool "Kernel Userspace Events"
+ depends on NET
+ default y
+ help
+ This option enables the kernel userspace event layer, which is a
+ simple mechanism for kernel-to-user communication over a netlink
+ socket.
+ The goal of the kernel userspace events layer is to provide a simple
+ and efficient events system, that notifies userspace about kobject
+ state changes. This will enable applications to just listen for
+ events instead of polling system devices and files.
+ Hotplug events (kobject addition and removal) are also available on
+ the netlink socket in addition to the execution of /sbin/hotplug if
+ CONFIG_HOTPLUG is enabled.
+
+ Say Y, unless you are building a system requiring minimal memory
+ consumption.
+
config IKCONFIG
bool "Kernel .config support"
---help---
diff -Nru a/lib/Makefile b/lib/Makefile
--- a/lib/Makefile 2004-09-10 16:45:59 -07:00
+++ b/lib/Makefile 2004-09-10 16:45:59 -07:00
@@ -6,7 +6,7 @@
lib-y := errno.o ctype.o string.o vsprintf.o cmdline.o \
bust_spinlocks.o rbtree.o radix-tree.o dump_stack.o \
kobject.o kref.o idr.o div64.o parser.o int_sqrt.o \
- bitmap.o extable.o
+ bitmap.o extable.o kobject_uevent.o

lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += rwsem-spinlock.o
lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
diff -Nru a/lib/kobject.c b/lib/kobject.c
--- a/lib/kobject.c 2004-09-10 16:45:59 -07:00
+++ b/lib/kobject.c 2004-09-10 16:45:59 -07:00
@@ -63,7 +63,7 @@
return container_of(entry,struct kobject,entry);
}

-static int get_kobj_path_length(struct kset *kset, struct kobject *kobj)
+static int get_kobj_path_length(struct kobject *kobj)
{
int length = 1;
struct kobject * parent = kobj;
@@ -79,7 +79,7 @@
return length;
}

-static void fill_kobj_path(struct kset *kset, struct kobject *kobj, char *path, int length)
+static void fill_kobj_path(struct kobject *kobj, char *path, int length)
{
struct kobject * parent;

@@ -99,146 +99,24 @@
* kobject_get_path - generate and return the path associated with a given kobj
* and kset pair. The result must be freed by the caller with kfree().
*
- * @kset: kset in question, with which to build the path
* @kobj: kobject in question, with which to build the path
* @gfp_mask: the allocation type used to allocate the path
*/
-char * kobject_get_path(struct kset *kset, struct kobject *kobj, int gfp_mask)
+char *kobject_get_path(struct kobject *kobj, int gfp_mask)
{
char *path;
int len;

- len = get_kobj_path_length(kset, kobj);
+ len = get_kobj_path_length(kobj);
path = kmalloc(len, gfp_mask);
if (!path)
return NULL;
memset(path, 0x00, len);
- fill_kobj_path(kset, kobj, path, len);
+ fill_kobj_path(kobj, path, len);

return path;
}

-#ifdef CONFIG_HOTPLUG
-
-u64 hotplug_seqnum;
-#define BUFFER_SIZE 1024 /* should be enough memory for the env */
-#define NUM_ENVP 32 /* number of env pointers */
-static spinlock_t sequence_lock = SPIN_LOCK_UNLOCKED;
-
-static void kset_hotplug(const char *action, struct kset *kset,
- struct kobject *kobj)
-{
- char *argv [3];
- char **envp = NULL;
- char *buffer = NULL;
- char *scratch;
- int i = 0;
- int retval;
- char *kobj_path = NULL;
- char *name = NULL;
- unsigned long seq;
-
- /* If the kset has a filter operation, call it. If it returns
- failure, no hotplug event is required. */
- if (kset->hotplug_ops->filter) {
- if (!kset->hotplug_ops->filter(kset, kobj))
- return;
- }
-
- pr_debug ("%s\n", __FUNCTION__);
-
- if (!hotplug_path[0])
- return;
-
- envp = kmalloc(NUM_ENVP * sizeof (char *), GFP_KERNEL);
- if (!envp)
- return;
- memset (envp, 0x00, NUM_ENVP * sizeof (char *));
-
- buffer = kmalloc(BUFFER_SIZE, GFP_KERNEL);
- if (!buffer)
- goto exit;
-
- if (kset->hotplug_ops->name)
- name = kset->hotplug_ops->name(kset, kobj);
- if (name == NULL)
- name = kset->kobj.name;
-
- argv [0] = hotplug_path;
- argv [1] = name;
- argv [2] = NULL;
-
- /* minimal command environment */
- envp [i++] = "HOME=/";
- envp [i++] = "PATH=/sbin:/bin:/usr/sbin:/usr/bin";
-
- scratch = buffer;
-
- envp [i++] = scratch;
- scratch += sprintf(scratch, "ACTION=%s", action) + 1;
-
- spin_lock(&sequence_lock);
- seq = ++hotplug_seqnum;
- spin_unlock(&sequence_lock);
-
- envp [i++] = scratch;
- scratch += sprintf(scratch, "SEQNUM=%ld", seq) + 1;
-
- kobj_path = kobject_get_path(kset, kobj, GFP_KERNEL);
- if (!kobj_path)
- goto exit;
-
- envp [i++] = scratch;
- scratch += sprintf (scratch, "DEVPATH=%s", kobj_path) + 1;
-
- if (kset->hotplug_ops->hotplug) {
- /* have the kset specific function add its stuff */
- retval = kset->hotplug_ops->hotplug (kset, kobj,
- &envp[i], NUM_ENVP - i, scratch,
- BUFFER_SIZE - (scratch - buffer));
- if (retval) {
- pr_debug ("%s - hotplug() returned %d\n",
- __FUNCTION__, retval);
- goto exit;
- }
- }
-
- pr_debug ("%s: %s %s %s %s %s %s %s\n", __FUNCTION__, argv[0], argv[1],
- envp[0], envp[1], envp[2], envp[3], envp[4]);
- retval = call_usermodehelper (argv[0], argv, envp, 0);
- if (retval)
- pr_debug ("%s - call_usermodehelper returned %d\n",
- __FUNCTION__, retval);
-
-exit:
- kfree(kobj_path);
- kfree(buffer);
- kfree(envp);
- return;
-}
-
-void kobject_hotplug(const char *action, struct kobject *kobj)
-{
- struct kobject * top_kobj = kobj;
-
- /* If this kobj does not belong to a kset,
- try to find a parent that does. */
- if (!top_kobj->kset && top_kobj->parent) {
- do {
- top_kobj = top_kobj->parent;
- } while (!top_kobj->kset && top_kobj->parent);
- }
-
- if (top_kobj->kset && top_kobj->kset->hotplug_ops)
- kset_hotplug(action, top_kobj->kset, kobj);
-}
-#else
-void kobject_hotplug(const char *action, struct kobject *kobj)
-{
- return;
-}
-#endif /* CONFIG_HOTPLUG */
-
/**
* kobject_init - initialize object.
* @kobj: object in question.
@@ -654,7 +532,6 @@
EXPORT_SYMBOL(kobject_add);
EXPORT_SYMBOL(kobject_del);
EXPORT_SYMBOL(kobject_rename);
-EXPORT_SYMBOL(kobject_hotplug);

EXPORT_SYMBOL(kset_register);
EXPORT_SYMBOL(kset_unregister);
diff -Nru a/lib/kobject_uevent.c b/lib/kobject_uevent.c
--- /dev/null Wed Dec 31 16:00:00 196900
+++ b/lib/kobject_uevent.c 2004-09-10 16:45:59 -07:00
@@ -0,0 +1,264 @@
+/*
+ * kernel userspace event delivery
+ *
+ * Copyright (C) 2004 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2004 Novell, Inc. All rights reserved.
+ * Copyright (C) 2004 IBM, Inc. All rights reserved.
+ *
+ * Licensed under the GNU GPL v2.
+ *
+ * Authors:
+ * Robert Love <[email protected]>
+ * Kay Sievers <[email protected]>
+ * Arjan van de Ven <[email protected]>
+ * Greg Kroah-Hartman <[email protected]>
+ */
+
+#include <linux/spinlock.h>
+#include <linux/socket.h>
+#include <linux/skbuff.h>
+#include <linux/netlink.h>
+#include <linux/string.h>
+#include <linux/kobject.h>
+#include <net/sock.h>
+
+#ifdef CONFIG_KOBJECT_UEVENT
+static struct sock *uevent_sock;
+
+/**
+ * send_uevent - notify userspace by sending event trough netlink socket
+ *
+ * @signal: signal name
+ * @obj: object path (kobject)
+ * @buf: buffer used to pass auxiliary data like the hotplug environment
+ * @buflen:
+ * gfp_mask:
+ */
+static int send_uevent(const char *signal, const char *obj, const void *buf,
+ int buflen, int gfp_mask)
+{
+ struct sk_buff *skb;
+ char *pos;
+ int len;
+
+ if (!uevent_sock)
+ return -EIO;
+
+ len = strlen(signal) + 1;
+ len += strlen(obj) + 1;
+ len += buflen;
+
+ skb = alloc_skb(len, gfp_mask);
+ if (!skb)
+ return -ENOMEM;
+
+ pos = skb_put(skb, len);
+
+ pos += sprintf(pos, "%s@%s", signal, obj) + 1;
+ memcpy(pos, buf, buflen);
+
+ return netlink_broadcast(uevent_sock, skb, 0, 1, gfp_mask);
+}
+
+static int do_kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr, int gfp_mask)
+{
+ char *path;
+ char *attrpath;
+ int len;
+ int rc = -ENOMEM;
+
+ path = kobject_get_path(kobj, gfp_mask);
+ if (!path)
+ return -ENOMEM;
+
+ if (attr) {
+ len = strlen(path);
+ len += strlen(attr->name) + 2;
+ attrpath = kmalloc(len, gfp_mask);
+ if (!attrpath)
+ goto exit;
+ sprintf(attrpath, "%s/%s", path, attr->name);
+ rc = send_uevent(signal, attrpath, NULL, 0, gfp_mask);
+ kfree(attrpath);
+ } else {
+ rc = send_uevent(signal, path, NULL, 0, gfp_mask);
+ }
+
+exit:
+ kfree(path);
+ return rc;
+}
+
+/**
+ * kobject_uevent - notify userspace by sending event through netlink socket
+ *
+ * @signal: signal name
+ * @kobj: struct kobject that the event is happening to
+ * @attr: optional struct attribute the event belongs to
+ */
+int kobject_uevent(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return do_kobject_uevent(signal, kobj, attr, GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(kobject_uevent);
+
+int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+ struct attribute *attr)
+{
+ return do_kobject_uevent(signal, kobj, attr, GFP_ATOMIC);
+}
+
+EXPORT_SYMBOL_GPL(kobject_uevent_atomic);
+
+static int __init kobject_uevent_init(void)
+{
+ uevent_sock = netlink_kernel_create(NETLINK_KOBJECT_UEVENT, NULL);
+
+ if (!uevent_sock) {
+ printk(KERN_ERR
+ "kobject_uevent: unable to create netlink socket!\n");
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+core_initcall(kobject_uevent_init);
+
+#else
+static inline int send_uevent(const char *signal, const char *obj,
+ const void *buf, int buflen, int gfp_mask)
+{
+ return 0;
+}
+
+#endif /* CONFIG_KOBJECT_UEVENT */
+
+
+#ifdef CONFIG_HOTPLUG
+u64 hotplug_seqnum;
+static spinlock_t sequence_lock = SPIN_LOCK_UNLOCKED;
+
+#define BUFFER_SIZE 1024 /* should be enough memory for the env */
+#define NUM_ENVP 32 /* number of env pointers */
+/**
+ * kobject_hotplug - notify userspace by executing /sbin/hotplug
+ *
+ * @action: action that is happening (usually "ADD" or "REMOVE")
+ * @kobj: struct kobject that the action is happening to
+ */
+void kobject_hotplug(const char *action, struct kobject *kobj)
+{
+ char *argv [3];
+ char **envp = NULL;
+ char *buffer = NULL;
+ char *scratch;
+ int i = 0;
+ int retval;
+ char *kobj_path = NULL;
+ char *name = NULL;
+ u64 seq;
+ struct kobject *top_kobj = kobj;
+ struct kset *kset;
+
+ if (!top_kobj->kset && top_kobj->parent) {
+ do {
+ top_kobj = top_kobj->parent;
+ } while (!top_kobj->kset && top_kobj->parent);
+ }
+
+ if (top_kobj->kset && top_kobj->kset->hotplug_ops)
+ kset = top_kobj->kset;
+ else
+ return;
+
+ /* If the kset has a filter operation, call it.
+ Skip the event, if the filter returns zero. */
+ if (kset->hotplug_ops->filter) {
+ if (!kset->hotplug_ops->filter(kset, kobj))
+ return;
+ }
+
+ pr_debug ("%s\n", __FUNCTION__);
+
+ envp = kmalloc(NUM_ENVP * sizeof (char *), GFP_KERNEL);
+ if (!envp)
+ return;
+ memset (envp, 0x00, NUM_ENVP * sizeof (char *));
+
+ buffer = kmalloc(BUFFER_SIZE, GFP_KERNEL);
+ if (!buffer)
+ goto exit;
+
+ if (kset->hotplug_ops->name)
+ name = kset->hotplug_ops->name(kset, kobj);
+ if (name == NULL)
+ name = kset->kobj.name;
+
+ argv [0] = hotplug_path;
+ argv [1] = name;
+ argv [2] = NULL;
+
+ /* minimal command environment */
+ envp [i++] = "HOME=/";
+ envp [i++] = "PATH=/sbin:/bin:/usr/sbin:/usr/bin";
+
+ scratch = buffer;
+
+ envp [i++] = scratch;
+ scratch += sprintf(scratch, "ACTION=%s", action) + 1;
+
+ kobj_path = kobject_get_path(kobj, GFP_KERNEL);
+ if (!kobj_path)
+ goto exit;
+
+ envp [i++] = scratch;
+ scratch += sprintf (scratch, "DEVPATH=%s", kobj_path) + 1;
+
+ spin_lock(&sequence_lock);
+ seq = ++hotplug_seqnum;
+ spin_unlock(&sequence_lock);
+
+ envp [i++] = scratch;
+ scratch += sprintf(scratch, "SEQNUM=%lld", seq) + 1;
+
+ envp [i++] = scratch;
+ scratch += sprintf(scratch, "SUBSYSTEM=%s", name) + 1;
+
+ if (kset->hotplug_ops->hotplug) {
+ /* have the kset specific function add its stuff */
+ retval = kset->hotplug_ops->hotplug (kset, kobj,
+ &envp[i], NUM_ENVP - i, scratch,
+ BUFFER_SIZE - (scratch - buffer));
+ if (retval) {
+ pr_debug ("%s - hotplug() returned %d\n",
+ __FUNCTION__, retval);
+ goto exit;
+ }
+ }
+
+ pr_debug ("%s: %s %s %s %s %s %s %s\n", __FUNCTION__, argv[0], argv[1],
+ envp[0], envp[1], envp[2], envp[3], envp[4]);
+
+ send_uevent(action, kobj_path, buffer, scratch - buffer, GFP_KERNEL);
+
+ if (!hotplug_path[0])
+ goto exit;
+
+ retval = call_usermodehelper (argv[0], argv, envp, 0);
+ if (retval)
+ pr_debug ("%s - call_usermodehelper returned %d\n",
+ __FUNCTION__, retval);
+
+exit:
+ kfree(kobj_path);
+ kfree(buffer);
+ kfree(envp);
+ return;
+}
+EXPORT_SYMBOL(kobject_hotplug);
+#endif /* CONFIG_HOTPLUG */
+
+

2004-09-11 00:19:12

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, Sep 10, 2004 at 04:54:09PM -0700, Greg KH wrote:
> To send an event, the user needs to pass the kobject, a optional
> sysfs-attribute and the signal string to the following function:
>
> kobject_uevent(const char *signal,
> struct kobject *kobj,
> struct attribute *attr)

Sorry I missed the flare up of this topic. What about events for which
there is no associated kobject?

why is the kobject argument not first? Seems weird..

What happened to a formatted string argument? The signal argument can
become the pre-formatted string, and someone can provide a wrapper
that takes a printf() like format and args.
kobject_uevent_printf(kobj, "something bad: 0x%08x", err);

Tim

2004-09-11 00:49:08

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, Sep 10, 2004 at 05:18:49PM -0700, Tim Hockin wrote:
> On Fri, Sep 10, 2004 at 04:54:09PM -0700, Greg KH wrote:
> > To send an event, the user needs to pass the kobject, a optional
> > sysfs-attribute and the signal string to the following function:
> >
> > kobject_uevent(const char *signal,
> > struct kobject *kobj,
> > struct attribute *attr)
>
> Sorry I missed the flare up of this topic. What about events for which
> there is no associated kobject?

Tough, no event for them :)

> why is the kobject argument not first? Seems weird..

Yeah, it is a bit "odd", but it follows my old kobject_hotplug() way.

> What happened to a formatted string argument? The signal argument can
> become the pre-formatted string, and someone can provide a wrapper
> that takes a printf() like format and args.
> kobject_uevent_printf(kobj, "something bad: 0x%08x", err);

Use an attribute, and have userspace read that formatted argument if
need be. This keeps the kernel interface much simpler, and doesn't
allow you to abuse it for things it is not intended for (like error
reporting stuff...)

thanks,

greg k-h

2004-09-11 01:24:52

by Daniel Stekloff

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, 2004-09-10 at 17:48, Greg KH wrote:
> On Fri, Sep 10, 2004 at 05:18:49PM -0700, Tim Hockin wrote:
> > On Fri, Sep 10, 2004 at 04:54:09PM -0700, Greg KH wrote:
> > > To send an event, the user needs to pass the kobject, a optional
> > > sysfs-attribute and the signal string to the following function:
> > >
> > > kobject_uevent(const char *signal,
> > > struct kobject *kobj,
> > > struct attribute *attr)
> >
> > Sorry I missed the flare up of this topic. What about events for which
> > there is no associated kobject?
>
> Tough, no event for them :)
>
> > why is the kobject argument not first? Seems weird..
>
> Yeah, it is a bit "odd", but it follows my old kobject_hotplug() way.
>
> > What happened to a formatted string argument? The signal argument can
> > become the pre-formatted string, and someone can provide a wrapper
> > that takes a printf() like format and args.
> > kobject_uevent_printf(kobj, "something bad: 0x%08x", err);
>
> Use an attribute, and have userspace read that formatted argument if
> need be. This keeps the kernel interface much simpler, and doesn't
> allow you to abuse it for things it is not intended for (like error
> reporting stuff...)


Not to be cheeky, but cpu "overheating" isn't an error event? <grin>



2004-09-11 01:46:42

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, Sep 10, 2004 at 05:48:27PM -0700, Greg KH wrote:
> need be. This keeps the kernel interface much simpler, and doesn't
> allow you to abuse it for things it is not intended for (like error
> reporting stuff...)

Errm, not for error reporting? So the "driver hardening" and fault
logging people shouldn't use this?

2004-09-11 04:08:38

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, 2004-09-10 at 16:54 -0700, Greg KH wrote:
> Ok, thanks to a respin of the patch by Kay, and some further tweaks by
> me to make the patch a bit smaller (and make the functions
> EXPORT_SYMBOL_GPL, Robert and Kay, speak up if you don't like that
> change), I've commited the following patch to my trees, and it should
> show up in the next -mm release

Greg, looks good to me. Thank you much for picking this up.

I am currently out of town (foo camp) so I cannot really test it, but I
did a quick patch review and it looks right on.

> (Andrew, you can drop your other patch in your stack that contained a
> version of this.)

Nod.

> +/**
> + * send_uevent - notify userspace by sending event trough netlink socket

s/trough/through/ ;-)

...

Kay, any thoughts?

We should probably add at least _some_ user. The filesystem mount
events are good, since we want to add those to HAL.

Again, thanks.

Robert Love


2004-09-11 04:44:15

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, 2004-09-10 at 18:23 -0700, Daniel Stekloff wrote:

> Not to be cheeky, but cpu "overheating" isn't an error event? <grin>

I think the key importance is event vs. error. E.g., "overheating" is a
state, something that can be handled in user-space. A driver piping out
"error 501" is, well, something different.

Personally, I am not against using kevent for driver events/errors, but
the exact scope needs to be decided by the community. You do need a
kevent, though.

Robert Love


2004-09-11 11:36:51

by Dave Jones

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, Sep 10, 2004 at 05:48:27PM -0700, Greg KH wrote:

> > What happened to a formatted string argument? The signal argument can
> > become the pre-formatted string, and someone can provide a wrapper
> > that takes a printf() like format and args.
> > kobject_uevent_printf(kobj, "something bad: 0x%08x", err);
>
> Use an attribute, and have userspace read that formatted argument if
> need be. This keeps the kernel interface much simpler, and doesn't
> allow you to abuse it for things it is not intended for (like error
> reporting stuff...)

Erm, no. This will just encourage folks to sprintf to a buffer first
and pass the result to kobject_uevent_printf().

nitpick: Also, if this isn't taking formatted input, shouldn't the name of the
function lose the 'f' ?

Dave

2004-09-11 16:54:44

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Sat, Sep 11, 2004 at 12:09:35AM -0400, Robert Love wrote:
> > +/**
> > + * send_uevent - notify userspace by sending event trough netlink socket
>
> s/trough/through/ ;-)

Heh, give something for the "spelling nit-pickers" to submit a patch
against :)

> We should probably add at least _some_ user. The filesystem mount
> events are good, since we want to add those to HAL.

True, anyone want to send me a patch with a user of this?

thanks,

greg k-h

2004-09-11 16:58:30

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Fri, Sep 10, 2004 at 06:45:43PM -0700, Tim Hockin wrote:
> On Fri, Sep 10, 2004 at 05:48:27PM -0700, Greg KH wrote:
> > need be. This keeps the kernel interface much simpler, and doesn't
> > allow you to abuse it for things it is not intended for (like error
> > reporting stuff...)
>
> Errm, not for error reporting? So the "driver hardening" and fault
> logging people shouldn't use this?

The "driver hardening" people hopefully have been scared away from Linux
so they never show their face around here again. And if they do, no,
this is not what they want to do (what they need to do is get a clue...)

The fault logging people should also not use this, as this is for
notifying userspace that an event has happened. And yes, "overheat" is
an example of a event that some people consider a "fault" :)

thanks,

greg k-h

2004-09-11 18:16:50

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Sat, Sep 11, 2004 at 12:35:25PM +0100, Dave Jones wrote:
> On Fri, Sep 10, 2004 at 05:48:27PM -0700, Greg KH wrote:
>
> > > What happened to a formatted string argument? The signal argument can
> > > become the pre-formatted string, and someone can provide a wrapper
> > > that takes a printf() like format and args.
> > > kobject_uevent_printf(kobj, "something bad: 0x%08x", err);
> >
> > Use an attribute, and have userspace read that formatted argument if
> > need be. This keeps the kernel interface much simpler, and doesn't
> > allow you to abuse it for things it is not intended for (like error
> > reporting stuff...)
>
> Erm, no. This will just encourage folks to sprintf to a buffer first
> and pass the result to kobject_uevent_printf().

Yeah, I agree. I think we need to standardize on the "events", and make
it hard to misspell them. I'll work on that on Monday.

> nitpick: Also, if this isn't taking formatted input, shouldn't the name of the
> function lose the 'f' ?

The function name is kobject_uevent(), no 'f' in sight :)

thanks,

greg k-h

2004-09-13 14:47:54

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

Kobject Userspace Event Notification for blockdevice mount and unmount events

Send notification over the new netlink socket to let userspace know that
the filesystem code claims/releases the superblock on an blockdevice.
This way, userspace can get rid of constantly polling /proc/mounts to
watch for filesystem changes.

Signed-off-by: Kay Sievers <[email protected]>
---
===== fs/super.c 1.122 vs edited =====
--- 1.122/fs/super.c 2004-07-11 11:23:26 +02:00
+++ edited/fs/super.c 2004-09-13 16:16:36 +02:00
@@ -35,6 +35,7 @@
#include <linux/vfs.h>
#include <linux/writeback.h> /* for the emergency remount stuff */
#include <linux/idr.h>
+#include <linux/kobject.h>
#include <asm/uaccess.h>


@@ -633,6 +634,16 @@ static int test_bdev_super(struct super_
return (void *)s->s_bdev == data;
}

+static void bdev_uevent(const char *action, struct block_device *bdev)
+{
+ if (bdev->bd_disk) {
+ if (bdev->bd_part)
+ kobject_uevent(action, &bdev->bd_part->kobj, NULL);
+ else
+ kobject_uevent(action, &bdev->bd_disk->kobj, NULL);
+ }
+}
+
struct super_block *get_sb_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int))
@@ -675,8 +686,10 @@ struct super_block *get_sb_bdev(struct f
up_write(&s->s_umount);
deactivate_super(s);
s = ERR_PTR(error);
- } else
+ } else {
s->s_flags |= MS_ACTIVE;
+ bdev_uevent("mount", bdev);
+ }
}

return s;
@@ -691,6 +704,8 @@ EXPORT_SYMBOL(get_sb_bdev);
void kill_block_super(struct super_block *sb)
{
struct block_device *bdev = sb->s_bdev;
+
+ bdev_uevent("umount", bdev);
generic_shutdown_super(sb);
set_blocksize(bdev, sb->s_old_blocksize);
close_bdev_excl(bdev);


Attachments:
(No filename) (903.00 B)
uevent-fs-01.patch (1.70 kB)
Download all attachments

2004-09-15 00:08:42

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Mon, Sep 13, 2004 at 04:45:53PM +0200, Kay Sievers wrote:
> On Sat, Sep 11, 2004 at 09:53:00AM -0700, Greg KH wrote:
> > On Sat, Sep 11, 2004 at 12:09:35AM -0400, Robert Love wrote:
> > > > +/**
> > > > + * send_uevent - notify userspace by sending event trough netlink socket
> > >
> > > s/trough/through/ ;-)
> >
> > Heh, give something for the "spelling nit-pickers" to submit a patch
> > against :)
> >
> > > We should probably add at least _some_ user. The filesystem mount
> > > events are good, since we want to add those to HAL.
> >
> > True, anyone want to send me a patch with a user of this?
>
> Do we agree on the model that the signal is a simple verb and we keep
> only a small dictionary of _static_ signal strings and no fancy compositions?

I agree with this. And because of that, we should enforce that, and
help prevent typos in the signals. So, here's a patch that does just
that, making it a lot harder to mess up (although you still can, as
enumerated types aren't checked by the compiler...) This patch booted
on my test box.

Anyone object to me adding this patch? If not, I'll fix up Kay's patch
for mounting to use this interface and add both of them.

> And we should reserve the "add" and "remove" only for the hotplug events.

I don't know, the firmware objects already use "add" for an event. I
didn't put a check in the kobject_uevent() calls to prevent the add and
remove, but now it's a lot easier to do so if you think it's necessary.

thanks,

greg k-h

-----------

kevent: force users to use a type, not a string to help prevent typos
and so that we all can keep track of the different event types much
easier.


===== drivers/base/firmware_class.c 1.13 vs edited =====
--- 1.13/drivers/base/firmware_class.c 2004-09-13 09:46:22 -07:00
+++ edited/drivers/base/firmware_class.c 2004-09-14 16:33:20 -07:00
@@ -420,7 +420,7 @@
add_timer(&fw_priv->timeout);
}

- kobject_hotplug("add", &class_dev->kobj);
+ kobject_hotplug(&class_dev->kobj, KOBJ_ADD);
wait_for_completion(&fw_priv->completion);
set_bit(FW_STATUS_DONE, &fw_priv->status);

===== include/linux/kobject.h 1.30 vs edited =====
--- 1.30/include/linux/kobject.h 2004-09-13 12:58:51 -07:00
+++ edited/include/linux/kobject.h 2004-09-14 16:24:00 -07:00
@@ -236,27 +236,33 @@
extern void subsys_remove_file(struct subsystem * , struct subsys_attribute *);


+enum kobject_action {
+ KOBJ_ADD = 0x00,
+ KOBJ_REMOVE = 0x01,
+ KOBJ_MAX_ACTION,
+};
+
#ifdef CONFIG_HOTPLUG
-extern void kobject_hotplug(const char *action, struct kobject *kobj);
+extern void kobject_hotplug(struct kobject *kobj, enum kobject_action action);
#else
-static inline void kobject_hotplug(const char *action, struct kobject *kobj) { }
+static inline void kobject_hotplug(struct kobject *kobj, enum kobject_action action) { }
#endif


#ifdef CONFIG_KOBJECT_UEVENT
-extern int kobject_uevent(const char *signal, struct kobject *kobj,
+extern int kobject_uevent(struct kobject *kobj, enum kobject_action action,
struct attribute *attr);

-extern int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+extern int kobject_uevent_atomic(struct kobject *kobj, enum kobject_action action,
struct attribute *attr);
#else
-static inline int kobject_uevent(const char *signal, struct kobject *kobj,
+static inline int kobject_uevent(struct kobject *kobj, enum kobject_action action,
struct attribute *attr)
{
return 0;
}

-static inline int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+static inline int kobject_uevent_atomic(struct kobject *kobj, enum kobject_action action,
struct attribute *attr)
{
return 0;
===== lib/kobject.c 1.55 vs edited =====
--- 1.55/lib/kobject.c 2004-09-13 09:54:04 -07:00
+++ edited/lib/kobject.c 2004-09-14 16:20:21 -07:00
@@ -185,7 +185,7 @@
if (parent)
kobject_put(parent);
} else {
- kobject_hotplug("add", kobj);
+ kobject_hotplug(kobj, KOBJ_ADD);
}

return error;
@@ -299,7 +299,7 @@

void kobject_del(struct kobject * kobj)
{
- kobject_hotplug("remove", kobj);
+ kobject_hotplug(kobj, KOBJ_REMOVE);
sysfs_remove_dir(kobj);
unlink(kobj);
}
===== lib/kobject_uevent.c 1.2 vs edited =====
--- 1.2/lib/kobject_uevent.c 2004-09-11 18:50:05 -07:00
+++ edited/lib/kobject_uevent.c 2004-09-14 16:28:21 -07:00
@@ -22,6 +22,20 @@
#include <linux/kobject.h>
#include <net/sock.h>

+/* these match up with the values for enum kobject_action */
+static char *actions[] = {
+ "add", /* 0x00 */
+ "remove", /* 0x01 */
+};
+
+static char *action_to_string(enum kobject_action action)
+{
+ if (action >= KOBJ_MAX_ACTION)
+ return NULL;
+ else
+ return actions[action];
+}
+
#ifdef CONFIG_KOBJECT_UEVENT
static struct sock *uevent_sock;

@@ -60,11 +74,12 @@
return netlink_broadcast(uevent_sock, skb, 0, 1, gfp_mask);
}

-static int do_kobject_uevent(const char *signal, struct kobject *kobj,
+static int do_kobject_uevent(struct kobject *kobj, enum kobject_action action,
struct attribute *attr, int gfp_mask)
{
char *path;
char *attrpath;
+ char *signal;
int len;
int rc = -ENOMEM;

@@ -72,6 +87,10 @@
if (!path)
return -ENOMEM;

+ signal = action_to_string(action);
+ if (!signal)
+ return -EINVAL;
+
if (attr) {
len = strlen(path);
len += strlen(attr->name) + 2;
@@ -97,17 +116,17 @@
* @kobj: struct kobject that the event is happening to
* @attr: optional struct attribute the event belongs to
*/
-int kobject_uevent(const char *signal, struct kobject *kobj,
+int kobject_uevent(struct kobject *kobj, enum kobject_action action,
struct attribute *attr)
{
- return do_kobject_uevent(signal, kobj, attr, GFP_KERNEL);
+ return do_kobject_uevent(kobj, action, attr, GFP_KERNEL);
}
EXPORT_SYMBOL_GPL(kobject_uevent);

-int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+int kobject_uevent_atomic(struct kobject *kobj, enum kobject_action action,
struct attribute *attr)
{
- return do_kobject_uevent(signal, kobj, attr, GFP_ATOMIC);
+ return do_kobject_uevent(kobj, action, attr, GFP_ATOMIC);
}

EXPORT_SYMBOL_GPL(kobject_uevent_atomic);
@@ -149,7 +168,7 @@
* @action: action that is happening (usually "ADD" or "REMOVE")
* @kobj: struct kobject that the action is happening to
*/
-void kobject_hotplug(const char *action, struct kobject *kobj)
+void kobject_hotplug(struct kobject *kobj, enum kobject_action action)
{
char *argv [3];
char **envp = NULL;
@@ -159,6 +178,7 @@
int retval;
char *kobj_path = NULL;
char *name = NULL;
+ char *action_string;
u64 seq;
struct kobject *top_kobj = kobj;
struct kset *kset;
@@ -183,6 +203,10 @@

pr_debug ("%s\n", __FUNCTION__);

+ action_string = action_to_string(action);
+ if (!action_string)
+ return;
+
envp = kmalloc(NUM_ENVP * sizeof (char *), GFP_KERNEL);
if (!envp)
return;
@@ -208,7 +232,7 @@
scratch = buffer;

envp [i++] = scratch;
- scratch += sprintf(scratch, "ACTION=%s", action) + 1;
+ scratch += sprintf(scratch, "ACTION=%s", action_string) + 1;

kobj_path = kobject_get_path(kobj, GFP_KERNEL);
if (!kobj_path)
@@ -242,7 +266,7 @@
pr_debug ("%s: %s %s %s %s %s %s %s\n", __FUNCTION__, argv[0], argv[1],
envp[0], envp[1], envp[2], envp[3], envp[4]);

- send_uevent(action, kobj_path, buffer, scratch - buffer, GFP_KERNEL);
+ send_uevent(action_string, kobj_path, buffer, scratch - buffer, GFP_KERNEL);

if (!hotplug_path[0])
goto exit;

2004-09-15 01:09:00

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 05:07:53PM -0700, Greg KH wrote:
> On Mon, Sep 13, 2004 at 04:45:53PM +0200, Kay Sievers wrote:
> > Do we agree on the model that the signal is a simple verb and we keep
> > only a small dictionary of _static_ signal strings and no fancy compositions?
>
> I agree with this. And because of that, we should enforce that, and
> help prevent typos in the signals. So, here's a patch that does just
> that, making it a lot harder to mess up (although you still can, as
> enumerated types aren't checked by the compiler...) This patch booted
> on my test box.
>
> Anyone object to me adding this patch? If not, I'll fix up Kay's patch
> for mounting to use this interface and add both of them.

I like it, so the printf is gone :) Fine with me.

> > And we should reserve the "add" and "remove" only for the hotplug events.
>
> I don't know, the firmware objects already use "add" for an event.

Yes, but isn't the firmware event a real hotplug event? I just want to
be sure, that it's easy to recognize the hotplug events from userspace.

> I didn't put a check in the kobject_uevent() calls to prevent the add and
> remove, but now it's a lot easier to do so if you think it's necessary.

Don't think that this is needed. I will add somthing to the kobject
documentation, if it's stable and merged.

Thanks,
Kay

2004-09-15 01:12:11

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 03:09:01AM +0200, Kay Sievers wrote:
> On Tue, Sep 14, 2004 at 05:07:53PM -0700, Greg KH wrote:
> > On Mon, Sep 13, 2004 at 04:45:53PM +0200, Kay Sievers wrote:
> > > Do we agree on the model that the signal is a simple verb and we keep
> > > only a small dictionary of _static_ signal strings and no fancy compositions?

I don't have any concrete examples right now, but it seems that this is
being locked down pretty tightly for no real reason...

Just a passing thought.

2004-09-15 01:19:43

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, 2004-09-14 at 17:07 -0700, Greg KH wrote:

> I don't know, the firmware objects already use "add" for an event. I
> didn't put a check in the kobject_uevent() calls to prevent the add and
> remove, but now it's a lot easier to do so if you think it's necessary.

I have no problem with this, either, so long as we are not too anal or
strict about adding new actions.

In other words, I like the safety and typo prevention that this gives
us, but want to keep things flexible and easy.

Best,

Robert Love


2004-09-15 02:10:33

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, 2004-09-14 at 18:11 -0700, Tim Hockin wrote:

Hey, Tim.

> I don't have any concrete examples right now, but it seems that this is
> being locked down pretty tightly for no real reason...
>
> Just a passing thought.

I am fearful of the overly strict lock down, too. I mean, we already
ditched the entire payload.

But so long as you can always add a new action, what complaint do you
have? In other words, all this does is force the use of the enum, which
ensures that we try to reuse existing actions, prevent typos, and so on.

Cool?

Robert Love


2004-09-15 03:17:16

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 10:10:29PM -0400, Robert Love wrote:
> > I don't have any concrete examples right now, but it seems that this is
> > being locked down pretty tightly for no real reason...
> >
> > Just a passing thought.
>
> I am fearful of the overly strict lock down, too. I mean, we already
> ditched the entire payload.

Yeah, it's much reduced in flexibility from where it started.

> But so long as you can always add a new action, what complaint do you
> have? In other words, all this does is force the use of the enum, which
> ensures that we try to reuse existing actions, prevent typos, and so on.

Well, it will be what it will be, I think. I know several people who
wanted it to be more than it is turning out to be, but that's not
unexpected. Of course we can cope with what it is.

What I think we'll find is that fringe users will hack around it. It will
become a documentum that the "insert" event of a Foo really means
something else. People will adapt to the limited "verbs" and overload
them to mean whatever it is that they need.

As much as we all like to malign "driver hardening", there is a *lot* that
can be done to make drivers more robust and to report better diagnostics
and failure events.

I'd like to have a standardized way to spit things like ECC errors up to
userspace, but I don't think that's what Greg K-H wants these used for.

I'd like to ACPI events move to a standardized event system, but they
*require* a data payload.

There are *way* too many places (IMHO) where we throw a printk() and punt,
or do something which is less than ideal. If I had my druthers, we would
examine most places that call printk() at runtime (not startup, etc) and
figure out if an event makes more sense.

This model serves well for "eth0 has a link" and "hda1 was mounted" sorts
of events. [Though namespaces make mounting a lot of fun. Which namespace
was it mounted on? Why should my app in namespace X see an event about
namespace Y?]

If that is all it's good for, then it is better than nothing, though not
as good as it might be.

Tim

2004-09-15 03:43:28

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 08:17:06PM -0700, Tim Hockin wrote:
> On Tue, Sep 14, 2004 at 10:10:29PM -0400, Robert Love wrote:
> > > I don't have any concrete examples right now, but it seems that this is
> > > being locked down pretty tightly for no real reason...
> > >
> > > Just a passing thought.
> >
> > I am fearful of the overly strict lock down, too. I mean, we already
> > ditched the entire payload.
>
> Yeah, it's much reduced in flexibility from where it started.

That's because the original proposal was not very well defined at all.

> > But so long as you can always add a new action, what complaint do you
> > have? In other words, all this does is force the use of the enum, which
> > ensures that we try to reuse existing actions, prevent typos, and so on.
>
> Well, it will be what it will be, I think. I know several people who
> wanted it to be more than it is turning out to be, but that's not
> unexpected. Of course we can cope with what it is.

Who are those people? What did people want it to be that it now is not?
Will they not speak up publicly? If not, how do they expect it to
address things they want?

> What I think we'll find is that fringe users will hack around it. It will
> become a documentum that the "insert" event of a Foo really means
> something else. People will adapt to the limited "verbs" and overload
> them to mean whatever it is that they need.

Since when did I ever state that the verbs would be "limited"? One of
the original issues that people were worried about was the possiblity of
a typo in a verb that we would be forced to live with. The patch I just
proposed fixes that issue.

All new "verbs" will be submitted to the same kind of review that any
other portion of the kernel code would be subjected to. Nothing wrong
with that, right?

> As much as we all like to malign "driver hardening", there is a *lot* that
> can be done to make drivers more robust and to report better diagnostics
> and failure events.

I agree. But this interface is not designed or intended for that.

> I'd like to have a standardized way to spit things like ECC errors up to
> userspace, but I don't think that's what Greg K-H wants these used for.

You are correct. I also would like to see a way ECC and other types of
errors and diagnostics be sent to userspace in a common and unified
manner. But I have yet to see a proposal to do this that is acceptable.

> I'd like to ACPI events move to a standardized event system, but they
> *require* a data payload.

Great, that also would be wonderful. Create such a event system for
ACPI if you desire. I think the ACPI developers are still working on
bigger issues currently.

> There are *way* too many places (IMHO) where we throw a printk() and punt,
> or do something which is less than ideal. If I had my druthers, we would
> examine most places that call printk() at runtime (not startup, etc) and
> figure out if an event makes more sense.

Please do.

> This model serves well for "eth0 has a link" and "hda1 was mounted" sorts
> of events. [Though namespaces make mounting a lot of fun. Which namespace
> was it mounted on? Why should my app in namespace X see an event about
> namespace Y?]

Yeah, namespaces are interesting :)

> If that is all it's good for, then it is better than nothing, though not
> as good as it might be.

Patches are always welcome.

thanks,

greg k-h

2004-09-15 03:45:30

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 09:19:27PM -0400, Robert Love wrote:
> On Tue, 2004-09-14 at 17:07 -0700, Greg KH wrote:
>
> > I don't know, the firmware objects already use "add" for an event. I
> > didn't put a check in the kobject_uevent() calls to prevent the add and
> > remove, but now it's a lot easier to do so if you think it's necessary.
>
> I have no problem with this, either, so long as we are not too anal or
> strict about adding new actions.

That's fine. I'll dump them into a separate header file to make it a
bit easier to find and add new ones to.

> In other words, I like the safety and typo prevention that this gives
> us, but want to keep things flexible and easy.

Heh, you want it all, don't you :)

greg k-h

2004-09-15 03:50:11

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 03:09:01AM +0200, Kay Sievers wrote:
> On Tue, Sep 14, 2004 at 05:07:53PM -0700, Greg KH wrote:
> > On Mon, Sep 13, 2004 at 04:45:53PM +0200, Kay Sievers wrote:
> > > Do we agree on the model that the signal is a simple verb and we keep
> > > only a small dictionary of _static_ signal strings and no fancy compositions?
> >
> > I agree with this. And because of that, we should enforce that, and
> > help prevent typos in the signals. So, here's a patch that does just
> > that, making it a lot harder to mess up (although you still can, as
> > enumerated types aren't checked by the compiler...) This patch booted
> > on my test box.
> >
> > Anyone object to me adding this patch? If not, I'll fix up Kay's patch
> > for mounting to use this interface and add both of them.
>
> I like it, so the printf is gone :) Fine with me.
>
> > > And we should reserve the "add" and "remove" only for the hotplug events.
> >
> > I don't know, the firmware objects already use "add" for an event.
>
> Yes, but isn't the firmware event a real hotplug event? I just want to
> be sure, that it's easy to recognize the hotplug events from userspace.

It's a "real" one in that it is generated :)

request_firmware() causes the hotplug event to happen, so that we know
to load firmware to the device that requested it.

> > I didn't put a check in the kobject_uevent() calls to prevent the add and
> > remove, but now it's a lot easier to do so if you think it's necessary.
>
> Don't think that this is needed. I will add somthing to the kobject
> documentation, if it's stable and merged.

Ok, I'll add this patch and wait for the documentation :)

thanks,

greg k-h

2004-09-15 04:48:44

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 08:42:29PM -0700, Greg KH wrote:
> > Yeah, it's much reduced in flexibility from where it started.
>
> That's because the original proposal was not very well defined at all.

True, I raised plenty of concerns about it in the first draft, too :)

> > Well, it will be what it will be, I think. I know several people who
> > wanted it to be more than it is turning out to be, but that's not
> > unexpected. Of course we can cope with what it is.
>
> Who are those people? What did people want it to be that it now is not?
> Will they not speak up publicly? If not, how do they expect it to
> address things they want?

Well, I knew several groups at Sun who could have benefitted from "driver
hardening" event-logging stuff. Things like IPMI, evlog, etc are what are
being used today.

> > What I think we'll find is that fringe users will hack around it. It will
> > become a documentum that the "insert" event of a Foo really means
> > something else. People will adapt to the limited "verbs" and overload
> > them to mean whatever it is that they need.
>
> Since when did I ever state that the verbs would be "limited"? One of
> the original issues that people were worried about was the possiblity of
> a typo in a verb that we would be forced to live with. The patch I just
> proposed fixes that issue.

Sorry, don't get me wrong - I fully agree with amking it an enum or some
such. In fact, I think I suggested that in the first draft, too. But
adding a dozen enum verbs, of which each are used once in one single
driver is just not going to fly. The barrier to creating a new "verb" is
not high, but there is a slight barrier. Rather than deal with the
barrier, I fear (and I could be wrong) that people will just overload
existing verbs.

> > As much as we all like to malign "driver hardening", there is a *lot* that
> > can be done to make drivers more robust and to report better diagnostics
> > and failure events.
>
> I agree. But this interface is not designed or intended for that.

Right. I originally asked Robert if there was some way to make this
interface capable of handling that, too. Maybe the answer is merely "no,
not this API".

> > I'd like to have a standardized way to spit things like ECC errors up to
> > userspace, but I don't think that's what Greg K-H wants these used for.
>
> You are correct. I also would like to see a way ECC and other types of
> errors and diagnostics be sent to userspace in a common and unified
> manner. But I have yet to see a proposal to do this that is acceptable.

Well, let's open that discussion, then! :) What requirements do you see?

Basically, they need to be exactly like this, except there needs to be
some amount of buffering of messages (somehow) and they need a data
payload.

For buffering, I could live with "all events with no listener get
printk()ed and they get buffered in dmesg". Now all we need to solve is
the payload issue.

Really, other than payload, why NOT use this API for ECC and driver
faults?

> > I'd like to ACPI events move to a standardized event system, but they
> > *require* a data payload.
>
> Great, that also would be wonderful. Create such a event system for
> ACPI if you desire. I think the ACPI developers are still working on
> bigger issues currently.

ACPI *has* it's own event system. It's fine, but it's Yet Another Event
System. I'd love to see it use this. It has mostly the same behaviors,
except it has a data payload (a string and couple unsigned ints, if I
recall). If this API can't handle that, then we have to keep ACPI's
current event system. Wouldn't it be nicer to remove code and encourage
migrating towards standard(ish) APIs?

Again, other than payload, why NOT use this API for ACPI?

> > There are *way* too many places (IMHO) where we throw a printk() and punt,
> > or do something which is less than ideal. If I had my druthers, we would
> > examine most places that call printk() at runtime (not startup, etc) and
> > figure out if an event makes more sense.
>
> Please do.

I'll put it on my list. The requirements that all kevents have a kobject
and that verbs be single words with no payload makes it hard to do,
though.

> > This model serves well for "eth0 has a link" and "hda1 was mounted" sorts
> > of events. [Though namespaces make mounting a lot of fun. Which namespace
> > was it mounted on? Why should my app in namespace X see an event about
> > namespace Y?]
>
> Yeah, namespaces are interesting :)

I started a paper on why namespaces should be ripped out of Linux or else
they should be actually thought out and implemented properly. Maybe one
day I'll finish it.

Tim

2004-09-15 05:09:43

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 09:48:30PM -0700, Tim Hockin wrote:
> On Tue, Sep 14, 2004 at 08:42:29PM -0700, Greg KH wrote:
> > > Well, it will be what it will be, I think. I know several people who
> > > wanted it to be more than it is turning out to be, but that's not
> > > unexpected. Of course we can cope with what it is.
> >
> > Who are those people? What did people want it to be that it now is not?
> > Will they not speak up publicly? If not, how do they expect it to
> > address things they want?
>
> Well, I knew several groups at Sun who could have benefitted from "driver
> hardening" event-logging stuff. Things like IPMI, evlog, etc are what are
> being used today.

Yes, it's a wide range of stuff that all should be consolidated.

But I haven't seen many people step up to do the work, that's my biggest
complaint. I know I don't have the time to do it either :)

> > > What I think we'll find is that fringe users will hack around it. It will
> > > become a documentum that the "insert" event of a Foo really means
> > > something else. People will adapt to the limited "verbs" and overload
> > > them to mean whatever it is that they need.
> >
> > Since when did I ever state that the verbs would be "limited"? One of
> > the original issues that people were worried about was the possiblity of
> > a typo in a verb that we would be forced to live with. The patch I just
> > proposed fixes that issue.
>
> Sorry, don't get me wrong - I fully agree with amking it an enum or some
> such. In fact, I think I suggested that in the first draft, too. But
> adding a dozen enum verbs, of which each are used once in one single
> driver is just not going to fly. The barrier to creating a new "verb" is
> not high, but there is a slight barrier. Rather than deal with the
> barrier, I fear (and I could be wrong) that people will just overload
> existing verbs.

We'll see how it gets used. If people start to do this, we'll
reconsider the kernel code. The interface to userspace will still be
the same, so we aren't painting ourselves into a corner right now.

> > > As much as we all like to malign "driver hardening", there is a *lot* that
> > > can be done to make drivers more robust and to report better diagnostics
> > > and failure events.
> >
> > I agree. But this interface is not designed or intended for that.
>
> Right. I originally asked Robert if there was some way to make this
> interface capable of handling that, too. Maybe the answer is merely "no,
> not this API".

"No, not this API."

Seriously, that's not what this interface is for. This is a simple
event notification interface.

> > > I'd like to have a standardized way to spit things like ECC errors up to
> > > userspace, but I don't think that's what Greg K-H wants these used for.
> >
> > You are correct. I also would like to see a way ECC and other types of
> > errors and diagnostics be sent to userspace in a common and unified
> > manner. But I have yet to see a proposal to do this that is acceptable.
>
> Well, let's open that discussion, then! :) What requirements do you see?
>
> Basically, they need to be exactly like this, except there needs to be
> some amount of buffering of messages (somehow) and they need a data
> payload.

Sounds good to me.

> For buffering, I could live with "all events with no listener get
> printk()ed and they get buffered in dmesg". Now all we need to solve is
> the payload issue.
>
> Really, other than payload, why NOT use this API for ECC and driver
> faults?

The payload is a pretty good reason why to not use this right now. No
one has proposed a way to handle such a payload in a sane manner.

> > > I'd like to ACPI events move to a standardized event system, but they
> > > *require* a data payload.
> >
> > Great, that also would be wonderful. Create such a event system for
> > ACPI if you desire. I think the ACPI developers are still working on
> > bigger issues currently.
>
> ACPI *has* it's own event system. It's fine, but it's Yet Another Event
> System.

Yeah, it's pretty old too, that's why it is the way it is.

> I'd love to see it use this. It has mostly the same behaviors,
> except it has a data payload (a string and couple unsigned ints, if I
> recall). If this API can't handle that, then we have to keep ACPI's
> current event system. Wouldn't it be nicer to remove code and encourage
> migrating towards standard(ish) APIs?
>
> Again, other than payload, why NOT use this API for ACPI?

Again, the payload is the big thing, right?

thanks,

greg k-h

2004-09-15 06:22:42

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 10:09:04PM -0700, Greg KH wrote:
> > Well, I knew several groups at Sun who could have benefitted from "driver
> > hardening" event-logging stuff. Things like IPMI, evlog, etc are what are
> > being used today.
>
> Yes, it's a wide range of stuff that all should be consolidated.
>
> But I haven't seen many people step up to do the work, that's my biggest
> complaint. I know I don't have the time to do it either :)

Yeah, Sun wasn't really good about committing people to fix things that
it needed, they just like to find things they need.

> > driver is just not going to fly. The barrier to creating a new "verb" is
> > not high, but there is a slight barrier. Rather than deal with the
> > barrier, I fear (and I could be wrong) that people will just overload
> > existing verbs.
>
> We'll see how it gets used. If people start to do this, we'll
> reconsider the kernel code. The interface to userspace will still be
> the same, so we aren't painting ourselves into a corner right now.

True, and a fair answer.

> > > > As much as we all like to malign "driver hardening", there is a *lot* that
> > > > can be done to make drivers more robust and to report better diagnostics
> > > > and failure events.
> > >
> > > I agree. But this interface is not designed or intended for that.
> >
> > Right. I originally asked Robert if there was some way to make this
> > interface capable of handling that, too. Maybe the answer is merely "no,
> > not this API".
>
> Seriously, that's not what this interface is for. This is a simple
> event notification interface.

Well, this API is not far from "good enough". It's meant to be a "simple
event system" but with a few expansions, it can be a full-featured event
system :) And yes, I know the term "feature creep".

> > > You are correct. I also would like to see a way ECC and other types of
> > > errors and diagnostics be sent to userspace in a common and unified
> > > manner. But I have yet to see a proposal to do this that is acceptable.
> >
> > Well, let's open that discussion, then! :) What requirements do you see?
> >
> > Basically, they need to be exactly like this, except there needs to be
> > some amount of buffering of messages (somehow) and they need a data
> > payload.
>
> Sounds good to me.

So what if the actual event system had a payload, and simple events don't
use it, and complex events do? Or what if there were an exactly analogous
API for messages with payloads?

> > Really, other than payload, why NOT use this API for ECC and driver
> > faults?
>
> The payload is a pretty good reason why to not use this right now. No
> one has proposed a way to handle such a payload in a sane manner.

What's insane about a string payload? Or rather, what are your objections
to saying that the payload string format is entirely dependant on the
{source, event} tuple?

ACPI events might come out of a kobject "/sys/devices/acpi" with an event
"event" and payload "button/power 00000000 00000001" or whatever the
actual values work out to be.

What's insane about that? Currently we have a separate /proc/acpi/event
file which spits out "button/power 00000000 00000001".

> > ACPI *has* it's own event system. It's fine, but it's Yet Another Event
> > System.
>
> Yeah, it's pretty old too, that's why it is the way it is.

But semantically, it's the same as this new API (I think), just less
elegant.

> > Again, other than payload, why NOT use this API for ACPI?
>
> Again, the payload is the big thing, right?

Yes, the payload is the big thing (that I see). I'm not sure if you're
posing this as in "See, it needs a payload so we don't want it." or "If we
find a way to do a payload sanely, will you shut up?".

:)

Cheers,
Tim

2004-09-15 06:47:14

by Jan Dittmer

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

Tim Hockin wrote:
>
> ACPI events might come out of a kobject "/sys/devices/acpi" with an event
> "event" and payload "button/power 00000000 00000001" or whatever the
> actual values work out to be.
>
> What's insane about that? Currently we have a separate /proc/acpi/event
> file which spits out "button/power 00000000 00000001".
>

What's wrong about fixing acpi to have something like
/sys/devices/acpi/buttons/power/, that spits out the event?
Just curious...

Jan

2004-09-15 06:49:17

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 08:45:57AM +0200, Jan Dittmer wrote:
> Tim Hockin wrote:
> >
> >ACPI events might come out of a kobject "/sys/devices/acpi" with an event
> >"event" and payload "button/power 00000000 00000001" or whatever the
> >actual values work out to be.
> >
> >What's insane about that? Currently we have a separate /proc/acpi/event
> >file which spits out "button/power 00000000 00000001".
> >
>
> What's wrong about fixing acpi to have something like
> /sys/devices/acpi/buttons/power/, that spits out the event?
> Just curious...

You'd still need to spit out a payload with the status. Interesting idea
for the evolution of acpi, though...

2004-09-15 06:53:33

by Jan Dittmer

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

Tim Hockin wrote:
> On Wed, Sep 15, 2004 at 08:45:57AM +0200, Jan Dittmer wrote:
>
>>Tim Hockin wrote:
>>
>>>ACPI events might come out of a kobject "/sys/devices/acpi" with an event
>>>"event" and payload "button/power 00000000 00000001" or whatever the
>>>actual values work out to be.
>>>
>>>What's insane about that? Currently we have a separate /proc/acpi/event
>>>file which spits out "button/power 00000000 00000001".
>>>
>>
>>What's wrong about fixing acpi to have something like
>>/sys/devices/acpi/buttons/power/, that spits out the event?
>>Just curious...
>
>
> You'd still need to spit out a payload with the status. Interesting idea
> for the evolution of acpi, though...

Well, what's the status supposed to mean? Is this something like
buttondown, buttonup, ...? Couldn't this be the 'event'. I mean that it
is an event is kind of selfexplaining if it comes through the event layer.

Jan

2004-09-15 08:16:47

by Karol Kozimor

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wednesday 15 of September 2004 08:45, Jan Dittmer wrote:
> What's wrong about fixing acpi to have something like
> /sys/devices/acpi/buttons/power/, that spits out the event?
> Just curious...

Well, the fact that you'd have to somehow:
1) pass the list of all the drivers that register notify handlers to the
userspace
2) make userspace daemons hold ~10 sysfs nodes open

Not that it's not feasible, but that's simply ugly IMHO.
Best regards,

--
Karol Kozimor
[email protected]

2004-09-15 09:10:29

by Andrew Grover

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, 14 Sep 2004 21:48:30 -0700, Tim Hockin <[email protected]> wrote:
> ACPI *has* it's own event system. It's fine, but it's Yet Another Event
> System. I'd love to see it use this. It has mostly the same behaviors,
> except it has a data payload (a string and couple unsigned ints, if I
> recall). If this API can't handle that, then we have to keep ACPI's
> current event system. Wouldn't it be nicer to remove code and encourage
> migrating towards standard(ish) APIs?
>
> Again, other than payload, why NOT use this API for ACPI?

IIRC the two possible future destinations for ACPI events are this,
and the input layer. There are some ACPI events that clearly should go
through this mechanism (e.g. thermal), some the input layer (e.g.
weird laptop extra keys), and maybe some in between? I know David
Bronaugh was looking into this a few weeks ago, maybe he'll pop back
up.

My 2c -- Andy

2004-09-15 13:14:55

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 11:21:29PM -0700, Tim Hockin wrote:
> On Tue, Sep 14, 2004 at 10:09:04PM -0700, Greg KH wrote:
> > > > > As much as we all like to malign "driver hardening", there is a *lot* that
> > > > > can be done to make drivers more robust and to report better diagnostics
> > > > > and failure events.
> > > >
> > > > I agree. But this interface is not designed or intended for that.
> > >
> > > Right. I originally asked Robert if there was some way to make this
> > > interface capable of handling that, too. Maybe the answer is merely "no,
> > > not this API".
> >
> > Seriously, that's not what this interface is for. This is a simple
> > event notification interface.
>
> Well, this API is not far from "good enough". It's meant to be a "simple
> event system" but with a few expansions, it can be a full-featured event
> system :) And yes, I know the term "feature creep".
>
> > > > You are correct. I also would like to see a way ECC and other types of
> > > > errors and diagnostics be sent to userspace in a common and unified
> > > > manner. But I have yet to see a proposal to do this that is acceptable.
> > >
> > > Well, let's open that discussion, then! :) What requirements do you see?
> > >
> > > Basically, they need to be exactly like this, except there needs to be
> > > some amount of buffering of messages (somehow) and they need a data
> > > payload.
> >
> > Sounds good to me.
>
> So what if the actual event system had a payload, and simple events don't
> use it, and complex events do? Or what if there were an exactly analogous
> API for messages with payloads?
>
> > > Really, other than payload, why NOT use this API for ECC and driver
> > > faults?
> >
> > The payload is a pretty good reason why to not use this right now. No
> > one has proposed a way to handle such a payload in a sane manner.
>
> What's insane about a string payload? Or rather, what are your objections
> to saying that the payload string format is entirely dependant on the
> {source, event} tuple?
>
> ACPI events might come out of a kobject "/sys/devices/acpi" with an event
> "event" and payload "button/power 00000000 00000001" or whatever the
> actual values work out to be.
>
> What's insane about that? Currently we have a separate /proc/acpi/event
> file which spits out "button/power 00000000 00000001".
>
> > > ACPI *has* it's own event system. It's fine, but it's Yet Another Event
> > > System.
> >
> > Yeah, it's pretty old too, that's why it is the way it is.
>
> But semantically, it's the same as this new API (I think), just less
> elegant.
>
> > > Again, other than payload, why NOT use this API for ACPI?
> >
> > Again, the payload is the big thing, right?
>
> Yes, the payload is the big thing (that I see). I'm not sure if you're
> posing this as in "See, it needs a payload so we don't want it." or "If we
> find a way to do a payload sanely, will you shut up?".

The problem here is that the current picture is merely a sysfs notification
(like a hotplug event is) and we really don't want to use that for error
handling and other things, that needs a reliable form of communcation.
Here we should consider a new transactional interface which can also carry
binary snapshot data from the drivers.
This is a complete different use and should have something like a event
filesystem (relayfs?) or similar and not in any kind something like the here
proposed printk fallback.

For the hotplug event we use a kind of "payload". It's just the already
computed list of '\0'-terminated environment strings, cause we need to know,
what subsystem we are working on and need the hotplug sequence number.
But hotplug is some special kind of event and every event is still strongly
tight to a sysfs device and all "real data" is available from the device
itself.

The same for the mount event, we don't want to carry any specific data,
cause the real data is already available in /proc/mounts, we just want to
know about the device state change. Then we can reread that data, instead
of constantly polling for it.

Yeah, for the ACPI event, I see that it would be possible to use the
payload model, like we use for the hotplug event, but if we open that in
general to the public, I expect something like printk over netlink.
>From my point of view, the kobject, an event is generated for, should be
a "real" device and never a whole subsystem. As long as ACPI can _not_ send
events for specific sysfs devices, I'm not for using kobject_uevent.

Thanks,
Kay

2004-09-15 15:49:01

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 10:19:50AM +0200, Karol Kozimor wrote:
> On Wednesday 15 of September 2004 08:45, Jan Dittmer wrote:
> > What's wrong about fixing acpi to have something like
> > /sys/devices/acpi/buttons/power/, that spits out the event?
> > Just curious...
>
> Well, the fact that you'd have to somehow:
> 1) pass the list of all the drivers that register notify handlers to the
> userspace
> 2) make userspace daemons hold ~10 sysfs nodes open

I *think* the suggestion was to add sysfs nodes for ACPI objects so that
the assosicated kobject could be the "source" argument to the uevent API.
Not that each kobject would have an event stream of it's own :)

I'm not deeply into ACPI, so I'm not sure if it is possible to enumerate
all event sources, or if GPEs would come from nowhere...

2004-09-15 16:26:30

by Jan Dittmer

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

Tim Hockin wrote:
> On Wed, Sep 15, 2004 at 10:19:50AM +0200, Karol Kozimor wrote:
>
>>On Wednesday 15 of September 2004 08:45, Jan Dittmer wrote:
>>
>>>What's wrong about fixing acpi to have something like
>>>/sys/devices/acpi/buttons/power/, that spits out the event?
>>>Just curious...
>>
>>Well, the fact that you'd have to somehow:
>>1) pass the list of all the drivers that register notify handlers to the
>>userspace
>>2) make userspace daemons hold ~10 sysfs nodes open
>
>
> I *think* the suggestion was to add sysfs nodes for ACPI objects so that
> the assosicated kobject could be the "source" argument to the uevent API.
> Not that each kobject would have an event stream of it's own :)

Yepp.

> I'm not deeply into ACPI, so I'm not sure if it is possible to enumerate
> all event sources, or if GPEs would come from nowhere...

Well if you do a find /sys | grep acpi, there are basically all nodes
already there - just no values in there :-(
I think Andrew's suggestions (in this same thread) make more sense, ie.
deliver button events via the input layer...

Jan

2004-09-15 18:59:43

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 02:07 -0700, Andrew Grover wrote:

> IIRC the two possible future destinations for ACPI events are this,
> and the input layer. There are some ACPI events that clearly should go
> through this mechanism (e.g. thermal), some the input layer (e.g.
> weird laptop extra keys), and maybe some in between? I know David
> Bronaugh was looking into this a few weeks ago, maybe he'll pop back
> up.

Hey, Andy.

I'd like, if possible, to have ACPI use kevents. ACPI makes sense as a
user.

The first question is whether or not ACPI could use the current kevent
model. The current reactionary response is that kevent is too limited,
but that misses the point a bit. The point is that you create kobjects
and better integrate with sysfs, and then kevent becomes trivial.

So if ACPI better took advantage of sysfs, would kevents be sufficient?

Robert Love


2004-09-15 20:11:34

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 08:44:55PM -0700, Greg KH wrote:
> On Tue, Sep 14, 2004 at 09:19:27PM -0400, Robert Love wrote:
> > On Tue, 2004-09-14 at 17:07 -0700, Greg KH wrote:
> >
> > > I don't know, the firmware objects already use "add" for an event. I
> > > didn't put a check in the kobject_uevent() calls to prevent the add and
> > > remove, but now it's a lot easier to do so if you think it's necessary.
> >
> > I have no problem with this, either, so long as we are not too anal or
> > strict about adding new actions.
>
> That's fine. I'll dump them into a separate header file to make it a
> bit easier to find and add new ones to.

And here's the patch I applied to my trees and will show up in the next
-mm release.

I'll go convert Kay's mount patch to the new interface and add it too.

(and yes, I've added a "change" event that drivers and such can use to
tell userspace they they need to go look at a specific sysfs file. I
figured that this was going to be the next action request so I might as
well add it now...)

thanks,

greg k-h

-----

kevent: standardize on the event types

This prevents any potential typos from happening.

Signed-off-by: Greg Kroah-Hartman <[email protected]>

diff -Nru a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
--- a/drivers/base/firmware_class.c 2004-09-15 11:52:49 -07:00
+++ b/drivers/base/firmware_class.c 2004-09-15 11:52:49 -07:00
@@ -420,7 +420,7 @@
add_timer(&fw_priv->timeout);
}

- kobject_hotplug("add", &class_dev->kobj);
+ kobject_hotplug(&class_dev->kobj, KOBJ_ADD);
wait_for_completion(&fw_priv->completion);
set_bit(FW_STATUS_DONE, &fw_priv->status);

diff -Nru a/include/linux/kobject.h b/include/linux/kobject.h
--- a/include/linux/kobject.h 2004-09-15 11:52:49 -07:00
+++ b/include/linux/kobject.h 2004-09-15 11:52:49 -07:00
@@ -22,6 +22,7 @@
#include <linux/sysfs.h>
#include <linux/rwsem.h>
#include <linux/kref.h>
+#include <linux/kobject_uevent.h>
#include <asm/atomic.h>

#define KOBJ_NAME_LEN 20
@@ -235,32 +236,10 @@
extern int subsys_create_file(struct subsystem * , struct subsys_attribute *);
extern void subsys_remove_file(struct subsystem * , struct subsys_attribute *);

-
#ifdef CONFIG_HOTPLUG
-extern void kobject_hotplug(const char *action, struct kobject *kobj);
-#else
-static inline void kobject_hotplug(const char *action, struct kobject *kobj) { }
-#endif
-
-
-#ifdef CONFIG_KOBJECT_UEVENT
-extern int kobject_uevent(const char *signal, struct kobject *kobj,
- struct attribute *attr);
-
-extern int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
- struct attribute *attr);
+extern void kobject_hotplug(struct kobject *kobj, enum kobject_action action);
#else
-static inline int kobject_uevent(const char *signal, struct kobject *kobj,
- struct attribute *attr)
-{
- return 0;
-}
-
-static inline int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
- struct attribute *attr)
-{
- return 0;
-}
+static inline void kobject_hotplug(struct kobject *kobj, enum kobject_action action) { }
#endif

#endif /* __KERNEL__ */
diff -Nru a/include/linux/kobject_uevent.h b/include/linux/kobject_uevent.h
--- /dev/null Wed Dec 31 16:00:00 196900
+++ b/include/linux/kobject_uevent.h 2004-09-15 11:52:49 -07:00
@@ -0,0 +1,50 @@
+/*
+ * kobject_uevent.h - list of kobject user events that can be generated
+ *
+ * Copyright (C) 2004 IBM Corp.
+ * Copyright (C) 2004 Greg Kroah-Hartman <[email protected]>
+ *
+ * This file is released under the GPLv2.
+ *
+ */
+
+#ifndef _KOBJECT_EVENT_H_
+#define _KOBJECT_EVENT_H_
+
+/*
+ * If you add an action here, you must also add the proper string to the
+ * lib/kobject_uevent.c file.
+ */
+
+enum kobject_action {
+ KOBJ_ADD = 0x00, /* add event, for hotplug */
+ KOBJ_REMOVE = 0x01, /* remove event, for hotplug */
+ KOBJ_CHANGE = 0x02, /* a sysfs attribute file has changed */
+ KOBJ_MOUNT = 0x03, /* mount event for block devices */
+ KOBJ_MAX_ACTION, /* must be last action listed */
+};
+
+
+#ifdef CONFIG_KOBJECT_UEVENT
+int kobject_uevent(struct kobject *kobj,
+ enum kobject_action action,
+ struct attribute *attr);
+int kobject_uevent_atomic(struct kobject *kobj,
+ enum kobject_action action,
+ struct attribute *attr);
+#else
+static inline int kobject_uevent(struct kobject *kobj,
+ enum kobject_action action,
+ struct attribute *attr)
+{
+ return 0;
+}
+static inline int kobject_uevent_atomic(struct kobject *kobj,
+ enum kobject_action action,
+ struct attribute *attr)
+{
+ return 0;
+}
+#endif
+
+#endif
diff -Nru a/lib/kobject.c b/lib/kobject.c
--- a/lib/kobject.c 2004-09-15 11:52:49 -07:00
+++ b/lib/kobject.c 2004-09-15 11:52:49 -07:00
@@ -185,7 +185,7 @@
if (parent)
kobject_put(parent);
} else {
- kobject_hotplug("add", kobj);
+ kobject_hotplug(kobj, KOBJ_ADD);
}

return error;
@@ -299,7 +299,7 @@

void kobject_del(struct kobject * kobj)
{
- kobject_hotplug("remove", kobj);
+ kobject_hotplug(kobj, KOBJ_REMOVE);
sysfs_remove_dir(kobj);
unlink(kobj);
}
diff -Nru a/lib/kobject_uevent.c b/lib/kobject_uevent.c
--- a/lib/kobject_uevent.c 2004-09-15 11:52:49 -07:00
+++ b/lib/kobject_uevent.c 2004-09-15 11:52:49 -07:00
@@ -19,9 +19,29 @@
#include <linux/skbuff.h>
#include <linux/netlink.h>
#include <linux/string.h>
+#include <linux/kobject_uevent.h>
#include <linux/kobject.h>
#include <net/sock.h>

+/*
+ * These must match up with the values for enum kobject_action
+ * as found in include/linux/kobject_uevent.h
+ */
+static char *actions[] = {
+ "add", /* 0x00 */
+ "remove", /* 0x01 */
+ "change", /* 0x02 */
+ "mount", /* 0x03 */
+};
+
+static char *action_to_string(enum kobject_action action)
+{
+ if (action >= KOBJ_MAX_ACTION)
+ return NULL;
+ else
+ return actions[action];
+}
+
#ifdef CONFIG_KOBJECT_UEVENT
static struct sock *uevent_sock;

@@ -60,11 +80,12 @@
return netlink_broadcast(uevent_sock, skb, 0, 1, gfp_mask);
}

-static int do_kobject_uevent(const char *signal, struct kobject *kobj,
+static int do_kobject_uevent(struct kobject *kobj, enum kobject_action action,
struct attribute *attr, int gfp_mask)
{
char *path;
char *attrpath;
+ char *signal;
int len;
int rc = -ENOMEM;

@@ -72,6 +93,10 @@
if (!path)
return -ENOMEM;

+ signal = action_to_string(action);
+ if (!signal)
+ return -EINVAL;
+
if (attr) {
len = strlen(path);
len += strlen(attr->name) + 2;
@@ -97,17 +122,17 @@
* @kobj: struct kobject that the event is happening to
* @attr: optional struct attribute the event belongs to
*/
-int kobject_uevent(const char *signal, struct kobject *kobj,
+int kobject_uevent(struct kobject *kobj, enum kobject_action action,
struct attribute *attr)
{
- return do_kobject_uevent(signal, kobj, attr, GFP_KERNEL);
+ return do_kobject_uevent(kobj, action, attr, GFP_KERNEL);
}
EXPORT_SYMBOL_GPL(kobject_uevent);

-int kobject_uevent_atomic(const char *signal, struct kobject *kobj,
+int kobject_uevent_atomic(struct kobject *kobj, enum kobject_action action,
struct attribute *attr)
{
- return do_kobject_uevent(signal, kobj, attr, GFP_ATOMIC);
+ return do_kobject_uevent(kobj, action, attr, GFP_ATOMIC);
}

EXPORT_SYMBOL_GPL(kobject_uevent_atomic);
@@ -149,7 +174,7 @@
* @action: action that is happening (usually "ADD" or "REMOVE")
* @kobj: struct kobject that the action is happening to
*/
-void kobject_hotplug(const char *action, struct kobject *kobj)
+void kobject_hotplug(struct kobject *kobj, enum kobject_action action)
{
char *argv [3];
char **envp = NULL;
@@ -159,6 +184,7 @@
int retval;
char *kobj_path = NULL;
char *name = NULL;
+ char *action_string;
u64 seq;
struct kobject *top_kobj = kobj;
struct kset *kset;
@@ -183,6 +209,10 @@

pr_debug ("%s\n", __FUNCTION__);

+ action_string = action_to_string(action);
+ if (!action_string)
+ return;
+
envp = kmalloc(NUM_ENVP * sizeof (char *), GFP_KERNEL);
if (!envp)
return;
@@ -208,7 +238,7 @@
scratch = buffer;

envp [i++] = scratch;
- scratch += sprintf(scratch, "ACTION=%s", action) + 1;
+ scratch += sprintf(scratch, "ACTION=%s", action_string) + 1;

kobj_path = kobject_get_path(kobj, GFP_KERNEL);
if (!kobj_path)
@@ -242,7 +272,7 @@
pr_debug ("%s: %s %s %s %s %s %s %s\n", __FUNCTION__, argv[0], argv[1],
envp[0], envp[1], envp[2], envp[3], envp[4]);

- send_uevent(action, kobj_path, buffer, scratch - buffer, GFP_KERNEL);
+ send_uevent(action_string, kobj_path, buffer, scratch - buffer, GFP_KERNEL);

if (!hotplug_path[0])
goto exit;

2004-09-15 20:18:01

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 12:40 -0700, Greg KH wrote:

> And here's the patch I applied to my trees and will show up in the next
> -mm release.
>
> I'll go convert Kay's mount patch to the new interface and add it too.

I think you want an "unmount" signal, too.

> (and yes, I've added a "change" event that drivers and such can use to
> tell userspace they they need to go look at a specific sysfs file. I
> figured that this was going to be the next action request so I might as
> well add it now...)

Good.

Robert Love


2004-09-15 20:31:04

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 13:22 -0700, Tim Hockin wrote:

> So do we actually plan to handle namespaces at all wrt kevents?

I don't see why we have to.

A mount event should really just cause a rescan of the mount table.

Robert Love


2004-09-15 20:34:54

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 04:26:25PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 13:22 -0700, Tim Hockin wrote:
>
> > So do we actually plan to handle namespaces at all wrt kevents?
>
> I don't see why we have to.
>
> A mount event should really just cause a rescan of the mount table.

In which namespace? All of them? Is that an information leak that might
be hazardous (I'm bad with security stuff).

2004-09-15 20:37:09

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 13:31 -0700, Tim Hockin wrote:

> > A mount event should really just cause a rescan of the mount table.
>
> In which namespace? All of them? Is that an information leak that might
> be hazardous (I'm bad with security stuff).

You can only see your own namespace. So e.g. /proc/mtab is your name
space's mount table and you can rescan that when receiving the mount
signal.

That is sort of the coolness of this approach - push the payload out to
user-space, and we don't have to worry about things like information
leak, name spaces, security, etc. We are just providing a notify
mechanism for state information that should already be exposed through
sysfs.

So there is no information leak.

Robert Love


2004-09-15 20:43:11

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 13:31 -0700, Tim Hockin wrote:
> On Wed, Sep 15, 2004 at 04:26:25PM -0400, Robert Love wrote:
> > On Wed, 2004-09-15 at 13:22 -0700, Tim Hockin wrote:
> >
> > > So do we actually plan to handle namespaces at all wrt kevents?
> >
> > I don't see why we have to.
> >
> > A mount event should really just cause a rescan of the mount table.
>
> In which namespace? All of them? Is that an information leak that might
> be hazardous (I'm bad with security stuff).

In the namespace of sysfs :).
It's just the physical device.

Thanks,
Kay

2004-09-15 20:58:07

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 04:33:34PM -0400, Robert Love wrote:
> > In which namespace? All of them? Is that an information leak that might
> > be hazardous (I'm bad with security stuff).
>
> You can only see your own namespace. So e.g. /proc/mtab is your name
> space's mount table and you can rescan that when receiving the mount
> signal.

> So there is no information leak.

Are you not sending it with some specific device as the source? Or is it
just coming from some abstract root kobject?

2004-09-15 21:10:32

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 13:56 -0700, Tim Hockin wrote:

> If you're notifying me of mounts and unmounts, I really want to know about
> all of them, not just mounts that have a hard local device. I'd rather
> get "something was mounted" and be forced to probe that (that's a leak,
> too, but less important).

If its a big deal, we can use a generic kobject instead of the physical
device, but I don't think it is a big deal.

It is technically not a security issue, anyhow, since the kevent
requires root to read. But I suppose most uses are going to shovel all
the events onto a user visible bus and we don't want to do arbitration
in user-space.

Robert Love


2004-09-15 21:14:55

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 01:56:43PM -0700, Tim Hockin wrote:
> On Wed, Sep 15, 2004 at 04:49:18PM -0400, Robert Love wrote:
> > On Wed, 2004-09-15 at 13:47 -0700, Tim Hockin wrote:
> >
> > > Are you not sending it with some specific device as the source? Or is it
> > > just coming from some abstract root kobject?
> >
> > It comes the the physical device.
> >
> > Is there really a specific issue that you are seeing?
>
> Well, two.
>
> 1) If you send me an event "/dev/hda3 mounted", but it was for some other
> namespace, you just leaked potentially useful information.

You can listen only as root!
All information is already in /proc/mounts.

> I'm no security expert, but that seems to me to be a gratuitous leak.

I don't aggree on the second part :)

> Maybe it's just another example of why namespaces need to go away.
>
> 2) If you send me an event "/dev/hda3 mounted" do I also get an event when
> I loopback mount /tmp/rh9.0-1.iso or when I bind mount /foo to /bar or
> when I mount server:/export/home on /home?

You get an event if fs-code claims/relases a genhd. It's a claim/release
event to be more precise. Only the first mount will emit a event and the
last umount.


thaks,
Kay

2004-09-15 21:01:28

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 04:49:18PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 13:47 -0700, Tim Hockin wrote:
>
> > Are you not sending it with some specific device as the source? Or is it
> > just coming from some abstract root kobject?
>
> It comes the the physical device.
>
> Is there really a specific issue that you are seeing?

Well, two.

1) If you send me an event "/dev/hda3 mounted", but it was for some other
namespace, you just leaked potentially useful information.

I'm no security expert, but that seems to me to be a gratuitous leak.
Maybe it's just another example of why namespaces need to go away.

2) If you send me an event "/dev/hda3 mounted" do I also get an event when
I loopback mount /tmp/rh9.0-1.iso or when I bind mount /foo to /bar or
when I mount server:/export/home on /home?

If you're notifying me of mounts and unmounts, I really want to know about
all of them, not just mounts that have a hard local device. I'd rather
get "something was mounted" and be forced to probe that (that's a leak,
too, but less important).

Tim

2004-09-15 21:36:43

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Tue, Sep 14, 2004 at 11:21:29PM -0700, Tim Hockin wrote:
> On Tue, Sep 14, 2004 at 10:09:04PM -0700, Greg KH wrote:
> > > > > As much as we all like to malign "driver hardening", there is a *lot* that
> > > > > can be done to make drivers more robust and to report better diagnostics
> > > > > and failure events.
> > > >
> > > > I agree. But this interface is not designed or intended for that.
> > >
> > > Right. I originally asked Robert if there was some way to make this
> > > interface capable of handling that, too. Maybe the answer is merely "no,
> > > not this API".
> >
> > Seriously, that's not what this interface is for. This is a simple
> > event notification interface.
>
> Well, this API is not far from "good enough". It's meant to be a "simple
> event system" but with a few expansions, it can be a full-featured event
> system :) And yes, I know the term "feature creep".

Ok, for now, no payload.

The kevent interface is well defined, bounded and simple. Let's not try
to muddy it up with arbitraty blobs being passed as a payload (and for
those people who want fimware binary crap in event log messages, please,
just use a sysfs file for access and send a kevent with KOBJ_CHANGE
action.)

The event logging people are working on figuring out what actually needs
to be logged in the first place. Let's let them finialize that, and
then let them propose a way to get that standardized information out of
the kernel (hint, I don't think that netlink is going to cut it for
them...)

thanks,

greg k-h

2004-09-15 21:36:44

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 14:21 -0700, Greg KH wrote:

> Doh! You're right. Here's Kay's patch ported to the new interface, and
> adding a umount event type. I've applied it to my trees.

I am actually thinking that Kay's approach is less than ideal, since it
does not catch all mounts and unmounts.

Robert Love


2004-09-15 21:36:46

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 14:23 -0700, Greg KH wrote:

> We aren't giving absolute /dev entries here, that's the beauty of the
> kobject tree :)

Not that I agree, but I don't think it is the absolute /dev entries that
bother him: it is the fact that knowledge of the mount itself is an
information leak.

Which it is. As root, in my name space, I should rest in the knowledge
that my mounts are secret, I guess. But I just do not see it as a big
problem.

Robert Love


2004-09-15 21:40:30

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 14:34 -0700, Tim Hockin wrote:

> It's a can of worms, is what it is. And I'm not sure what a good fix
> would be. Would it just be enough to send a generic "mount-table changed"
> event, and let userspace figure out the rest?

"Can of worms" is a tough description for something that there is no
practical security issue for, just a lot of hand waving. No one even
uses name spaces.

Anyhow, I already said that we could send out a generic kobject instead
of the one tied to the specific device.

> Or really, why is the kernel broadcasting a mount, which originated in
> userland. Couldn't mount (or a mount wrapper) do that? It's already
> running in the right namespace...

In practice stuff like that never works. Besides, it is not mount(1)
that we want to wrap but the mount(2) system call. And, uh, I'd rather
stab myself than try to get that patch by Uli.

Robert Love


2004-09-15 21:44:01

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 23:38 +0200, Kay Sievers wrote:

> Anyone can watch the refcount on the fs-modules, they increment on every
> device claim. Is that a leak in your eyes too :)

Haha, Kay, rocking point.

Robert Love


2004-09-15 21:52:17

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 14:46 -0700, Tim Hockin wrote:

> The "big deal" is that namespaces are not *ever* considered when things
> like this go on.
>
> We keep adding things that don't think namepsaces are a big deal, and
> we're painting ourselves into a corner where NONE of the advanced
> functionality will work in the face of namespaces.

This stuff does not break name spaces, though. Not at all. Root gets
the event and it or some user rescans their mount table.

I can use name spaces with this and they do not break.

You do have an argument with respect to the information leak, but I've
never looked at name spaces as a method of hiding what is mounted.

Robert Love


2004-09-15 21:53:46

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 05:39:29PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 23:38 +0200, Kay Sievers wrote:
>
> > Anyone can watch the refcount on the fs-modules, they increment on every
> > device claim. Is that a leak in your eyes too :)
>
> Haha, Kay, rocking point.

Not if I don't build my fs as a module? And yeah, in the strictest sense
it IS a leak if a user can do that.

Now, I'm not *really* concerned withs ecurity, I just want to make sure we
don't flippantly disregard it.

I'd be happy if the actual block device was not part of the event.

What worries me more is that only some mount/umount calls get events, and
not all of them.

2004-09-15 21:49:28

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 02:35:26PM -0700, Greg KH wrote:
> Hm, this is an issue that I and the FreeBSD author of jail were talking
> about a week or so ago when I was describing why we did udev in
> userspace and the hotplug stuff. He pointed out the namespace issue as
> the biggest problem for FreeBSD to be able to do the same kind of thing
> that we are doing.
>
> But I agree, I don't think it's a big deal right now either.

The "big deal" is that namespaces are not *ever* considered when things
like this go on.

We keep adding things that don't think namepsaces are a big deal, and
we're painting ourselves into a corner where NONE of the advanced
functionality will work in the face of namespaces.

2004-09-15 21:57:41

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 02:49:19PM -0700, Tim Hockin wrote:
>
> What worries me more is that only some mount/umount calls get events, and
> not all of them.

That should be fixed, and I think Robert has already said they will fix
that.

thanks,

greg k-h


--

2004-09-15 21:41:31

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 17:26 -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 14:23 -0700, Greg KH wrote:
>
> > We aren't giving absolute /dev entries here, that's the beauty of the
> > kobject tree :)
>
> Not that I agree, but I don't think it is the absolute /dev entries that
> bother him: it is the fact that knowledge of the mount itself is an
> information leak.
>
> Which it is. As root, in my name space, I should rest in the knowledge
> that my mounts are secret, I guess. But I just do not see it as a big
> problem.

Anyone can watch the refcount on the fs-modules, they increment on every
device claim. Is that a leak in your eyes too :)

Kay

2004-09-15 21:40:02

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 05:26:29PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 14:23 -0700, Greg KH wrote:
>
> > We aren't giving absolute /dev entries here, that's the beauty of the
> > kobject tree :)
>
> Not that I agree, but I don't think it is the absolute /dev entries that
> bother him: it is the fact that knowledge of the mount itself is an
> information leak.
>
> Which it is. As root, in my name space, I should rest in the knowledge
> that my mounts are secret, I guess. But I just do not see it as a big
> problem.

Hm, this is an issue that I and the FreeBSD author of jail were talking
about a week or so ago when I was describing why we did udev in
userspace and the hotplug stuff. He pointed out the namespace issue as
the biggest problem for FreeBSD to be able to do the same kind of thing
that we are doing.

But I agree, I don't think it's a big deal right now either.

thanks,

greg k-h

2004-09-15 21:40:01

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 05:26:54PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 14:21 -0700, Greg KH wrote:
>
> > Doh! You're right. Here's Kay's patch ported to the new interface, and
> > adding a umount event type. I've applied it to my trees.
>
> I am actually thinking that Kay's approach is less than ideal, since it
> does not catch all mounts and unmounts.

Ok, well if you two agree that something else should be done, send me a
patch against this last one :)

thanks,

greg k-h

2004-09-15 21:40:00

by Kay Sievers

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 17:26 -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 14:21 -0700, Greg KH wrote:
>
> > Doh! You're right. Here's Kay's patch ported to the new interface, and
> > adding a umount event type. I've applied it to my trees.
>
> I am actually thinking that Kay's approach is less than ideal, since it
> does not catch all mounts and unmounts.

I was only looking from the sysfs side and want the event for a block
device. A generic mount notification was not on the plan that time and
will also not fit in the current kobject picture. We may name it "claim"
and "release" if you like that more.

Kay

2004-09-15 21:40:00

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 05:26:29PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 14:23 -0700, Greg KH wrote:
>
> > We aren't giving absolute /dev entries here, that's the beauty of the
> > kobject tree :)
>
> Not that I agree, but I don't think it is the absolute /dev entries that
> bother him: it is the fact that knowledge of the mount itself is an
> information leak.
>
> Which it is. As root, in my name space, I should rest in the knowledge
> that my mounts are secret, I guess. But I just do not see it as a big
> problem.

It's a can of worms, is what it is. And I'm not sure what a good fix
would be. Would it just be enough to send a generic "mount-table changed"
event, and let userspace figure out the rest?

Or really, why is the kernel broadcasting a mount, which originated in
userland. Couldn't mount (or a mount wrapper) do that? It's already
running in the right namespace...

2004-09-15 21:31:27

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 01:56:43PM -0700, Tim Hockin wrote:
> On Wed, Sep 15, 2004 at 04:49:18PM -0400, Robert Love wrote:
> > On Wed, 2004-09-15 at 13:47 -0700, Tim Hockin wrote:
> >
> > > Are you not sending it with some specific device as the source? Or is it
> > > just coming from some abstract root kobject?
> >
> > It comes the the physical device.
> >
> > Is there really a specific issue that you are seeing?
>
> Well, two.
>
> 1) If you send me an event "/dev/hda3 mounted", but it was for some other
> namespace, you just leaked potentially useful information.

No, we are sending an event that says:
block/hda/hda3 was mounted.

Now it's up to you, in your namespace to figure out where sysfs is
mounted, and then what, if any /dev device corrisponds to that sysfs
block device (using udevinfo in your namespace if you so desire.)

We aren't giving absolute /dev entries here, that's the beauty of the
kobject tree :)

thanks,

greg k-h

2004-09-15 21:31:23

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 04:10:43PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 12:40 -0700, Greg KH wrote:
>
> > And here's the patch I applied to my trees and will show up in the next
> > -mm release.
> >
> > I'll go convert Kay's mount patch to the new interface and add it too.
>
> I think you want an "unmount" signal, too.

Doh! You're right. Here's Kay's patch ported to the new interface, and
adding a umount event type. I've applied it to my trees.

thanks,

greg k-h

-----


kevent: add block mount and umount support

Send notification over the new netlink socket to let userspace know that
the filesystem code claims/releases the superblock on an blockdevice.
This way, userspace can get rid of constantly polling /proc/mounts to
watch for filesystem changes.

Signed-off-by: Kay Sievers <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

diff -Nru a/fs/super.c b/fs/super.c
--- a/fs/super.c 2004-09-15 14:15:54 -07:00
+++ b/fs/super.c 2004-09-15 14:15:54 -07:00
@@ -35,6 +35,7 @@
#include <linux/vfs.h>
#include <linux/writeback.h> /* for the emergency remount stuff */
#include <linux/idr.h>
+#include <linux/kobject.h>
#include <asm/uaccess.h>


@@ -633,6 +634,16 @@
return (void *)s->s_bdev == data;
}

+static void bdev_uevent(struct block_device *bdev, enum kobject_action action)
+{
+ if (bdev->bd_disk) {
+ if (bdev->bd_part)
+ kobject_uevent(&bdev->bd_part->kobj, action, NULL);
+ else
+ kobject_uevent(&bdev->bd_disk->kobj, action, NULL);
+ }
+}
+
struct super_block *get_sb_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int))
@@ -675,8 +686,10 @@
up_write(&s->s_umount);
deactivate_super(s);
s = ERR_PTR(error);
- } else
+ } else {
s->s_flags |= MS_ACTIVE;
+ bdev_uevent(bdev, KOBJ_MOUNT);
+ }
}

return s;
@@ -691,6 +704,8 @@
void kill_block_super(struct super_block *sb)
{
struct block_device *bdev = sb->s_bdev;
+
+ bdev_uevent(bdev, KOBJ_UMOUNT);
generic_shutdown_super(sb);
set_blocksize(bdev, sb->s_old_blocksize);
close_bdev_excl(bdev);
diff -Nru a/include/linux/kobject_uevent.h b/include/linux/kobject_uevent.h
--- a/include/linux/kobject_uevent.h 2004-09-15 14:15:54 -07:00
+++ b/include/linux/kobject_uevent.h 2004-09-15 14:15:54 -07:00
@@ -21,6 +21,7 @@
KOBJ_REMOVE = 0x01, /* remove event, for hotplug */
KOBJ_CHANGE = 0x02, /* a sysfs attribute file has changed */
KOBJ_MOUNT = 0x03, /* mount event for block devices */
+ KOBJ_UMOUNT = 0x04, /* umount event for block devices */
KOBJ_MAX_ACTION, /* must be last action listed */
};

diff -Nru a/lib/kobject_uevent.c b/lib/kobject_uevent.c
--- a/lib/kobject_uevent.c 2004-09-15 14:15:54 -07:00
+++ b/lib/kobject_uevent.c 2004-09-15 14:15:54 -07:00
@@ -32,6 +32,7 @@
"remove", /* 0x01 */
"change", /* 0x02 */
"mount", /* 0x03 */
+ "umount", /* 0x04 */
};

static char *action_to_string(enum kobject_action action)

2004-09-15 20:56:27

by Robert Love

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, 2004-09-15 at 13:47 -0700, Tim Hockin wrote:

> Are you not sending it with some specific device as the source? Or is it
> just coming from some abstract root kobject?

It comes the the physical device.

Is there really a specific issue that you are seeing?

Robert Love


2004-09-15 20:26:49

by Tim Hockin

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 04:10:43PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 12:40 -0700, Greg KH wrote:
>
> > And here's the patch I applied to my trees and will show up in the next
> > -mm release.
> >
> > I'll go convert Kay's mount patch to the new interface and add it too.
>
> I think you want an "unmount" signal, too.

So do we actually plan to handle namespaces at all wrt kevents?

2004-09-16 01:22:31

by Herbert Poetzl

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 05:38:39PM -0400, Robert Love wrote:
> On Wed, 2004-09-15 at 14:34 -0700, Tim Hockin wrote:
>
> > It's a can of worms, is what it is. And I'm not sure what a good fix
> > would be. Would it just be enough to send a generic "mount-table changed"
> > event, and let userspace figure out the rest?
>
> "Can of worms" is a tough description for something that there is no
> practical security issue for, just a lot of hand waving. No one even
> uses name spaces.

ah, sorry, that is wrong, we (linux-vserver)
_do_ use namespaces extensively, and probably
other 'jail' solutions will use it too ...

best,
Herbert

> Anyhow, I already said that we could send out a generic kobject instead
> of the one tied to the specific device.
>
> > Or really, why is the kernel broadcasting a mount, which originated in
> > userland. Couldn't mount (or a mount wrapper) do that? It's already
> > running in the right namespace...
>
> In practice stuff like that never works. Besides, it is not mount(1)
> that we want to wrap but the mount(2) system call. And, uh, I'd rather
> stab myself than try to get that patch by Uli.
>
> Robert Love
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2004-09-16 04:08:56

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, Sep 16, 2004 at 03:21:04AM +0200, Herbert Poetzl wrote:
> On Wed, Sep 15, 2004 at 05:38:39PM -0400, Robert Love wrote:
> > On Wed, 2004-09-15 at 14:34 -0700, Tim Hockin wrote:
> >
> > > It's a can of worms, is what it is. And I'm not sure what a good fix
> > > would be. Would it just be enough to send a generic "mount-table changed"
> > > event, and let userspace figure out the rest?
> >
> > "Can of worms" is a tough description for something that there is no
> > practical security issue for, just a lot of hand waving. No one even
> > uses name spaces.
>
> ah, sorry, that is wrong, we (linux-vserver)
> _do_ use namespaces extensively, and probably
> other 'jail' solutions will use it too ...

Great. So, how do you handle the /sbin/hotplug call today? How would
you want to handle this kevent notifier?

thanks,

greg k-h

2004-09-16 14:11:52

by Herbert Poetzl

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Wed, Sep 15, 2004 at 09:08:21PM -0700, Greg KH wrote:
> On Thu, Sep 16, 2004 at 03:21:04AM +0200, Herbert Poetzl wrote:
> > On Wed, Sep 15, 2004 at 05:38:39PM -0400, Robert Love wrote:
> > > On Wed, 2004-09-15 at 14:34 -0700, Tim Hockin wrote:
> > >
> > > > It's a can of worms, is what it is. And I'm not sure what a good fix
> > > > would be. Would it just be enough to send a generic "mount-table changed"
> > > > event, and let userspace figure out the rest?
> > >
> > > "Can of worms" is a tough description for something that there is no
> > > practical security issue for, just a lot of hand waving. No one even
> > > uses name spaces.
> >
> > ah, sorry, that is wrong, we (linux-vserver)
> > _do_ use namespaces extensively, and probably
> > other 'jail' solutions will use it too ...
>
> Great.
> So, how do you handle the /sbin/hotplug call today?

most of the time, not at all, but if, then in the
'initial' namespace where other userspace helpers
are handled too (means on the host)

> How would you want to handle this kevent notifier?

if there was a notifier telling about mounts - real
and virtual - then they would make sense _inside_
the respective namespace they happen .. e.g.

usb-device attached:
helper is called on host, and does some stuff
result is a mount of some device, which happens
on the host/initial namespace, notifier happens
there ...

process in namespace does --bind mount:
this might be interesting for the host too, but
probably it is more useful for the namespace
where it happened ...

but keep in mind, that might not be true for the
usual, we put our sendmail in a separate namespace

best,
Herbert

> thanks,
>
> greg k-h
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2004-09-16 15:21:57

by Greg KH

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, Sep 16, 2004 at 04:10:08PM +0200, Herbert Poetzl wrote:
> On Wed, Sep 15, 2004 at 09:08:21PM -0700, Greg KH wrote:
> > On Thu, Sep 16, 2004 at 03:21:04AM +0200, Herbert Poetzl wrote:
> > > On Wed, Sep 15, 2004 at 05:38:39PM -0400, Robert Love wrote:
> > > > On Wed, 2004-09-15 at 14:34 -0700, Tim Hockin wrote:
> > > >
> > > > > It's a can of worms, is what it is. And I'm not sure what a good fix
> > > > > would be. Would it just be enough to send a generic "mount-table changed"
> > > > > event, and let userspace figure out the rest?
> > > >
> > > > "Can of worms" is a tough description for something that there is no
> > > > practical security issue for, just a lot of hand waving. No one even
> > > > uses name spaces.
> > >
> > > ah, sorry, that is wrong, we (linux-vserver)
> > > _do_ use namespaces extensively, and probably
> > > other 'jail' solutions will use it too ...
> >
> > Great.
> > So, how do you handle the /sbin/hotplug call today?
>
> most of the time, not at all, but if, then in the
> 'initial' namespace where other userspace helpers
> are handled too (means on the host)

Ok, then you could handle the kevent message the same way, right?

> > How would you want to handle this kevent notifier?
>
> if there was a notifier telling about mounts - real
> and virtual - then they would make sense _inside_
> the respective namespace they happen .. e.g.
>
> usb-device attached:
> helper is called on host, and does some stuff
> result is a mount of some device, which happens
> on the host/initial namespace, notifier happens
> there ...
>
> process in namespace does --bind mount:
> this might be interesting for the host too, but
> probably it is more useful for the namespace
> where it happened ...

But in which namespace did it happen? That's probaby the hard part to
determine, right?

thanks,

greg k-h

2004-09-16 18:37:08

by Herbert Poetzl

[permalink] [raw]
Subject: Re: [patch] kernel sysfs events layer

On Thu, Sep 16, 2004 at 08:08:05AM -0700, Greg KH wrote:
> On Thu, Sep 16, 2004 at 04:10:08PM +0200, Herbert Poetzl wrote:
> > On Wed, Sep 15, 2004 at 09:08:21PM -0700, Greg KH wrote:
> > > On Thu, Sep 16, 2004 at 03:21:04AM +0200, Herbert Poetzl wrote:
> > > > On Wed, Sep 15, 2004 at 05:38:39PM -0400, Robert Love wrote:
> > > > > On Wed, 2004-09-15 at 14:34 -0700, Tim Hockin wrote:
> > > > >
> > > > > > It's a can of worms, is what it is. And I'm not sure what a good fix
> > > > > > would be. Would it just be enough to send a generic "mount-table changed"
> > > > > > event, and let userspace figure out the rest?
> > > > >
> > > > > "Can of worms" is a tough description for something that there is no
> > > > > practical security issue for, just a lot of hand waving. No one even
> > > > > uses name spaces.
> > > >
> > > > ah, sorry, that is wrong, we (linux-vserver)
> > > > _do_ use namespaces extensively, and probably
> > > > other 'jail' solutions will use it too ...
> > >
> > > Great.
> > > So, how do you handle the /sbin/hotplug call today?
> >
> > most of the time, not at all, but if, then in the
> > 'initial' namespace where other userspace helpers
> > are handled too (means on the host)
>
> Ok, then you could handle the kevent message the same way, right?

yes probably ...

> > > How would you want to handle this kevent notifier?
> >
> > if there was a notifier telling about mounts - real
> > and virtual - then they would make sense _inside_
> > the respective namespace they happen .. e.g.
> >
> > usb-device attached:
> > helper is called on host, and does some stuff
> > result is a mount of some device, which happens
> > on the host/initial namespace, notifier happens
> > there ...
> >
> > process in namespace does --bind mount:
> > this might be interesting for the host too, but
> > probably it is more useful for the namespace
> > where it happened ...
>
> But in which namespace did it happen? That's probaby the hard part to
> determine, right?

well, in the mount --bind case, not really, because
the current process already identifies a namespace
which could be used for that ...

best,
Herbert

> thanks,
>
> greg k-h
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/