This patch adds a fork connector in the do_fork() routine. It sends a
netlink datagram when enabled. The message can be read by a user space
application. By this way, the user space application is alerted when a
fork occurs.
It uses the userspace <-> kernelspace connector that works on top of
the netlink protocol. The fork connector is enabled or disabled by
sending a message to the connector. The unique sequence number of
messages can be used to check if a message is lost.
When the fork connector is disabled, the "lat_proc fork" test returns
a value equal to:
Process fork+exit: 154.6944 microseconds
When the fork connector is enabled, the same test returns:
Process fork+exit: 165.6667 microseconds
So the overhead (the construction and the sending of the message) is
around 7%.
This patch applies to 2.6.11-mm4. Some other patches are needed that
fix problems in the connector.c file. At least, you need to apply the
patch provided in the second email.
Signed-off-by: Guillaume Thouvenin <[email protected]>
---
drivers/connector/Kconfig | 11 +++++
drivers/connector/Makefile | 1
drivers/connector/cn_fork.c | 85 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/connector.h | 4 ++
kernel/fork.c | 44 ++++++++++++++++++++++
5 files changed, 145 insertions(+)
Index: linux-2.6.11-mm4-cnfork/drivers/connector/Kconfig
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/drivers/connector/Kconfig 2005-03-16 14:21:46.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/drivers/connector/Kconfig 2005-03-16 14:34:41.000000000 +0100
@@ -10,4 +10,15 @@
Connector support can also be built as a module. If so, the module
will be called cn.ko.
+config FORK_CONNECTOR
+ bool "Enable fork connector"
+ depends on CONNECTOR=y
+ default y
+ ---help---
+ It adds a connector in kernel/fork.c:do_fork() function. When a fork
+ occurs, netlink is used to transfer information about the parent and
+ its child. This information can be used by a user space application.
+ The fork connector can be enable/disable by sending a message to the
+ connector with the corresponding group id.
+
endmenu
Index: linux-2.6.11-mm4-cnfork/drivers/connector/Makefile
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/drivers/connector/Makefile 2005-03-16 14:21:46.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/drivers/connector/Makefile 2005-03-16 14:34:41.000000000 +0100
@@ -1,2 +1,3 @@
obj-$(CONFIG_CONNECTOR) += cn.o
+obj-$(CONFIG_FORK_CONNECTOR) += cn_fork.o
cn-objs := cn_queue.o connector.o
Index: linux-2.6.11-mm4-cnfork/drivers/connector/cn_fork.c
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/drivers/connector/cn_fork.c 2003-01-30 11:24:37.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/drivers/connector/cn_fork.c 2005-03-16 14:34:41.000000000 +0100
@@ -0,0 +1,85 @@
+/*
+ * cn_fork.c
+ *
+ * 2005 Copyright (c) Guillaume Thouvenin <[email protected]>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+
+#include <linux/connector.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Guillaume Thouvenin <[email protected]>");
+MODULE_DESCRIPTION("Enable or disable the usage of the fork connector");
+
+int cn_fork_enable = 0;
+struct cb_id cb_fork_id = { CN_IDX_FORK, CN_VAL_FORK };
+
+/**
+ * cn_fork_callback - enable or disable the fork connector
+ * @data: message send by the connector
+ *
+ * The callback allows to enable or disable the sending of information
+ * about fork in the do_fork() routine. To enable the fork, the user
+ * space application must send the integer 1 in the data part of the
+ * message. To disable the fork connector, it must send the integer 0.
+ */
+static void cn_fork_callback(void *data)
+{
+ struct cn_msg *msg = (struct cn_msg *)data;
+
+ if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable)))
+ memcpy(&cn_fork_enable, msg->data, sizeof(cn_fork_enable));
+}
+
+/**
+ * cn_fork_init - initialization entry point
+ *
+ * This routine will be run at kernel boot time because this driver is
+ * built in the kernel. It adds the connector callback to the connector
+ * driver.
+ */
+static int cn_fork_init(void)
+{
+ int err;
+
+ err = cn_add_callback(&cb_fork_id, "cn_fork", &cn_fork_callback);
+ if (err) {
+ printk(KERN_WARNING "Failed to register cn_fork\n");
+ return -EINVAL;
+ }
+
+ printk(KERN_NOTICE "cn_fork is registered\n");
+ return 0;
+}
+
+/**
+ * cn_fork_exit - exit entry point
+ *
+ * As this driver is always statically compiled into the kernel the
+ * cn_fork_exit has no effect.
+ */
+static void cn_fork_exit(void)
+{
+ cn_del_callback(&cb_fork_id);
+}
+
+module_init(cn_fork_init);
+module_exit(cn_fork_exit);
Index: linux-2.6.11-mm4-cnfork/include/linux/connector.h
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/include/linux/connector.h 2005-03-16 14:21:49.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/include/linux/connector.h 2005-03-16 14:34:41.000000000 +0100
@@ -28,6 +28,8 @@
#define CN_VAL_KOBJECT_UEVENT 0x0000
#define CN_IDX_SUPERIO 0xaabb /* SuperIO subsystem */
#define CN_VAL_SUPERIO 0xccdd
+#define CN_IDX_FORK 0xfeed /* fork events */
+#define CN_VAL_FORK 0xbeef
#define CONNECTOR_MAX_MSG_SIZE 1024
@@ -133,6 +135,8 @@
};
extern int cn_already_initialized;
+extern int cn_fork_enable;
+extern struct cb_id cb_fork_id;
int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
void cn_del_callback(struct cb_id *);
Index: linux-2.6.11-mm4-cnfork/kernel/fork.c
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/kernel/fork.c 2005-03-16 14:21:49.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/kernel/fork.c 2005-03-16 14:34:41.000000000 +0100
@@ -41,6 +41,7 @@
#include <linux/profile.h>
#include <linux/rmap.h>
#include <linux/acct.h>
+#include <linux/connector.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -63,6 +64,47 @@
EXPORT_SYMBOL(tasklist_lock);
+#ifdef CONFIG_FORK_CONNECTOR
+
+#define CN_FORK_INFO_SIZE 64
+#define CN_FORK_MSG_SIZE (sizeof(struct cn_msg) + CN_FORK_INFO_SIZE)
+
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ static DEFINE_SPINLOCK(cn_fork_lock);
+ static __u32 seq; /* used to test if message is lost */
+
+ if (cn_fork_enable) {
+ struct cn_msg *msg;
+
+ __u8 buffer[CN_FORK_MSG_SIZE];
+
+ msg = (struct cn_msg *)buffer;
+
+ memcpy(&msg->id, &cb_fork_id, sizeof(msg->id));
+ spin_lock(&cn_fork_lock);
+ msg->seq = seq++;
+ spin_unlock(&cn_fork_lock);
+ msg->ack = 0; /* not used */
+ /*
+ * size of data is the number of characters
+ * printed plus one for the trailing '\0'
+ */
+ /* just fill the data part with '\0' */
+ memset(msg->data, '\0', CN_FORK_INFO_SIZE);
+ msg->len = scnprintf(msg->data, CN_FORK_INFO_SIZE-1,
+ "%i %i", parent, child) + 1;
+
+ cn_netlink_send(msg, CN_IDX_FORK);
+ }
+}
+#else
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ return;
+}
+#endif /* CONFIG_FORK_CONNECTOR */
+
int nr_processes(void)
{
int cpu;
@@ -1253,6 +1295,8 @@
if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE))
ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
}
+
+ fork_connector(current->pid, p->pid);
} else {
free_pidmap(pid);
pid = PTR_ERR(p);
On Thursday, March 17, 2005 1:04 am, Guillaume Thouvenin wrote:
> +static inline void fork_connector(pid_t parent, pid_t child)
> +{
> + static DEFINE_SPINLOCK(cn_fork_lock);
> + static __u32 seq; /* used to test if message is lost */
> +
> + if (cn_fork_enable) {
> + struct cn_msg *msg;
> +
> + __u8 buffer[CN_FORK_MSG_SIZE];
> +
> + msg = (struct cn_msg *)buffer;
> +
> + memcpy(&msg->id, &cb_fork_id, sizeof(msg->id));
> + spin_lock(&cn_fork_lock);
> + msg->seq = seq++;
> + spin_unlock(&cn_fork_lock);
As I mentioned before, this won't work very well on a large CPU count system.
cn_fork_lock will be taken by each CPU everytime it does a fork, meaning that
forks will be very slow if lots of CPUs are doing them at the same time. Is
there a more scalable way to ensure message delivery?
Jesse
On Thursday, March 17, 2005 1:38 pm, Evgeniy Polyakov wrote:
> The most significant part there - is requirement to store
> u32 seq in each CPU's cache and thus flush cacheline +
> invalidate/get from mem on each other cpus
> each time it is accessed, which is a big price.
Same thing has to happen with the lock. To put it simply, writing global
variables from multiple CPUs with anything other than very low frequency is
bad.
> It is totally Guillaume's work - so he decides,
> I would recomend per cpu counters and processor's
> id in each message.
> And of course userspace should take care of misordered
> messages.
> I personally prefer such mechanism.
Yep, I agree. Hopefully Guillaume will too :)
Jesse
On Thu, 17 Mar 2005 08:56:57 -0800
Jesse Barnes <[email protected]> wrote:
> On Thursday, March 17, 2005 1:04 am, Guillaume Thouvenin wrote:
> > +static inline void fork_connector(pid_t parent, pid_t child)
> > +{
> > + static DEFINE_SPINLOCK(cn_fork_lock);
> > + static __u32 seq; /* used to test if message is lost */
> > +
> > + if (cn_fork_enable) {
> > + struct cn_msg *msg;
> > +
> > + __u8 buffer[CN_FORK_MSG_SIZE];
> > +
> > + msg = (struct cn_msg *)buffer;
> > +
> > + memcpy(&msg->id, &cb_fork_id, sizeof(msg->id));
> > + spin_lock(&cn_fork_lock);
> > + msg->seq = seq++;
> > + spin_unlock(&cn_fork_lock);
>
> As I mentioned before, this won't work very well on a large CPU count system.
> cn_fork_lock will be taken by each CPU everytime it does a fork, meaning that
> forks will be very slow if lots of CPUs are doing them at the same time. Is
Maybe... But..., concider ppc system,
each lock is about 10 instructions(or even less),
increment with return is about 3-5 instructions, unlock -
barrier() and setting.
The whole fork syscall contains too bigger number
of instruction(do_fork() itself is more than 500,
and it is not counting number of instructions in
functions that are called from do_fork())
to care about 20 idle on each CPU,
even if there are 512 of them.
The most significant part there - is requirement to store
u32 seq in each CPU's cache and thus flush cacheline +
invalidate/get from mem on each other cpus
each time it is accessed, which is a big price.
> there a more scalable way to ensure message delivery?
It is totally Guillaume's work - so he decides,
I would recomend per cpu counters and processor's
id in each message.
And of course userspace should take care of misordered
messages.
I personally prefer such mechanism.
Guillaume?
> Jesse
Evgeniy Polyakov
Only failure makes us experts. -- Theo de Raadt
On Thu, 2005-03-17 at 14:05 -0800, Jesse Barnes wrote:
> On Thursday, March 17, 2005 1:38 pm, Evgeniy Polyakov wrote:
> > The most significant part there - is requirement to store
> > u32 seq in each CPU's cache and thus flush cacheline +
> > invalidate/get from mem on each other cpus
> > each time it is accessed, which is a big price.
>
> Same thing has to happen with the lock. To put it simply, writing global
> variables from multiple CPUs with anything other than very low frequency is
> bad.
>
> > It is totally Guillaume's work - so he decides,
> > I would recomend per cpu counters and processor's
> > id in each message.
> > And of course userspace should take care of misordered
> > messages.
> > I personally prefer such mechanism.
>
> Yep, I agree. Hopefully Guillaume will too :)
Sure, I agree and I will implement the per cpu counters solution with a
processor id in each message.
Thank you very much Jesse and Evgeniy for your great help,
Best regards,
Guillaume
ChangeLog:
- Remove the global cn_fork_lock and replace it by a per CPU
counter.
- The processor ID has been added in the data part of the message.
Thus datas sent in a message are: "CPU_ID PARENT_PID CHILD_PID"
Those modifications were done to be more scalable because, as
mentioned by Jesse Barnes, the global cn_fork_lock won't work well on a
large CPU system.
This patch applies to 2.6.11-mm4.
Regards,
Guillaume
---
drivers/connector/Kconfig | 11 +++++
drivers/connector/Makefile | 1
drivers/connector/cn_fork.c | 85 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/connector.h | 4 ++
kernel/fork.c | 44 ++++++++++++++++++++++
5 files changed, 145 insertions(+)
Index: linux-2.6.11-mm4-cnfork/drivers/connector/Kconfig
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/drivers/connector/Kconfig 2005-03-16 14:21:46.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/drivers/connector/Kconfig 2005-03-16 14:34:41.000000000 +0100
@@ -10,4 +10,15 @@
Connector support can also be built as a module. If so, the module
will be called cn.ko.
+config FORK_CONNECTOR
+ bool "Enable fork connector"
+ depends on CONNECTOR=y
+ default y
+ ---help---
+ It adds a connector in kernel/fork.c:do_fork() function. When a fork
+ occurs, netlink is used to transfer information about the parent and
+ its child. This information can be used by a user space application.
+ The fork connector can be enable/disable by sending a message to the
+ connector with the corresponding group id.
+
endmenu
Index: linux-2.6.11-mm4-cnfork/drivers/connector/Makefile
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/drivers/connector/Makefile 2005-03-16 14:21:46.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/drivers/connector/Makefile 2005-03-16 14:34:41.000000000 +0100
@@ -1,2 +1,3 @@
obj-$(CONFIG_CONNECTOR) += cn.o
+obj-$(CONFIG_FORK_CONNECTOR) += cn_fork.o
cn-objs := cn_queue.o connector.o
Index: linux-2.6.11-mm4-cnfork/drivers/connector/cn_fork.c
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/drivers/connector/cn_fork.c 2003-01-30 11:24:37.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/drivers/connector/cn_fork.c 2005-03-16 14:34:41.000000000 +0100
@@ -0,0 +1,85 @@
+/*
+ * cn_fork.c
+ *
+ * 2005 Copyright (c) Guillaume Thouvenin <[email protected]>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+
+#include <linux/connector.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Guillaume Thouvenin <[email protected]>");
+MODULE_DESCRIPTION("Enable or disable the usage of the fork connector");
+
+int cn_fork_enable = 0;
+struct cb_id cb_fork_id = { CN_IDX_FORK, CN_VAL_FORK };
+
+/**
+ * cn_fork_callback - enable or disable the fork connector
+ * @data: message send by the connector
+ *
+ * The callback allows to enable or disable the sending of information
+ * about fork in the do_fork() routine. To enable the fork, the user
+ * space application must send the integer 1 in the data part of the
+ * message. To disable the fork connector, it must send the integer 0.
+ */
+static void cn_fork_callback(void *data)
+{
+ struct cn_msg *msg = (struct cn_msg *)data;
+
+ if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable)))
+ memcpy(&cn_fork_enable, msg->data, sizeof(cn_fork_enable));
+}
+
+/**
+ * cn_fork_init - initialization entry point
+ *
+ * This routine will be run at kernel boot time because this driver is
+ * built in the kernel. It adds the connector callback to the connector
+ * driver.
+ */
+static int cn_fork_init(void)
+{
+ int err;
+
+ err = cn_add_callback(&cb_fork_id, "cn_fork", &cn_fork_callback);
+ if (err) {
+ printk(KERN_WARNING "Failed to register cn_fork\n");
+ return -EINVAL;
+ }
+
+ printk(KERN_NOTICE "cn_fork is registered\n");
+ return 0;
+}
+
+/**
+ * cn_fork_exit - exit entry point
+ *
+ * As this driver is always statically compiled into the kernel the
+ * cn_fork_exit has no effect.
+ */
+static void cn_fork_exit(void)
+{
+ cn_del_callback(&cb_fork_id);
+}
+
+module_init(cn_fork_init);
+module_exit(cn_fork_exit);
Index: linux-2.6.11-mm4-cnfork/include/linux/connector.h
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/include/linux/connector.h 2005-03-16 14:21:49.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/include/linux/connector.h 2005-03-16 14:34:41.000000000 +0100
@@ -28,6 +28,8 @@
#define CN_VAL_KOBJECT_UEVENT 0x0000
#define CN_IDX_SUPERIO 0xaabb /* SuperIO subsystem */
#define CN_VAL_SUPERIO 0xccdd
+#define CN_IDX_FORK 0xfeed /* fork events */
+#define CN_VAL_FORK 0xbeef
#define CONNECTOR_MAX_MSG_SIZE 1024
@@ -133,6 +135,8 @@
};
extern int cn_already_initialized;
+extern int cn_fork_enable;
+extern struct cb_id cb_fork_id;
int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
void cn_del_callback(struct cb_id *);
Index: linux-2.6.11-mm4-cnfork/kernel/fork.c
===================================================================
--- linux-2.6.11-mm4-cnfork.orig/kernel/fork.c 2005-03-16 14:21:49.000000000 +0100
+++ linux-2.6.11-mm4-cnfork/kernel/fork.c 2005-03-21 13:13:25.000000000 +0100
@@ -41,6 +41,7 @@
#include <linux/profile.h>
#include <linux/rmap.h>
#include <linux/acct.h>
+#include <linux/connector.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -63,6 +64,47 @@
EXPORT_SYMBOL(tasklist_lock);
+#ifdef CONFIG_FORK_CONNECTOR
+
+#define CN_FORK_INFO_SIZE 64
+#define CN_FORK_MSG_SIZE (sizeof(struct cn_msg) + CN_FORK_INFO_SIZE)
+
+static DEFINE_PER_CPU(unsigned long, fork_counts);
+
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ if (cn_fork_enable) {
+ struct cn_msg *msg;
+ __u8 buffer[CN_FORK_MSG_SIZE];
+
+ msg = (struct cn_msg *)buffer;
+
+ memcpy(&msg->id, &cb_fork_id, sizeof(msg->id));
+
+ msg->ack = 0; /* not used */
+ msg->seq = get_cpu_var(fork_counts)++;
+
+ /*
+ * size of data is the number of characters
+ * printed plus one for the trailing '\0'
+ */
+ memset(msg->data, '\0', CN_FORK_INFO_SIZE);
+ msg->len = scnprintf(msg->data, CN_FORK_INFO_SIZE-1,
+ "%i %i %i",
+ smp_processor_id(), parent, child) + 1;
+
+ put_cpu_var(fork_counts);
+
+ cn_netlink_send(msg, CN_IDX_FORK);
+ }
+}
+#else
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ return;
+}
+#endif /* CONFIG_FORK_CONNECTOR */
+
int nr_processes(void)
{
int cpu;
@@ -1253,6 +1295,8 @@
if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE))
ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
}
+
+ fork_connector(current->pid, p->pid);
} else {
free_pidmap(pid);
pid = PTR_ERR(p);
On Mon, 2005-03-21 at 04:48, Guillaume Thouvenin wrote:
> ChangeLog:
>
> - Remove the global cn_fork_lock and replace it by a per CPU
> counter.
> - The processor ID has been added in the data part of the message.
> Thus datas sent in a message are: "CPU_ID PARENT_PID CHILD_PID"
>
> Those modifications were done to be more scalable because, as
> mentioned by Jesse Barnes, the global cn_fork_lock won't work well on a
> large CPU system.
>
> This patch applies to 2.6.11-mm4.
Guillaume,
If a bunch of applications are listening for fork events,
your patch allows any application to turn off the
fork event notification? Is this the right behavior?
Should'nt it turn off the fork-event notification when
the number of listeners become zero?
RP
On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
> On Mon, 2005-03-21 at 04:48, Guillaume Thouvenin wrote:
> > ChangeLog:
> >
> > - Remove the global cn_fork_lock and replace it by a per CPU
> > counter.
> > - The processor ID has been added in the data part of the message.
> > Thus datas sent in a message are: "CPU_ID PARENT_PID CHILD_PID"
> >
> > Those modifications were done to be more scalable because, as
> > mentioned by Jesse Barnes, the global cn_fork_lock won't work well on a
> > large CPU system.
> >
> > This patch applies to 2.6.11-mm4.
> Guillaume,
>
> If a bunch of applications are listening for fork events,
> your patch allows any application to turn off the
> fork event notification? Is this the right behavior?
>
> Should'nt it turn off the fork-event notification when
> the number of listeners become zero?
There is no number of listeners - netlink sockets provide multicast
dataflow.
[Although one can obtain that number].
As far as I can see, Guillaume's application is main management utility
-
it can turn on or off some feature, like "ip" can turn on or off
interfaces
without waiting when bounded processes decide to exit.
> RP
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
> If a bunch of applications are listening for fork events,
> your patch allows any application to turn off the
> fork event notification? Is this the right behavior?
Yes it is. The main management is done by application so, if several
applications are listening for fork events you need to choose which one
will turn off the fork connector.
I want to keep this turn on/off mechanism simple but if it's needed I
can manage the variable "cn_fork_enable" as a counter. Thus the callback
could be something like:
static void cn_fork_callback(void *data)
{
int start;
struct cn_msg *msg = (struct cn_msg *)data;
if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
memcpy(&start, msg->data, sizeof(cn_fork_enable));
if (start)
cn_fork_enable++;
else
cn_fork_enable > 0 ? cn_fork_enable-- : 0;
}
}
What do you think about this implementation?
Guillaume
Guillaume Thouvenin wrote:
> On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
>
>> If a bunch of applications are listening for fork events,
>> your patch allows any application to turn off the
>> fork event notification? Is this the right behavior?
>
>
> Yes it is. The main management is done by application so, if several
> applications are listening for fork events you need to choose which one
> will turn off the fork connector.
It is not practical. One listener never know who else are listening
to the fork connector.
We also need to protect yet another global variable "cn_fork_enable".
Also, in order to implement "cn_fork_enable" as a counter, we need
some sort of registration to prevent an application from sending
repeated "off" notification. One can only turn off its own switch.
Thanks,
- jay
>
> I want to keep this turn on/off mechanism simple but if it's needed I
> can manage the variable "cn_fork_enable" as a counter. Thus the callback
> could be something like:
>
> static void cn_fork_callback(void *data)
> {
> int start;
> struct cn_msg *msg = (struct cn_msg *)data;
>
> if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
> memcpy(&start, msg->data, sizeof(cn_fork_enable));
> if (start)
> cn_fork_enable++;
> else
> cn_fork_enable > 0 ? cn_fork_enable-- : 0;
> }
> }
>
>
> What do you think about this implementation?
>
> Guillaume
On Mon, 2005-03-21 at 23:07, Guillaume Thouvenin wrote:
> On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
> > If a bunch of applications are listening for fork events,
> > your patch allows any application to turn off the
> > fork event notification? Is this the right behavior?
>
> Yes it is. The main management is done by application so, if several
> applications are listening for fork events you need to choose which one
> will turn off the fork connector.
>
> I want to keep this turn on/off mechanism simple but if it's needed I
> can manage the variable "cn_fork_enable" as a counter. Thus the callback
> could be something like:
>
> static void cn_fork_callback(void *data)
> {
> int start;
> struct cn_msg *msg = (struct cn_msg *)data;
>
> if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
> memcpy(&start, msg->data, sizeof(cn_fork_enable));
> if (start)
> cn_fork_enable++;
> else
> cn_fork_enable > 0 ? cn_fork_enable-- : 0;
> }
> }
I think a better way is:
Providing a different connector channel called the administrator
channel which can be used only by a super-user, and gives you
the ability to switch on or off any connector channel including the
fork-connector channel.
For lack of better term I am using the word 'channel' to mean
something that carries events of particular type through the
connector-infrastructure.
RP
>
>
> What do you think about this implementation?
>
> Guillaume
>
On Mon, 2005-03-21 at 20:36, Evgeniy Polyakov wrote:
> On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
> > On Mon, 2005-03-21 at 04:48, Guillaume Thouvenin wrote:
> > > ChangeLog:
> > >
> > > - Remove the global cn_fork_lock and replace it by a per CPU
> > > counter.
> > > - The processor ID has been added in the data part of the message.
> > > Thus datas sent in a message are: "CPU_ID PARENT_PID CHILD_PID"
> > >
> > > Those modifications were done to be more scalable because, as
> > > mentioned by Jesse Barnes, the global cn_fork_lock won't work well on a
> > > large CPU system.
> > >
> > > This patch applies to 2.6.11-mm4.
> > Guillaume,
> >
> > If a bunch of applications are listening for fork events,
> > your patch allows any application to turn off the
> > fork event notification? Is this the right behavior?
> >
> > Should'nt it turn off the fork-event notification when
> > the number of listeners become zero?
>
> There is no number of listeners - netlink sockets provide multicast
> dataflow.
> [Although one can obtain that number].
>
> As far as I can see, Guillaume's application is main management utility
> -
Yes. agreed. But again nothing stops one of the many application
listening on the multicast events from stopping the fork-events.
Even though Guillame's application claims itself the main management
utility, nothing stops another application from assuming the management
role.
I think if the the connector infrastructure provides a administrative
kind of channel, (the one I mentioned in the reply to Guillame) that
should help get better control on the management aspects of the
event stream.
RP
>
> it can turn on or off some feature, like "ip" can turn on or off
> interfaces
> without waiting when bounded processes decide to exit.
> > RP
On Tue, 22 Mar 2005 10:26:19 -0800
Ram <[email protected]> wrote:
> On Mon, 2005-03-21 at 23:07, Guillaume Thouvenin wrote:
> > On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
> > > If a bunch of applications are listening for fork events,
> > > your patch allows any application to turn off the
> > > fork event notification? Is this the right behavior?
> >
> > Yes it is. The main management is done by application so, if several
> > applications are listening for fork events you need to choose which one
> > will turn off the fork connector.
> >
> > I want to keep this turn on/off mechanism simple but if it's needed I
> > can manage the variable "cn_fork_enable" as a counter. Thus the callback
> > could be something like:
> >
> > static void cn_fork_callback(void *data)
> > {
> > int start;
> > struct cn_msg *msg = (struct cn_msg *)data;
> >
> > if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
> > memcpy(&start, msg->data, sizeof(cn_fork_enable));
> > if (start)
> > cn_fork_enable++;
> > else
> > cn_fork_enable > 0 ? cn_fork_enable-- : 0;
> > }
> > }
>
> I think a better way is:
>
> Providing a different connector channel called the administrator
> channel which can be used only by a super-user, and gives you
> the ability to switch on or off any connector channel including the
> fork-connector channel.
Only super-user can bind netlink socket to multicast group.
> For lack of better term I am using the word 'channel' to mean
> something that carries events of particular type through the
> connector-infrastructure.
I still do not see why it is needed.
Super-user can run ip command and turn network interface off
not waiting while apache or named exits or unbind.
In theory I can create some kind of userspace registration mechanism,
when userspace application reports it's pid to the connector,
and then it sends data to the specified pids, but does not
allow controlling from userspace.
But I really do not think it is a good idea to permit
non-priviledged userspace processes to know about deep
kernel internals through connector's messages.
> RP
>
>
> >
> >
> > What do you think about this implementation?
> >
> > Guillaume
> >
Evgeniy Polyakov
Only failure makes us experts. -- Theo de Raadt
On Tue, 2005-03-22 at 11:22, Evgeniy Polyakov wrote:
> On Tue, 22 Mar 2005 10:26:19 -0800
> Ram <[email protected]> wrote:
>
> > On Mon, 2005-03-21 at 23:07, Guillaume Thouvenin wrote:
> > > On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
> > > > If a bunch of applications are listening for fork events,
> > > > your patch allows any application to turn off the
> > > > fork event notification? Is this the right behavior?
> > >
> > > Yes it is. The main management is done by application so, if several
> > > applications are listening for fork events you need to choose which one
> > > will turn off the fork connector.
> > >
> > > I want to keep this turn on/off mechanism simple but if it's needed I
> > > can manage the variable "cn_fork_enable" as a counter. Thus the callback
> > > could be something like:
> > >
> > > static void cn_fork_callback(void *data)
> > > {
> > > int start;
> > > struct cn_msg *msg = (struct cn_msg *)data;
> > >
> > > if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
> > > memcpy(&start, msg->data, sizeof(cn_fork_enable));
> > > if (start)
> > > cn_fork_enable++;
> > > else
> > > cn_fork_enable > 0 ? cn_fork_enable-- : 0;
> > > }
> > > }
> >
> > I think a better way is:
> >
> > Providing a different connector channel called the administrator
> > channel which can be used only by a super-user, and gives you
> > the ability to switch on or off any connector channel including the
> > fork-connector channel.
>
> Only super-user can bind netlink socket to multicast group.
ok. I did not realize that.
>
> > For lack of better term I am using the word 'channel' to mean
> > something that carries events of particular type through the
> > connector-infrastructure.
>
> I still do not see why it is needed.
> Super-user can run ip command and turn network interface off
> not waiting while apache or named exits or unbind.
>
> In theory I can create some kind of userspace registration mechanism,
> when userspace application reports it's pid to the connector,
> and then it sends data to the specified pids, but does not
> allow controlling from userspace.
> But I really do not think it is a good idea to permit
> non-priviledged userspace processes to know about deep
> kernel internals through connector's messages.
Yes. non-priviledged userspace processes should not know
any deep kernel internals through connector events.
I think what I am driving at is, an application that is critically
dependent on the fork-notification, suddenly stops receiving such
notification because some other application has switched off the
service without its notice.
the reason I am concerned is I am planning to feed this fork-events
to my in-kernel module. Side note: I would really like support for
in-kernel listners through connector infrastructure.
RP
>
> > RP
> >
> >
> > >
> > >
> > > What do you think about this implementation?
> > >
> > > Guillaume
> > >
>
>
> Evgeniy Polyakov
>
> Only failure makes us experts. -- Theo de Raadt
On Tue, 22 Mar 2005 11:18:07 -0800
Ram <[email protected]> wrote:
> > I still do not see why it is needed.
> > Super-user can run ip command and turn network interface off
> > not waiting while apache or named exits or unbind.
> >
> > In theory I can create some kind of userspace registration mechanism,
> > when userspace application reports it's pid to the connector,
> > and then it sends data to the specified pids, but does not
> > allow controlling from userspace.
> > But I really do not think it is a good idea to permit
> > non-priviledged userspace processes to know about deep
> > kernel internals through connector's messages.
>
> Yes. non-priviledged userspace processes should not know
> any deep kernel internals through connector events.
>
> I think what I am driving at is, an application that is critically
> dependent on the fork-notification, suddenly stops receiving such
> notification because some other application has switched off the
> service without its notice.
>
> the reason I am concerned is I am planning to feed this fork-events
> to my in-kernel module. Side note: I would really like support for
> in-kernel listners through connector infrastructure.
fork connector can be extended to send to the rest of the world
information that it was turned off or on.
The easiest way to listen inside the kernel for connector's
notifications is creating appropriate socket in userspace
and then obtain it through the following code which must be
run in context of the process that created above socket:
struct file *file;
struct inode *inode;
int err = 0;
file = fget(userspace_socket_number_in_the_current_process_context);
if (!file) {
return -ENODEV;
}
inode = file->f_dentry->d_inode;
if (inode->i_sock) {
sock = SOCKET_I(inode);
err = 0;
} else {
fput(file);
err = -ENODEV;
}
sock is struct sock, which then can be used for read/write
operations using
kernel_recvmsg(sock, &msg, &iov, 1, size, 0);
kernel_sendmsg(sock, &msg, &iov, 1, size);
with the following initialisation:
sock->sk->sk_allocation |= GFP_NOIO;
iov.iov_base = buf;
iov.iov_len = size;
msg.msg_name = NULL;
msg.msg_namelen = 0;
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_namelen = 0;
msg.msg_flags = msg_flags | MSG_NOSIGNAL;
Of course it can be done directly from kernelspace without
touching process's structures, but this way is easier.
> RP
Evgeniy Polyakov
Only failure makes us experts. -- Theo de Raadt
On Tue, 2005-03-22 at 12:25, Evgeniy Polyakov wrote:
> On Tue, 22 Mar 2005 11:18:07 -0800
> Ram <[email protected]> wrote:
>
> > > I still do not see why it is needed.
> > > Super-user can run ip command and turn network interface off
> > > not waiting while apache or named exits or unbind.
> > >
> > > In theory I can create some kind of userspace registration mechanism,
> > > when userspace application reports it's pid to the connector,
> > > and then it sends data to the specified pids, but does not
> > > allow controlling from userspace.
> > > But I really do not think it is a good idea to permit
> > > non-priviledged userspace processes to know about deep
> > > kernel internals through connector's messages.
> >
> > Yes. non-priviledged userspace processes should not know
> > any deep kernel internals through connector events.
> >
> > I think what I am driving at is, an application that is critically
> > dependent on the fork-notification, suddenly stops receiving such
> > notification because some other application has switched off the
> > service without its notice.
> >
> > the reason I am concerned is I am planning to feed this fork-events
> > to my in-kernel module. Side note: I would really like support for
> > in-kernel listners through connector infrastructure.
>
> fork connector can be extended to send to the rest of the world
> information that it was turned off or on.
This will support the paradigm: "Let me know if this is switched off,
and I don't mind any arbitrary process switching it off". No
applications can seriously rely on this service, because there is this
*arbritray-application-being-able-to-control* component to it.
A ideal paradigm would be: "Don't switch it off, without my consent".
Essentially this will give applications enough confidence to rely on it
seriously.
However an inbetween paradigm that would acceptable is:
"Let me know if this is swtiched off provided only application that I
can trust-on has the power to manage it".
To me 2nd or 3rd paradigm seems acceptable, but not the 1st.
>
> The easiest way to listen inside the kernel for connector's
> notifications is creating appropriate socket in userspace
> and then obtain it through the following code which must be
> run in context of the process that created above socket:
>
> struct file *file;
> struct inode *inode;
> int err = 0;
>
> file = fget(userspace_socket_number_in_the_current_process_context);
> if (!file) {
> return -ENODEV;
> }
> inode = file->f_dentry->d_inode;
> if (inode->i_sock) {
> sock = SOCKET_I(inode);
> err = 0;
> } else {
> fput(file);
> err = -ENODEV;
> }
>
> sock is struct sock, which then can be used for read/write
> operations using
> kernel_recvmsg(sock, &msg, &iov, 1, size, 0);
> kernel_sendmsg(sock, &msg, &iov, 1, size);
>
> with the following initialisation:
>
> sock->sk->sk_allocation |= GFP_NOIO;
> iov.iov_base = buf;
> iov.iov_len = size;
> msg.msg_name = NULL;
> msg.msg_namelen = 0;
> msg.msg_control = NULL;
> msg.msg_controllen = 0;
> msg.msg_namelen = 0;
> msg.msg_flags = msg_flags | MSG_NOSIGNAL;
>
right. thanks for the code snippet. I should be coding something
around these lines.
>
> Of course it can be done directly from kernelspace without
> touching process's structures, but this way is easier.
Thanks,
RP
>
> > RP
>
> Evgeniy Polyakov
>
> Only failure makes us experts. -- Theo de Raadt
Ram wrote:
> On Tue, 2005-03-22 at 11:22, Evgeniy Polyakov wrote:
>
>>On Tue, 22 Mar 2005 10:26:19 -0800
>>Ram <[email protected]> wrote:
>>
>>
>>>On Mon, 2005-03-21 at 23:07, Guillaume Thouvenin wrote:
>>>
>>>>On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
>>>>
>>>>> If a bunch of applications are listening for fork events,
>>>>> your patch allows any application to turn off the
>>>>> fork event notification? Is this the right behavior?
>>>>
>>>>Yes it is. The main management is done by application so, if several
>>>>applications are listening for fork events you need to choose which one
>>>>will turn off the fork connector.
>>>>
>>>>I want to keep this turn on/off mechanism simple but if it's needed I
>>>>can manage the variable "cn_fork_enable" as a counter. Thus the callback
>>>>could be something like:
>>>>
>>>>static void cn_fork_callback(void *data)
>>>>{
>>>> int start;
>>>> struct cn_msg *msg = (struct cn_msg *)data;
>>>>
>>>> if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
>>>> memcpy(&start, msg->data, sizeof(cn_fork_enable));
>>>> if (start)
>>>> cn_fork_enable++;
>>>> else
>>>> cn_fork_enable > 0 ? cn_fork_enable-- : 0;
>>>> }
>>>>}
>>>
>>>I think a better way is:
>>>
>>> Providing a different connector channel called the administrator
>>> channel which can be used only by a super-user, and gives you
>>> the ability to switch on or off any connector channel including the
>>> fork-connector channel.
>>
>>Only super-user can bind netlink socket to multicast group.
>
>
> ok. I did not realize that.
>
>
>>> For lack of better term I am using the word 'channel' to mean
>>> something that carries events of particular type through the
>>> connector-infrastructure.
>>
>>I still do not see why it is needed.
>>Super-user can run ip command and turn network interface off
>>not waiting while apache or named exits or unbind.
>>
>>In theory I can create some kind of userspace registration mechanism,
>>when userspace application reports it's pid to the connector,
>>and then it sends data to the specified pids, but does not
>>allow controlling from userspace.
>>But I really do not think it is a good idea to permit
>>non-priviledged userspace processes to know about deep
>>kernel internals through connector's messages.
>
>
> Yes. non-priviledged userspace processes should not know
> any deep kernel internals through connector events.
>
> I think what I am driving at is, an application that is critically
> dependent on the fork-notification, suddenly stops receiving such
> notification because some other application has switched off the
> service without its notice.
>
> the reason I am concerned is I am planning to feed this fork-events
> to my in-kernel module. Side note: I would really like support for
> in-kernel listners through connector infrastructure.
Listners in the kernel? Sound like a PAGG thing again!
Guillaume's stuff was originally an in-kernel module. CSA/Job
are kernel modules also. We all use hook management feature of PAGG,
which offer event notification to other kernel components.
Guillaume moved to connector approach and i am moving to that
direction too because Andrew likes to see in-kernel modules moved
to user space.
It is kind of funny to see someone trying to get a notification
through a connector and then feed it back to the kernel! This
does not sound right to me!
- jay
>
> RP
>
>
>>>RP
>>>
>>>
>>>
>>>>
>>>>What do you think about this implementation?
>>>>
>>>>Guillaume
>>>>
>>
>>
>> Evgeniy Polyakov
>>
>>Only failure makes us experts. -- Theo de Raadt
Evgeniy Polyakov wrote:
> On Tue, 22 Mar 2005 10:26:19 -0800
> Ram <[email protected]> wrote:
>
>
>>On Mon, 2005-03-21 at 23:07, Guillaume Thouvenin wrote:
>>
>>>On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
>>>
>>>> If a bunch of applications are listening for fork events,
>>>> your patch allows any application to turn off the
>>>> fork event notification? Is this the right behavior?
>>>
>>>Yes it is. The main management is done by application so, if several
>>>applications are listening for fork events you need to choose which one
>>>will turn off the fork connector.
>>>
>>>I want to keep this turn on/off mechanism simple but if it's needed I
>>>can manage the variable "cn_fork_enable" as a counter. Thus the callback
>>>could be something like:
>>>
>>>static void cn_fork_callback(void *data)
>>>{
>>> int start;
>>> struct cn_msg *msg = (struct cn_msg *)data;
>>>
>>> if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
>>> memcpy(&start, msg->data, sizeof(cn_fork_enable));
>>> if (start)
>>> cn_fork_enable++;
>>> else
>>> cn_fork_enable > 0 ? cn_fork_enable-- : 0;
>>> }
>>>}
>>
>>I think a better way is:
>>
>> Providing a different connector channel called the administrator
>> channel which can be used only by a super-user, and gives you
>> the ability to switch on or off any connector channel including the
>> fork-connector channel.
>
>
> Only super-user can bind netlink socket to multicast group.
>
>
>> For lack of better term I am using the word 'channel' to mean
>> something that carries events of particular type through the
>> connector-infrastructure.
>
>
> I still do not see why it is needed.
> Super-user can run ip command and turn network interface off
> not waiting while apache or named exits or unbind.
You can turn off a network interface and turn it back on without
closing a network application and it will continue to work.
>
> In theory I can create some kind of userspace registration mechanism,
> when userspace application reports it's pid to the connector,
> and then it sends data to the specified pids, but does not
> allow controlling from userspace.
> But I really do not think it is a good idea to permit
> non-priviledged userspace processes to know about deep
> kernel internals through connector's messages.
I see this issue less a case of bad guys vs. good guys. I see it
as various components doing system related work, but there is
no mechanism of knowing who is on who is off by today's patch. A
service listening to the fork connector can request to turn off
cn_fork_enable on exit and inadquately affect other services/daemons
listening to the same connector. It is not acceptable in my opinion.
The idea of implementing fork connector enabling/disabling was so
that the kernel does not waste time writing to the sockets if no
application listening. If implementing such a feature costs
more than it saves, maybe the fork connector should simply always
send?
Thanks,
- jay
>
>
>>RP
>>
>>
>>
>>>
>>>What do you think about this implementation?
>>>
>>>Guillaume
>>>
>
>
>
> Evgeniy Polyakov
>
> Only failure makes us experts. -- Theo de Raadt
On Tue, 2005-03-22 at 12:42 -0800, Ram wrote:
> On Tue, 2005-03-22 at 12:25, Evgeniy Polyakov wrote:
> > On Tue, 22 Mar 2005 11:18:07 -0800
> > Ram <[email protected]> wrote:
> >
> > > > I still do not see why it is needed.
> > > > Super-user can run ip command and turn network interface off
> > > > not waiting while apache or named exits or unbind.
> > > >
> > > > In theory I can create some kind of userspace registration mechanism,
> > > > when userspace application reports it's pid to the connector,
> > > > and then it sends data to the specified pids, but does not
> > > > allow controlling from userspace.
> > > > But I really do not think it is a good idea to permit
> > > > non-priviledged userspace processes to know about deep
> > > > kernel internals through connector's messages.
> > >
> > > Yes. non-priviledged userspace processes should not know
> > > any deep kernel internals through connector events.
> > >
> > > I think what I am driving at is, an application that is critically
> > > dependent on the fork-notification, suddenly stops receiving such
> > > notification because some other application has switched off the
> > > service without its notice.
> > >
> > > the reason I am concerned is I am planning to feed this fork-events
> > > to my in-kernel module. Side note: I would really like support for
> > > in-kernel listners through connector infrastructure.
> >
> > fork connector can be extended to send to the rest of the world
> > information that it was turned off or on.
>
> This will support the paradigm: "Let me know if this is switched off,
> and I don't mind any arbitrary process switching it off". No
> applications can seriously rely on this service, because there is this
> *arbritray-application-being-able-to-control* component to it.
It is fork connector who will inform all it's listeners that it is not
working any more.
And it is only super-user who can turn it off.
> A ideal paradigm would be: "Don't switch it off, without my consent".
> Essentially this will give applications enough confidence to rely on it
> seriously.
Does ip command wait until apache or bind exited when trying to turn
network interface off? No. One can even unload network driver.
Because super-user must have ability to control the whole system.
> However an inbetween paradigm that would acceptable is:
> "Let me know if this is swtiched off provided only application that I
> can trust-on has the power to manage it".
I suppose fork connector [kernel module] is trusted enough application
that can inform external listeners about it's status.
But super-user still can turn it off by any program,
like one can turn network interface off using ip, ifconfig, your own
application that can, btw, do it using netlink socket.
> To me 2nd or 3rd paradigm seems acceptable, but not the 1st.
Ok, fork module can implement following controlling mechanism:
on insert time super-user provides pid of the controlling
program, later in fork callback it searches for the given pid,
sends some signal to it, and if it receives
magic message [with signal number for example] after it, then it is ok
to perform desired action.
I believe there are plenty of auth techniques that can be applied
here, but I strongly object against it.
What would happen with the system, if only one controlling daemon
is allowed to turn networking off, or change hard drive work modes,
or super-user could not be allowed to remove user's file using usual
rm comman, but not specially authentificated daemon?
Or if it should ask all user's application, do they agree
to remove user's files?
> Thanks,
> RP
>
> >
> > > RP
> >
> > Evgeniy Polyakov
> >
> > Only failure makes us experts. -- Theo de Raadt
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
On Tue, 2005-03-22 at 15:51 -0800, Jay Lan wrote:
> >>I think a better way is:
> >>
> >> Providing a different connector channel called the administrator
> >> channel which can be used only by a super-user, and gives you
> >> the ability to switch on or off any connector channel including the
> >> fork-connector channel.
> >
> >
> > Only super-user can bind netlink socket to multicast group.
> >
> >
> >> For lack of better term I am using the word 'channel' to mean
> >> something that carries events of particular type through the
> >> connector-infrastructure.
> >
> >
> > I still do not see why it is needed.
> > Super-user can run ip command and turn network interface off
> > not waiting while apache or named exits or unbind.
>
> You can turn off a network interface and turn it back on without
> closing a network application and it will continue to work.
And connector's listeners will continue to listen it's events.
But there will not be any event.
Like application will continue to be bound to the IP and listen
for incomming connection, but there will not be any connection.
> >
> > In theory I can create some kind of userspace registration mechanism,
> > when userspace application reports it's pid to the connector,
> > and then it sends data to the specified pids, but does not
> > allow controlling from userspace.
> > But I really do not think it is a good idea to permit
> > non-priviledged userspace processes to know about deep
> > kernel internals through connector's messages.
>
> I see this issue less a case of bad guys vs. good guys. I see it
> as various components doing system related work, but there is
> no mechanism of knowing who is on who is off by today's patch. A
> service listening to the fork connector can request to turn off
> cn_fork_enable on exit and inadquately affect other services/daemons
> listening to the same connector. It is not acceptable in my opinion.
It is super-user who only is allowed to turn it off and even listen for
events,
since only super-user is allowed to bind socket to multicast netlink
group.
Super-user should not be allowed to control it's system?
> The idea of implementing fork connector enabling/disabling was so
> that the kernel does not waste time writing to the sockets if no
> application listening. If implementing such a feature costs
> more than it saves, maybe the fork connector should simply always
> send?
With CBUS it costs nothing to always send notifications,
but I still do not see a point of disallowing super-user to stop
fork notifications using arbitrary application.
> Thanks,
> - jay
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
On Tue, 2005-03-22 at 10:15 -0800, Jay Lan wrote:
> Guillaume Thouvenin wrote:
> > On Mon, 2005-03-21 at 12:52 -0800, Ram wrote:
> >
> >> If a bunch of applications are listening for fork events,
> >> your patch allows any application to turn off the
> >> fork event notification? Is this the right behavior?
> >
> >
> > Yes it is. The main management is done by application so, if several
> > applications are listening for fork events you need to choose which one
> > will turn off the fork connector.
>
> It is not practical. One listener never know who else are listening
> to the fork connector.
>
> We also need to protect yet another global variable "cn_fork_enable".
>
> Also, in order to implement "cn_fork_enable" as a counter, we need
> some sort of registration to prevent an application from sending
> repeated "off" notification. One can only turn off its own switch.
I agree with the Evgeniy's ip example. I can provide a fork connector
interface utility to enable or disable the fork connector. Thus,
listeners don't have to start and stop the fork connector and they don't
have to worry about who else are listening the fork connector. It also
removes the need of registration.
I copy at the end of this email the code of the fork connector
controller application (called fcctl).
I need to add the possibility to get the state of the fork connector.
I also need to extend the fork connector to send to everyone information
when it is turned off/on.
Regards,
Guillaume
/*
* fcctl.c - Fork connector interface utility
*
* Used to turn on/off the fork connector
*/
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <linux/connector.h>
#include <linux/netlink.h>
#define MESSAGE_SIZE (sizeof(struct nlmsghdr) + \
sizeof(struct cn_msg) + \
sizeof(int))
#define FORK_CN_DISABLE 0
#define FORK_CN_ENABLE 1
#define FORK_CN_UNKNOWN 2
void show_help()
{
printf("usage: fcctl {on|off}\n\n");
}
int main(int argc, char **argv)
{
int sk_nl;
int err;
int action = FORK_CN_UNKNOWN;
struct sockaddr_nl sa_nl; /* info for the netlink interface */
char buff[128]; /* must be > MESSAGE_SIZE */
struct nlmsghdr *hdr;
struct cn_msg *mesg;
if (getuid() != 0) {
printf("Only root can start/stop the fork connector\n");
return 0;
}
if (argc != 2) {
show_help();
return 0;
}
if (strcmp(argv[1], "on") == 0)
action = FORK_CN_ENABLE;
else if (strcmp(argv[1], "off") == 0)
action = FORK_CN_DISABLE;
if (action == FORK_CN_UNKNOWN) {
show_help();
return 0;
}
/*
* Create an endpoint for communication. Use the kernel user
* interface device (PF_NETLINK) which is a datagram oriented
* service (SOCK_DGRAM). The protocol used is the netfilter/iptables
* ULOG protocol (NETLINK_NFLOG)
*/
sk_nl = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_NFLOG);
if (sk_nl == -1) {
printf("socket sk_nl error");
return -1;
}
sa_nl.nl_family = AF_NETLINK;
sa_nl.nl_groups = CN_IDX_FORK;
sa_nl.nl_pid = getpid();
err = bind(sk_nl, (struct sockaddr *)&sa_nl, sizeof(sa_nl));
if (err == -1) {
printf("binding sk_nl error");
close(sk_nl);
return -1;
}
/* Clear the buffer */
memset(buff, '\0', sizeof(buff));
/* fill the message header */
hdr = (struct nlmsghdr *)buff;
hdr->nlmsg_len = MESSAGE_SIZE;
hdr->nlmsg_type = NLMSG_DONE;
hdr->nlmsg_flags = 0;
hdr->nlmsg_seq = 0;
hdr->nlmsg_pid = getpid();
/* the message */
mesg = (struct cn_msg *)NLMSG_DATA(hdr);
mesg->id.idx = CN_IDX_FORK;
mesg->id.val = CN_VAL_FORK;
mesg->seq = 0;
mesg->ack = 0;
mesg->len = sizeof(int);
mesg->data[0] = action;
send(sk_nl, hdr, hdr->nlmsg_len, 0);
close(sk_nl);
return 0;
}
On Tue, 2005-03-22 at 21:51, Evgeniy Polyakov wrote:
> On Wed, 2005-03-23 at 08:01 +0300, Evgeniy Polyakov wrote:
> > On Tue, 2005-03-22 at 15:51 -0800, Jay Lan wrote:
> >
>
> > > I see this issue less a case of bad guys vs. good guys. I see it
> > > as various components doing system related work, but there is
> > > no mechanism of knowing who is on who is off by today's patch. A
> > > service listening to the fork connector can request to turn off
> > > cn_fork_enable on exit and inadquately affect other services/daemons
> > > listening to the same connector. It is not acceptable in my opinion.
> >
> > It is super-user who only is allowed to turn it off and even listen for
> > events,
> > since only super-user is allowed to bind socket to multicast netlink
> > group.
> > Super-user should not be allowed to control it's system?
>
> BTW, super-user can unload fork connector module, and none listener
> will even know about it, it just stops to receive notification.
I see your point. Since the application has to be super-user to turn it
off, and since super-user applications are trusted not to mis-behave,
the current mechanism is relatively safe. I guess its the amount of
checks you put in place, to prevent inadvertent shooting-in-the-foot.
There is nothing one can do if the fork_connector module is yanked out.
However there is something one can do, to prevent any arbitrary
application from shutting down the fork-event stream. I think I can live
with the current mechanism, under the assumption that no fork-event
listner has a legitate reason to shut down the fork-event-stream.
RP