This patch adds a fork connector in the do_fork() routine. It sends a
netlink datagram when enabled. The message can be read by a user space
application. By this way, the user space application is alerted when a
fork occurs.
It uses the userspace <-> kernelspace connector that works on top of
the netlink protocol. The fork connector is enabled or disabled by
sending a message to the connector. The unique sequence number of
messages can be used to check if a message is lost. In this patch we use
a cn_fork_msg structure (thanks to Dean Gaudet) rather than zeroing 64
bytes of memory and doing conversions to decimal ascii as done before.
Also we move the fork_connector() inline code in the header file
include/linux/connector.h as suggested by Paul Jackson (thanks).
The fork connector is used by the Enhanced Linux System Accounting
project http://elsa.sourceforge.net
When the fork connector is disabled, the "lat_proc fork" test returns
a value equal to:
Process fork+exit: 150.2076 microseconds
When the fork connector is enabled, the same test returns:
Process fork+exit: 153.7778 microseconds
So the overhead (the construction and the sending of the message) is
around 2%. If we use the CBUS patch http://lkml.org/lkml/2005/3/20/14
instead of cn_netlink_send() routine we can reduce this overhead near to
0%. We're waiting for the inclusion of the CBUS in the -mm tree.
This patch applies to 2.6.12-rc1-mm4.
Signed-off-by: Guillaume Thouvenin <[email protected]>
---
drivers/connector/Kconfig | 11 ++++
drivers/connector/Makefile | 1
drivers/connector/cn_fork.c | 104 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/connector.h | 53 ++++++++++++++++++++++
kernel/fork.c | 3 +
5 files changed, 172 insertions(+)
Index: linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Kconfig
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/drivers/connector/Kconfig 2005-03-31 14:56:02.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Kconfig 2005-03-31 14:57:52.000000000 +0200
@@ -10,4 +10,15 @@ config CONNECTOR
Connector support can also be built as a module. If so, the module
will be called cn.ko.
+config FORK_CONNECTOR
+ bool "Enable fork connector"
+ depends on CONNECTOR=y
+ default y
+ ---help---
+ It adds a connector in kernel/fork.c:do_fork() function. When a fork
+ occurs, netlink is used to transfer information about the parent and
+ its child. This information can be used by a user space application.
+ The fork connector can be enable/disable by sending a message to the
+ connector with the corresponding group id.
+
endmenu
Index: linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Makefile
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/drivers/connector/Makefile 2005-03-31 14:56:02.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Makefile 2005-03-31 14:57:52.000000000 +0200
@@ -1,2 +1,3 @@
obj-$(CONFIG_CONNECTOR) += cn.o
+obj-$(CONFIG_FORK_CONNECTOR) += cn_fork.o
cn-objs := cn_queue.o connector.o
Index: linux-2.6.12-rc1-mm4-cnfork/drivers/connector/cn_fork.c
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/drivers/connector/cn_fork.c 2003-01-30 11:24:37.000000000 +0100
+++ linux-2.6.12-rc1-mm4-cnfork/drivers/connector/cn_fork.c 2005-03-31 14:57:52.000000000 +0200
@@ -0,0 +1,104 @@
+/*
+ * cn_fork.c
+ *
+ * 2005 Copyright (c) Guillaume Thouvenin <[email protected]>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+
+#include <linux/connector.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Guillaume Thouvenin <[email protected]>");
+MODULE_DESCRIPTION("Enable or disable the usage of the fork connector");
+
+int cn_fork_enable = 0;
+struct cb_id cb_fork_id = { CN_IDX_FORK, CN_VAL_FORK };
+
+static inline void cn_fork_send_status(void)
+{
+ /* TODO */
+ printk(KERN_INFO "cn_fork_enable == %d\n", cn_fork_enable);
+}
+
+/**
+ * cn_fork_callback - enable or disable the fork connector
+ * @data: message send by the connector
+ *
+ * The callback allows to enable or disable the sending of information
+ * about fork in the do_fork() routine. To enable the fork, the user
+ * space application must send the integer 1 in the data part of the
+ * message. To disable the fork connector, it must send the integer 0.
+ */
+static void cn_fork_callback(void *data)
+{
+ struct cn_msg *msg = (struct cn_msg *)data;
+ int action;
+
+ if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
+ memcpy(&action, msg->data, sizeof(cn_fork_enable));
+ switch(action) {
+ case FORK_CN_START:
+ cn_fork_enable = 1;
+ break;
+ case FORK_CN_STOP:
+ cn_fork_enable = 0;
+ break;
+ case FORK_CN_STATUS:
+ cn_fork_send_status();
+ break;
+ }
+ }
+}
+
+/**
+ * cn_fork_init - initialization entry point
+ *
+ * This routine will be run at kernel boot time because this driver is
+ * built in the kernel. It adds the connector callback to the connector
+ * driver.
+ */
+static int cn_fork_init(void)
+{
+ int err;
+
+ err = cn_add_callback(&cb_fork_id, "cn_fork", &cn_fork_callback);
+ if (err) {
+ printk(KERN_WARNING "Failed to register cn_fork\n");
+ return -EINVAL;
+ }
+
+ printk(KERN_NOTICE "cn_fork is registered\n");
+ return 0;
+}
+
+/**
+ * cn_fork_exit - exit entry point
+ *
+ * As this driver is always statically compiled into the kernel the
+ * cn_fork_exit has no effect.
+ */
+static void cn_fork_exit(void)
+{
+ cn_del_callback(&cb_fork_id);
+}
+
+module_init(cn_fork_init);
+module_exit(cn_fork_exit);
Index: linux-2.6.12-rc1-mm4-cnfork/include/linux/connector.h
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/include/linux/connector.h 2005-03-31 14:56:05.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/include/linux/connector.h 2005-03-31 14:57:52.000000000 +0200
@@ -28,10 +28,16 @@
#define CN_VAL_KOBJECT_UEVENT 0x0000
#define CN_IDX_SUPERIO 0xaabb /* SuperIO subsystem */
#define CN_VAL_SUPERIO 0xccdd
+#define CN_IDX_FORK 0xfeed /* fork events */
+#define CN_VAL_FORK 0xbeef
#define CONNECTOR_MAX_MSG_SIZE 1024
+#define FORK_CN_STOP 0
+#define FORK_CN_START 1
+#define FORK_CN_STATUS 2
+
struct cb_id
{
__u32 idx;
@@ -64,6 +70,12 @@ struct cn_ctl_msg
__u8 data[0];
};
+struct cn_fork_msg
+{
+ int cpu;
+ int ppid;
+ int cpid;
+};
#ifdef __KERNEL__
@@ -133,6 +145,8 @@ struct cn_dev
};
extern int cn_already_initialized;
+extern int cn_fork_enable;
+extern struct cb_id cb_fork_id;
int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
void cn_del_callback(struct cb_id *);
@@ -146,5 +160,44 @@ void cn_queue_free_dev(struct cn_queue_d
int cn_cb_equal(struct cb_id *, struct cb_id *);
+#ifdef CONFIG_FORK_CONNECTOR
+
+#define CN_FORK_INFO_SIZE sizeof(struct cn_fork_msg)
+#define CN_FORK_MSG_SIZE (sizeof(struct cn_msg) + CN_FORK_INFO_SIZE)
+
+static DEFINE_PER_CPU(unsigned long, fork_counts);
+
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ if (cn_fork_enable) {
+ struct cn_msg *msg;
+ struct cn_fork_msg *forkmsg;
+ __u8 buffer[CN_FORK_MSG_SIZE];
+
+ msg = (struct cn_msg *)buffer;
+
+ memcpy(&msg->id, &cb_fork_id, sizeof(msg->id));
+
+ msg->ack = 0; /* not used */
+ msg->seq = get_cpu_var(fork_counts)++;
+
+ msg->len = CN_FORK_INFO_SIZE;
+ forkmsg = (struct cn_fork_msg *)msg->data;
+ forkmsg->cpu = smp_processor_id();
+ forkmsg->ppid = parent;
+ forkmsg->cpid = child;
+
+ put_cpu_var(fork_counts);
+
+ cn_netlink_send(msg, CN_IDX_FORK);
+ }
+}
+#else
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ return;
+}
+#endif /* CONFIG_FORK_CONNECTOR */
+
#endif /* __KERNEL__ */
#endif /* __CONNECTOR_H */
Index: linux-2.6.12-rc1-mm4-cnfork/kernel/fork.c
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/kernel/fork.c 2005-03-31 14:56:05.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/kernel/fork.c 2005-03-31 14:57:52.000000000 +0200
@@ -41,6 +41,7 @@
#include <linux/profile.h>
#include <linux/rmap.h>
#include <linux/acct.h>
+#include <linux/connector.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -1249,6 +1250,8 @@ long do_fork(unsigned long clone_flags,
if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE))
ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
}
+
+ fork_connector(current->pid, p->pid);
} else {
free_pidmap(pid);
pid = PTR_ERR(p);
Guillaume Thouvenin <[email protected]> wrote:
>
> This patch adds a fork connector in the do_fork() routine.
> ...
>
> The fork connector is used by the Enhanced Linux System Accounting
> project http://elsa.sourceforge.net
Does it also meet all the in-kernel requirements for other accounting
projects?
If not, what else do those other projects need?
Guillaume Thouvenin <[email protected]> wrote:
>
> This patch adds a fork connector in the do_fork() routine.
>
> +config FORK_CONNECTOR
> + bool "Enable fork connector"
> + depends on CONNECTOR=y
This kind of defeats connector's ability to be built as a module. Doing
select CONNECTOR
may be better here.
> +static void cn_fork_callback(void *data)
> +{
> + struct cn_msg *msg = (struct cn_msg *)data;
The cast is unnecessary.
>
> extern int cn_already_initialized;
> +extern int cn_fork_enable;
> +extern struct cb_id cb_fork_id;
Should these declarations be inside CONFIG_FORK_CONNECTOR?
> +
> +static DEFINE_PER_CPU(unsigned long, fork_counts);
> +
This will cause fork_counts to be defined in each compilation unit which
includes this header file. You should use DEFINE_PER_CPU in .c and
DECLARE_PER_CPU in .h.
Andrew Morton wrote:
>Guillaume Thouvenin <[email protected]> wrote:
>
>
>> This patch adds a fork connector in the do_fork() routine.
>>...
>>
>> The fork connector is used by the Enhanced Linux System Accounting
>>project http://elsa.sourceforge.net
>>
>>
>
>Does it also meet all the in-kernel requirements for other accounting
>projects?
>
>If not, what else do those other projects need?
>
>
Hi Andrew,
As the discussion in this thread of the past few days showed, this
patch intends to take care of process grouping, but not the
accounting data collection. Besides my concern of possibility of
data loss, this patch also provides CSA information to handle process
grouping as it intends to do. I plan to run some testing to see percentage
of data loss when system is under stress test, but improvement at
CBUS as Evgeniy indicated should help!
Please be advised that i still need an do_exit handling to save accounting
data. But, it is a separate issue.
Thanks,
- jay
On Thu, 2005-03-31 at 16:10 -0800, Jay Lan wrote:
> Andrew Morton wrote:
>
> >Guillaume Thouvenin <[email protected]> wrote:
> >
> >
> >> This patch adds a fork connector in the do_fork() routine.
> >>...
> >>
> >> The fork connector is used by the Enhanced Linux System Accounting
> >>project http://elsa.sourceforge.net
> >>
> >>
> >
> >Does it also meet all the in-kernel requirements for other accounting
> >projects?
> >
> >If not, what else do those other projects need?
> >
> >
> Hi Andrew,
>
> As the discussion in this thread of the past few days showed, this
> patch intends to take care of process grouping, but not the
> accounting data collection. Besides my concern of possibility of
> data loss, this patch also provides CSA information to handle process
> grouping as it intends to do. I plan to run some testing to see percentage
> of data loss when system is under stress test, but improvement at
> CBUS as Evgeniy indicated should help!
>
> Please be advised that i still need an do_exit handling to save accounting
> data. But, it is a separate issue.
My five copecks [or two cents]:
fork connector with CBUS [with theirs upto 2.5 % degradation
with huge disk writes per fork] are still much faster than any existing
accounting models.
But it is purely accounting project author to think about
accounting design though...
> Thanks,
> - jay
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
ChangeLog:
- Remove unnecessary catch in cn_fork_callback().
- Use reverse dependencies in Kconfig. By doing this, if fork
connector is enabled, the connector will be automatically
selected as built-in. It's the what we need.
- Move the cn_fork_enable and cb_fork_id declarations inside
CONFIG_FORK_CONNECTOR.
- Use DECLARE_PER_CPU in connector.h and DEFINE_PER_CPU in fork.c
to avoid fork_counts to be defined in each compilation unit which
includes include/linux/connector.h
- Add a description of the fork connector in the beginning of the
cn_fork.c
This patch applies to 2.6.12-rc1-mm4
Many thanks to Andrew for the great help,
Best regards,
Guillaume
Signed-off-by: Guillaume Thouvenin <[email protected]>
---
drivers/connector/Kconfig | 11 ++++
drivers/connector/Makefile | 1
drivers/connector/cn_fork.c | 113 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/connector.h | 54 +++++++++++++++++++++
kernel/fork.c | 11 ++++
5 files changed, 190 insertions(+)
Index: linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Kconfig
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/drivers/connector/Kconfig 2005-03-31 14:56:02.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Kconfig 2005-04-01 08:02:25.000000000 +0200
@@ -10,4 +10,15 @@ config CONNECTOR
Connector support can also be built as a module. If so, the module
will be called cn.ko.
+config FORK_CONNECTOR
+ bool "Enable fork connector"
+ select CONNECTOR
+ default y
+ ---help---
+ It adds a connector in kernel/fork.c:do_fork() function. When a fork
+ occurs, netlink is used to transfer information about the parent and
+ its child. This information can be used by a user space application.
+ The fork connector can be enable/disable by sending a message to the
+ connector with the corresponding group id.
+
endmenu
Index: linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Makefile
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/drivers/connector/Makefile 2005-03-31 14:56:02.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/drivers/connector/Makefile 2005-03-31 14:57:52.000000000 +0200
@@ -1,2 +1,3 @@
obj-$(CONFIG_CONNECTOR) += cn.o
+obj-$(CONFIG_FORK_CONNECTOR) += cn_fork.o
cn-objs := cn_queue.o connector.o
Index: linux-2.6.12-rc1-mm4-cnfork/drivers/connector/cn_fork.c
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/drivers/connector/cn_fork.c 2003-01-30 11:24:37.000000000 +0100
+++ linux-2.6.12-rc1-mm4-cnfork/drivers/connector/cn_fork.c 2005-04-01 12:48:47.000000000 +0200
@@ -0,0 +1,113 @@
+/*
+ * cn_fork.c - Fork connector
+ *
+ * Copyright (C) 2005 Guillaume Thouvenin <[email protected]>
+ *
+ * This module implements the fork connector. It allows to send a
+ * netlink datagram, when enabled, from the do_fork() routine. The
+ * message can be read by a user space application. By this way,
+ * the user space application is alerted when a fork occurs.
+ *
+ * It uses the userspace <-> kernelspace connector that works on top of
+ * the netlink protocol. The fork connector is enabled or disabled by
+ * sending a message to the connector. The unique sequence number of
+ * messages can be used to check if a message is lost.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+
+#include <linux/connector.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Guillaume Thouvenin <[email protected]>");
+MODULE_DESCRIPTION("Enable or disable the usage of the fork connector");
+
+int cn_fork_enable = 0;
+struct cb_id cb_fork_id = { CN_IDX_FORK, CN_VAL_FORK };
+
+static inline void cn_fork_send_status(void)
+{
+ /* TODO: An informational line in log is maybe not enough... */
+ printk(KERN_INFO "cn_fork_enable == %d\n", cn_fork_enable);
+}
+
+/**
+ * cn_fork_callback - enable or disable the fork connector
+ * @data: message send by the connector
+ *
+ * The callback allows to enable or disable the sending of information
+ * about fork in the do_fork() routine. To enable the fork, the user
+ * space application must send the integer 1 in the data part of the
+ * message. To disable the fork connector, it must send the integer 0.
+ */
+static void cn_fork_callback(void *data)
+{
+ struct cn_msg *msg = data;
+ int action;
+
+ if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) {
+ memcpy(&action, msg->data, sizeof(cn_fork_enable));
+ switch(action) {
+ case FORK_CN_START:
+ cn_fork_enable = 1;
+ break;
+ case FORK_CN_STOP:
+ cn_fork_enable = 0;
+ break;
+ case FORK_CN_STATUS:
+ cn_fork_send_status();
+ break;
+ }
+ }
+}
+
+/**
+ * cn_fork_init - initialization entry point
+ *
+ * This routine will be run at kernel boot time because this driver is
+ * built in the kernel. It adds the connector callback to the connector
+ * driver.
+ */
+static int cn_fork_init(void)
+{
+ int err;
+
+ err = cn_add_callback(&cb_fork_id, "cn_fork", &cn_fork_callback);
+ if (err) {
+ printk(KERN_WARNING "Failed to register cn_fork\n");
+ return -EINVAL;
+ }
+
+ printk(KERN_NOTICE "cn_fork is registered\n");
+ return 0;
+}
+
+/**
+ * cn_fork_exit - exit entry point
+ *
+ * As this driver is always statically compiled into the kernel the
+ * cn_fork_exit has no effect.
+ */
+static void cn_fork_exit(void)
+{
+ cn_del_callback(&cb_fork_id);
+}
+
+module_init(cn_fork_init);
+module_exit(cn_fork_exit);
Index: linux-2.6.12-rc1-mm4-cnfork/include/linux/connector.h
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/include/linux/connector.h 2005-03-31 14:56:05.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/include/linux/connector.h 2005-04-01 09:52:28.000000000 +0200
@@ -28,10 +28,16 @@
#define CN_VAL_KOBJECT_UEVENT 0x0000
#define CN_IDX_SUPERIO 0xaabb /* SuperIO subsystem */
#define CN_VAL_SUPERIO 0xccdd
+#define CN_IDX_FORK 0xfeed /* fork events */
+#define CN_VAL_FORK 0xbeef
#define CONNECTOR_MAX_MSG_SIZE 1024
+#define FORK_CN_STOP 0
+#define FORK_CN_START 1
+#define FORK_CN_STATUS 2
+
struct cb_id
{
__u32 idx;
@@ -64,6 +70,12 @@ struct cn_ctl_msg
__u8 data[0];
};
+struct cn_fork_msg
+{
+ int cpu; /* ID of the cpu where the fork occured */
+ int ppid; /* the parent PID */
+ int cpid; /* the child PID */
+};
#ifdef __KERNEL__
@@ -146,5 +158,47 @@ void cn_queue_free_dev(struct cn_queue_d
int cn_cb_equal(struct cb_id *, struct cb_id *);
+#ifdef CONFIG_FORK_CONNECTOR
+
+#define CN_FORK_INFO_SIZE sizeof(struct cn_fork_msg)
+#define CN_FORK_MSG_SIZE (sizeof(struct cn_msg) + CN_FORK_INFO_SIZE)
+
+extern int cn_fork_enable;
+extern struct cb_id cb_fork_id;
+
+DECLARE_PER_CPU(unsigned long, fork_counts);
+
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ if (cn_fork_enable) {
+ struct cn_msg *msg;
+ struct cn_fork_msg *forkmsg;
+ __u8 buffer[CN_FORK_MSG_SIZE];
+
+ msg = (struct cn_msg *)buffer;
+
+ memcpy(&msg->id, &cb_fork_id, sizeof(msg->id));
+
+ msg->ack = 0; /* not used */
+ msg->seq = get_cpu_var(fork_counts)++;
+
+ msg->len = CN_FORK_INFO_SIZE;
+ forkmsg = (struct cn_fork_msg *)msg->data;
+ forkmsg->cpu = smp_processor_id();
+ forkmsg->ppid = parent;
+ forkmsg->cpid = child;
+
+ put_cpu_var(fork_counts);
+
+ cn_netlink_send(msg, CN_IDX_FORK);
+ }
+}
+#else
+static inline void fork_connector(pid_t parent, pid_t child)
+{
+ return;
+}
+#endif /* CONFIG_FORK_CONNECTOR */
+
#endif /* __KERNEL__ */
#endif /* __CONNECTOR_H */
Index: linux-2.6.12-rc1-mm4-cnfork/kernel/fork.c
===================================================================
--- linux-2.6.12-rc1-mm4-cnfork.orig/kernel/fork.c 2005-03-31 14:56:05.000000000 +0200
+++ linux-2.6.12-rc1-mm4-cnfork/kernel/fork.c 2005-04-01 09:53:20.000000000 +0200
@@ -41,6 +41,7 @@
#include <linux/profile.h>
#include <linux/rmap.h>
#include <linux/acct.h>
+#include <linux/connector.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -63,6 +64,14 @@ DEFINE_PER_CPU(unsigned long, process_co
EXPORT_SYMBOL(tasklist_lock);
+#ifdef CONFIG_FORK_CONNECTOR
+/*
+ * fork_counts is used by the fork_connector() inline routine as
+ * the sequence number of the netlink message.
+ */
+static DEFINE_PER_CPU(unsigned long, fork_counts);
+#endif /* CONFIG_FORK_CONNECTOR */
+
int nr_processes(void)
{
int cpu;
@@ -1249,6 +1258,8 @@ long do_fork(unsigned long clone_flags,
if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE))
ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
}
+
+ fork_connector(current->pid, p->pid);
} else {
free_pidmap(pid);
pid = PTR_ERR(p);
Hi Evgeniy,
Should i be concerned about this bugcheck?
I have seen this happening a number of times, all with the same signature
in my testing. I ran a mix of AIM7, ubench, fork-test (continuously
fork new
processes), and another program reading from the fork connector socket.
Thanks,
- jay
cqueue/1[656]: bugcheck! 0 [1]
Modules linked in: nfs lockd sunrpc sg st sr_mod ipv6 usbcore
Pid: 656, CPU 1, comm: cqueue/1
psr : 00001010085a6010 ifs : 8000000000000289 ip :
[<a0000001005cee50>] Not tainted
ip is at __kfree_skb+0x1b0/0x220
unat: 0000000000000000 pfs : 0000000000000289 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000009641
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a0000001005cee50 b6 : a00000010000e7e0 b7 : a0000001003ae440
f6 : 0fffbccccccccc8c00000 f7 : 0ffdaa200000000000000
f8 : 100008000000000000000 f9 : 10002a000000000000000
f10 : 0fffcccccccccc8c00000 f11 : 1003e0000000000000000
r1 : a000000100c0ec00 r2 : 0000000000004000 r3 : 0000000000004000
r8 : 0000000000000028 r9 : a0000001008eaac8 r10 : 0000000000000004
r11 : 0000000000000028 r12 : e00000307a99fd60 r13 : e00000307a998000
r14 : a000000100887c00 r15 : a000000100a24b18 r16 : a000000100a22e18
r17 : ffffffffffffffff r18 : a000000100887bec r19 : a000000100a9080f
r20 : 0000000000003517 r21 : 00000000000fffff r22 : 0000000000000034
r23 : 0000000000000034 r24 : a000000100a90810 r25 : 0000000000003518
r26 : 0000000000003518 r27 : 00000010085a6010 r28 : a000000100a90811
r29 : 0000000000003519 r30 : 0000000000000000 r31 : a000000100a24ae8
Call Trace:
[<a0000001000126a0>] show_stack+0x80/0xa0
sp=e00000307a99f920 bsp=e00000307a999078
[<a000000100012f00>] show_regs+0x840/0x880
sp=e00000307a99faf0 bsp=e00000307a999018
[<a000000100034890>] die+0x150/0x1c0
sp=e00000307a99fb00 bsp=e00000307a998fd0
[<a000000100034940>] die_if_kernel+0x40/0x60
sp=e00000307a99fb00 bsp=e00000307a998fa0
[<a000000100035d20>] ia64_bad_break+0x300/0x380
sp=e00000307a99fb00 bsp=e00000307a998f78
[<a00000010000b5e0>] ia64_leave_kernel+0x0/0x280
sp=e00000307a99fb90 bsp=e00000307a998f78
[<a0000001005cee50>] __kfree_skb+0x1b0/0x220
sp=e00000307a99fd60 bsp=e00000307a998f30
[<a00000010044f630>] kfree_skb+0x50/0xa0
sp=e00000307a99fd60 bsp=e00000307a998f10
[<a00000010044e400>] cn_queue_wrapper+0xe0/0x100
sp=e00000307a99fd60 bsp=e00000307a998ee8
[<a0000001000cb880>] worker_thread+0x3e0/0x520
sp=e00000307a99fd60 bsp=e00000307a998e60
[<a0000001000d7210>] kthread+0x290/0x300
sp=e00000307a99fdd0 bsp=e00000307a998e20
[<a000000100010a00>] kernel_thread_helper+0xe0/0x100
sp=e00000307a99fe30 bsp=e00000307a998df0
[<a000000100009120>] start_kernel_thread+0x20/0x40
sp=e00000307a99fe30 bsp=e00000307a998df0
BTW, when it happened last time, my program listening to the socket
complained about duplicate messages received.
Unmatched seq. Rcvd=1824062, expected=1824061 <===
Unmatched seq. Rcvd=1824062, expected=1824063 <===
Unmatched seq. Rcvd=1824348, expected=1824307
When my program received 1824062 while expecting 1824061
it adjusted itself to expect the next one being 1824063. But instead
the message of 1824062 arrived the second time.
- jay
Jay Lan wrote:
> Hi Evgeniy,
>
> Should i be concerned about this bugcheck?
>
> I have seen this happening a number of times, all with the same signature
> in my testing. I ran a mix of AIM7, ubench, fork-test (continuously
> fork new
> processes), and another program reading from the fork connector socket.
>
> Thanks,
> - jay
>
>
>
> cqueue/1[656]: bugcheck! 0 [1]
> Modules linked in: nfs lockd sunrpc sg st sr_mod ipv6 usbcore
>
> Pid: 656, CPU 1, comm: cqueue/1
> psr : 00001010085a6010 ifs : 8000000000000289 ip :
> [<a0000001005cee50>] Not tainted
> ip is at __kfree_skb+0x1b0/0x220
> unat: 0000000000000000 pfs : 0000000000000289 rsc : 0000000000000003
> rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000009641
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
> csd : 0000000000000000 ssd : 0000000000000000
> b0 : a0000001005cee50 b6 : a00000010000e7e0 b7 : a0000001003ae440
> f6 : 0fffbccccccccc8c00000 f7 : 0ffdaa200000000000000
> f8 : 100008000000000000000 f9 : 10002a000000000000000
> f10 : 0fffcccccccccc8c00000 f11 : 1003e0000000000000000
> r1 : a000000100c0ec00 r2 : 0000000000004000 r3 : 0000000000004000
> r8 : 0000000000000028 r9 : a0000001008eaac8 r10 : 0000000000000004
> r11 : 0000000000000028 r12 : e00000307a99fd60 r13 : e00000307a998000
> r14 : a000000100887c00 r15 : a000000100a24b18 r16 : a000000100a22e18
> r17 : ffffffffffffffff r18 : a000000100887bec r19 : a000000100a9080f
> r20 : 0000000000003517 r21 : 00000000000fffff r22 : 0000000000000034
> r23 : 0000000000000034 r24 : a000000100a90810 r25 : 0000000000003518
> r26 : 0000000000003518 r27 : 00000010085a6010 r28 : a000000100a90811
> r29 : 0000000000003519 r30 : 0000000000000000 r31 : a000000100a24ae8
>
> Call Trace:
> [<a0000001000126a0>] show_stack+0x80/0xa0
> sp=e00000307a99f920 bsp=e00000307a999078
> [<a000000100012f00>] show_regs+0x840/0x880
> sp=e00000307a99faf0 bsp=e00000307a999018
> [<a000000100034890>] die+0x150/0x1c0
> sp=e00000307a99fb00 bsp=e00000307a998fd0
> [<a000000100034940>] die_if_kernel+0x40/0x60
> sp=e00000307a99fb00 bsp=e00000307a998fa0
> [<a000000100035d20>] ia64_bad_break+0x300/0x380
> sp=e00000307a99fb00 bsp=e00000307a998f78
> [<a00000010000b5e0>] ia64_leave_kernel+0x0/0x280
> sp=e00000307a99fb90 bsp=e00000307a998f78
> [<a0000001005cee50>] __kfree_skb+0x1b0/0x220
> sp=e00000307a99fd60 bsp=e00000307a998f30
> [<a00000010044f630>] kfree_skb+0x50/0xa0
> sp=e00000307a99fd60 bsp=e00000307a998f10
> [<a00000010044e400>] cn_queue_wrapper+0xe0/0x100
> sp=e00000307a99fd60 bsp=e00000307a998ee8
> [<a0000001000cb880>] worker_thread+0x3e0/0x520
> sp=e00000307a99fd60 bsp=e00000307a998e60
> [<a0000001000d7210>] kthread+0x290/0x300
> sp=e00000307a99fdd0 bsp=e00000307a998e20
> [<a000000100010a00>] kernel_thread_helper+0xe0/0x100
> sp=e00000307a99fe30 bsp=e00000307a998df0
> [<a000000100009120>] start_kernel_thread+0x20/0x40
> sp=e00000307a99fe30 bsp=e00000307a998df0
>
>
>
On Thu, 2005-04-07 at 15:47 -0700, Jay Lan wrote:
> BTW, when it happened last time, my program listening to the socket
> complained about duplicate messages received.
>
> Unmatched seq. Rcvd=1824062, expected=1824061 <===
> Unmatched seq. Rcvd=1824062, expected=1824063 <===
> Unmatched seq. Rcvd=1824348, expected=1824307
>
> When my program received 1824062 while expecting 1824061
> it adjusted itself to expect the next one being 1824063. But instead
> the message of 1824062 arrived the second time.
Thank you for your report.
Could you reproduce a bug with the attached patch?
> - jay
* looking for [email protected]/connector--main--0--patch-38 to compare with
* comparing to [email protected]/connector--main--0--patch-38
M connector.c
* modified files
--- orig/drivers/connector/connector.c
+++ mod/drivers/connector/connector.c
@@ -121,7 +121,7 @@
NETLINK_CB(skb).dst_groups = groups;
uskb = skb_clone(skb, GFP_ATOMIC);
- if (uskb)
+ if (uskb && 0)
netlink_unicast(dev->nls, uskb, 0, 0);
netlink_broadcast(dev->nls, skb, 0, groups, GFP_ATOMIC);
@@ -158,7 +158,7 @@
}
spin_unlock_bh(&dev->cbdev->queue_lock);
- return found;
+ return (found)?0:-ENODEV;
}
/*
@@ -181,7 +181,6 @@
"requested msg->len=%u[%u], nlh->nlmsg_len=%u, skb->len=%u.\n",
msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)),
nlh->nlmsg_len, skb->len);
- kfree_skb(skb);
return -EINVAL;
}
#if 0
@@ -215,17 +214,18 @@
skb->len, skb->data_len, skb->truesize, skb->protocol,
skb_cloned(skb), skb_shared(skb));
#endif
- while (skb->len >= NLMSG_SPACE(0)) {
+ if (skb->len >= NLMSG_SPACE(0)) {
nlh = (struct nlmsghdr *)skb->data;
+
if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
skb->len < nlh->nlmsg_len ||
nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
-#if 0
+#if 1
printk(KERN_INFO "nlmsg_len=%u, sizeof(*nlh)=%u\n",
nlh->nlmsg_len, sizeof(*nlh));
#endif
kfree_skb(skb);
- break;
+ goto out;
}
len = NLMSG_ALIGN(nlh->nlmsg_len);
@@ -233,22 +233,11 @@
len = skb->len;
err = __cn_rx_skb(skb, nlh);
- if (err) {
-#if 0
- if (err < 0 && (nlh->nlmsg_flags & NLM_F_ACK))
- netlink_ack(skb, nlh, -err);
-#endif
- break;
- } else {
-#if 0
- if (nlh->nlmsg_flags & NLM_F_ACK)
- netlink_ack(skb, nlh, 0);
-#endif
- break;
- }
- skb_pull(skb, len);
+ if (err < 0)
+ kfree_skb(skb);
}
-
+
+out:
kfree_skb(__skb);
}
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
Could you give attached patch a try instead of previous one.
It adds gfp mask into cn_netlink_send() call also.
If you need updated CBUS sources, feel free to ask,
I will send updated sources with Andrew's comments resolved too.
I do not know exactly your connector version,
so patch will probably be applied with fuzz.
feel free to contact if it does not apply, I will send
the whole sources.
Thank you.
* looking for [email protected]/connector--main--0--patch-38 to compare with
* comparing to [email protected]/connector--main--0--patch-38
M connector.c
M connector.h
M cbus.c
* modified files
--- orig/drivers/connector/connector.c
+++ mod/drivers/connector/connector.c
@@ -70,7 +70,7 @@
* then it is new message.
*
*/
-void cn_netlink_send(struct cn_msg *msg, u32 __groups)
+void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask)
{
struct cn_callback_entry *n, *__cbq;
unsigned int size;
@@ -102,7 +102,7 @@
size = NLMSG_SPACE(sizeof(*msg) + msg->len);
- skb = alloc_skb(size, GFP_ATOMIC);
+ skb = alloc_skb(size, gfp_mask);
if (!skb) {
printk(KERN_ERR "Failed to allocate new skb with size=%u.\n", size);
return;
@@ -119,11 +119,11 @@
#endif
NETLINK_CB(skb).dst_groups = groups;
-
- uskb = skb_clone(skb, GFP_ATOMIC);
- if (uskb)
+#if 0
+ uskb = skb_clone(skb, gfp_mask);
+ if (uskb && 0)
netlink_unicast(dev->nls, uskb, 0, 0);
-
+#endif
netlink_broadcast(dev->nls, skb, 0, groups, GFP_ATOMIC);
return;
@@ -158,7 +158,7 @@
}
spin_unlock_bh(&dev->cbdev->queue_lock);
- return found;
+ return (found)?0:-ENODEV;
}
/*
@@ -181,7 +181,6 @@
"requested msg->len=%u[%u], nlh->nlmsg_len=%u, skb->len=%u.\n",
msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)),
nlh->nlmsg_len, skb->len);
- kfree_skb(skb);
return -EINVAL;
}
#if 0
@@ -215,17 +214,18 @@
skb->len, skb->data_len, skb->truesize, skb->protocol,
skb_cloned(skb), skb_shared(skb));
#endif
- while (skb->len >= NLMSG_SPACE(0)) {
+ if (skb->len >= NLMSG_SPACE(0)) {
nlh = (struct nlmsghdr *)skb->data;
+
if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
skb->len < nlh->nlmsg_len ||
nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
-#if 0
+#if 1
printk(KERN_INFO "nlmsg_len=%u, sizeof(*nlh)=%u\n",
nlh->nlmsg_len, sizeof(*nlh));
#endif
kfree_skb(skb);
- break;
+ goto out;
}
len = NLMSG_ALIGN(nlh->nlmsg_len);
@@ -233,22 +233,11 @@
len = skb->len;
err = __cn_rx_skb(skb, nlh);
- if (err) {
-#if 0
- if (err < 0 && (nlh->nlmsg_flags & NLM_F_ACK))
- netlink_ack(skb, nlh, -err);
-#endif
- break;
- } else {
-#if 0
- if (nlh->nlmsg_flags & NLM_F_ACK)
- netlink_ack(skb, nlh, 0);
-#endif
- break;
- }
- skb_pull(skb, len);
+ if (err < 0)
+ kfree_skb(skb);
}
-
+
+out:
kfree_skb(__skb);
}
@@ -310,7 +299,7 @@
m.ack = notify_event;
memcpy(&m.id, id, sizeof(m.id));
- cn_netlink_send(&m, ctl->group);
+ cn_netlink_send(&m, ctl->group, GFP_ATOMIC);
}
}
spin_unlock_bh(¬ify_lock);
--- orig/include/linux/connector.h
+++ mod/include/linux/connector.h
@@ -148,7 +148,7 @@
int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
void cn_del_callback(struct cb_id *);
-void cn_netlink_send(struct cn_msg *, u32);
+void cn_netlink_send(struct cn_msg *, u32, int);
int cn_queue_add_callback(struct cn_queue_dev *dev, struct cn_callback *cb);
void cn_queue_del_callback(struct cn_queue_dev *dev, struct cb_id *id);
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
My workarea was based on 2.6.12-rc1-mm4 plus Guilluame's patch.
Your patch caused 5 out of 8 hunks failure at connector.c
and one failure at connector.h.
Could you generate a new patch based on my version? A tar
file of complete source of drivers/connector would work
also. :)
Thanks!
- jay
Evgeniy Polyakov wrote:
> Could you give attached patch a try instead of previous one.
> It adds gfp mask into cn_netlink_send() call also.
> If you need updated CBUS sources, feel free to ask,
> I will send updated sources with Andrew's comments resolved too.
>
> I do not know exactly your connector version,
> so patch will probably be applied with fuzz.
>
> feel free to contact if it does not apply, I will send
> the whole sources.
>
> Thank you.
>
> * looking for [email protected]/connector--main--0--patch-38 to compare with
> * comparing to [email protected]/connector--main--0--patch-38
> M connector.c
> M connector.h
> M cbus.c
>
> * modified files
>
> --- orig/drivers/connector/connector.c
> +++ mod/drivers/connector/connector.c
> @@ -70,7 +70,7 @@
> * then it is new message.
> *
> */
> -void cn_netlink_send(struct cn_msg *msg, u32 __groups)
> +void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask)
> {
> struct cn_callback_entry *n, *__cbq;
> unsigned int size;
> @@ -102,7 +102,7 @@
>
> size = NLMSG_SPACE(sizeof(*msg) + msg->len);
>
> - skb = alloc_skb(size, GFP_ATOMIC);
> + skb = alloc_skb(size, gfp_mask);
> if (!skb) {
> printk(KERN_ERR "Failed to allocate new skb with size=%u.\n", size);
> return;
> @@ -119,11 +119,11 @@
> #endif
>
> NETLINK_CB(skb).dst_groups = groups;
> -
> - uskb = skb_clone(skb, GFP_ATOMIC);
> - if (uskb)
> +#if 0
> + uskb = skb_clone(skb, gfp_mask);
> + if (uskb && 0)
> netlink_unicast(dev->nls, uskb, 0, 0);
> -
> +#endif
> netlink_broadcast(dev->nls, skb, 0, groups, GFP_ATOMIC);
>
> return;
> @@ -158,7 +158,7 @@
> }
> spin_unlock_bh(&dev->cbdev->queue_lock);
>
> - return found;
> + return (found)?0:-ENODEV;
> }
>
> /*
> @@ -181,7 +181,6 @@
> "requested msg->len=%u[%u], nlh->nlmsg_len=%u, skb->len=%u.\n",
> msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)),
> nlh->nlmsg_len, skb->len);
> - kfree_skb(skb);
> return -EINVAL;
> }
> #if 0
> @@ -215,17 +214,18 @@
> skb->len, skb->data_len, skb->truesize, skb->protocol,
> skb_cloned(skb), skb_shared(skb));
> #endif
> - while (skb->len >= NLMSG_SPACE(0)) {
> + if (skb->len >= NLMSG_SPACE(0)) {
> nlh = (struct nlmsghdr *)skb->data;
> +
> if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
> skb->len < nlh->nlmsg_len ||
> nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
> -#if 0
> +#if 1
> printk(KERN_INFO "nlmsg_len=%u, sizeof(*nlh)=%u\n",
> nlh->nlmsg_len, sizeof(*nlh));
> #endif
> kfree_skb(skb);
> - break;
> + goto out;
> }
>
> len = NLMSG_ALIGN(nlh->nlmsg_len);
> @@ -233,22 +233,11 @@
> len = skb->len;
>
> err = __cn_rx_skb(skb, nlh);
> - if (err) {
> -#if 0
> - if (err < 0 && (nlh->nlmsg_flags & NLM_F_ACK))
> - netlink_ack(skb, nlh, -err);
> -#endif
> - break;
> - } else {
> -#if 0
> - if (nlh->nlmsg_flags & NLM_F_ACK)
> - netlink_ack(skb, nlh, 0);
> -#endif
> - break;
> - }
> - skb_pull(skb, len);
> + if (err < 0)
> + kfree_skb(skb);
> }
> -
> +
> +out:
> kfree_skb(__skb);
> }
>
> @@ -310,7 +299,7 @@
> m.ack = notify_event;
>
> memcpy(&m.id, id, sizeof(m.id));
> - cn_netlink_send(&m, ctl->group);
> + cn_netlink_send(&m, ctl->group, GFP_ATOMIC);
> }
> }
> spin_unlock_bh(¬ify_lock);
>
>
> --- orig/include/linux/connector.h
> +++ mod/include/linux/connector.h
> @@ -148,7 +148,7 @@
>
> int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
> void cn_del_callback(struct cb_id *);
> -void cn_netlink_send(struct cn_msg *, u32);
> +void cn_netlink_send(struct cn_msg *, u32, int);
>
> int cn_queue_add_callback(struct cn_queue_dev *dev, struct cn_callback *cb);
> void cn_queue_del_callback(struct cn_queue_dev *dev, struct cb_id *id);
>
>
>
>
>
Hi Evgeniy,
Forget about my previous request of a new patch.
The failures were straight forward enough to figure out.
- jay
Jay Lan wrote:
> My workarea was based on 2.6.12-rc1-mm4 plus Guilluame's patch.
>
> Your patch caused 5 out of 8 hunks failure at connector.c
> and one failure at connector.h.
>
> Could you generate a new patch based on my version? A tar
> file of complete source of drivers/connector would work
> also. :)
>
> Thanks!
> - jay
>
> Evgeniy Polyakov wrote:
>
>> Could you give attached patch a try instead of previous one.
>> It adds gfp mask into cn_netlink_send() call also.
>> If you need updated CBUS sources, feel free to ask, I will send
>> updated sources with Andrew's comments resolved too.
>>
>> I do not know exactly your connector version, so patch will probably
>> be applied with fuzz.
>>
>> feel free to contact if it does not apply, I will send
>> the whole sources.
>>
>> Thank you.
>>
>> * looking for [email protected]/connector--main--0--patch-38 to
>> compare with
>> * comparing to [email protected]/connector--main--0--patch-38
>> M connector.c
>> M connector.h
>> M cbus.c
>>
>> * modified files
>>
>> --- orig/drivers/connector/connector.c
>> +++ mod/drivers/connector/connector.c
>> @@ -70,7 +70,7 @@
>> * then it is new message.
>> *
>> */
>> -void cn_netlink_send(struct cn_msg *msg, u32 __groups)
>> +void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask)
>> {
>> struct cn_callback_entry *n, *__cbq;
>> unsigned int size;
>> @@ -102,7 +102,7 @@
>>
>> size = NLMSG_SPACE(sizeof(*msg) + msg->len);
>>
>> - skb = alloc_skb(size, GFP_ATOMIC);
>> + skb = alloc_skb(size, gfp_mask);
>> if (!skb) {
>> printk(KERN_ERR "Failed to allocate new skb with size=%u.\n",
>> size);
>> return;
>> @@ -119,11 +119,11 @@
>> #endif
>>
>> NETLINK_CB(skb).dst_groups = groups;
>> -
>> - uskb = skb_clone(skb, GFP_ATOMIC);
>> - if (uskb)
>> +#if 0
>> + uskb = skb_clone(skb, gfp_mask);
>> + if (uskb && 0)
>> netlink_unicast(dev->nls, uskb, 0, 0);
>> -
>> +#endif
>> netlink_broadcast(dev->nls, skb, 0, groups, GFP_ATOMIC);
>>
>> return;
>> @@ -158,7 +158,7 @@
>> }
>> spin_unlock_bh(&dev->cbdev->queue_lock);
>>
>> - return found;
>> + return (found)?0:-ENODEV;
>> }
>>
>> /*
>> @@ -181,7 +181,6 @@
>> "requested msg->len=%u[%u], nlh->nlmsg_len=%u,
>> skb->len=%u.\n",
>> msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)),
>> nlh->nlmsg_len, skb->len);
>> - kfree_skb(skb);
>> return -EINVAL;
>> }
>> #if 0
>> @@ -215,17 +214,18 @@
>> skb->len, skb->data_len, skb->truesize, skb->protocol,
>> skb_cloned(skb), skb_shared(skb));
>> #endif
>> - while (skb->len >= NLMSG_SPACE(0)) {
>> + if (skb->len >= NLMSG_SPACE(0)) {
>> nlh = (struct nlmsghdr *)skb->data;
>> +
>> if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
>> skb->len < nlh->nlmsg_len ||
>> nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
>> -#if 0
>> +#if 1
>> printk(KERN_INFO "nlmsg_len=%u, sizeof(*nlh)=%u\n",
>> nlh->nlmsg_len, sizeof(*nlh));
>> #endif
>> kfree_skb(skb);
>> - break;
>> + goto out;
>> }
>>
>> len = NLMSG_ALIGN(nlh->nlmsg_len);
>> @@ -233,22 +233,11 @@
>> len = skb->len;
>>
>> err = __cn_rx_skb(skb, nlh);
>> - if (err) {
>> -#if 0
>> - if (err < 0 && (nlh->nlmsg_flags & NLM_F_ACK))
>> - netlink_ack(skb, nlh, -err);
>> -#endif
>> - break;
>> - } else {
>> -#if 0
>> - if (nlh->nlmsg_flags & NLM_F_ACK)
>> - netlink_ack(skb, nlh, 0);
>> -#endif
>> - break;
>> - }
>> - skb_pull(skb, len);
>> + if (err < 0)
>> + kfree_skb(skb);
>> }
>> -
>> +
>> +out:
>> kfree_skb(__skb);
>> }
>>
>> @@ -310,7 +299,7 @@
>> m.ack = notify_event;
>>
>> memcpy(&m.id, id, sizeof(m.id));
>> - cn_netlink_send(&m, ctl->group);
>> + cn_netlink_send(&m, ctl->group, GFP_ATOMIC);
>> }
>> }
>> spin_unlock_bh(¬ify_lock);
>>
>>
>> --- orig/include/linux/connector.h
>> +++ mod/include/linux/connector.h
>> @@ -148,7 +148,7 @@
>>
>> int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
>> void cn_del_callback(struct cb_id *);
>> -void cn_netlink_send(struct cn_msg *, u32);
>> +void cn_netlink_send(struct cn_msg *, u32, int);
>>
>> int cn_queue_add_callback(struct cn_queue_dev *dev, struct
>> cn_callback *cb);
>> void cn_queue_del_callback(struct cn_queue_dev *dev, struct cb_id *id);
>>
>>
>>
>>
>>
>
On Fri, 08 Apr 2005 15:08:13 -0700
Jay Lan <[email protected]> wrote:
> Hi Evgeniy,
>
> Forget about my previous request of a new patch.
>
> The failures were straight forward enough to figure out.
Ok.
The latest sources are always awailable at
http://tservice.net.ru/~s0mbre/archive/connector
> - jay
>
> Jay Lan wrote:
> > My workarea was based on 2.6.12-rc1-mm4 plus Guilluame's patch.
> >
> > Your patch caused 5 out of 8 hunks failure at connector.c
> > and one failure at connector.h.
> >
> > Could you generate a new patch based on my version? A tar
> > file of complete source of drivers/connector would work
> > also. :)
> >
> > Thanks!
> > - jay
> >
> > Evgeniy Polyakov wrote:
> >
> >> Could you give attached patch a try instead of previous one.
> >> It adds gfp mask into cn_netlink_send() call also.
> >> If you need updated CBUS sources, feel free to ask, I will send
> >> updated sources with Andrew's comments resolved too.
> >>
> >> I do not know exactly your connector version, so patch will probably
> >> be applied with fuzz.
> >>
> >> feel free to contact if it does not apply, I will send
> >> the whole sources.
> >>
> >> Thank you.
> >>
> >> * looking for [email protected]/connector--main--0--patch-38 to
> >> compare with
> >> * comparing to [email protected]/connector--main--0--patch-38
> >> M connector.c
> >> M connector.h
> >> M cbus.c
> >>
> >> * modified files
> >>
> >> --- orig/drivers/connector/connector.c
> >> +++ mod/drivers/connector/connector.c
> >> @@ -70,7 +70,7 @@
> >> * then it is new message.
> >> *
> >> */
> >> -void cn_netlink_send(struct cn_msg *msg, u32 __groups)
> >> +void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask)
> >> {
> >> struct cn_callback_entry *n, *__cbq;
> >> unsigned int size;
> >> @@ -102,7 +102,7 @@
> >>
> >> size = NLMSG_SPACE(sizeof(*msg) + msg->len);
> >>
> >> - skb = alloc_skb(size, GFP_ATOMIC);
> >> + skb = alloc_skb(size, gfp_mask);
> >> if (!skb) {
> >> printk(KERN_ERR "Failed to allocate new skb with size=%u.\n",
> >> size);
> >> return;
> >> @@ -119,11 +119,11 @@
> >> #endif
> >>
> >> NETLINK_CB(skb).dst_groups = groups;
> >> -
> >> - uskb = skb_clone(skb, GFP_ATOMIC);
> >> - if (uskb)
> >> +#if 0
> >> + uskb = skb_clone(skb, gfp_mask);
> >> + if (uskb && 0)
> >> netlink_unicast(dev->nls, uskb, 0, 0);
> >> -
> >> +#endif
> >> netlink_broadcast(dev->nls, skb, 0, groups, GFP_ATOMIC);
> >>
> >> return;
> >> @@ -158,7 +158,7 @@
> >> }
> >> spin_unlock_bh(&dev->cbdev->queue_lock);
> >>
> >> - return found;
> >> + return (found)?0:-ENODEV;
> >> }
> >>
> >> /*
> >> @@ -181,7 +181,6 @@
> >> "requested msg->len=%u[%u], nlh->nlmsg_len=%u,
> >> skb->len=%u.\n",
> >> msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)),
> >> nlh->nlmsg_len, skb->len);
> >> - kfree_skb(skb);
> >> return -EINVAL;
> >> }
> >> #if 0
> >> @@ -215,17 +214,18 @@
> >> skb->len, skb->data_len, skb->truesize, skb->protocol,
> >> skb_cloned(skb), skb_shared(skb));
> >> #endif
> >> - while (skb->len >= NLMSG_SPACE(0)) {
> >> + if (skb->len >= NLMSG_SPACE(0)) {
> >> nlh = (struct nlmsghdr *)skb->data;
> >> +
> >> if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
> >> skb->len < nlh->nlmsg_len ||
> >> nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
> >> -#if 0
> >> +#if 1
> >> printk(KERN_INFO "nlmsg_len=%u, sizeof(*nlh)=%u\n",
> >> nlh->nlmsg_len, sizeof(*nlh));
> >> #endif
> >> kfree_skb(skb);
> >> - break;
> >> + goto out;
> >> }
> >>
> >> len = NLMSG_ALIGN(nlh->nlmsg_len);
> >> @@ -233,22 +233,11 @@
> >> len = skb->len;
> >>
> >> err = __cn_rx_skb(skb, nlh);
> >> - if (err) {
> >> -#if 0
> >> - if (err < 0 && (nlh->nlmsg_flags & NLM_F_ACK))
> >> - netlink_ack(skb, nlh, -err);
> >> -#endif
> >> - break;
> >> - } else {
> >> -#if 0
> >> - if (nlh->nlmsg_flags & NLM_F_ACK)
> >> - netlink_ack(skb, nlh, 0);
> >> -#endif
> >> - break;
> >> - }
> >> - skb_pull(skb, len);
> >> + if (err < 0)
> >> + kfree_skb(skb);
> >> }
> >> -
> >> +
> >> +out:
> >> kfree_skb(__skb);
> >> }
> >>
> >> @@ -310,7 +299,7 @@
> >> m.ack = notify_event;
> >>
> >> memcpy(&m.id, id, sizeof(m.id));
> >> - cn_netlink_send(&m, ctl->group);
> >> + cn_netlink_send(&m, ctl->group, GFP_ATOMIC);
> >> }
> >> }
> >> spin_unlock_bh(¬ify_lock);
> >>
> >>
> >> --- orig/include/linux/connector.h
> >> +++ mod/include/linux/connector.h
> >> @@ -148,7 +148,7 @@
> >>
> >> int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
> >> void cn_del_callback(struct cb_id *);
> >> -void cn_netlink_send(struct cn_msg *, u32);
> >> +void cn_netlink_send(struct cn_msg *, u32, int);
> >>
> >> int cn_queue_add_callback(struct cn_queue_dev *dev, struct
> >> cn_callback *cb);
> >> void cn_queue_del_callback(struct cn_queue_dev *dev, struct cb_id *id);
> >>
> >>
> >>
> >>
> >>
> >
Evgeniy Polyakov
Only failure makes us experts. -- Theo de Raadt
With the patch you provide to me, i did not see the bugcheck
at cn_queue_wrapper() at the console.
Unmatched sequence number messages still happened. We expect
to lose packets under system stressed situation, but i still
observed duplicate messages, which concerned me.
Unmatched seq. Rcvd=79477, expected=79478 <=== duplicate
Unmatched seq. Rcvd=713823, expected=713422 * loss of 401 msgs
Unmatched seq. Rcvd=80024, expected=79901 * loss of 123 msgs
Unmatched seq. Rcvd=93632, expected=93633 <=== duplicate
Unmatched seq. Rcvd=94718, expected=93970
Unmatched seq. Rcvd=743576, expected=743502
Unmatched seq. Rcvd=123506, expected=123507 <=== duplicate
Unmatched seq. Rcvd=773753, expected=773503
Unmatched seq. Rcvd=124111, expected=123938
Unmatched seq. Rcvd=157172, expected=157173 <=== duplicate
Unmatched seq. Rcvd=813024, expected=812913 <=== duplicate
Unmatched seq. Rcvd=813024, expected=813025 <=== duplicate
Unmatched seq. Rcvd=157830, expected=157501
Unmatched seq. Rcvd=158408, expected=158145
Unmatched seq. Rcvd=813678, expected=813438
The test system was a two cpu ia64.
Thanks,
- jay
Evgeniy Polyakov wrote:
> On Fri, 08 Apr 2005 15:08:13 -0700
> Jay Lan <[email protected]> wrote:
>
>
>>Hi Evgeniy,
>>
>>Forget about my previous request of a new patch.
>>
>>The failures were straight forward enough to figure out.
>
>
> Ok.
> The latest sources are always awailable at
> http://tservice.net.ru/~s0mbre/archive/connector
>
>
>>- jay
>>
>>Jay Lan wrote:
>>
>>>My workarea was based on 2.6.12-rc1-mm4 plus Guilluame's patch.
>>>
>>>Your patch caused 5 out of 8 hunks failure at connector.c
>>>and one failure at connector.h.
>>>
>>>Could you generate a new patch based on my version? A tar
>>>file of complete source of drivers/connector would work
>>>also. :)
>>>
>>>Thanks!
>>> - jay
>>>
>>>Evgeniy Polyakov wrote:
>>>
>>>
>>>>Could you give attached patch a try instead of previous one.
>>>>It adds gfp mask into cn_netlink_send() call also.
>>>>If you need updated CBUS sources, feel free to ask, I will send
>>>>updated sources with Andrew's comments resolved too.
>>>>
>>>>I do not know exactly your connector version, so patch will probably
>>>>be applied with fuzz.
>>>>
>>>>feel free to contact if it does not apply, I will send
>>>>the whole sources.
>>>>
>>>>Thank you.
>>>>
>>>>* looking for [email protected]/connector--main--0--patch-38 to
>>>>compare with
>>>>* comparing to [email protected]/connector--main--0--patch-38
>>>>M connector.c
>>>>M connector.h
>>>>M cbus.c
>>>>
>>>>* modified files
>>>>
>>>>--- orig/drivers/connector/connector.c
>>>>+++ mod/drivers/connector/connector.c
>>>>@@ -70,7 +70,7 @@
>>>> * then it is new message.
>>>> *
>>>> */
>>>>-void cn_netlink_send(struct cn_msg *msg, u32 __groups)
>>>>+void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask)
>>>> {
>>>> struct cn_callback_entry *n, *__cbq;
>>>> unsigned int size;
>>>>@@ -102,7 +102,7 @@
>>>>
>>>> size = NLMSG_SPACE(sizeof(*msg) + msg->len);
>>>>
>>>>- skb = alloc_skb(size, GFP_ATOMIC);
>>>>+ skb = alloc_skb(size, gfp_mask);
>>>> if (!skb) {
>>>> printk(KERN_ERR "Failed to allocate new skb with size=%u.\n",
>>>>size);
>>>> return;
>>>>@@ -119,11 +119,11 @@
>>>> #endif
>>>>
>>>> NETLINK_CB(skb).dst_groups = groups;
>>>>-
>>>>- uskb = skb_clone(skb, GFP_ATOMIC);
>>>>- if (uskb)
>>>>+#if 0
>>>>+ uskb = skb_clone(skb, gfp_mask);
>>>>+ if (uskb && 0)
>>>> netlink_unicast(dev->nls, uskb, 0, 0);
>>>>-
>>>>+#endif
>>>> netlink_broadcast(dev->nls, skb, 0, groups, GFP_ATOMIC);
>>>>
>>>> return;
>>>>@@ -158,7 +158,7 @@
>>>> }
>>>> spin_unlock_bh(&dev->cbdev->queue_lock);
>>>>
>>>>- return found;
>>>>+ return (found)?0:-ENODEV;
>>>> }
>>>>
>>>> /*
>>>>@@ -181,7 +181,6 @@
>>>> "requested msg->len=%u[%u], nlh->nlmsg_len=%u,
>>>>skb->len=%u.\n",
>>>> msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)),
>>>> nlh->nlmsg_len, skb->len);
>>>>- kfree_skb(skb);
>>>> return -EINVAL;
>>>> }
>>>> #if 0
>>>>@@ -215,17 +214,18 @@
>>>> skb->len, skb->data_len, skb->truesize, skb->protocol,
>>>> skb_cloned(skb), skb_shared(skb));
>>>> #endif
>>>>- while (skb->len >= NLMSG_SPACE(0)) {
>>>>+ if (skb->len >= NLMSG_SPACE(0)) {
>>>> nlh = (struct nlmsghdr *)skb->data;
>>>>+
>>>> if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
>>>> skb->len < nlh->nlmsg_len ||
>>>> nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
>>>>-#if 0
>>>>+#if 1
>>>> printk(KERN_INFO "nlmsg_len=%u, sizeof(*nlh)=%u\n",
>>>> nlh->nlmsg_len, sizeof(*nlh));
>>>> #endif
>>>> kfree_skb(skb);
>>>>- break;
>>>>+ goto out;
>>>> }
>>>>
>>>> len = NLMSG_ALIGN(nlh->nlmsg_len);
>>>>@@ -233,22 +233,11 @@
>>>> len = skb->len;
>>>>
>>>> err = __cn_rx_skb(skb, nlh);
>>>>- if (err) {
>>>>-#if 0
>>>>- if (err < 0 && (nlh->nlmsg_flags & NLM_F_ACK))
>>>>- netlink_ack(skb, nlh, -err);
>>>>-#endif
>>>>- break;
>>>>- } else {
>>>>-#if 0
>>>>- if (nlh->nlmsg_flags & NLM_F_ACK)
>>>>- netlink_ack(skb, nlh, 0);
>>>>-#endif
>>>>- break;
>>>>- }
>>>>- skb_pull(skb, len);
>>>>+ if (err < 0)
>>>>+ kfree_skb(skb);
>>>> }
>>>>-
>>>>+
>>>>+out:
>>>> kfree_skb(__skb);
>>>> }
>>>>
>>>>@@ -310,7 +299,7 @@
>>>> m.ack = notify_event;
>>>>
>>>> memcpy(&m.id, id, sizeof(m.id));
>>>>- cn_netlink_send(&m, ctl->group);
>>>>+ cn_netlink_send(&m, ctl->group, GFP_ATOMIC);
>>>> }
>>>> }
>>>> spin_unlock_bh(¬ify_lock);
>>>>
>>>>
>>>>--- orig/include/linux/connector.h
>>>>+++ mod/include/linux/connector.h
>>>>@@ -148,7 +148,7 @@
>>>>
>>>> int cn_add_callback(struct cb_id *, char *, void (* callback)(void *));
>>>> void cn_del_callback(struct cb_id *);
>>>>-void cn_netlink_send(struct cn_msg *, u32);
>>>>+void cn_netlink_send(struct cn_msg *, u32, int);
>>>>
>>>> int cn_queue_add_callback(struct cn_queue_dev *dev, struct
>>>>cn_callback *cb);
>>>> void cn_queue_del_callback(struct cn_queue_dev *dev, struct cb_id *id);
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>
>
> Evgeniy Polyakov
>
> Only failure makes us experts. -- Theo de Raadt
On Fri, 08 Apr 2005 20:31:20 -0700
Jay Lan <[email protected]> wrote:
> With the patch you provide to me, i did not see the bugcheck
> at cn_queue_wrapper() at the console.
>
> Unmatched sequence number messages still happened. We expect
> to lose packets under system stressed situation, but i still
> observed duplicate messages, which concerned me.
What tool and what version do you use
as message generator and receiver?
> Thanks,
> - jay
Evgeniy Polyakov
Only failure makes us experts. -- Theo de Raadt
#include <sys/signal.h>
#include <sys/time.h>
int main(int argc, char *argv[])
{
int pid;
int i = 0, max = 100000;
struct timeval tv0, tv1;
struct timezone tz;
long diff;
if (argc >= 2)
max = atoi(argv[1]);
signal(SIGCHLD, SIG_IGN);
gettimeofday(&tv0, &tz);
while (i++ < max) {
pid = fork();
if (pid == 0) {
sleep(1);
exit (0);
}
}
gettimeofday(&tv1, &tz);
diff = (tv1.tv_sec - tv0.tv_sec)*1000000 + (tv1.tv_usec - tv0.tv_usec);
printf("Average per process fork+exit time is %ld usecs [diff=%lu, max=%d].\n", diff/max, diff, max);
return 0;
}
On Sun, Apr 10, 2005 at 10:43:24PM -0700, Jay Lan ([email protected]) wrote:
> I based my listen program on the fclisten.c posted by Kaigai Kohei
> with my own modification. Unfortunately i lost my test machine in the
> lab. I will recreate the listen program Monday. The original listener
> did not validate sequence number. It also prints length of data and
> sequence number of every message it receives. My listener prints
> only out-of-sequence error messages.
>
> The fork generator fork-test.c was yours? I called it fork-test
> and let it run continuously in while-loop:
>
> # while 1
> # ./fork-test 10000000
> # sleep 1
> # end
>
> I let it do 10,000,000 times of fork continuously while the system
> is running AIM7 and/or ubench.
>
> The original fclisten.c and fork-test.c are attached for your reference.
It is pretty normal to see duplicated numbers in a fork test -
I observed it too, since counter is incremented without locks
we can catch situation when it is incremented simultaneously
on both processors, the latest version of the fork connector
from Guillaume contains processor id in the message and per cpu counters,
so one can destinguish messages which sequence numbers will flow
in a very similar way now.
> - jay
>
--
Evgeniy Polyakov
Evgeniy Polyakov <[email protected]> wrote:
>
> On Sun, Apr 10, 2005 at 10:43:24PM -0700, Jay Lan ([email protected]) wrote:
> > I based my listen program on the fclisten.c posted by Kaigai Kohei
> > with my own modification. Unfortunately i lost my test machine in the
> > lab. I will recreate the listen program Monday. The original listener
> > did not validate sequence number. It also prints length of data and
> > sequence number of every message it receives. My listener prints
> > only out-of-sequence error messages.
> >
> > The fork generator fork-test.c was yours? I called it fork-test
> > and let it run continuously in while-loop:
> >
> > # while 1
> > # ./fork-test 10000000
> > # sleep 1
> > # end
> >
> > I let it do 10,000,000 times of fork continuously while the system
> > is running AIM7 and/or ubench.
> >
> > The original fclisten.c and fork-test.c are attached for your reference.
>
> It is pretty normal to see duplicated numbers in a fork test -
> I observed it too, since counter is incremented without locks
> we can catch situation when it is incremented simultaneously
> on both processors, the latest version of the fork connector
> from Guillaume contains processor id in the message and per cpu counters,
> so one can destinguish messages which sequence numbers will flow
> in a very similar way now.
Oh come on, that's just daft. Evgeniy, put a lock in there and fix it up.
On Sun, Apr 10, 2005 at 11:51:24PM -0700, Andrew Morton ([email protected]) wrote:
> Evgeniy Polyakov <[email protected]> wrote:
> >
> > On Sun, Apr 10, 2005 at 10:43:24PM -0700, Jay Lan ([email protected]) wrote:
> > > I based my listen program on the fclisten.c posted by Kaigai Kohei
> > > with my own modification. Unfortunately i lost my test machine in the
> > > lab. I will recreate the listen program Monday. The original listener
> > > did not validate sequence number. It also prints length of data and
> > > sequence number of every message it receives. My listener prints
> > > only out-of-sequence error messages.
> > >
> > > The fork generator fork-test.c was yours? I called it fork-test
> > > and let it run continuously in while-loop:
> > >
> > > # while 1
> > > # ./fork-test 10000000
> > > # sleep 1
> > > # end
> > >
> > > I let it do 10,000,000 times of fork continuously while the system
> > > is running AIM7 and/or ubench.
> > >
> > > The original fclisten.c and fork-test.c are attached for your reference.
> >
> > It is pretty normal to see duplicated numbers in a fork test -
> > I observed it too, since counter is incremented without locks
> > we can catch situation when it is incremented simultaneously
> > on both processors, the latest version of the fork connector
> > from Guillaume contains processor id in the message and per cpu counters,
> > so one can destinguish messages which sequence numbers will flow
> > in a very similar way now.
>
> Oh come on, that's just daft. Evgeniy, put a lock in there and fix it up.
#ifndef FAST_AND_SUSPICIOUS
spin_lock(&fork_lock);
#endif
seq++;
#ifndef FAST_AND_SUSPICIOUS
spin_unlock(&fork_lock);
#endif
:)
--
Evgeniy Polyakov ( s0mbre )