2011-06-20 11:40:50

by Johannes Berg

[permalink] [raw]
Subject: [PATCH] netlink: advertise incomplete dumps

From: Johannes Berg <[email protected]>

Consider the following situation:
* a dump that would show 8 entries, four in the first
round, and four in the second
* between the first and second rounds, 6 entries are
removed
* now the second round will not show any entry, and
even if there is a sequence/generation counter the
application will not know

To solve this problem, add a new flag NLM_F_DUMP_INTR
to the netlink header that indicates the dump wasn't
consistent, this flag can also be set on the MSG_DONE
message that terminates the dump, and as such above
situation can be detected.

To achieve this, add a sequence counter to the netlink
callback struct. Of course, netlink code still needs
to use this new functionality. The correct way to do
that is to always set cb->seq when a dumpit callback
is invoked and call nl_dump_check_consistent() for
each new message. The core code will also call this
function for the final MSG_DONE message.

To make it usable with generic netlink, a new function
genlmsg_nlhdr() is needed to obtain the netlink header
from the genetlink user header.

Signed-off-by: Johannes Berg <[email protected]>
---
Should we merge this through wireless? From wireless-next it'll go into
net-next quickly, but the converse isn't true, so using it in wireless
would be harder if it was merged through net-next I think? Or John can
cherry-pick into wireless-testing?


include/linux/netlink.h | 2 ++
include/net/genetlink.h | 32 ++++++++++++++++++++++++++++++++
include/net/netlink.h | 24 ++++++++++++++++++++++++
net/netlink/af_netlink.c | 2 ++
4 files changed, 60 insertions(+)

--- a/include/linux/netlink.h 2011-06-20 08:42:41.000000000 +0200
+++ b/include/linux/netlink.h 2011-06-20 09:39:19.000000000 +0200
@@ -49,6 +49,7 @@ struct nlmsghdr {
#define NLM_F_MULTI 2 /* Multipart message, terminated by NLMSG_DONE */
#define NLM_F_ACK 4 /* Reply with ack, with zero or error code */
#define NLM_F_ECHO 8 /* Echo this request */
+#define NLM_F_DUMP_INTR 16 /* Dump was inconsistent due to sequence change */

/* Modifiers to GET request */
#define NLM_F_ROOT 0x100 /* specify tree root */
@@ -222,6 +223,7 @@ struct netlink_callback {
struct netlink_callback *cb);
int (*done)(struct netlink_callback *cb);
int family;
+ unsigned int prev_seq, seq;
long args[6];
};

--- a/net/netlink/af_netlink.c 2011-06-20 08:42:41.000000000 +0200
+++ b/net/netlink/af_netlink.c 2011-06-20 09:39:19.000000000 +0200
@@ -1693,6 +1693,8 @@ static int netlink_dump(struct sock *sk)
if (!nlh)
goto errout_skb;

+ nl_dump_check_consistent(cb, nlh);
+
memcpy(nlmsg_data(nlh), &len, sizeof(len));

if (sk_filter(sk, skb))
--- a/include/net/genetlink.h 2011-06-20 08:42:41.000000000 +0200
+++ b/include/net/genetlink.h 2011-06-20 13:36:54.000000000 +0200
@@ -160,6 +160,38 @@ static inline void *genlmsg_put(struct s
}

/**
+ * genlmsg_nlhdr - Obtain netlink header from user specified header
+ * @user_hdr: user header as returned from genlmsg_put()
+ * @family: generic netlink family
+ *
+ * Returns pointer to netlink header.
+ */
+static inline struct nlmsghdr *genlmsg_nlhdr(void *user_hdr,
+ struct genl_family *family)
+{
+ return (struct nlmsghdr *)((char *)user_hdr -
+ family->hdrsize -
+ GENL_HDRLEN -
+ NLMSG_HDRLEN);
+}
+
+/**
+ * genl_dump_check_consistent - check if sequence is consistent and advertise if not
+ * @cb: netlink callback structure that stores the sequence number
+ * @user_hdr: user header as returned from genlmsg_put()
+ * @family: generic netlink family
+ *
+ * Cf. nl_dump_check_consistent(), this just provides a wrapper to make it
+ * simpler to use with generic netlink.
+ */
+static inline void genl_dump_check_consistent(struct netlink_callback *cb,
+ void *user_hdr,
+ struct genl_family *family)
+{
+ nl_dump_check_consistent(cb, genlmsg_nlhdr(user_hdr, family));
+}
+
+/**
* genlmsg_put_reply - Add generic netlink header to a reply message
* @skb: socket buffer holding the message
* @info: receiver info
--- a/include/net/netlink.h 2011-06-20 08:42:41.000000000 +0200
+++ b/include/net/netlink.h 2011-06-20 09:39:19.000000000 +0200
@@ -638,6 +638,30 @@ static inline int nlmsg_unicast(struct s
nlmsg_ok(pos, rem); \
pos = nlmsg_next(pos, &(rem)))

+/**
+ * nl_dump_check_consistent - check if sequence is consistent and advertise if not
+ * @cb: netlink callback structure that stores the sequence number
+ * @nlh: netlink message header to write the flag to
+ *
+ * This function checks if the sequence (generation) number changed during dump
+ * and if it did, advertises it in the netlink message header.
+ *
+ * The correct way to use it is to set cb->seq to the generation counter when
+ * all locks for dumping have been acquired, and then call this function for
+ * each message that is generated.
+ *
+ * Note that due to initialisation concerns, 0 is an invalid sequence number
+ * and must not be used by code that uses this functionality.
+ */
+static inline void
+nl_dump_check_consistent(struct netlink_callback *cb,
+ struct nlmsghdr *nlh)
+{
+ if (cb->prev_seq && cb->seq != cb->prev_seq)
+ nlh->nlmsg_flags |= NLM_F_DUMP_INTR;
+ cb->prev_seq = cb->seq;
+}
+
/**************************************************************************
* Netlink Attributes
**************************************************************************/




2011-06-20 21:02:51

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] netlink: advertise incomplete dumps

From: Johannes Berg <[email protected]>
Date: Mon, 20 Jun 2011 13:40:46 +0200

> Should we merge this through wireless? From wireless-next it'll go into
> net-next quickly, but the converse isn't true, so using it in wireless
> would be harder if it was merged through net-next I think? Or John can
> cherry-pick into wireless-testing?

I'm fine with this going in via the wireless tree, that way we'll
see a real user when it merged into net-next-2.6

Acked-by: David S. Miller <[email protected]>

2011-06-21 13:11:27

by Thomas Graf

[permalink] [raw]
Subject: [PATCH] rtnl: provide link dump consistency info

This patch adds a change sequence counter to each net namespace
which is bumped whenever a netdevice is added or removed from
the list. If such a change occurred while a link dump took place,
the dump will have the NLM_F_DUMP_INTR flag set in the first
message which has been interrupted and in all subsequent messages
of the same dump.

Note that links may still be modified or renamed while a dump is
taking place but we can guarantee for userspace to receive a
complete list of links and not miss any.

Testing:
I have added 500 VLAN netdevices to make sure the dump is split
over multiple messages. Then while continuously dumping links in
one process I also continuously deleted and re-added a dummy
netdevice in another process. Multiple dumps per seconds have
had the NLM_F_DUMP_INTR flag set.

I guess we can wait for Johannes patch to hit net-next via the
wireless tree. I just wanted to give this some testing right away.

Signed-off-by: Thomas Graf <[email protected]>

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 2bf9ed9..ff5c680 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -62,6 +62,7 @@ struct net {
struct list_head dev_base_head;
struct hlist_head *dev_name_head;
struct hlist_head *dev_index_head;
+ unsigned int dev_base_seq; /* protected by rtnl_mutex */

/* core fib_rules */
struct list_head rules_ops;
diff --git a/net/core/dev.c b/net/core/dev.c
index 9c58c1e..97f30b4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -199,6 +199,11 @@ static struct list_head ptype_all __read_mostly; /* Taps */
DEFINE_RWLOCK(dev_base_lock);
EXPORT_SYMBOL(dev_base_lock);

+static inline void dev_base_seq_inc(struct net *net)
+{
+ while (++net->dev_base_seq == 0);
+}
+
static inline struct hlist_head *dev_name_hash(struct net *net, const char *name)
{
unsigned hash = full_name_hash(name, strnlen(name, IFNAMSIZ));
@@ -237,6 +242,9 @@ static int list_netdevice(struct net_device *dev)
hlist_add_head_rcu(&dev->index_hlist,
dev_index_hash(net, dev->ifindex));
write_unlock_bh(&dev_base_lock);
+
+ dev_base_seq_inc(net);
+
return 0;
}

@@ -253,6 +261,8 @@ static void unlist_netdevice(struct net_device *dev)
hlist_del_rcu(&dev->name_hlist);
hlist_del_rcu(&dev->index_hlist);
write_unlock_bh(&dev_base_lock);
+
+ dev_base_seq_inc(dev_net(dev));
}

/*
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index e41e511..91f03c7 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -128,6 +128,7 @@ static __net_init int setup_net(struct net *net)
LIST_HEAD(net_exit_list);

atomic_set(&net->count, 1);
+ net->dev_base_seq = 1;

#ifdef NETNS_REFCNT_DEBUG
atomic_set(&net->use_count, 0);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index abd936d..8d694b6 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1009,6 +1009,8 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
s_idx = cb->args[1];

rcu_read_lock();
+ cb->seq = net->dev_base_seq;
+
for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
idx = 0;
head = &net->dev_index_head[h];
@@ -1020,6 +1022,8 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
cb->nlh->nlmsg_seq, 0,
NLM_F_MULTI) <= 0)
goto out;
+
+ nl_dump_check_consistent(cb, nlmsg_hdr(skb));
cont:
idx++;
}

2011-06-21 07:05:11

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] netlink: advertise incomplete dumps

On Mon, 2011-06-20 at 13:59 -0700, David Miller wrote:
> From: Johannes Berg <[email protected]>
> Date: Mon, 20 Jun 2011 13:40:46 +0200
>
> > Should we merge this through wireless? From wireless-next it'll go into
> > net-next quickly, but the converse isn't true, so using it in wireless
> > would be harder if it was merged through net-next I think? Or John can
> > cherry-pick into wireless-testing?
>
> I'm fine with this going in via the wireless tree, that way we'll
> see a real user when it merged into net-next-2.6

Thomas said he also has updated some of the core netlink code, but I'll
also have a real user ready and the NFC subsystem is also using it and
those are coming in through wireless. Thanks!

johannes


2011-07-01 22:46:21

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] rtnl: provide link dump consistency info

From: Thomas Graf <[email protected]>
Date: Tue, 21 Jun 2011 09:11:20 -0400

> This patch adds a change sequence counter to each net namespace
> which is bumped whenever a netdevice is added or removed from
> the list. If such a change occurred while a link dump took place,
> the dump will have the NLM_F_DUMP_INTR flag set in the first
> message which has been interrupted and in all subsequent messages
> of the same dump.
>
> Note that links may still be modified or renamed while a dump is
> taking place but we can guarantee for userspace to receive a
> complete list of links and not miss any.
>
> Testing:
> I have added 500 VLAN netdevices to make sure the dump is split
> over multiple messages. Then while continuously dumping links in
> one process I also continuously deleted and re-added a dummy
> netdevice in another process. Multiple dumps per seconds have
> had the NLM_F_DUMP_INTR flag set.
>
> I guess we can wait for Johannes patch to hit net-next via the
> wireless tree. I just wanted to give this some testing right away.
>
> Signed-off-by: Thomas Graf <[email protected]>

Applied.