Currently the network namespace work has gotten about as far as we can
without the ability to make sysctls that are per network namespace.
The techniques we have been using for other namespace of examining
current and replacing the ctl_table.data field depending on the
namespace instance that current->nsproxy refers to are both ugly
and do not work for the network sysctls.
The case in handling the networking sysctls that does not work with
the existing ugly pointer munging techniques are directories like
/proc/sys/net/ipv4/conf/ and /proc/sys/net/ipv4/neigh/ whose contents
vary depending on the networking devices present in the network
namespace.
Adding support to the sysctl infrastructure to allow to register
a sysctl table for a particular instance of a particular namespace
removes the need for magic sysctl methods, and allows the use
of the techniques for managing dynamic sysctl tables used for years
in the network stack.
Herbert we need this infrastructure most in net-2.6.25 (as not having
it is a current bottleneck to further development of the network
namespace) so these patches are against net-2.6.25.
Andrew also need this infrastructure in -mm so that we can take
advantage of this new infrastructure when implementing other
namespaces.
So I expect the sane way to deal with this patchset is to merge into
both net-2.6.25 and -mm and then Andrew can drop or disable the
patches once he pulls bases -mm on a version of net-2.6.25 with
the changes.
Eric
There are a number of modules that register a sysctl table
somewhere deeply nested in the sysctl hierarchy, such as
fs/nfs, fs/xfs, dev/cdrom, etc.
They all specify several dummy ctl_tables for the path name.
This patch implements register_sysctl_path that takes
an additional path name, and makes up dummy sysctl nodes
for each component.
This patch was originally written by Olaf Kirch and
brought to my attention and reworked some by Olaf Hering.
I have changed a few additional things so the bugs are mine.
After converting all of the easy callers Olaf Hering observed
allyesconfig ARCH=i386, the patch reduces the final binary size by 9369 bytes.
.text +897
.data -7008
text data bss dec hex filename
26959310 4045899 4718592 35723801 2211a19 ../vmlinux-vanilla
26960207 4038891 4718592 35717690 221023a ../O-allyesconfig/vmlinux
So this change is both a space savings and a code simplification.
CC: Olaf Kirch <[email protected]>
CC: Olaf Hering <[email protected]>
Signed-off-by: Eric W. Biederman <[email protected]>
---
include/linux/sysctl.h | 9 +++++
kernel/sysctl.c | 90 ++++++++++++++++++++++++++++++++++++++++--------
2 files changed, 84 insertions(+), 15 deletions(-)
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index e99171f..eb522bf 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -1065,7 +1065,16 @@ struct ctl_table_header
struct completion *unregistering;
};
+/* struct ctl_path describes where in the hierarchy a table is added */
+struct ctl_path
+{
+ const char *procname;
+ int ctl_name;
+};
+
struct ctl_table_header *register_sysctl_table(struct ctl_table * table);
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+ struct ctl_table *table);
void unregister_sysctl_table(struct ctl_table_header * table);
int sysctl_check_table(struct ctl_table *table);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 0deed82..fa92e70 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1490,11 +1490,12 @@ static __init int sysctl_init(void)
core_initcall(sysctl_init);
/**
- * register_sysctl_table - register a sysctl hierarchy
+ * register_sysctl_paths - register a sysctl hierarchy
+ * @path: The path to the directory the sysctl table is in.
* @table: the top-level table structure
*
* Register a sysctl table hierarchy. @table should be a filled in ctl_table
- * array. An entry with a ctl_name of 0 terminates the table.
+ * array. A completely 0 filled entry terminates the table.
*
* The members of the &struct ctl_table structure are used as follows:
*
@@ -1557,28 +1558,80 @@ core_initcall(sysctl_init);
* This routine returns %NULL on a failure to register, and a pointer
* to the table header on success.
*/
-struct ctl_table_header *register_sysctl_table(struct ctl_table * table)
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+ struct ctl_table *table)
{
- struct ctl_table_header *tmp;
- tmp = kmalloc(sizeof(struct ctl_table_header), GFP_KERNEL);
- if (!tmp)
+ struct ctl_table_header *header;
+ struct ctl_table *new, **prevp;
+ unsigned int n, npath;
+
+ /* Count the path components */
+ for (npath = 0; path[npath].ctl_name || path[npath].procname; ++npath)
+ ;
+
+ /*
+ * For each path component, allocate a 2-element ctl_table array.
+ * The first array element will be filled with the sysctl entry
+ * for this, the second will be the sentinel (ctl_name == 0).
+ *
+ * We allocate everything in one go so that we don't have to
+ * worry about freeing additional memory in unregister_sysctl_table.
+ */
+ header = kzalloc(sizeof(struct ctl_table_header) +
+ (2 * npath * sizeof(struct ctl_table)), GFP_KERNEL);
+ if (!header)
return NULL;
- tmp->ctl_table = table;
- INIT_LIST_HEAD(&tmp->ctl_entry);
- tmp->used = 0;
- tmp->unregistering = NULL;
- sysctl_set_parent(NULL, table);
- if (sysctl_check_table(tmp->ctl_table)) {
- kfree(tmp);
+
+ new = (struct ctl_table *) (header + 1);
+
+ /* Now connect the dots */
+ prevp = &header->ctl_table;
+ for (n = 0; n < npath; ++n, ++path) {
+ /* Copy the procname */
+ new->procname = path->procname;
+ new->ctl_name = path->ctl_name;
+ new->mode = 0555;
+
+ *prevp = new;
+ prevp = &new->child;
+
+ new += 2;
+ }
+ *prevp = table;
+
+ INIT_LIST_HEAD(&header->ctl_entry);
+ header->used = 0;
+ header->unregistering = NULL;
+ sysctl_set_parent(NULL, header->ctl_table);
+ if (sysctl_check_table(header->ctl_table)) {
+ kfree(header);
return NULL;
}
spin_lock(&sysctl_lock);
- list_add_tail(&tmp->ctl_entry, &root_table_header.ctl_entry);
+ list_add_tail(&header->ctl_entry, &root_table_header.ctl_entry);
spin_unlock(&sysctl_lock);
- return tmp;
+
+ return header;
}
/**
+ * register_sysctl_table - register a sysctl table hierarchy
+ * @table: the top-level table structure
+ *
+ * Register a sysctl table hierarchy. @table should be a filled in ctl_table
+ * array. A completely 0 filled entry terminates the table.
+ *
+ * See register_sysctl_paths for more details.
+ */
+struct ctl_table_header *register_sysctl_table(struct ctl_table * table)
+{
+ static const struct ctl_path null_path[] = { {} };
+
+ return register_sysctl_paths(null_path, table);
+}
+
+
+/**
* unregister_sysctl_table - unregister a sysctl table hierarchy
* @header: the header returned from register_sysctl_table
*
@@ -1600,6 +1653,12 @@ struct ctl_table_header *register_sysctl_table(struct ctl_table * table)
return NULL;
}
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+ struct ctl_table *table)
+{
+ return NULL;
+}
+
void unregister_sysctl_table(struct ctl_table_header * table)
{
}
@@ -2658,6 +2717,7 @@ EXPORT_SYMBOL(proc_dostring);
EXPORT_SYMBOL(proc_doulongvec_minmax);
EXPORT_SYMBOL(proc_doulongvec_ms_jiffies_minmax);
EXPORT_SYMBOL(register_sysctl_table);
+EXPORT_SYMBOL(register_sysctl_paths);
EXPORT_SYMBOL(sysctl_intvec);
EXPORT_SYMBOL(sysctl_jiffies);
EXPORT_SYMBOL(sysctl_ms_jiffies);
--
1.5.3.rc6.17.g1911
By doing this we allow users of register_sysctl_paths that build
and dynamically allocate their ctl_table to be simpler. This allows
them to just remember the ctl_table_header returned from
register_sysctl_paths from which they can now find the
ctl_table array they need to free.
Signed-off-by: Eric W. Biederman <[email protected]>
---
include/linux/sysctl.h | 1 +
kernel/sysctl.c | 1 +
2 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index eb522bf..8b2e9e0 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -1063,6 +1063,7 @@ struct ctl_table_header
struct list_head ctl_entry;
int used;
struct completion *unregistering;
+ struct ctl_table *ctl_table_arg;
};
/* struct ctl_path describes where in the hierarchy a table is added */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index fa92e70..effae87 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1598,6 +1598,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
new += 2;
}
*prevp = table;
+ header->ctl_table_arg = table;
INIT_LIST_HEAD(&header->ctl_entry);
header->used = 0;
--
1.5.3.rc6.17.g1911
The user interface is: register_net_sysctl_table and
unregister_net_sysctl_table. Very much like the current
interface except there is a network namespace parameter.
With this any sysctl registered with register_net_sysctl_table
will only show up to tasks in the same network namespace.
All other sysctls continue to be globally visible.
Signed-off-by: Eric W. Biederman <[email protected]>
---
include/net/net_namespace.h | 9 +++++++
net/sysctl_net.c | 57 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 66 insertions(+), 0 deletions(-)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 4d0d634..235214c 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -25,6 +25,8 @@ struct net {
struct proc_dir_entry *proc_net_stat;
struct proc_dir_entry *proc_net_root;
+ struct list_head sysctl_table_headers;
+
struct net_device *loopback_dev; /* The loopback */
struct list_head dev_base_head;
@@ -144,4 +146,11 @@ extern void unregister_pernet_subsys(struct pernet_operations *);
extern int register_pernet_device(struct pernet_operations *);
extern void unregister_pernet_device(struct pernet_operations *);
+struct ctl_path;
+struct ctl_table;
+struct ctl_table_header;
+extern struct ctl_table_header *register_net_sysctl_table(struct net *net,
+ const struct ctl_path *path, struct ctl_table *table);
+extern void unregister_net_sysctl_table(struct ctl_table_header *header);
+
#endif /* __NET_NET_NAMESPACE_H */
diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index cd4eafb..c50c793 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -14,6 +14,7 @@
#include <linux/mm.h>
#include <linux/sysctl.h>
+#include <linux/nsproxy.h>
#include <net/sock.h>
@@ -54,3 +55,59 @@ struct ctl_table net_table[] = {
#endif
{ 0 },
};
+
+static struct list_head *
+net_ctl_header_lookup(struct ctl_table_root *root, struct nsproxy *namespaces)
+{
+ return &namespaces->net_ns->sysctl_table_headers;
+}
+
+static struct ctl_table_root net_sysctl_root = {
+ .lookup = net_ctl_header_lookup,
+};
+
+static int sysctl_net_init(struct net *net)
+{
+ INIT_LIST_HEAD(&net->sysctl_table_headers);
+ return 0;
+}
+
+static void sysctl_net_exit(struct net *net)
+{
+ WARN_ON(!list_empty(&net->sysctl_table_headers));
+ return;
+}
+
+static struct pernet_operations sysctl_pernet_ops = {
+ .init = sysctl_net_init,
+ .exit = sysctl_net_exit,
+};
+
+static __init int sysctl_init(void)
+{
+ int ret;
+ ret = register_pernet_subsys(&sysctl_pernet_ops);
+ if (ret)
+ goto out;
+ register_sysctl_root(&net_sysctl_root);
+out:
+ return ret;
+}
+subsys_initcall(sysctl_init);
+
+struct ctl_table_header *register_net_sysctl_table(struct net *net,
+ const struct ctl_path *path, struct ctl_table *table)
+{
+ struct nsproxy namespaces;
+ namespaces = *current->nsproxy;
+ namespaces.net_ns = net;
+ return __register_sysctl_paths(&net_sysctl_root,
+ &namespaces, path, table);
+}
+EXPORT_SYMBOL_GPL(register_net_sysctl_table);
+
+void unregister_net_sysctl_table(struct ctl_table_header *header)
+{
+ return unregister_sysctl_table(header);
+}
+EXPORT_SYMBOL_GPL(unregister_net_sysctl_table);
--
1.5.3.rc6.17.g1911
This patch implements the basic infrastructure for per namespace sysctls.
A list of lists of sysctl headers is added, allowing each namespace to have
it's own list of sysctl headers.
Each list of sysctl headers has a lookup function to find the first
sysctl header in the list, allowing the lists to have a per namespace
instance.
register_sysct_root is added to tell sysctl.c about additional
lists of sysctl_headers. As all of the users are expected to be in
kernel no unregister function is provided.
sysctl_head_next is updated to walk through the list of lists.
__register_sysctl_paths is added to add a new sysctl table on
a non-default sysctl list.
The only intrusive part of this patch is propagating the information
to decided which list of sysctls to use for sysctl_check_table.
Signed-off-by: Eric W. Biederman <[email protected]>
---
include/linux/sysctl.h | 16 ++++++++-
kernel/sysctl.c | 93 ++++++++++++++++++++++++++++++++++++++++++------
kernel/sysctl_check.c | 25 +++++++------
3 files changed, 111 insertions(+), 23 deletions(-)
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 8b2e9e0..cd1da5c 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -951,7 +951,9 @@ enum
/* For the /proc/sys support */
struct ctl_table;
+struct nsproxy;
extern struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev);
+extern struct ctl_table_header *__sysctl_head_next(struct nsproxy *namespaces, struct ctl_table_header *prev);
extern void sysctl_head_finish(struct ctl_table_header *prev);
extern int sysctl_perm(struct ctl_table *table, int op);
@@ -1055,6 +1057,13 @@ struct ctl_table
void *extra2;
};
+struct ctl_table_root {
+ struct list_head root_list;
+ struct list_head header_list;
+ struct list_head *(*lookup)(struct ctl_table_root *root,
+ struct nsproxy *namespaces);
+};
+
/* struct ctl_table_header is used to maintain dynamic lists of
struct ctl_table trees. */
struct ctl_table_header
@@ -1064,6 +1073,7 @@ struct ctl_table_header
int used;
struct completion *unregistering;
struct ctl_table *ctl_table_arg;
+ struct ctl_table_root *root;
};
/* struct ctl_path describes where in the hierarchy a table is added */
@@ -1073,12 +1083,16 @@ struct ctl_path
int ctl_name;
};
+void register_sysctl_root(struct ctl_table_root *root);
+struct ctl_table_header *__register_sysctl_paths(
+ struct ctl_table_root *root, struct nsproxy *namespaces,
+ const struct ctl_path *path, struct ctl_table *table);
struct ctl_table_header *register_sysctl_table(struct ctl_table * table);
struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
struct ctl_table *table);
void unregister_sysctl_table(struct ctl_table_header * table);
-int sysctl_check_table(struct ctl_table *table);
+int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table);
#else /* __KERNEL__ */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index effae87..ad4b709 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -156,8 +156,16 @@ static int proc_dointvec_taint(struct ctl_table *table, int write, struct file *
#endif
static struct ctl_table root_table[];
-static struct ctl_table_header root_table_header =
- { root_table, LIST_HEAD_INIT(root_table_header.ctl_entry) };
+static struct ctl_table_root sysctl_table_root;
+static struct ctl_table_header root_table_header = {
+ .ctl_table = root_table,
+ .ctl_entry = LIST_HEAD_INIT(sysctl_table_root.header_list),
+ .root = &sysctl_table_root,
+};
+static struct ctl_table_root sysctl_table_root = {
+ .root_list = LIST_HEAD_INIT(sysctl_table_root.root_list),
+ .header_list = LIST_HEAD_INIT(root_table_header.ctl_entry),
+};
static struct ctl_table kern_table[];
static struct ctl_table vm_table[];
@@ -1300,12 +1308,27 @@ void sysctl_head_finish(struct ctl_table_header *head)
spin_unlock(&sysctl_lock);
}
-struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev)
+static struct list_head *
+lookup_header_list(struct ctl_table_root *root, struct nsproxy *namespaces)
{
+ struct list_head *header_list;
+ header_list = &root->header_list;
+ if (root->lookup)
+ header_list = root->lookup(root, namespaces);
+ return header_list;
+}
+
+struct ctl_table_header *__sysctl_head_next(struct nsproxy *namespaces,
+ struct ctl_table_header *prev)
+{
+ struct ctl_table_root *root;
+ struct list_head *header_list;
struct ctl_table_header *head;
struct list_head *tmp;
+
spin_lock(&sysctl_lock);
if (prev) {
+ head = prev;
tmp = &prev->ctl_entry;
unuse_table(prev);
goto next;
@@ -1319,14 +1342,38 @@ struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev)
spin_unlock(&sysctl_lock);
return head;
next:
+ root = head->root;
tmp = tmp->next;
- if (tmp == &root_table_header.ctl_entry)
- break;
+ header_list = lookup_header_list(root, namespaces);
+ if (tmp != header_list)
+ continue;
+
+ do {
+ root = list_entry(root->root_list.next,
+ struct ctl_table_root, root_list);
+ if (root == &sysctl_table_root)
+ goto out;
+ header_list = lookup_header_list(root, namespaces);
+ } while (list_empty(header_list));
+ tmp = header_list->next;
}
+out:
spin_unlock(&sysctl_lock);
return NULL;
}
+struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev)
+{
+ return __sysctl_head_next(current->nsproxy, prev);
+}
+
+void register_sysctl_root(struct ctl_table_root *root)
+{
+ spin_lock(&sysctl_lock);
+ list_add_tail(&root->root_list, &sysctl_table_root.root_list);
+ spin_unlock(&sysctl_lock);
+}
+
#ifdef CONFIG_SYSCTL_SYSCALL
int do_sysctl(int __user *name, int nlen, void __user *oldval, size_t __user *oldlenp,
void __user *newval, size_t newlen)
@@ -1483,14 +1530,16 @@ static __init int sysctl_init(void)
{
int err;
sysctl_set_parent(NULL, root_table);
- err = sysctl_check_table(root_table);
+ err = sysctl_check_table(current->nsproxy, root_table);
return 0;
}
core_initcall(sysctl_init);
/**
- * register_sysctl_paths - register a sysctl hierarchy
+ * __register_sysctl_paths - register a sysctl hierarchy
+ * @root: List of sysctl headers to register on
+ * @namespaces: Data to compute which lists of sysctl entries are visible
* @path: The path to the directory the sysctl table is in.
* @table: the top-level table structure
*
@@ -1558,9 +1607,12 @@ core_initcall(sysctl_init);
* This routine returns %NULL on a failure to register, and a pointer
* to the table header on success.
*/
-struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
- struct ctl_table *table)
+struct ctl_table_header *__register_sysctl_paths(
+ struct ctl_table_root *root,
+ struct nsproxy *namespaces,
+ const struct ctl_path *path, struct ctl_table *table)
{
+ struct list_head *header_list;
struct ctl_table_header *header;
struct ctl_table *new, **prevp;
unsigned int n, npath;
@@ -1603,19 +1655,38 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
INIT_LIST_HEAD(&header->ctl_entry);
header->used = 0;
header->unregistering = NULL;
+ header->root = root;
sysctl_set_parent(NULL, header->ctl_table);
- if (sysctl_check_table(header->ctl_table)) {
+ if (sysctl_check_table(namespaces, header->ctl_table)) {
kfree(header);
return NULL;
}
spin_lock(&sysctl_lock);
- list_add_tail(&header->ctl_entry, &root_table_header.ctl_entry);
+ header_list = lookup_header_list(root, namespaces);
+ list_add_tail(&header->ctl_entry, header_list);
spin_unlock(&sysctl_lock);
return header;
}
/**
+ * register_sysctl_table_path - register a sysctl table hierarchy
+ * @path: The path to the directory the sysctl table is in.
+ * @table: the top-level table structure
+ *
+ * Register a sysctl table hierarchy. @table should be a filled in ctl_table
+ * array. A completely 0 filled entry terminates the table.
+ *
+ * See __register_sysctl_paths for more details.
+ */
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+ struct ctl_table *table)
+{
+ return __register_sysctl_paths(&sysctl_table_root, current->nsproxy,
+ path, table);
+}
+
+/**
* register_sysctl_table - register a sysctl table hierarchy
* @table: the top-level table structure
*
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index fdfca0d..2544852 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -1352,7 +1352,8 @@ static void sysctl_repair_table(struct ctl_table *table)
}
}
-static struct ctl_table *sysctl_check_lookup(struct ctl_table *table)
+static struct ctl_table *sysctl_check_lookup(struct nsproxy *namespaces,
+ struct ctl_table *table)
{
struct ctl_table_header *head;
struct ctl_table *ref, *test;
@@ -1360,8 +1361,8 @@ static struct ctl_table *sysctl_check_lookup(struct ctl_table *table)
depth = sysctl_depth(table);
- for (head = sysctl_head_next(NULL); head;
- head = sysctl_head_next(head)) {
+ for (head = __sysctl_head_next(namespaces, NULL); head;
+ head = __sysctl_head_next(namespaces, head)) {
cur_depth = depth;
ref = head->ctl_table;
repeat:
@@ -1406,13 +1407,14 @@ static void set_fail(const char **fail, struct ctl_table *table, const char *str
*fail = str;
}
-static int sysctl_check_dir(struct ctl_table *table)
+static int sysctl_check_dir(struct nsproxy *namespaces,
+ struct ctl_table *table)
{
struct ctl_table *ref;
int error;
error = 0;
- ref = sysctl_check_lookup(table);
+ ref = sysctl_check_lookup(namespaces, table);
if (ref) {
int match = 0;
if ((!table->procname && !ref->procname) ||
@@ -1437,11 +1439,12 @@ static int sysctl_check_dir(struct ctl_table *table)
return error;
}
-static void sysctl_check_leaf(struct ctl_table *table, const char **fail)
+static void sysctl_check_leaf(struct nsproxy *namespaces,
+ struct ctl_table *table, const char **fail)
{
struct ctl_table *ref;
- ref = sysctl_check_lookup(table);
+ ref = sysctl_check_lookup(namespaces, table);
if (ref && (ref != table))
set_fail(fail, table, "Sysctl already exists");
}
@@ -1465,7 +1468,7 @@ static void sysctl_check_bin_path(struct ctl_table *table, const char **fail)
}
}
-int sysctl_check_table(struct ctl_table *table)
+int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
{
int error = 0;
for (; table->ctl_name || table->procname; table++) {
@@ -1495,7 +1498,7 @@ int sysctl_check_table(struct ctl_table *table)
set_fail(&fail, table, "Directory with extra1");
if (table->extra2)
set_fail(&fail, table, "Directory with extra2");
- if (sysctl_check_dir(table))
+ if (sysctl_check_dir(namespaces, table))
set_fail(&fail, table, "Inconsistent directory names");
} else {
if ((table->strategy == sysctl_data) ||
@@ -1544,7 +1547,7 @@ int sysctl_check_table(struct ctl_table *table)
if (!table->procname && table->proc_handler)
set_fail(&fail, table, "proc_handler without procname");
#endif
- sysctl_check_leaf(table, &fail);
+ sysctl_check_leaf(namespaces, table, &fail);
}
sysctl_check_bin_path(table, &fail);
if (fail) {
@@ -1552,7 +1555,7 @@ int sysctl_check_table(struct ctl_table *table)
error = -EINVAL;
}
if (table->child)
- error |= sysctl_check_table(table->child);
+ error |= sysctl_check_table(namespaces, table->child);
}
return error;
}
--
1.5.3.rc6.17.g1911
On Thu, Nov 29, 2007 at 10:40:24AM -0700, Eric W. Biederman wrote:
>
> Herbert we need this infrastructure most in net-2.6.25 (as not having
> it is a current bottleneck to further development of the network
> namespace) so these patches are against net-2.6.25.
I've applied them all to net-2.6.25 with Andrew's fixes included.
Thanks Eric.
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Herbert Xu <[email protected]> writes:
> On Thu, Nov 29, 2007 at 10:40:24AM -0700, Eric W. Biederman wrote:
>>
>> Herbert we need this infrastructure most in net-2.6.25 (as not having
>> it is a current bottleneck to further development of the network
>> namespace) so these patches are against net-2.6.25.
>
> I've applied them all to net-2.6.25 with Andrew's fixes included.
> Thanks Eric.
Welcome, and thanks.
I will see about taking advantage of this shortly.
Eric
Quoting Eric W. Biederman ([email protected]):
>
> The user interface is: register_net_sysctl_table and
> unregister_net_sysctl_table. Very much like the current
> interface except there is a network namespace parameter.
>
> With this any sysctl registered with register_net_sysctl_table
> will only show up to tasks in the same network namespace.
>
> All other sysctls continue to be globally visible.
>
> Signed-off-by: Eric W. Biederman <[email protected]>
> ---
> include/net/net_namespace.h | 9 +++++++
> net/sysctl_net.c | 57 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 66 insertions(+), 0 deletions(-)
>
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 4d0d634..235214c 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -25,6 +25,8 @@ struct net {
> struct proc_dir_entry *proc_net_stat;
> struct proc_dir_entry *proc_net_root;
>
> + struct list_head sysctl_table_headers;
> +
> struct net_device *loopback_dev; /* The loopback */
>
> struct list_head dev_base_head;
> @@ -144,4 +146,11 @@ extern void unregister_pernet_subsys(struct pernet_operations *);
> extern int register_pernet_device(struct pernet_operations *);
> extern void unregister_pernet_device(struct pernet_operations *);
>
> +struct ctl_path;
> +struct ctl_table;
> +struct ctl_table_header;
> +extern struct ctl_table_header *register_net_sysctl_table(struct net *net,
> + const struct ctl_path *path, struct ctl_table *table);
> +extern void unregister_net_sysctl_table(struct ctl_table_header *header);
> +
> #endif /* __NET_NET_NAMESPACE_H */
> diff --git a/net/sysctl_net.c b/net/sysctl_net.c
> index cd4eafb..c50c793 100644
> --- a/net/sysctl_net.c
> +++ b/net/sysctl_net.c
> @@ -14,6 +14,7 @@
>
> #include <linux/mm.h>
> #include <linux/sysctl.h>
> +#include <linux/nsproxy.h>
>
> #include <net/sock.h>
>
> @@ -54,3 +55,59 @@ struct ctl_table net_table[] = {
> #endif
> { 0 },
> };
> +
> +static struct list_head *
> +net_ctl_header_lookup(struct ctl_table_root *root, struct nsproxy *namespaces)
> +{
> + return &namespaces->net_ns->sysctl_table_headers;
> +}
> +
> +static struct ctl_table_root net_sysctl_root = {
> + .lookup = net_ctl_header_lookup,
> +};
> +
> +static int sysctl_net_init(struct net *net)
> +{
> + INIT_LIST_HEAD(&net->sysctl_table_headers);
> + return 0;
> +}
> +
> +static void sysctl_net_exit(struct net *net)
> +{
> + WARN_ON(!list_empty(&net->sysctl_table_headers));
> + return;
> +}
> +
> +static struct pernet_operations sysctl_pernet_ops = {
> + .init = sysctl_net_init,
> + .exit = sysctl_net_exit,
> +};
> +
> +static __init int sysctl_init(void)
> +{
> + int ret;
> + ret = register_pernet_subsys(&sysctl_pernet_ops);
> + if (ret)
> + goto out;
> + register_sysctl_root(&net_sysctl_root);
> +out:
> + return ret;
> +}
> +subsys_initcall(sysctl_init);
> +
> +struct ctl_table_header *register_net_sysctl_table(struct net *net,
> + const struct ctl_path *path, struct ctl_table *table)
> +{
> + struct nsproxy namespaces;
> + namespaces = *current->nsproxy;
> + namespaces.net_ns = net;
> + return __register_sysctl_paths(&net_sysctl_root,
> + &namespaces, path, table);
Hey Eric,
the patches look nice.
The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
does make it seem like nsproxy may not be the best choice of what to
pass in. Doesn't only net_sysctl_root->lookup() look at the argument?
But I assume you don't want to be more general than sending in a
nsproxy so as to dissuade abuse of this interface for needlessly complex
sysctl interfaces?
(Well I expect that'll become clear once the the patches using this
come out.)
Are you planning to use this infrastructure for the uts and ipc
sysctls as well?
thanks,
-serge
> +}
> +EXPORT_SYMBOL_GPL(register_net_sysctl_table);
> +
> +void unregister_net_sysctl_table(struct ctl_table_header *header)
> +{
> + return unregister_sysctl_table(header);
> +}
> +EXPORT_SYMBOL_GPL(unregister_net_sysctl_table);
> --
> 1.5.3.rc6.17.g1911
[snip]
>> + &namespaces, path, table);
>
> Hey Eric,
>
> the patches look nice.
Agree ;)
> The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
> does make it seem like nsproxy may not be the best choice of what to
> pass in. Doesn't only net_sysctl_root->lookup() look at the argument?
>
> But I assume you don't want to be more general than sending in a
> nsproxy so as to dissuade abuse of this interface for needlessly complex
> sysctl interfaces?
>
> (Well I expect that'll become clear once the the patches using this
> come out.)
>
> Are you planning to use this infrastructure for the uts and ipc
> sysctls as well?
I have sent some patches concerning uts and ipc already.
I'd appreciate any feedback on it :)
> thanks,
> -serge
Thanks,
Pavel
"Serge E. Hallyn" <[email protected]> writes:
>
> Hey Eric,
>
> the patches look nice.
>
> The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
> does make it seem like nsproxy may not be the best choice of what to
> pass in. Doesn't only net_sysctl_root->lookup() look at the argument?
Yes. Although I call it from __register_sysctl_paths.
> But I assume you don't want to be more general than sending in a
> nsproxy so as to dissuade abuse of this interface for needlessly complex
> sysctl interfaces?
A bit of that. I would love to pass in a task_struct so you can use
anything from a task. The trouble is I don't have any task_structs or
nsproxys with the proper value at the point where I am first setting
this up. Further I have to have the full sysctl lookup working or I
could not call sysctl_check.
> (Well I expect that'll become clear once the the patches using this
> come out.)
>
> Are you planning to use this infrastructure for the uts and ipc
> sysctls as well?
Yes. Where it comes in especially useful, is I can move /proc/sys
to /proc/sys/<tgid>/task/<pid>/sys. And get a particular processes
view of sysctl.
We also get a little more reuse of common functions.
Otherwise Pavel does have a point that using this for uts and ipc
is not a savings lines of code wise.
After having seen Pavel changes I am asking myself if there is a sane
way to remove the ctl_name argument from the ctl_path.
Anyway where I am with the nsproxy question was that I don't
see anything easily better. What I have works and gets the job
done, and doesn't have any module unload races or holes where a sloppy
programmer can mess up the sysctl tree. We needed a solution.
Trying any harder to find something better would take ages. So
I figured this implementation was good enough.
Eric
Quoting Eric W. Biederman ([email protected]):
> "Serge E. Hallyn" <[email protected]> writes:
>
> >
> > Hey Eric,
> >
> > the patches look nice.
> >
> > The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
> > does make it seem like nsproxy may not be the best choice of what to
> > pass in. Doesn't only net_sysctl_root->lookup() look at the argument?
>
> Yes. Although I call it from __register_sysctl_paths.
>
> > But I assume you don't want to be more general than sending in a
> > nsproxy so as to dissuade abuse of this interface for needlessly complex
> > sysctl interfaces?
>
> A bit of that. I would love to pass in a task_struct so you can use
> anything from a task. The trouble is I don't have any task_structs or
> nsproxys with the proper value at the point where I am first setting
> this up. Further I have to have the full sysctl lookup working or I
> could not call sysctl_check.
>
> > (Well I expect that'll become clear once the the patches using this
> > come out.)
> >
> > Are you planning to use this infrastructure for the uts and ipc
> > sysctls as well?
>
> Yes. Where it comes in especially useful, is I can move /proc/sys
> to /proc/sys/<tgid>/task/<pid>/sys. And get a particular processes
> view of sysctl.
>
> We also get a little more reuse of common functions.
>
> Otherwise Pavel does have a point that using this for uts and ipc
> is not a savings lines of code wise.
>
> After having seen Pavel changes I am asking myself if there is a sane
> way to remove the ctl_name argument from the ctl_path.
>
> Anyway where I am with the nsproxy question was that I don't
> see anything easily better. What I have works and gets the job
> done, and doesn't have any module unload races or holes where a sloppy
> programmer can mess up the sysctl tree. We needed a solution.
> Trying any harder to find something better would take ages. So
> I figured this implementation was good enough.
I agree. So it's already in -mm but still
Acked-by: Serge Hallyn <[email protected]>
thanks,
-serge