From: Joel Granados <[email protected]>
What?
Remove the loop stopping criteria check for ->procname == NULL, the
array size calculation based on sentinels and the superfluous sentinels
created within the sysctl infrastructure. None are needed as they are
now solely based on ctl_table ARRAY_SIZE. Finally, sentinels that have
been added in recent releases (not present for the original patchsets)
were removed.
Why?
By removing the sysctl sentinel elements we avoid kernel bloat as
ctl_table arrays get moved out of kernel/sysctl.c into their own
respective subsystems. This move was started long ago to avoid merge
conflicts; the sentinel removal bit is to avoid bloating the kernel by
one element as arrays moved out. It includes work in /arch [1], /dirver
[2], fs/ [3], kernel [4], net/ [5], mm/ [6], security/ [6], io_uring [6]
and other misc directories [6]. It will reduce the overall build time
size of the kernel and run time memory bloat by about ~64 bytes per
declared ctl_table array (more info here [0]).
Testing:
* Ran sysctl selftests (./tools/testing/selftests/sysctl/sysctl.sh)
* Ran this through 0-day with no errors or warnings
Savings in vmlinux:
A total of 64 bytes per sentinel is saved after removal. Here is the
aggregated savings for all the removal patchsets ([1,2,3,4,5,6]) for
the x86_64 arch (actual savings will depend on kernel conf):
|------|---------|-----------|-------|-----------|--------|---------|----------------|
|dir | arch[1] | driver[2] | fs[3] | kernel[4] | net[5] | misc[6] | Total(approx.) |
|------|---------|-----------|-------|-----------|--------|---------|----------------|
|Bytes | 192 | 2432 | 1920 | 1984 | 3976 | 963 | 11467 |
|------|---------|-----------|-------|-----------|--------|---------|----------------|
Savings in allocated memory:
The estimated savings during boot for config [3] are 6272 bytes. See
[7] for how to measure it.
Comments/feedback greatly appreciated
Best
Joel
[0] Links Related to the ctl_table sentinel removal:
* Good summaries from Luis:
https://lore.kernel.org/all/ZO5Yx5JFogGi%[email protected]/
https://lore.kernel.org/all/[email protected]/
* Patches adjusting sysctl register calls:
https://lore.kernel.org/all/[email protected]/
https://lore.kernel.org/all/[email protected]/
* Discussions about expectations and approach
https://lore.kernel.org/all/[email protected]
https://lore.kernel.org/all/[email protected]
[1] https://lore.kernel.org/20231002-jag-sysctl_remove_empty_elem_arch-v3-0-606da2840a7a@samsung.com
[2] https://lore.kernel.org/20231002-jag-sysctl_remove_empty_elem_drivers-v2-0-02dd0d46f71e@samsung.com
[3] https://lore.kernel.org/20231121-jag-sysctl_remove_empty_elem_fs-v2-0-39eab723a034@samsung.com
[4] https://lore.kernel.org/20240328-jag-sysctl_remove_empty_elem_kernel-v3-0-285d273912fe@samsung.com
[5] https://lore.kernel.org/[email protected]
[6] https://lore.kernel.org/[email protected]
[7]
To measure the in memory savings apply this on top of this patchset.
"
diff --git i/fs/proc/proc_sysctl.c w/fs/proc/proc_sysctl.c
index a6aeaa928dd2..6ca5341bcddf 100644
--- i/fs/proc/proc_sysctl.c
+++ w/fs/proc/proc_sysctl.c
@@ -963,6 +963,7 @@ static struct ctl_dir *new_dir(struct ctl_table_set *set,
table[0].procname = new_name;
table[0].mode = S_IFDIR|S_IRUGO|S_IXUGO;
init_header(&new->header, set->dir.header.root, set, node, table, 1);
+ printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table));
return new;
}
@@ -1186,6 +1187,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_
link_name += len;
link++;
}
+ printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table));
init_header(links, dir->header.root, dir->header.set, node, link_table,
head->ctl_table_size);
links->nreg = head->ctl_table_size;
"
and then run the following bash script in the kernel:
accum=0
for n in $(dmesg | grep kzalloc | awk '{print $3}') ; do
accum=$(calc "$accum + $n")
done
echo $accum
Signed-off-by: Joel Granados <[email protected]>
---
Joel Granados (8):
locking: Remove superfluous sentinel element from kern_lockdep_table
mm profiling: Remove superfluous sentinel element from ctl_table
sysctl: Remove check for sentinel element in ctl_table arrays
sysctl: Replace nr_entries with ctl_table_size in new_links
sysctl: Remove superfluous empty allocations from sysctl internals
sysctl: Remove "child" sysctl code comments
sysctl: Remove ctl_table sentinel code comments
sysctl: Warn on an empty procname element
fs/proc/proc_sysctl.c | 50 +++++++++++++++++++++---------------------------
kernel/locking/lockdep.c | 1 -
lib/alloc_tag.c | 1 -
net/sysctl_net.c | 11 ++---------
4 files changed, 24 insertions(+), 39 deletions(-)
---
base-commit: 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0
change-id: 20240603-jag-sysctl_remset-4afb8c723003
Best regards,
--
Joel Granados <[email protected]>
From: Joel Granados <[email protected]>
Now that the sentinels have been removed from ctl_table arrays, there is
no need to artificially append empty ctl_table elements at ctl_table
registration. Remove superfluous empty allocation from new_dir and
new_links.
Signed-off-by: Joel Granados <[email protected]>
---
fs/proc/proc_sysctl.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 1babb54347a4..29d40f0ff3ff 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -951,14 +951,14 @@ static struct ctl_dir *new_dir(struct ctl_table_set *set,
char *new_name;
new = kzalloc(sizeof(*new) + sizeof(struct ctl_node) +
- sizeof(struct ctl_table)*2 + namelen + 1,
+ sizeof(struct ctl_table) + namelen + 1,
GFP_KERNEL);
if (!new)
return NULL;
node = (struct ctl_node *)(new + 1);
table = (struct ctl_table *)(node + 1);
- new_name = (char *)(table + 2);
+ new_name = (char *)(table + 1);
memcpy(new_name, name, namelen);
table[0].procname = new_name;
table[0].mode = S_IFDIR|S_IRUGO|S_IXUGO;
@@ -1163,7 +1163,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_
links = kzalloc(sizeof(struct ctl_table_header) +
sizeof(struct ctl_node)*head->ctl_table_size +
- sizeof(struct ctl_table)*(head->ctl_table_size + 1) +
+ sizeof(struct ctl_table)*head->ctl_table_size +
name_bytes,
GFP_KERNEL);
@@ -1172,7 +1172,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_
node = (struct ctl_node *)(links + 1);
link_table = (struct ctl_table *)(node + head->ctl_table_size);
- link_name = (char *)&link_table[head->ctl_table_size + 1];
+ link_name = (char *)(link_table + head->ctl_table_size);
link = link_table;
list_for_each_table_entry(entry, head) {
--
2.43.0
From: Joel Granados <[email protected]>
The number of ctl_table entries (nr_entries) calculation was previously
based on the ctl_table_size and the sentinel element. Since the
sentinels have been removed, we remove the calculation and just use the
ctl_table_size from the ctl_table_header.
Signed-off-by: Joel Granados <[email protected]>
---
fs/proc/proc_sysctl.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index d4ba7ad9dbe0..1babb54347a4 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1154,18 +1154,16 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_
struct ctl_table_header *links;
struct ctl_node *node;
char *link_name;
- int nr_entries, name_bytes;
+ int name_bytes;
name_bytes = 0;
- nr_entries = 0;
list_for_each_table_entry(entry, head) {
- nr_entries++;
name_bytes += strlen(entry->procname) + 1;
}
links = kzalloc(sizeof(struct ctl_table_header) +
- sizeof(struct ctl_node)*nr_entries +
- sizeof(struct ctl_table)*(nr_entries + 1) +
+ sizeof(struct ctl_node)*head->ctl_table_size +
+ sizeof(struct ctl_table)*(head->ctl_table_size + 1) +
name_bytes,
GFP_KERNEL);
@@ -1173,8 +1171,8 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_
return NULL;
node = (struct ctl_node *)(links + 1);
- link_table = (struct ctl_table *)(node + nr_entries);
- link_name = (char *)&link_table[nr_entries + 1];
+ link_table = (struct ctl_table *)(node + head->ctl_table_size);
+ link_name = (char *)&link_table[head->ctl_table_size + 1];
link = link_table;
list_for_each_table_entry(entry, head) {
@@ -1188,7 +1186,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_
}
init_header(links, dir->header.root, dir->header.set, node, link_table,
head->ctl_table_size);
- links->nreg = nr_entries;
+ links->nreg = head->ctl_table_size;
return links;
}
--
2.43.0
From: Joel Granados <[email protected]>
Remove the mention of a "zero terminated entry" from the
__register_sysctl_table function doc.
Signed-off-by: Joel Granados <[email protected]>
---
fs/proc/proc_sysctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 2ef23d2c3496..806700b70dea 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1306,7 +1306,7 @@ static struct ctl_dir *sysctl_mkdir_p(struct ctl_dir *dir, const char *path)
* @table_size : The number of elements in table
*
* Register a sysctl table hierarchy. @table should be a filled in ctl_table
- * array. A completely 0 filled entry terminates the table.
+ * array.
*
* The members of the &struct ctl_table structure are used as follows:
* procname - the name of the sysctl file under /proc/sys. Set to %NULL to not
--
2.43.0
From: Joel Granados <[email protected]>
Use ARRAY_SIZE exclusively by removing the check to ->procname in the
stopping criteria of the loops traversing ctl_table arrays. This commit
finalizes the removal of the sentinel elements at the end of ctl_table
arrays which reduces the build time size and run time memory bloat by
~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%[email protected]/)
Remove the entry->procname evaluation from the for loop stopping
criteria in sysctl and sysctl_net.
Signed-off-by: Joel Granados <[email protected]>
---
fs/proc/proc_sysctl.c | 2 +-
net/sysctl_net.c | 11 ++---------
2 files changed, 3 insertions(+), 10 deletions(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b1c2c0b82116..d4ba7ad9dbe0 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -21,7 +21,7 @@
#define list_for_each_table_entry(entry, header) \
entry = header->ctl_table; \
- for (size_t i = 0 ; i < header->ctl_table_size && entry->procname; ++i, entry++)
+ for (size_t i = 0 ; i < header->ctl_table_size; ++i, entry++)
static const struct dentry_operations proc_sys_dentry_operations;
static const struct file_operations proc_sys_file_operations;
diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index f5017012a049..19e8048241ba 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -127,7 +127,7 @@ static void ensure_safe_net_sysctl(struct net *net, const char *path,
pr_debug("Registering net sysctl (net %p): %s\n", net, path);
ent = table;
- for (size_t i = 0; i < table_size && ent->procname; ent++, i++) {
+ for (size_t i = 0; i < table_size; ent++, i++) {
unsigned long addr;
const char *where;
@@ -165,17 +165,10 @@ struct ctl_table_header *register_net_sysctl_sz(struct net *net,
struct ctl_table *table,
size_t table_size)
{
- int count;
- struct ctl_table *entry;
-
if (!net_eq(net, &init_net))
ensure_safe_net_sysctl(net, path, table, table_size);
- entry = table;
- for (count = 0 ; count < table_size && entry->procname; entry++, count++)
- ;
-
- return __register_sysctl_table(&net->sysctls, path, table, count);
+ return __register_sysctl_table(&net->sysctls, path, table, table_size);
}
EXPORT_SYMBOL_GPL(register_net_sysctl_sz);
--
2.43.0
From: Joel Granados <[email protected]>
Add a pr_err warning in case a ctl_table is registered with a sentinel
element containing a NULL procname.
Signed-off-by: Joel Granados <[email protected]>
---
fs/proc/proc_sysctl.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 806700b70dea..f65098de5fcb 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1119,6 +1119,8 @@ static int sysctl_check_table(const char *path, struct ctl_table_header *header)
struct ctl_table *entry;
int err = 0;
list_for_each_table_entry(entry, header) {
+ if (!entry->procname)
+ err |= sysctl_err(path, entry, "procname is null");
if ((entry->proc_handler == proc_dostring) ||
(entry->proc_handler == proc_dobool) ||
(entry->proc_handler == proc_dointvec) ||
--
2.43.0
On Tue, Jun 04, 2024 at 08:29:21AM +0200, Joel Granados via B4 Relay wrote:
> From: Joel Granados <[email protected]>
> --- a/net/sysctl_net.c
> +++ b/net/sysctl_net.c
> @@ -127,7 +127,7 @@ static void ensure_safe_net_sysctl(struct net *net, const char *path,
>
> pr_debug("Registering net sysctl (net %p): %s\n", net, path);
> ent = table;
> - for (size_t i = 0; i < table_size && ent->procname; ent++, i++) {
> + for (size_t i = 0; i < table_size; ent++, i++) {
> unsigned long addr;
> const char *where;
>
> @@ -165,17 +165,10 @@ struct ctl_table_header *register_net_sysctl_sz(struct net *net,
> struct ctl_table *table,
> size_t table_size)
> {
> - int count;
> - struct ctl_table *entry;
> -
> if (!net_eq(net, &init_net))
> ensure_safe_net_sysctl(net, path, table, table_size);
>
> - entry = table;
> - for (count = 0 ; count < table_size && entry->procname; entry++, count++)
> - ;
> -
> - return __register_sysctl_table(&net->sysctls, path, table, count);
> + return __register_sysctl_table(&net->sysctls, path, table, table_size);
> }
> EXPORT_SYMBOL_GPL(register_net_sysctl_sz);
>
Given that this is a very small network related patch, I'm queueing this
in through the sysctl-next branch. Please let me know if you would
rather take it through the networking workflow.
Best
--
Joel Granados
On Tue, Jun 04, 2024 at 08:29:26AM +0200, Joel Granados via B4 Relay wrote:
> From: Joel Granados <[email protected]>
>
> Add a pr_err warning in case a ctl_table is registered with a sentinel
> element containing a NULL procname.
>
> Signed-off-by: Joel Granados <[email protected]>
> ---
> fs/proc/proc_sysctl.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index 806700b70dea..f65098de5fcb 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -1119,6 +1119,8 @@ static int sysctl_check_table(const char *path, struct ctl_table_header *header)
> struct ctl_table *entry;
> int err = 0;
> list_for_each_table_entry(entry, header) {
> + if (!entry->procname)
> + err |= sysctl_err(path, entry, "procname is null");
> if ((entry->proc_handler == proc_dostring) ||
> (entry->proc_handler == proc_dobool) ||
> (entry->proc_handler == proc_dointvec) ||
>
> --
> 2.43.0
>
>
To add to this check, I sent out a static analysis check to smatch in
such a way that a warning will be printed out if there is a ctl_table
element with a procname or prog_handler that are NULL. You can see it
here https://lore.kernel.org/all/[email protected]/
Best
--
Joel Granados