2020-05-16 08:57:21

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v2 0/4] cleaning up the sysctls table (hung_task watchdog)

Kernel/sysctl.c contains more than 190 interface files, and there are a
large number of config macro controls. When modifying the sysctl
interface directly in kernel/sysctl.c, conflicts are very easy to occur.
E.g: https://lkml.org/lkml/2020/5/10/413.

Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the same time.

So consider cleaning up the sysctls table, details are in:
https://kernelnewbies.org/KernelProjects/proc
https://lkml.org/lkml/2020/5/13/990

The current patch set extracts register_sysctl_init and some sysctl_vals
variables, and clears the interface of hung_task and watchdog in sysctl.c.

changes in v2:
1. Adjusted the order of patches, first do public function
extraction, then do feature code movement
2. Move hung_task sysctl to hung_task.c instead of adding new file
3. Extract multiple common variables instead of only neg_one, and keep
the order of member values in sysctl_vals
4. Add const modification to the variable sixty in watchdog sysctl

V1: https://lkml.org/lkml/2020/5/15/17

Xiaoming Ni (4):
sysctl: Add register_sysctl_init() interface
sysctl: Move some boundary constants form sysctl.c to sysctl_vals
hung_task: Move hung_task sysctl interface to hung_task.c
watchdog: move watchdog sysctl interface to watchdog.c

fs/proc/proc_sysctl.c | 2 +-
include/linux/sched/sysctl.h | 8 +-
include/linux/sysctl.h | 13 ++-
kernel/hung_task.c | 63 +++++++++++++-
kernel/sysctl.c | 202 ++++++++-----------------------------------
kernel/watchdog.c | 101 ++++++++++++++++++++++
6 files changed, 210 insertions(+), 179 deletions(-)

--
1.8.5.6


2020-05-16 08:57:51

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v2 1/4] sysctl: Add register_sysctl_init() interface

In order to eliminate the duplicate code for registering the sysctl
interface during the initialization of each feature, add the
register_sysctl_init() interface

Signed-off-by: Xiaoming Ni <[email protected]>
---
include/linux/sysctl.h | 2 ++
kernel/sysctl.c | 19 +++++++++++++++++++
2 files changed, 21 insertions(+)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 02fa844..43f8ef9 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -206,6 +206,8 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
void unregister_sysctl_table(struct ctl_table_header * table);

extern int sysctl_init(void);
+extern void register_sysctl_init(const char *path, struct ctl_table *table,
+ const char *table_name);

extern struct ctl_table sysctl_mount_point[];

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b9ff323..20c31f5 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1970,6 +1970,25 @@ int __init sysctl_init(void)
return 0;
}

+/*
+ * The sysctl interface is used to modify the interface value,
+ * but the feature interface has default values. Even if register_sysctl fails,
+ * the feature body function can also run. At the same time, malloc small
+ * fragment of memory during the system initialization phase, almost does
+ * not fail. Therefore, the function return is designed as void
+ */
+void __init register_sysctl_init(const char *path, struct ctl_table *table,
+ const char *table_name)
+{
+ struct ctl_table_header *hdr = register_sysctl(path, table);
+
+ if (unlikely(!hdr)) {
+ pr_err("failed when register_sysctl %s to %s\n", table_name, path);
+ return;
+ }
+ kmemleak_not_leak(hdr);
+}
+
#endif /* CONFIG_SYSCTL */

/*
--
1.8.5.6

2020-05-16 08:58:16

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v2 4/4] watchdog: move watchdog sysctl interface to watchdog.c

Move watchdog syscl interface to watchdog.c.
Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the same time.

Signed-off-by: Xiaoming Ni <[email protected]>
---
kernel/sysctl.c | 96 ---------------------------------------------------
kernel/watchdog.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 101 insertions(+), 96 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 49e9541..efe6172 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -97,9 +97,6 @@
#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
#include <linux/stackleak.h>
#endif
-#ifdef CONFIG_LOCKUP_DETECTOR
-#include <linux/nmi.h>
-#endif

#if defined(CONFIG_SYSCTL)

@@ -120,9 +117,6 @@
#endif

/* Constants used for minimum and maximum */
-#ifdef CONFIG_LOCKUP_DETECTOR
-static int sixty = 60;
-#endif

static unsigned long zero_ul;
static unsigned long one_ul = 1;
@@ -883,96 +877,6 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0444,
.proc_handler = proc_dointvec,
},
-#if defined(CONFIG_LOCKUP_DETECTOR)
- {
- .procname = "watchdog",
- .data = &watchdog_user_enabled,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_watchdog,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
- {
- .procname = "watchdog_thresh",
- .data = &watchdog_thresh,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_watchdog_thresh,
- .extra1 = SYSCTL_ZERO,
- .extra2 = &sixty,
- },
- {
- .procname = "nmi_watchdog",
- .data = &nmi_watchdog_user_enabled,
- .maxlen = sizeof(int),
- .mode = NMI_WATCHDOG_SYSCTL_PERM,
- .proc_handler = proc_nmi_watchdog,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
- {
- .procname = "watchdog_cpumask",
- .data = &watchdog_cpumask_bits,
- .maxlen = NR_CPUS,
- .mode = 0644,
- .proc_handler = proc_watchdog_cpumask,
- },
-#ifdef CONFIG_SOFTLOCKUP_DETECTOR
- {
- .procname = "soft_watchdog",
- .data = &soft_watchdog_user_enabled,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_soft_watchdog,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
- {
- .procname = "softlockup_panic",
- .data = &softlockup_panic,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#ifdef CONFIG_SMP
- {
- .procname = "softlockup_all_cpu_backtrace",
- .data = &sysctl_softlockup_all_cpu_backtrace,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#endif /* CONFIG_SMP */
-#endif
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
- {
- .procname = "hardlockup_panic",
- .data = &hardlockup_panic,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#ifdef CONFIG_SMP
- {
- .procname = "hardlockup_all_cpu_backtrace",
- .data = &sysctl_hardlockup_all_cpu_backtrace,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#endif /* CONFIG_SMP */
-#endif
-#endif
-
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
{
.procname = "unknown_nmi_panic",
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b6b1f54..f5d19be 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -756,6 +756,106 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
mutex_unlock(&watchdog_mutex);
return err;
}
+
+static const int sixty = 60;
+
+static struct ctl_table watchdog_sysctls[] = {
+ {
+ .procname = "watchdog",
+ .data = &watchdog_user_enabled,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_watchdog,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "watchdog_thresh",
+ .data = &watchdog_thresh,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_watchdog_thresh,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = &sixty,
+ },
+ {
+ .procname = "nmi_watchdog",
+ .data = &nmi_watchdog_user_enabled,
+ .maxlen = sizeof(int),
+ .mode = NMI_WATCHDOG_SYSCTL_PERM,
+ .proc_handler = proc_nmi_watchdog,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "watchdog_cpumask",
+ .data = &watchdog_cpumask_bits,
+ .maxlen = NR_CPUS,
+ .mode = 0644,
+ .proc_handler = proc_watchdog_cpumask,
+ },
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+ {
+ .procname = "soft_watchdog",
+ .data = &soft_watchdog_user_enabled,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_soft_watchdog,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "softlockup_panic",
+ .data = &softlockup_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#ifdef CONFIG_SMP
+ {
+ .procname = "softlockup_all_cpu_backtrace",
+ .data = &sysctl_softlockup_all_cpu_backtrace,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#endif /* CONFIG_SMP */
+#endif
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+ {
+ .procname = "hardlockup_panic",
+ .data = &hardlockup_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#ifdef CONFIG_SMP
+ {
+ .procname = "hardlockup_all_cpu_backtrace",
+ .data = &sysctl_hardlockup_all_cpu_backtrace,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#endif /* CONFIG_SMP */
+#endif
+ {}
+};
+
+static void __init watchdog_sysctl_init(void)
+{
+ register_sysctl_init("kernel", watchdog_sysctls, "watchdog_sysctls");
+}
+#else
+#define watchdog_sysctl_init() do { } while (0)
#endif /* CONFIG_SYSCTL */

void __init lockup_detector_init(void)
@@ -769,4 +869,5 @@ void __init lockup_detector_init(void)
if (!watchdog_nmi_probe())
nmi_watchdog_available = true;
lockup_detector_setup();
+ watchdog_sysctl_init();
}
--
1.8.5.6

2020-05-16 09:00:59

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v2 2/4] sysctl: Move some boundary constants form sysctl.c to sysctl_vals

Some boundary (.extra1 .extra2) constants (E.g: neg_one two) in
sysctl.c are used in multiple features. Move these variables to
sysctl_vals to avoid adding duplicate variables when cleaning up
sysctls table.

Signed-off-by: Xiaoming Ni <[email protected]>
---
fs/proc/proc_sysctl.c | 2 +-
include/linux/sysctl.h | 11 ++++++++---
kernel/sysctl.c | 37 ++++++++++++++++---------------------
3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b6f5d45..3f77e64 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -23,7 +23,7 @@
static const struct inode_operations proc_sys_dir_operations;

/* shared constants to be used in various sysctls */
-const int sysctl_vals[] = { 0, 1, INT_MAX };
+const int sysctl_vals[] = { -1, 0, 1, 2, 4, 100, 1000, INT_MAX };
EXPORT_SYMBOL(sysctl_vals);

/* Support for permanently empty directories */
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 43f8ef9..bf97c30 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -38,9 +38,14 @@
struct ctl_dir;

/* Keep the same order as in fs/proc/proc_sysctl.c */
-#define SYSCTL_ZERO ((void *)&sysctl_vals[0])
-#define SYSCTL_ONE ((void *)&sysctl_vals[1])
-#define SYSCTL_INT_MAX ((void *)&sysctl_vals[2])
+#define SYSCTL_NEG_ONE ((void *)&sysctl_vals[0])
+#define SYSCTL_ZERO ((void *)&sysctl_vals[1])
+#define SYSCTL_ONE ((void *)&sysctl_vals[2])
+#define SYSCTL_TWO ((void *)&sysctl_vals[3])
+#define SYSCTL_FOUR ((void *)&sysctl_vals[4])
+#define SYSCTL_ONE_HUNDRED ((void *)&sysctl_vals[5])
+#define SYSCTL_ONE_THOUSAND ((void *)&sysctl_vals[6])
+#define SYSCTL_INT_MAX ((void *)&sysctl_vals[7])

extern const int sysctl_vals[];

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 20c31f5..5f08cc2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -124,14 +124,9 @@
static int sixty = 60;
#endif

-static int __maybe_unused neg_one = -1;
-static int __maybe_unused two = 2;
-static int __maybe_unused four = 4;
static unsigned long zero_ul;
static unsigned long one_ul = 1;
static unsigned long long_max = LONG_MAX;
-static int one_hundred = 100;
-static int one_thousand = 1000;
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
#endif
@@ -547,7 +542,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &neg_one,
+ .extra1 = SYSCTL_NEG_ONE,
.extra2 = SYSCTL_ONE,
},
#endif
@@ -878,7 +873,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax_sysadmin,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
#endif
{
@@ -1189,7 +1184,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = perf_cpu_time_max_percent_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "perf_event_max_stack",
@@ -1207,7 +1202,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = perf_event_max_stack_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_thousand,
+ .extra2 = SYSCTL_ONE_THOUSAND,
},
#endif
{
@@ -1282,7 +1277,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "panic_on_oom",
@@ -1291,7 +1286,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "oom_kill_allocating_task",
@@ -1336,7 +1331,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = dirty_background_ratio_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "dirty_background_bytes",
@@ -1353,7 +1348,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = dirty_ratio_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "dirty_bytes",
@@ -1393,7 +1388,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
#ifdef CONFIG_HUGETLB_PAGE
{
@@ -1450,7 +1445,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0200,
.proc_handler = drop_caches_sysctl_handler,
.extra1 = SYSCTL_ONE,
- .extra2 = &four,
+ .extra2 = SYSCTL_FOUR,
},
#ifdef CONFIG_COMPACTION
{
@@ -1503,7 +1498,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = watermark_scale_factor_sysctl_handler,
.extra1 = SYSCTL_ONE,
- .extra2 = &one_thousand,
+ .extra2 = SYSCTL_ONE_THOUSAND,
},
{
.procname = "percpu_pagelist_fraction",
@@ -1582,7 +1577,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = sysctl_min_unmapped_ratio_sysctl_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "min_slab_ratio",
@@ -1591,7 +1586,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = sysctl_min_slab_ratio_sysctl_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
#endif
#ifdef CONFIG_SMP
@@ -1874,7 +1869,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0600,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "protected_regular",
@@ -1883,7 +1878,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0600,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "suid_dumpable",
@@ -1892,7 +1887,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax_coredump,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
#if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
{
--
1.8.5.6

2020-05-17 02:42:46

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] sysctl: Move some boundary constants form sysctl.c to sysctl_vals

On Sat, May 16, 2020 at 04:55:13PM +0800, Xiaoming Ni wrote:
> Some boundary (.extra1 .extra2) constants (E.g: neg_one two) in
> sysctl.c are used in multiple features. Move these variables to
> sysctl_vals to avoid adding duplicate variables when cleaning up
> sysctls table.
>
> Signed-off-by: Xiaoming Ni <[email protected]>

Reviewed-by: Kees Cook <[email protected]>

--
Kees Cook

2020-05-17 02:47:21

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] sysctl: Add register_sysctl_init() interface

On Sat, May 16, 2020 at 04:55:12PM +0800, Xiaoming Ni wrote:
> In order to eliminate the duplicate code for registering the sysctl
> interface during the initialization of each feature, add the
> register_sysctl_init() interface
>
> Signed-off-by: Xiaoming Ni <[email protected]>

Reviewed-by: Kees Cook <[email protected]>

--
Kees Cook

2020-05-17 02:48:14

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] watchdog: move watchdog sysctl interface to watchdog.c

On Sat, May 16, 2020 at 04:55:15PM +0800, Xiaoming Ni wrote:
> Move watchdog syscl interface to watchdog.c.
> Use register_sysctl() to register the sysctl interface to avoid
> merge conflicts when different features modify sysctl.c at the same time.
>
> Signed-off-by: Xiaoming Ni <[email protected]>

Reviewed-by: Kees Cook <[email protected]>

--
Kees Cook