2020-05-19 03:33:19

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v4 0/4] cleaning up the sysctls table (hung_task watchdog)

Kernel/sysctl.c contains more than 190 interface files, and there are a
large number of config macro controls. When modifying the sysctl
interface directly in kernel/sysctl.c, conflicts are very easy to occur.
E.g: https://lore.kernel.org/lkml/[email protected]/

Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the same time.

So consider cleaning up the sysctls table, details are in:
https://kernelnewbies.org/KernelProjects/proc
https://lore.kernel.org/lkml/[email protected]/#t

The current patch set extracts register_sysctl_init and some sysctl_vals
variables, and clears the interface of hung_task and watchdog in sysctl.c.

The current patch set is based on linux-next, commit 72bc15d0018ebfbc9
("Add linux-next specific files for 20200518").

changes in v4:
Handle the conflict with the commit d4ee116819ed714f ("kernel/hung_task.c:
introduce sysctl to print all traces when a hung task is detected"),
move the sysctl interface hung_task_all_cpu_backtrace to hung_task.c.

V3: https://lore.kernel.org/lkml/[email protected]/
base on commit b9bbe6ed63b2b9 ("Linux 5.7-rc6")
changes in v3:
1. make hung_task_timeout_max to be const
2. fix build warning:
kernel/watchdog.c:779:14: warning: initialization discards 'const'
qualifier from pointer target type [-Wdiscarded-qualifiers]
.extra2 = &sixty,
^

V2: https://lore.kernel.org/lkml/[email protected]/
changes in v2:
1. Adjusted the order of patches, first do public function
extraction, then do feature code movement
2. Move hung_task sysctl to hung_task.c instead of adding new file
3. Extract multiple common variables instead of only neg_one, and keep
the order of member values in sysctl_vals
4. Add const modification to the variable sixty in watchdog sysctl

V1: https://lore.kernel.org/lkml/[email protected]/

Xiaoming Ni (4):
sysctl: Add register_sysctl_init() interface
sysctl: Move some boundary constants form sysctl.c to sysctl_vals
hung_task: Move hung_task sysctl interface to hung_task.c
watchdog: move watchdog sysctl interface to watchdog.c

fs/proc/proc_sysctl.c | 2 +-
include/linux/sched/sysctl.h | 14 +--
include/linux/sysctl.h | 13 ++-
kernel/hung_task.c | 77 +++++++++++++++-
kernel/sysctl.c | 214 +++++++------------------------------------
kernel/watchdog.c | 101 ++++++++++++++++++++
6 files changed, 224 insertions(+), 197 deletions(-)

--
1.8.5.6


2020-05-19 03:33:29

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v4 3/4] hung_task: Move hung_task sysctl interface to hung_task.c

Move hung_task sysctl interface to hung_task.c.
Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the same time.

Signed-off-by: Xiaoming Ni <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
include/linux/sched/sysctl.h | 14 +-------
kernel/hung_task.c | 77 +++++++++++++++++++++++++++++++++++++++++++-
kernel/sysctl.c | 62 -----------------------------------
3 files changed, 77 insertions(+), 76 deletions(-)

diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index 660ac49..fcd397a8 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -7,20 +7,8 @@
struct ctl_table;

#ifdef CONFIG_DETECT_HUNG_TASK
-
-#ifdef CONFIG_SMP
-extern unsigned int sysctl_hung_task_all_cpu_backtrace;
-#else
-#define sysctl_hung_task_all_cpu_backtrace 0
-#endif /* CONFIG_SMP */
-
-extern int sysctl_hung_task_check_count;
-extern unsigned int sysctl_hung_task_panic;
+/* used for hung_task and block/ */
extern unsigned long sysctl_hung_task_timeout_secs;
-extern unsigned long sysctl_hung_task_check_interval_secs;
-extern int sysctl_hung_task_warnings;
-int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
- void *buffer, size_t *lenp, loff_t *ppos);
#else
/* Avoid need for ifdefs elsewhere in the code */
enum { sysctl_hung_task_timeout_secs = 0 };
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index a672db8..d67df599 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -63,6 +63,9 @@
* Defaults to 0, can be changed via sysctl.
*/
unsigned int __read_mostly sysctl_hung_task_all_cpu_backtrace;
+#else
+/* Avoid need for ifdefs elsewhere in the code */
+enum { sysctl_hung_task_timeout_secs = 0 };
#endif /* CONFIG_SMP */

/*
@@ -265,10 +268,11 @@ static long hung_timeout_jiffies(unsigned long last_checked,
MAX_SCHEDULE_TIMEOUT;
}

+#ifdef CONFIG_SYSCTL
/*
* Process updating of timeout sysctl
*/
-int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
+static int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos)
{
@@ -285,6 +289,76 @@ int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
return ret;
}

+/*
+ * This is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs
+ * and hung_task_check_interval_secs
+ */
+static const unsigned long hung_task_timeout_max = (LONG_MAX / HZ);
+static struct ctl_table hung_task_sysctls[] = {
+#ifdef CONFIG_SMP
+ {
+ .procname = "hung_task_all_cpu_backtrace",
+ .data = &sysctl_hung_task_all_cpu_backtrace,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#endif /* CONFIG_SMP */
+ {
+ .procname = "hung_task_panic",
+ .data = &sysctl_hung_task_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "hung_task_check_count",
+ .data = &sysctl_hung_task_check_count,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ },
+ {
+ .procname = "hung_task_timeout_secs",
+ .data = &sysctl_hung_task_timeout_secs,
+ .maxlen = sizeof(unsigned long),
+ .mode = 0644,
+ .proc_handler = proc_dohung_task_timeout_secs,
+ .extra2 = (void *)&hung_task_timeout_max,
+ },
+ {
+ .procname = "hung_task_check_interval_secs",
+ .data = &sysctl_hung_task_check_interval_secs,
+ .maxlen = sizeof(unsigned long),
+ .mode = 0644,
+ .proc_handler = proc_dohung_task_timeout_secs,
+ .extra2 = (void *)&hung_task_timeout_max,
+ },
+ {
+ .procname = "hung_task_warnings",
+ .data = &sysctl_hung_task_warnings,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_NEG_ONE,
+ },
+ {}
+};
+
+static void __init hung_task_sysctl_init(void)
+{
+ register_sysctl_init("kernel", hung_task_sysctls, "hung_task_sysctls");
+}
+#else
+#define hung_task_sysctl_init() do { } while (0)
+#endif /* CONFIG_SYSCTL */
+
+
static atomic_t reset_hung_task = ATOMIC_INIT(0);

void reset_hung_task_detector(void)
@@ -354,6 +428,7 @@ static int __init hung_task_init(void)
pm_notifier(hungtask_pm_notify, 0);

watchdog_task = kthread_run(watchdog, NULL, "khungtaskd");
+ hung_task_sysctl_init();

return 0;
}
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 38469bf..b7fd4e6 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -131,13 +131,6 @@
static const int ngroups_max = NGROUPS_MAX;
static const int cap_last_cap = CAP_LAST_CAP;

-/*
- * This is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs
- * and hung_task_check_interval_secs
- */
-#ifdef CONFIG_DETECT_HUNG_TASK
-static unsigned long hung_task_timeout_max = (LONG_MAX/HZ);
-#endif

#ifdef CONFIG_INOTIFY_USER
#include <linux/inotify.h>
@@ -229,7 +222,6 @@ static int bpf_stats_handler(struct ctl_table *table, int write,
return ret;
}
#endif
-
/*
* /proc/sys support
*/
@@ -2431,60 +2423,6 @@ int proc_do_static_key(struct ctl_table *table, int write,
.proc_handler = proc_dointvec,
},
#endif
-#ifdef CONFIG_DETECT_HUNG_TASK
-#ifdef CONFIG_SMP
- {
- .procname = "hung_task_all_cpu_backtrace",
- .data = &sysctl_hung_task_all_cpu_backtrace,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#endif /* CONFIG_SMP */
- {
- .procname = "hung_task_panic",
- .data = &sysctl_hung_task_panic,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
- {
- .procname = "hung_task_check_count",
- .data = &sysctl_hung_task_check_count,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- },
- {
- .procname = "hung_task_timeout_secs",
- .data = &sysctl_hung_task_timeout_secs,
- .maxlen = sizeof(unsigned long),
- .mode = 0644,
- .proc_handler = proc_dohung_task_timeout_secs,
- .extra2 = &hung_task_timeout_max,
- },
- {
- .procname = "hung_task_check_interval_secs",
- .data = &sysctl_hung_task_check_interval_secs,
- .maxlen = sizeof(unsigned long),
- .mode = 0644,
- .proc_handler = proc_dohung_task_timeout_secs,
- .extra2 = &hung_task_timeout_max,
- },
- {
- .procname = "hung_task_warnings",
- .data = &sysctl_hung_task_warnings,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_NEG_ONE,
- },
-#endif
#ifdef CONFIG_RT_MUTEXES
{
.procname = "max_lock_depth",
--
1.8.5.6

2020-05-19 03:33:33

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v4 2/4] sysctl: Move some boundary constants form sysctl.c to sysctl_vals

Some boundary (.extra1 .extra2) constants (E.g: neg_one two) in
sysctl.c are used in multiple features. Move these variables to
sysctl_vals to avoid adding duplicate variables when cleaning up
sysctls table.

Signed-off-by: Xiaoming Ni <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
fs/proc/proc_sysctl.c | 2 +-
include/linux/sysctl.h | 11 ++++++++---
kernel/sysctl.c | 39 +++++++++++++++++----------------------
3 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 5b405f3..3d65e7d 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -24,7 +24,7 @@
static const struct inode_operations proc_sys_dir_operations;

/* shared constants to be used in various sysctls */
-const int sysctl_vals[] = { 0, 1, INT_MAX };
+const int sysctl_vals[] = { -1, 0, 1, 2, 4, 100, 1000, INT_MAX };
EXPORT_SYMBOL(sysctl_vals);

/* Support for permanently empty directories */
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 857ba93..97586ee 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -38,9 +38,14 @@
struct ctl_dir;

/* Keep the same order as in fs/proc/proc_sysctl.c */
-#define SYSCTL_ZERO ((void *)&sysctl_vals[0])
-#define SYSCTL_ONE ((void *)&sysctl_vals[1])
-#define SYSCTL_INT_MAX ((void *)&sysctl_vals[2])
+#define SYSCTL_NEG_ONE ((void *)&sysctl_vals[0])
+#define SYSCTL_ZERO ((void *)&sysctl_vals[1])
+#define SYSCTL_ONE ((void *)&sysctl_vals[2])
+#define SYSCTL_TWO ((void *)&sysctl_vals[3])
+#define SYSCTL_FOUR ((void *)&sysctl_vals[4])
+#define SYSCTL_ONE_HUNDRED ((void *)&sysctl_vals[5])
+#define SYSCTL_ONE_THOUSAND ((void *)&sysctl_vals[6])
+#define SYSCTL_INT_MAX ((void *)&sysctl_vals[7])

extern const int sysctl_vals[];

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 8afd713..38469bf 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -111,14 +111,9 @@
static int sixty = 60;
#endif

-static int __maybe_unused neg_one = -1;
-static int __maybe_unused two = 2;
-static int __maybe_unused four = 4;
static unsigned long zero_ul;
static unsigned long one_ul = 1;
static unsigned long long_max = LONG_MAX;
-static int one_hundred = 100;
-static int one_thousand = 1000;
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
#endif
@@ -1885,7 +1880,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &neg_one,
+ .extra1 = SYSCTL_NEG_ONE,
.extra2 = SYSCTL_ONE,
},
#endif
@@ -2227,7 +2222,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax_sysadmin,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
#endif
{
@@ -2487,7 +2482,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &neg_one,
+ .extra1 = SYSCTL_NEG_ONE,
},
#endif
#ifdef CONFIG_RT_MUTEXES
@@ -2549,7 +2544,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = perf_cpu_time_max_percent_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "perf_event_max_stack",
@@ -2567,7 +2562,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = perf_event_max_stack_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_thousand,
+ .extra2 = SYSCTL_ONE_THOUSAND,
},
#endif
{
@@ -2642,7 +2637,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "panic_on_oom",
@@ -2651,7 +2646,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "oom_kill_allocating_task",
@@ -2696,7 +2691,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = dirty_background_ratio_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "dirty_background_bytes",
@@ -2713,7 +2708,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = dirty_ratio_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "dirty_bytes",
@@ -2753,7 +2748,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
#ifdef CONFIG_HUGETLB_PAGE
{
@@ -2810,7 +2805,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0200,
.proc_handler = drop_caches_sysctl_handler,
.extra1 = SYSCTL_ONE,
- .extra2 = &four,
+ .extra2 = SYSCTL_FOUR,
},
#ifdef CONFIG_COMPACTION
{
@@ -2863,7 +2858,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = watermark_scale_factor_sysctl_handler,
.extra1 = SYSCTL_ONE,
- .extra2 = &one_thousand,
+ .extra2 = SYSCTL_ONE_THOUSAND,
},
{
.procname = "percpu_pagelist_fraction",
@@ -2942,7 +2937,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = sysctl_min_unmapped_ratio_sysctl_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
{
.procname = "min_slab_ratio",
@@ -2951,7 +2946,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = sysctl_min_slab_ratio_sysctl_handler,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = SYSCTL_ONE_HUNDRED,
},
#endif
#ifdef CONFIG_SMP
@@ -3234,7 +3229,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0600,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "protected_regular",
@@ -3243,7 +3238,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0600,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
{
.procname = "suid_dumpable",
@@ -3252,7 +3247,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0644,
.proc_handler = proc_dointvec_minmax_coredump,
.extra1 = SYSCTL_ZERO,
- .extra2 = &two,
+ .extra2 = SYSCTL_TWO,
},
#if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
{
--
1.8.5.6

2020-05-19 03:34:25

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v4 4/4] watchdog: move watchdog sysctl interface to watchdog.c

Move watchdog syscl interface to watchdog.c.
Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the same time.

Signed-off-by: Xiaoming Ni <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
kernel/sysctl.c | 96 ---------------------------------------------------
kernel/watchdog.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 101 insertions(+), 96 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b7fd4e6..88235fa 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -100,16 +100,10 @@
#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
#include <linux/stackleak.h>
#endif
-#ifdef CONFIG_LOCKUP_DETECTOR
-#include <linux/nmi.h>
-#endif

#if defined(CONFIG_SYSCTL)

/* Constants used for minimum and maximum */
-#ifdef CONFIG_LOCKUP_DETECTOR
-static int sixty = 60;
-#endif

static unsigned long zero_ul;
static unsigned long one_ul = 1;
@@ -2231,96 +2225,6 @@ int proc_do_static_key(struct ctl_table *table, int write,
.mode = 0444,
.proc_handler = proc_dointvec,
},
-#if defined(CONFIG_LOCKUP_DETECTOR)
- {
- .procname = "watchdog",
- .data = &watchdog_user_enabled,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_watchdog,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
- {
- .procname = "watchdog_thresh",
- .data = &watchdog_thresh,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_watchdog_thresh,
- .extra1 = SYSCTL_ZERO,
- .extra2 = &sixty,
- },
- {
- .procname = "nmi_watchdog",
- .data = &nmi_watchdog_user_enabled,
- .maxlen = sizeof(int),
- .mode = NMI_WATCHDOG_SYSCTL_PERM,
- .proc_handler = proc_nmi_watchdog,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
- {
- .procname = "watchdog_cpumask",
- .data = &watchdog_cpumask_bits,
- .maxlen = NR_CPUS,
- .mode = 0644,
- .proc_handler = proc_watchdog_cpumask,
- },
-#ifdef CONFIG_SOFTLOCKUP_DETECTOR
- {
- .procname = "soft_watchdog",
- .data = &soft_watchdog_user_enabled,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_soft_watchdog,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
- {
- .procname = "softlockup_panic",
- .data = &softlockup_panic,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#ifdef CONFIG_SMP
- {
- .procname = "softlockup_all_cpu_backtrace",
- .data = &sysctl_softlockup_all_cpu_backtrace,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#endif /* CONFIG_SMP */
-#endif
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
- {
- .procname = "hardlockup_panic",
- .data = &hardlockup_panic,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#ifdef CONFIG_SMP
- {
- .procname = "hardlockup_all_cpu_backtrace",
- .data = &sysctl_hardlockup_all_cpu_backtrace,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-#endif /* CONFIG_SMP */
-#endif
-#endif
-
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
{
.procname = "unknown_nmi_panic",
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 1b93953..b000326 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -758,6 +758,106 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
mutex_unlock(&watchdog_mutex);
return err;
}
+
+static const int sixty = 60;
+
+static struct ctl_table watchdog_sysctls[] = {
+ {
+ .procname = "watchdog",
+ .data = &watchdog_user_enabled,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_watchdog,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "watchdog_thresh",
+ .data = &watchdog_thresh,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_watchdog_thresh,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = (void *)&sixty,
+ },
+ {
+ .procname = "nmi_watchdog",
+ .data = &nmi_watchdog_user_enabled,
+ .maxlen = sizeof(int),
+ .mode = NMI_WATCHDOG_SYSCTL_PERM,
+ .proc_handler = proc_nmi_watchdog,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "watchdog_cpumask",
+ .data = &watchdog_cpumask_bits,
+ .maxlen = NR_CPUS,
+ .mode = 0644,
+ .proc_handler = proc_watchdog_cpumask,
+ },
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+ {
+ .procname = "soft_watchdog",
+ .data = &soft_watchdog_user_enabled,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_soft_watchdog,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "softlockup_panic",
+ .data = &softlockup_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#ifdef CONFIG_SMP
+ {
+ .procname = "softlockup_all_cpu_backtrace",
+ .data = &sysctl_softlockup_all_cpu_backtrace,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#endif /* CONFIG_SMP */
+#endif
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+ {
+ .procname = "hardlockup_panic",
+ .data = &hardlockup_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#ifdef CONFIG_SMP
+ {
+ .procname = "hardlockup_all_cpu_backtrace",
+ .data = &sysctl_hardlockup_all_cpu_backtrace,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#endif /* CONFIG_SMP */
+#endif
+ {}
+};
+
+static void __init watchdog_sysctl_init(void)
+{
+ register_sysctl_init("kernel", watchdog_sysctls, "watchdog_sysctls");
+}
+#else
+#define watchdog_sysctl_init() do { } while (0)
#endif /* CONFIG_SYSCTL */

void __init lockup_detector_init(void)
@@ -771,4 +871,5 @@ void __init lockup_detector_init(void)
if (!watchdog_nmi_probe())
nmi_watchdog_available = true;
lockup_detector_setup();
+ watchdog_sysctl_init();
}
--
1.8.5.6

2020-05-19 03:34:47

by Xiaoming Ni

[permalink] [raw]
Subject: [PATCH v4 1/4] sysctl: Add register_sysctl_init() interface

In order to eliminate the duplicate code for registering the sysctl
interface during the initialization of each feature, add the
register_sysctl_init() interface

Signed-off-by: Xiaoming Ni <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
include/linux/sysctl.h | 2 ++
kernel/sysctl.c | 19 +++++++++++++++++++
2 files changed, 21 insertions(+)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 50bb7f3..857ba93 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -197,6 +197,8 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
void unregister_sysctl_table(struct ctl_table_header * table);

extern int sysctl_init(void);
+extern void register_sysctl_init(const char *path, struct ctl_table *table,
+ const char *table_name);
void do_sysctl_args(void);

extern int pwrsw_enabled;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index cc1fcba..8afd713 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -3358,6 +3358,25 @@ int __init sysctl_init(void)
kmemleak_not_leak(hdr);
return 0;
}
+
+/*
+ * The sysctl interface is used to modify the interface value,
+ * but the feature interface has default values. Even if register_sysctl fails,
+ * the feature body function can also run. At the same time, malloc small
+ * fragment of memory during the system initialization phase, almost does
+ * not fail. Therefore, the function return is designed as void
+ */
+void __init register_sysctl_init(const char *path, struct ctl_table *table,
+ const char *table_name)
+{
+ struct ctl_table_header *hdr = register_sysctl(path, table);
+
+ if (unlikely(!hdr)) {
+ pr_err("failed when register_sysctl %s to %s\n", table_name, path);
+ return;
+ }
+ kmemleak_not_leak(hdr);
+}
#endif /* CONFIG_SYSCTL */
/*
* No sense putting this after each symbol definition, twice,
--
1.8.5.6

2020-05-19 04:47:50

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] sysctl: Move some boundary constants form sysctl.c to sysctl_vals

On 2020/05/19 12:31, Xiaoming Ni wrote:
> Some boundary (.extra1 .extra2) constants (E.g: neg_one two) in
> sysctl.c are used in multiple features. Move these variables to
> sysctl_vals to avoid adding duplicate variables when cleaning up
> sysctls table.
>
> Signed-off-by: Xiaoming Ni <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

I feel that it is use of

void *extra1;
void *extra2;

in "struct ctl_table" that requires constant values indirection.
Can't we get rid of sysctl_vals using some "union" like below?

struct ctl_table {
const char *procname; /* Text ID for /proc/sys, or zero */
void *data;
int maxlen;
umode_t mode;
struct ctl_table *child; /* Deprecated */
proc_handler *proc_handler; /* Callback for text formatting */
struct ctl_table_poll *poll;
union {
void *min_max_ptr[2];
int min_max_int[2];
long min_max_long[2];
};
} __randomize_layout;

2020-05-20 01:18:16

by Xiaoming Ni

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] sysctl: Move some boundary constants form sysctl.c to sysctl_vals

On 2020/5/19 12:44, Tetsuo Handa wrote:
> On 2020/05/19 12:31, Xiaoming Ni wrote:
>> Some boundary (.extra1 .extra2) constants (E.g: neg_one two) in
>> sysctl.c are used in multiple features. Move these variables to
>> sysctl_vals to avoid adding duplicate variables when cleaning up
>> sysctls table.
>>
>> Signed-off-by: Xiaoming Ni <[email protected]>
>> Reviewed-by: Kees Cook <[email protected]>
>
> I feel that it is use of
>
> void *extra1;
> void *extra2;
>
> in "struct ctl_table" that requires constant values indirection.
> Can't we get rid of sysctl_vals using some "union" like below?
>
> struct ctl_table {
> const char *procname; /* Text ID for /proc/sys, or zero */
> void *data;
> int maxlen;
> umode_t mode;
> struct ctl_table *child; /* Deprecated */
> proc_handler *proc_handler; /* Callback for text formatting */
> struct ctl_table_poll *poll;
> union {
> void *min_max_ptr[2];
> int min_max_int[2];
> long min_max_long[2];
> };
> } __randomize_layout;
>
> .
>

net/decnet/dn_dev.c:
static void dn_dev_sysctl_register(struct net_device *dev, struct
dn_dev_parms *parms)
{
struct dn_dev_sysctl_table *t;
int i;

char path[sizeof("net/decnet/conf/") + IFNAMSIZ];

t = kmemdup(&dn_dev_sysctl, sizeof(*t), GFP_KERNEL);
if (t == NULL)
return;

for(i = 0; i < ARRAY_SIZE(t->dn_dev_vars) - 1; i++) {
long offset = (long)t->dn_dev_vars[i].data;
t->dn_dev_vars[i].data = ((char *)parms) + offset;
}

snprintf(path, sizeof(path), "net/decnet/conf/%s",
dev? dev->name : parms->name);

t->dn_dev_vars[0].extra1 = (void *)dev;

t->sysctl_header = register_net_sysctl(&init_net, path, t->dn_dev_vars);
if (t->sysctl_header == NULL)
kfree(t);
else
parms->sysctl = t;
}

A small amount of code is not used as a boundary value when using
extra1. This scenario may not be suitable for renaming to min_max_ptr.

Should we add const to extra1 extra2 ?

--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -124,8 +124,8 @@ struct ctl_table {
struct ctl_table *child; /* Deprecated */
proc_handler *proc_handler; /* Callback for text formatting */
struct ctl_table_poll *poll;
- void *extra1;
- void *extra2;
+ const void *extra1;
+ const void *extra2;
} __randomize_layout;


Thanks
Xiaoming Ni







2020-05-20 01:41:59

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] sysctl: Move some boundary constants form sysctl.c to sysctl_vals

On Wed, May 20, 2020 at 09:14:08AM +0800, Xiaoming Ni wrote:
> On 2020/5/19 12:44, Tetsuo Handa wrote:
> > On 2020/05/19 12:31, Xiaoming Ni wrote:
> > > Some boundary (.extra1 .extra2) constants (E.g: neg_one two) in
> > > sysctl.c are used in multiple features. Move these variables to
> > > sysctl_vals to avoid adding duplicate variables when cleaning up
> > > sysctls table.
> > >
> > > Signed-off-by: Xiaoming Ni <[email protected]>
> > > Reviewed-by: Kees Cook <[email protected]>
> >
> > I feel that it is use of
> >
> > void *extra1;
> > void *extra2;
> >
> > in "struct ctl_table" that requires constant values indirection.
> > Can't we get rid of sysctl_vals using some "union" like below?
> >
> > struct ctl_table {
> > const char *procname; /* Text ID for /proc/sys, or zero */
> > void *data;
> > int maxlen;
> > umode_t mode;
> > struct ctl_table *child; /* Deprecated */
> > proc_handler *proc_handler; /* Callback for text formatting */
> > struct ctl_table_poll *poll;
> > union {
> > void *min_max_ptr[2];
> > int min_max_int[2];
> > long min_max_long[2];
> > };
> > } __randomize_layout;
> >
> > .
> >
>
> net/decnet/dn_dev.c:
> static void dn_dev_sysctl_register(struct net_device *dev, struct
> dn_dev_parms *parms)
> {
> struct dn_dev_sysctl_table *t;
> int i;
>
> char path[sizeof("net/decnet/conf/") + IFNAMSIZ];
>
> t = kmemdup(&dn_dev_sysctl, sizeof(*t), GFP_KERNEL);
> if (t == NULL)
> return;
>
> for(i = 0; i < ARRAY_SIZE(t->dn_dev_vars) - 1; i++) {
> long offset = (long)t->dn_dev_vars[i].data;
> t->dn_dev_vars[i].data = ((char *)parms) + offset;
> }
>
> snprintf(path, sizeof(path), "net/decnet/conf/%s",
> dev? dev->name : parms->name);
>
> t->dn_dev_vars[0].extra1 = (void *)dev;
>
> t->sysctl_header = register_net_sysctl(&init_net, path, t->dn_dev_vars);
> if (t->sysctl_header == NULL)
> kfree(t);
> else
> parms->sysctl = t;
> }
>
> A small amount of code is not used as a boundary value when using extra1.
> This scenario may not be suitable for renaming to min_max_ptr.
>
> Should we add const to extra1 extra2 ?
>
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -124,8 +124,8 @@ struct ctl_table {
> struct ctl_table *child; /* Deprecated */
> proc_handler *proc_handler; /* Callback for text formatting */
> struct ctl_table_poll *poll;
> - void *extra1;
> - void *extra2;
> + const void *extra1;
> + const void *extra2;
> } __randomize_layout;

Do that, compile an allyesconfig and it'll fail, but if you fix the
callers so that they use a const, then yes. That would cover only your
architecture. It is unclear if we ever used non-const for this on purpose.

Luis

2020-05-20 03:33:39

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] cleaning up the sysctls table (hung_task watchdog)

On Tue, 19 May 2020 11:31:07 +0800 Xiaoming Ni <[email protected]> wrote:

> Kernel/sysctl.c

eek!

>
> fs/proc/proc_sysctl.c | 2 +-
> include/linux/sched/sysctl.h | 14 +--
> include/linux/sysctl.h | 13 ++-
> kernel/hung_task.c | 77 +++++++++++++++-
> kernel/sysctl.c | 214 +++++++------------------------------------
> kernel/watchdog.c | 101 ++++++++++++++++++++
> 6 files changed, 224 insertions(+), 197 deletions(-)

Here's what we presently have happening in linux-next's kernel/sysctl.c:

sysctl.c | 3109 ++++++++++++++++++++++++++++++---------------------------------
1 file changed, 1521 insertions(+), 1588 deletions(-)


So this is not a good time for your patch!

Can I suggest that you set the idea aside and take a look after 5.8-rc1
is released?

2020-05-20 04:04:24

by Xiaoming Ni

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] cleaning up the sysctls table (hung_task watchdog)

On 2020/5/20 11:31, Andrew Morton wrote:
> On Tue, 19 May 2020 11:31:07 +0800 Xiaoming Ni <[email protected]> wrote:
>
>> Kernel/sysctl.c
>
> eek!
>
>>
>> fs/proc/proc_sysctl.c | 2 +-
>> include/linux/sched/sysctl.h | 14 +--
>> include/linux/sysctl.h | 13 ++-
>> kernel/hung_task.c | 77 +++++++++++++++-
>> kernel/sysctl.c | 214 +++++++------------------------------------
>> kernel/watchdog.c | 101 ++++++++++++++++++++
>> 6 files changed, 224 insertions(+), 197 deletions(-)
>
> Here's what we presently have happening in linux-next's kernel/sysctl.c:
>
> sysctl.c | 3109 ++++++++++++++++++++++++++++++---------------------------------
> 1 file changed, 1521 insertions(+), 1588 deletions(-)
>
>
> So this is not a good time for your patch!
>
> Can I suggest that you set the idea aside and take a look after 5.8-rc1
> is released?
>

ok, I will make v5 patch based on 5.8-rc1 after 5.8-rc1 is released,
And add more sysctl table cleanup.

Thanks
Xiaoming Ni


2020-05-20 13:10:19

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] cleaning up the sysctls table (hung_task watchdog)

On Wed, May 20, 2020 at 12:02:26PM +0800, Xiaoming Ni wrote:
> On 2020/5/20 11:31, Andrew Morton wrote:
> > On Tue, 19 May 2020 11:31:07 +0800 Xiaoming Ni <[email protected]> wrote:
> >
> > > Kernel/sysctl.c
> >
> > eek!
> >
> > >
> > > fs/proc/proc_sysctl.c | 2 +-
> > > include/linux/sched/sysctl.h | 14 +--
> > > include/linux/sysctl.h | 13 ++-
> > > kernel/hung_task.c | 77 +++++++++++++++-
> > > kernel/sysctl.c | 214 +++++++------------------------------------
> > > kernel/watchdog.c | 101 ++++++++++++++++++++
> > > 6 files changed, 224 insertions(+), 197 deletions(-)
> >
> > Here's what we presently have happening in linux-next's kernel/sysctl.c:
> >
> > sysctl.c | 3109 ++++++++++++++++++++++++++++++---------------------------------
> > 1 file changed, 1521 insertions(+), 1588 deletions(-)
> >
> >
> > So this is not a good time for your patch!
> >
> > Can I suggest that you set the idea aside and take a look after 5.8-rc1
> > is released?
> >
>
> ok, I will make v5 patch based on 5.8-rc1 after 5.8-rc1 is released,
> And add more sysctl table cleanup.

Xiaoming, I'll coordinate with you on this offline as I have the fs
kernel/sysctl.c stuff out of the way as well.

Luis

2020-05-29 07:11:16

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] sysctl: Add register_sysctl_init() interface

On Tue, May 19, 2020 at 11:31:08AM +0800, Xiaoming Ni wrote:
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -3358,6 +3358,25 @@ int __init sysctl_init(void)
> kmemleak_not_leak(hdr);
> return 0;
> }
> +
> +/*
> + * The sysctl interface is used to modify the interface value,
> + * but the feature interface has default values. Even if register_sysctl fails,
> + * the feature body function can also run. At the same time, malloc small
> + * fragment of memory during the system initialization phase, almost does
> + * not fail. Therefore, the function return is designed as void
> + */

Let's use kdoc while at it. Can you convert this to proper kdoc?

> +void __init register_sysctl_init(const char *path, struct ctl_table *table,
> + const char *table_name)
> +{
> + struct ctl_table_header *hdr = register_sysctl(path, table);
> +
> + if (unlikely(!hdr)) {
> + pr_err("failed when register_sysctl %s to %s\n", table_name, path);
> + return;

table_name is only used for this, however we can easily just make
another _register_sysctl_init() helper first, and then use a macro
which will concatenate this to something useful if you want to print
a string. I see no point in the description for this, specially since
the way it was used was not to be descriptive, but instead just a name
followed by some underscore and something else.

> + }
> + kmemleak_not_leak(hdr);

Is it *wrong* to run kmemleak_not_leak() when hdr was not allocated?
If so, can you fix the sysctl __init call itself?

PS. Since you have given me your series, feel free to send me a patch
as a follow up to this in privae and I can integrate it into my tree.

Luis

2020-05-29 07:31:58

by Xiaoming Ni

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] sysctl: Add register_sysctl_init() interface

On 2020/5/29 15:09, Luis Chamberlain wrote:
> On Tue, May 19, 2020 at 11:31:08AM +0800, Xiaoming Ni wrote:
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -3358,6 +3358,25 @@ int __init sysctl_init(void)
>> kmemleak_not_leak(hdr);
>> return 0;
>> }
>> +
>> +/*
>> + * The sysctl interface is used to modify the interface value,
>> + * but the feature interface has default values. Even if register_sysctl fails,
>> + * the feature body function can also run. At the same time, malloc small
>> + * fragment of memory during the system initialization phase, almost does
>> + * not fail. Therefore, the function return is designed as void
>> + */
>
> Let's use kdoc while at it. Can you convert this to proper kdoc?
>
Sorry, I do n??t know the format requirements of Kdoc, can you give me
some tips for writing?


>> +void __init register_sysctl_init(const char *path, struct ctl_table *table,
>> + const char *table_name)
>> +{
>> + struct ctl_table_header *hdr = register_sysctl(path, table);
>> +
>> + if (unlikely(!hdr)) {
>> + pr_err("failed when register_sysctl %s to %s\n", table_name, path);
>> + return;
>
> table_name is only used for this, however we can easily just make
> another _register_sysctl_init() helper first, and then use a macro
> which will concatenate this to something useful if you want to print
> a string. I see no point in the description for this, specially since
> the way it was used was not to be descriptive, but instead just a name
> followed by some underscore and something else.
>
Good idea, I will fix and send the patch to you as soon as possible

>> + }
>> + kmemleak_not_leak(hdr);
>
> Is it *wrong* to run kmemleak_not_leak() when hdr was not allocated?
> If so, can you fix the sysctl __init call itself?
I don't understand here, do you mean that register_sysctl_init () does
not need to call kmemleak_not_leak (hdr), or does it mean to add check
hdr before calling kmemleak_not_leak (hdr) in sysctl_init ()?


> PS. Since you have given me your series, feel free to send me a patch
> as a follow up to this in privae and I can integrate it into my tree.
>
> Luis
>

Thanks
Xiaoming Ni


2020-05-29 07:39:31

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] sysctl: Add register_sysctl_init() interface

On Fri, May 29, 2020 at 03:27:22PM +0800, Xiaoming Ni wrote:
> On 2020/5/29 15:09, Luis Chamberlain wrote:
> > On Tue, May 19, 2020 at 11:31:08AM +0800, Xiaoming Ni wrote:
> > > --- a/kernel/sysctl.c
> > > +++ b/kernel/sysctl.c
> > > @@ -3358,6 +3358,25 @@ int __init sysctl_init(void)
> > > kmemleak_not_leak(hdr);
> > > return 0;
> > > }
> > > +
> > > +/*
> > > + * The sysctl interface is used to modify the interface value,
> > > + * but the feature interface has default values. Even if register_sysctl fails,
> > > + * the feature body function can also run. At the same time, malloc small
> > > + * fragment of memory during the system initialization phase, almost does
> > > + * not fail. Therefore, the function return is designed as void
> > > + */
> >
> > Let's use kdoc while at it. Can you convert this to proper kdoc?
> >
> Sorry, I do n’t know the format requirements of Kdoc, can you give me some
> tips for writing?

Sure, include/net/mac80211.h is a good example.

> > > +void __init register_sysctl_init(const char *path, struct ctl_table *table,
> > > + const char *table_name)
> > > +{
> > > + struct ctl_table_header *hdr = register_sysctl(path, table);
> > > +
> > > + if (unlikely(!hdr)) {
> > > + pr_err("failed when register_sysctl %s to %s\n", table_name, path);
> > > + return;
> >
> > table_name is only used for this, however we can easily just make
> > another _register_sysctl_init() helper first, and then use a macro
> > which will concatenate this to something useful if you want to print
> > a string. I see no point in the description for this, specially since
> > the way it was used was not to be descriptive, but instead just a name
> > followed by some underscore and something else.
> >
> Good idea, I will fix and send the patch to you as soon as possible

No rush :)

> > > + }
> > > + kmemleak_not_leak(hdr);
> >
> > Is it *wrong* to run kmemleak_not_leak() when hdr was not allocated?
> > If so, can you fix the sysctl __init call itself?
> I don't understand here, do you mean that register_sysctl_init () does not
> need to call kmemleak_not_leak (hdr), or does it mean to add check hdr
> before calling kmemleak_not_leak (hdr) in sysctl_init ()?

I'm asking that the way you are adding it, you don't run
kmemleak_not_leak(hdr) if the hdr allocation filed. If that is
right then it seems that sysctl_init() might not be doing it
right.

Can that code be shared somehow?

Luis

2020-05-29 08:35:23

by Xiaoming Ni

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] sysctl: Add register_sysctl_init() interface

On 2020/5/29 15:36, Luis Chamberlain wrote:
> On Fri, May 29, 2020 at 03:27:22PM +0800, Xiaoming Ni wrote:
>> On 2020/5/29 15:09, Luis Chamberlain wrote:
>>> On Tue, May 19, 2020 at 11:31:08AM +0800, Xiaoming Ni wrote:
>>>> --- a/kernel/sysctl.c
>>>> +++ b/kernel/sysctl.c
>>>> @@ -3358,6 +3358,25 @@ int __init sysctl_init(void)
>>>> kmemleak_not_leak(hdr);
>>>> return 0;
>>>> }
>>>> +
>>>> +/*
>>>> + * The sysctl interface is used to modify the interface value,
>>>> + * but the feature interface has default values. Even if register_sysctl fails,
>>>> + * the feature body function can also run. At the same time, malloc small
>>>> + * fragment of memory during the system initialization phase, almost does
>>>> + * not fail. Therefore, the function return is designed as void
>>>> + */
>>>
>>> Let's use kdoc while at it. Can you convert this to proper kdoc?
>>>
>> Sorry, I do n’t know the format requirements of Kdoc, can you give me some
>> tips for writing?
>
> Sure, include/net/mac80211.h is a good example.
>
>>>> +void __init register_sysctl_init(const char *path, struct ctl_table *table,
>>>> + const char *table_name)
>>>> +{
>>>> + struct ctl_table_header *hdr = register_sysctl(path, table);
>>>> +
>>>> + if (unlikely(!hdr)) {
>>>> + pr_err("failed when register_sysctl %s to %s\n", table_name, path);
>>>> + return;
>>>
>>> table_name is only used for this, however we can easily just make
>>> another _register_sysctl_init() helper first, and then use a macro
>>> which will concatenate this to something useful if you want to print
>>> a string. I see no point in the description for this, specially since
>>> the way it was used was not to be descriptive, but instead just a name
>>> followed by some underscore and something else.
>>>
>> Good idea, I will fix and send the patch to you as soon as possible
>
> No rush :)
>
>>>> + }
>>>> + kmemleak_not_leak(hdr);
>>>
>>> Is it *wrong* to run kmemleak_not_leak() when hdr was not allocated?
>>> If so, can you fix the sysctl __init call itself?
>> I don't understand here, do you mean that register_sysctl_init () does not
>> need to call kmemleak_not_leak (hdr), or does it mean to add check hdr
>> before calling kmemleak_not_leak (hdr) in sysctl_init ()?
>
> I'm asking that the way you are adding it, you don't run
> kmemleak_not_leak(hdr) if the hdr allocation filed. If that is
> right then it seems that sysctl_init() might not be doing it
> right.
>
> Can that code be shared somehow?
>
> Luis

void __ref kmemleak_not_leak(const void *ptr)
{
pr_debug("%s(0x%p)\n", __func__, ptr);

if (kmemleak_enabled && ptr && !IS_ERR(ptr))
make_gray_object((unsigned long)ptr);
}
EXPORT_SYMBOL(kmemleak_not_leak);

In the code of kmemleak_not_leak(), it is verified that the pointer is
valid, so kmemleak_not_leak (NULL) will not be a problem.
At the same time, there is no need to call kmemleak_not_leak() in the
failed branch of register_sysctl_init().

Thanks
Xiaoming Ni

2020-05-29 11:33:55

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] sysctl: Add register_sysctl_init() interface

On Fri, May 29, 2020 at 04:33:01PM +0800, Xiaoming Ni wrote:
> On 2020/5/29 15:36, Luis Chamberlain wrote:
> > On Fri, May 29, 2020 at 03:27:22PM +0800, Xiaoming Ni wrote:
> > > On 2020/5/29 15:09, Luis Chamberlain wrote:
> > > > On Tue, May 19, 2020 at 11:31:08AM +0800, Xiaoming Ni wrote:
> > > > > --- a/kernel/sysctl.c
> > > > > +++ b/kernel/sysctl.c
> > > > > @@ -3358,6 +3358,25 @@ int __init sysctl_init(void)
> > > > > kmemleak_not_leak(hdr);
> > > > > return 0;
> > > > > }
> > > > > +
> > > > > +/*
> > > > > + * The sysctl interface is used to modify the interface value,
> > > > > + * but the feature interface has default values. Even if register_sysctl fails,
> > > > > + * the feature body function can also run. At the same time, malloc small
> > > > > + * fragment of memory during the system initialization phase, almost does
> > > > > + * not fail. Therefore, the function return is designed as void
> > > > > + */
> > > >
> > > > Let's use kdoc while at it. Can you convert this to proper kdoc?
> > > >
> > > Sorry, I do n’t know the format requirements of Kdoc, can you give me some
> > > tips for writing?
> >
> > Sure, include/net/mac80211.h is a good example.
> >
> > > > > +void __init register_sysctl_init(const char *path, struct ctl_table *table,
> > > > > + const char *table_name)
> > > > > +{
> > > > > + struct ctl_table_header *hdr = register_sysctl(path, table);
> > > > > +
> > > > > + if (unlikely(!hdr)) {
> > > > > + pr_err("failed when register_sysctl %s to %s\n", table_name, path);
> > > > > + return;
> > > >
> > > > table_name is only used for this, however we can easily just make
> > > > another _register_sysctl_init() helper first, and then use a macro
> > > > which will concatenate this to something useful if you want to print
> > > > a string. I see no point in the description for this, specially since
> > > > the way it was used was not to be descriptive, but instead just a name
> > > > followed by some underscore and something else.
> > > >
> > > Good idea, I will fix and send the patch to you as soon as possible
> >
> > No rush :)
> >
> > > > > + }
> > > > > + kmemleak_not_leak(hdr);
> > > >
> > > > Is it *wrong* to run kmemleak_not_leak() when hdr was not allocated?
> > > > If so, can you fix the sysctl __init call itself?
> > > I don't understand here, do you mean that register_sysctl_init () does not
> > > need to call kmemleak_not_leak (hdr), or does it mean to add check hdr
> > > before calling kmemleak_not_leak (hdr) in sysctl_init ()?
> >
> > I'm asking that the way you are adding it, you don't run
> > kmemleak_not_leak(hdr) if the hdr allocation filed. If that is
> > right then it seems that sysctl_init() might not be doing it
> > right.
> >
> > Can that code be shared somehow?
> >
> > Luis
>
> void __ref kmemleak_not_leak(const void *ptr)
> {
> pr_debug("%s(0x%p)\n", __func__, ptr);
>
> if (kmemleak_enabled && ptr && !IS_ERR(ptr))
> make_gray_object((unsigned long)ptr);
> }
> EXPORT_SYMBOL(kmemleak_not_leak);
>
> In the code of kmemleak_not_leak(), it is verified that the pointer is
> valid, so kmemleak_not_leak (NULL) will not be a problem.
> At the same time, there is no need to call kmemleak_not_leak() in the failed
> branch of register_sysctl_init().

Thanks for the confirmation.

Luis