2014-02-24 23:28:11

by Alexander Graf

[permalink] [raw]
Subject: [PATCH] ksm: Expose configuration via sysctl

Configuration of tunables and Linux virtual memory settings has traditionally
happened via sysctl. Thanks to that there are well established ways to make
sysctl configuration bits persistent (sysctl.conf).

KSM introduced a sysfs based configuration path which is not covered by user
space persistent configuration frameworks.

In order to make life easy for sysadmins, this patch adds all access to all
KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
giving us a streamlined way to make KSM configuration persistent.

Reported-by: Sasche Peilicke <[email protected]>
Signed-off-by: Alexander Graf <[email protected]>
---
kernel/sysctl.c | 10 +++++++
mm/ksm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 88 insertions(+), 0 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 332cefc..2169a00 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -217,6 +217,9 @@ extern struct ctl_table random_table[];
#ifdef CONFIG_EPOLL
extern struct ctl_table epoll_table[];
#endif
+#ifdef CONFIG_KSM
+extern struct ctl_table ksm_table[];
+#endif

#ifdef HAVE_ARCH_PICK_MMAP_LAYOUT
int sysctl_legacy_va_layout;
@@ -1279,6 +1282,13 @@ static struct ctl_table vm_table[] = {
},

#endif /* CONFIG_COMPACTION */
+#ifdef CONFIG_KSM
+ {
+ .procname = "ksm",
+ .mode = 0555,
+ .child = ksm_table,
+ },
+#endif
{
.procname = "min_free_kbytes",
.data = &min_free_kbytes,
diff --git a/mm/ksm.c b/mm/ksm.c
index 3df141e..df93989 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2306,6 +2306,84 @@ static struct attribute_group ksm_attr_group = {
};
#endif /* CONFIG_SYSFS */

+#ifdef CONFIG_SYSCTL
+static int fill_from_user(char *buf, char __user *ubuf, size_t len)
+{
+ if (len > 15)
+ return -EINVAL;
+
+ if (copy_from_user(buf, ubuf, len))
+ return -EFAULT;
+
+ buf[len] = 0;
+ return 0;
+}
+
+static int sysctl_ksm(struct ctl_table *ctl, int write, void __user *buffer,
+ size_t *lenp, loff_t *fpos)
+{
+ int r;
+ char buf[16];
+
+ if (!write)
+ return proc_dointvec(ctl, write, buffer, lenp, fpos);
+
+ r = fill_from_user(buf, buffer, *lenp);
+ if (r)
+ return r;
+
+ if (ctl->data == &ksm_run)
+ r = run_store(NULL, NULL, buf, *lenp);
+ else if (ctl->data == &ksm_thread_sleep_millisecs)
+ r = sleep_millisecs_store(NULL, NULL, buf, *lenp);
+ else if (ctl->data == &ksm_thread_pages_to_scan)
+ r = pages_to_scan_store(NULL, NULL, buf, *lenp);
+#ifdef CONFIG_NUMA
+ else if (ctl->data == &ksm_merge_across_nodes)
+ r = merge_across_nodes_store(NULL, NULL, buf, *lenp);
+#endif
+ else
+ r = -EINVAL;
+
+ return r;
+}
+
+struct ctl_table ksm_table[] =
+{
+ {
+ .procname = "run",
+ .data = &ksm_run,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = sysctl_ksm,
+ },
+ {
+ .procname = "sleep_millisecs",
+ .data = &ksm_thread_sleep_millisecs,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = sysctl_ksm,
+ },
+ {
+ .procname = "pages_to_scan",
+ .data = &ksm_thread_pages_to_scan,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = sysctl_ksm,
+ },
+#ifdef CONFIG_NUMA
+ {
+ .procname = "merge_across_nodes",
+ .data = &ksm_merge_across_nodes,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = sysctl_ksm,
+ },
+#endif /* CONFIG_NUMA */
+ { }
+};
+#endif /* CONFIG_SYSCTL */
+
static int __init ksm_init(void)
{
struct task_struct *ksm_thread;
--
1.6.0.2


2014-02-25 12:33:48

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On 02/24/2014 06:28 PM, Alexander Graf wrote:
> Configuration of tunables and Linux virtual memory settings has traditionally
> happened via sysctl. Thanks to that there are well established ways to make
> sysctl configuration bits persistent (sysctl.conf).
>
> KSM introduced a sysfs based configuration path which is not covered by user
> space persistent configuration frameworks.
>
> In order to make life easy for sysadmins, this patch adds all access to all
> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
> giving us a streamlined way to make KSM configuration persistent.
>
> Reported-by: Sasche Peilicke <[email protected]>
> Signed-off-by: Alexander Graf <[email protected]>

Reviewed-by: Rik van Riel <[email protected]>


--
All rights reversed

2014-02-25 17:15:42

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On Tue, Feb 25, 2014 at 12:28:04AM +0100, Alexander Graf wrote:
> Configuration of tunables and Linux virtual memory settings has traditionally
> happened via sysctl. Thanks to that there are well established ways to make
> sysctl configuration bits persistent (sysctl.conf).
>
> KSM introduced a sysfs based configuration path which is not covered by user
> space persistent configuration frameworks.
>
> In order to make life easy for sysadmins, this patch adds all access to all
> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
> giving us a streamlined way to make KSM configuration persistent.
>
> Reported-by: Sasche Peilicke <[email protected]>
> Signed-off-by: Alexander Graf <[email protected]>
> ---
> kernel/sysctl.c | 10 +++++++
> mm/ksm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 88 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 332cefc..2169a00 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -217,6 +217,9 @@ extern struct ctl_table random_table[];
> #ifdef CONFIG_EPOLL
> extern struct ctl_table epoll_table[];
> #endif
> +#ifdef CONFIG_KSM
> +extern struct ctl_table ksm_table[];
> +#endif
>
> #ifdef HAVE_ARCH_PICK_MMAP_LAYOUT
> int sysctl_legacy_va_layout;
> @@ -1279,6 +1282,13 @@ static struct ctl_table vm_table[] = {
> },
>
> #endif /* CONFIG_COMPACTION */
> +#ifdef CONFIG_KSM
> + {
> + .procname = "ksm",
> + .mode = 0555,
> + .child = ksm_table,
> + },
> +#endif

ksm can be a module, so this won't work.

Can we make those controls proper module parameters instead?

2014-02-25 17:19:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On Tue, Feb 25, 2014 at 12:15:28PM -0500, Johannes Weiner wrote:
> On Tue, Feb 25, 2014 at 12:28:04AM +0100, Alexander Graf wrote:
> > Configuration of tunables and Linux virtual memory settings has traditionally
> > happened via sysctl. Thanks to that there are well established ways to make
> > sysctl configuration bits persistent (sysctl.conf).
> >
> > KSM introduced a sysfs based configuration path which is not covered by user
> > space persistent configuration frameworks.
> >
> > In order to make life easy for sysadmins, this patch adds all access to all
> > KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
> > giving us a streamlined way to make KSM configuration persistent.
> >
> > Reported-by: Sasche Peilicke <[email protected]>
> > Signed-off-by: Alexander Graf <[email protected]>
> > ---
> > kernel/sysctl.c | 10 +++++++
> > mm/ksm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 88 insertions(+), 0 deletions(-)
> >
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index 332cefc..2169a00 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -217,6 +217,9 @@ extern struct ctl_table random_table[];
> > #ifdef CONFIG_EPOLL
> > extern struct ctl_table epoll_table[];
> > #endif
> > +#ifdef CONFIG_KSM
> > +extern struct ctl_table ksm_table[];
> > +#endif
> >
> > #ifdef HAVE_ARCH_PICK_MMAP_LAYOUT
> > int sysctl_legacy_va_layout;
> > @@ -1279,6 +1282,13 @@ static struct ctl_table vm_table[] = {
> > },
> >
> > #endif /* CONFIG_COMPACTION */
> > +#ifdef CONFIG_KSM
> > + {
> > + .procname = "ksm",
> > + .mode = 0555,
> > + .child = ksm_table,
> > + },
> > +#endif
>
> ksm can be a module, so this won't work.
>
> Can we make those controls proper module parameters instead?

You can do dynamic sysctl registration and removal. Its its own little
filesystem of sorts.

2014-02-25 17:35:14

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On 02/24/2014 03:28 PM, Alexander Graf wrote:
> Configuration of tunables and Linux virtual memory settings has traditionally
> happened via sysctl. Thanks to that there are well established ways to make
> sysctl configuration bits persistent (sysctl.conf).
>
> KSM introduced a sysfs based configuration path which is not covered by user
> space persistent configuration frameworks.
>
> In order to make life easy for sysadmins, this patch adds all access to all
> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
> giving us a streamlined way to make KSM configuration persistent.

Doesn't this essentially mean "don't use sysfs for configuration"?
Seems like at least /sys/kernel/mm/transparent_hugepage would need the
same treatment.

Couldn't we also (maybe in parallel) just teach the sysctl userspace
about sysfs? This way we don't have to do parallel sysctls and sysfs
for *EVERYTHING* in the kernel:

sysfs.kernel.mm.transparent_hugepage.enabled=enabled

Or do we just say "sysctls are the way to go for anything that might
need to be persistent, don't use sysfs"?

2014-02-25 23:09:31

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl



> Am 26.02.2014 um 01:34 schrieb Dave Hansen <[email protected]>:
>
>> On 02/24/2014 03:28 PM, Alexander Graf wrote:
>> Configuration of tunables and Linux virtual memory settings has traditionally
>> happened via sysctl. Thanks to that there are well established ways to make
>> sysctl configuration bits persistent (sysctl.conf).
>>
>> KSM introduced a sysfs based configuration path which is not covered by user
>> space persistent configuration frameworks.
>>
>> In order to make life easy for sysadmins, this patch adds all access to all
>> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
>> giving us a streamlined way to make KSM configuration persistent.
>
> Doesn't this essentially mean "don't use sysfs for configuration"?
> Seems like at least /sys/kernel/mm/transparent_hugepage would need the
> same treatment.
>
> Couldn't we also (maybe in parallel) just teach the sysctl userspace
> about sysfs? This way we don't have to do parallel sysctls and sysfs
> for *EVERYTHING* in the kernel:
>
> sysfs.kernel.mm.transparent_hugepage.enabled=enabled

It's pretty hard to filter this. We definitely do not want to expose all of sysfs through /proc/sys. But how do we know which files are actual configuration and which ones are dynamic system introspection data?

We could add a filter, but then we can just as well stick with the manual approach I followed here :).

>
> Or do we just say "sysctls are the way to go for anything that might
> need to be persistent, don't use sysfs"?

IMHO there are 2 paths we can take:

1) Admit that using sysfs for configuration is a bad idea, use sysctl instead

2) Invent a streamlined way to set sysfs configuration variables similar to how we can set sysctl values

I'm not really sure which path is nicer. But the sitaution today is not exactly satisfactory. The most common solution to ksm configuration is an init or systemd script that sets the respective config variables.


Alex-

2014-02-25 23:16:34

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl



>> Am 26.02.2014 um 01:19 schrieb Peter Zijlstra <[email protected]>:
>>
>>> On Tue, Feb 25, 2014 at 12:15:28PM -0500, Johannes Weiner wrote:
>>> On Tue, Feb 25, 2014 at 12:28:04AM +0100, Alexander Graf wrote:
>>> Configuration of tunables and Linux virtual memory settings has traditionally
>>> happened via sysctl. Thanks to that there are well established ways to make
>>> sysctl configuration bits persistent (sysctl.conf).
>>>
>>> KSM introduced a sysfs based configuration path which is not covered by user
>>> space persistent configuration frameworks.
>>>
>>> In order to make life easy for sysadmins, this patch adds all access to all
>>> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
>>> giving us a streamlined way to make KSM configuration persistent.
>>>
>>> Reported-by: Sasche Peilicke <[email protected]>
>>> Signed-off-by: Alexander Graf <[email protected]>
>>> ---
>>> kernel/sysctl.c | 10 +++++++
>>> mm/ksm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 88 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>>> index 332cefc..2169a00 100644
>>> --- a/kernel/sysctl.c
>>> +++ b/kernel/sysctl.c
>>> @@ -217,6 +217,9 @@ extern struct ctl_table random_table[];
>>> #ifdef CONFIG_EPOLL
>>> extern struct ctl_table epoll_table[];
>>> #endif
>>> +#ifdef CONFIG_KSM
>>> +extern struct ctl_table ksm_table[];
>>> +#endif
>>>
>>> #ifdef HAVE_ARCH_PICK_MMAP_LAYOUT
>>> int sysctl_legacy_va_layout;
>>> @@ -1279,6 +1282,13 @@ static struct ctl_table vm_table[] = {
>>> },
>>>
>>> #endif /* CONFIG_COMPACTION */
>>> +#ifdef CONFIG_KSM
>>> + {
>>> + .procname = "ksm",
>>> + .mode = 0555,
>>> + .child = ksm_table,
>>> + },
>>> +#endif
>>
>> ksm can be a module, so this won't work.
>>
>> Can we make those controls proper module parameters instead?
>
> You can do dynamic sysctl registration and removal. Its its own little
> filesystem of sorts.

Hm. Doesn't this open another big can of worms? If we have ksm as a module and our sysctl helpers try to enable ksm on boot, they may not be able to because the module hasn't been loaded yet.

So in that case, we want to always register the sysctl and dynamically load the ksm module when the sysctl gets accessed - similar to how we can do stub devices that load modiles, no?

Alex-

2014-02-25 23:50:50

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On Wed, Feb 26, 2014 at 12:16 AM, Alexander Graf <[email protected]> wrote:
>
>
>>> Am 26.02.2014 um 01:19 schrieb Peter Zijlstra <[email protected]>:
>>>
>>>> On Tue, Feb 25, 2014 at 12:15:28PM -0500, Johannes Weiner wrote:
>>>> On Tue, Feb 25, 2014 at 12:28:04AM +0100, Alexander Graf wrote:
>>>> Configuration of tunables and Linux virtual memory settings has traditionally
>>>> happened via sysctl. Thanks to that there are well established ways to make
>>>> sysctl configuration bits persistent (sysctl.conf).
>>>>
>>>> KSM introduced a sysfs based configuration path which is not covered by user
>>>> space persistent configuration frameworks.
>>>>
>>>> In order to make life easy for sysadmins, this patch adds all access to all
>>>> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
>>>> giving us a streamlined way to make KSM configuration persistent.
>>>>
>>>> Reported-by: Sasche Peilicke <[email protected]>
>>>> Signed-off-by: Alexander Graf <[email protected]>
>>>> ---
>>>> kernel/sysctl.c | 10 +++++++
>>>> mm/ksm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 88 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>>>> index 332cefc..2169a00 100644
>>>> --- a/kernel/sysctl.c
>>>> +++ b/kernel/sysctl.c
>>>> @@ -217,6 +217,9 @@ extern struct ctl_table random_table[];
>>>> #ifdef CONFIG_EPOLL
>>>> extern struct ctl_table epoll_table[];
>>>> #endif
>>>> +#ifdef CONFIG_KSM
>>>> +extern struct ctl_table ksm_table[];
>>>> +#endif
>>>>
>>>> #ifdef HAVE_ARCH_PICK_MMAP_LAYOUT
>>>> int sysctl_legacy_va_layout;
>>>> @@ -1279,6 +1282,13 @@ static struct ctl_table vm_table[] = {
>>>> },
>>>>
>>>> #endif /* CONFIG_COMPACTION */
>>>> +#ifdef CONFIG_KSM
>>>> + {
>>>> + .procname = "ksm",
>>>> + .mode = 0555,
>>>> + .child = ksm_table,
>>>> + },
>>>> +#endif
>>>
>>> ksm can be a module, so this won't work.
>>>
>>> Can we make those controls proper module parameters instead?
>>
>> You can do dynamic sysctl registration and removal. Its its own little
>> filesystem of sorts.
>
> Hm. Doesn't this open another big can of worms? If we have ksm as a module and our sysctl helpers try to enable ksm on boot, they may not be able to because the module hasn't been loaded yet.
>
> So in that case, we want to always register the sysctl and dynamically load the ksm module when the sysctl gets accessed - similar to how we can do stub devices that load modiles, no?

The files sysctl tries to write to at bootup need to be all there
right from the start, otherwise there can't be anything to access, and
no way to trigger any module load.

The auto-load stuff in /dev works by userspace creating dead device
nodes with the proper dev_t before the device exists, which is a very
different model.

sysctl is not suitable for any instantiated or conditional data like
loadable module, devices, ... Things need to be there right from the
start, later registered facilities will not be noticed by sysctl in
userspace and are therefore not hooked into system management. If any
values should be applied from userspace, this will not really work
out.

There a network devices configs doing that today, and they don't
really work that well; it's a gigantic stupid hack in userspace to
fiddle with /proc/sys/ when a netdev shows up, no model to copy to any
new facility.

Usually, loadable module parameter need to live in /sys/module/ or
/sys/bus/, where uevents are generated and udev can pick up the event
and apply system configuration.

Kay

2014-02-26 00:03:03

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On Tue, Feb 25, 2014 at 6:34 PM, Dave Hansen <[email protected]> wrote:
> On 02/24/2014 03:28 PM, Alexander Graf wrote:
>> Configuration of tunables and Linux virtual memory settings has traditionally
>> happened via sysctl. Thanks to that there are well established ways to make
>> sysctl configuration bits persistent (sysctl.conf).
>>
>> KSM introduced a sysfs based configuration path which is not covered by user
>> space persistent configuration frameworks.
>>
>> In order to make life easy for sysadmins, this patch adds all access to all
>> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
>> giving us a streamlined way to make KSM configuration persistent.
>
> Doesn't this essentially mean "don't use sysfs for configuration"?
> Seems like at least /sys/kernel/mm/transparent_hugepage would need the
> same treatment.
>
> Couldn't we also (maybe in parallel) just teach the sysctl userspace
> about sysfs? This way we don't have to do parallel sysctls and sysfs
> for *EVERYTHING* in the kernel:
>
> sysfs.kernel.mm.transparent_hugepage.enabled=enabled
>
> Or do we just say "sysctls are the way to go for anything that might
> need to be persistent, don't use sysfs"?

Support in sysctl for setting static data in /sys might make sense for
some rare use cases.

It's still not obvious how to handle the dynamic nature of most of the
data that is created by modules, and which data belongs into udev
rules and which in the "sysctl /sys" settings.

Kay

2014-02-26 00:10:44

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On 02/25/2014 03:09 PM, Alexander Graf wrote:
>> Couldn't we also (maybe in parallel) just teach the sysctl userspace
>> about sysfs? This way we don't have to do parallel sysctls and sysfs
>> for *EVERYTHING* in the kernel:
>>
>> sysfs.kernel.mm.transparent_hugepage.enabled=enabled
>
> It's pretty hard to filter this. We definitely do not want to expose all of sysfs through /proc/sys. But how do we know which files are actual configuration and which ones are dynamic system introspection data?
>
> We could add a filter, but then we can just as well stick with the manual approach I followed here :).

Maybe not stick it under /proc/sys, but teach sysctl(8) about them. I
guess at the moment, sysctl says that it's tied to /proc/sys:

> DESCRIPTION
> sysctl is used to modify kernel parameters at runtime. The parameters available are those listed under /proc/sys/. Procfs is required
> for sysctl support in Linux. You can use sysctl to both read and write sysctl data.

But surely that's not set in stone just because the manpage says so. :)

2014-02-26 01:06:27

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On Tue, 25 Feb 2014, Johannes Weiner wrote:
> On Tue, Feb 25, 2014 at 12:28:04AM +0100, Alexander Graf wrote:
> > Configuration of tunables and Linux virtual memory settings has traditionally
> > happened via sysctl. Thanks to that there are well established ways to make
> > sysctl configuration bits persistent (sysctl.conf).
> >
> > KSM introduced a sysfs based configuration path which is not covered by user
> > space persistent configuration frameworks.
> >
> > In order to make life easy for sysadmins, this patch adds all access to all
> > KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
> > giving us a streamlined way to make KSM configuration persistent.
>
> ksm can be a module, so this won't work.

That's news to me. Are you writing of some Red Hat patches, or just
misled by the "module_init(ksm_init)" which used the last line of ksm.c?

I don't mind Alex's patch, but I do think the same should be done for
THP as for KSM, and a general solution more attractive than more #ifdefs
one by one. Should a general solution just be in userspace, in sysctl(8)?

Hugh

2014-02-26 08:05:11

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl



> Am 26.02.2014 um 09:05 schrieb Hugh Dickins <[email protected]>:
>
>> On Tue, 25 Feb 2014, Johannes Weiner wrote:
>>> On Tue, Feb 25, 2014 at 12:28:04AM +0100, Alexander Graf wrote:
>>> Configuration of tunables and Linux virtual memory settings has traditionally
>>> happened via sysctl. Thanks to that there are well established ways to make
>>> sysctl configuration bits persistent (sysctl.conf).
>>>
>>> KSM introduced a sysfs based configuration path which is not covered by user
>>> space persistent configuration frameworks.
>>>
>>> In order to make life easy for sysadmins, this patch adds all access to all
>>> KSM tunables via sysctl as well. That way sysctl.conf works for KSM as well,
>>> giving us a streamlined way to make KSM configuration persistent.
>>
>> ksm can be a module, so this won't work.
>
> That's news to me. Are you writing of some Red Hat patches, or just
> misled by the "module_init(ksm_init)" which used the last line of ksm.c?

Ugh, sorry. I should have double-checked this. KSM is bool in Kconfig and so is THP.

>
> I don't mind Alex's patch, but I do think the same should be done for
> THP as for KSM, and a general

I agree.

> solution more attractive than more #ifdefs
> one by one. Should a general

I don't see a good alternative to this.

> solution just be in userspace, in sysctl(8)?

User space needs to have the ability to list available sysctls, so we need to have an enumerable map between sys and sysctl somewhere. Keeping that list close to where the actual files are implemented seems to make sense to me, as it's very hard to miss out on addition and removal of parameters throughout the stack this way. That's why I put it here.


Alex

2014-02-26 15:43:59

by Sven-Haegar Koch

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

On Tue, 25 Feb 2014, Dave Hansen wrote:

> On 02/25/2014 03:09 PM, Alexander Graf wrote:
> >> Couldn't we also (maybe in parallel) just teach the sysctl userspace
> >> about sysfs? This way we don't have to do parallel sysctls and sysfs
> >> for *EVERYTHING* in the kernel:
> >>
> >> sysfs.kernel.mm.transparent_hugepage.enabled=enabled
> >
> > It's pretty hard to filter this. We definitely do not want to expose all of sysfs through /proc/sys. But how do we know which files are actual configuration and which ones are dynamic system introspection data?
> >
> > We could add a filter, but then we can just as well stick with the manual approach I followed here :).
>
> Maybe not stick it under /proc/sys, but teach sysctl(8) about them. I
> guess at the moment, sysctl says that it's tied to /proc/sys:
>
> > DESCRIPTION
> > sysctl is used to modify kernel parameters at runtime. The parameters available are those listed under /proc/sys/. Procfs is required
> > for sysctl support in Linux. You can use sysctl to both read and write sysctl data.
>
> But surely that's not set in stone just because the manpage says so. :)

What I still don't get is why you need this?

My distribution (Debian) has a sysfsutils package which provides a
/etc/sysfs.conf / /etc/sysfs.d/foo exactly like /etc/sysctl.conf.

Don't other distributions have something like this?

c'ya
sven-haegar

--
Three may keep a secret, if two of them are dead.
- Ben F.

2014-02-26 16:17:24

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH] ksm: Expose configuration via sysctl

Sven-Haegar Koch wrote:
> On Tue, 25 Feb 2014, Dave Hansen wrote:
>
>
>> On 02/25/2014 03:09 PM, Alexander Graf wrote:
>>
>>>> Couldn't we also (maybe in parallel) just teach the sysctl userspace
>>>> about sysfs? This way we don't have to do parallel sysctls and sysfs
>>>> for *EVERYTHING* in the kernel:
>>>>
>>>> sysfs.kernel.mm.transparent_hugepage.enabled=enabled
>>>>
>>> It's pretty hard to filter this. We definitely do not want to expose all of sysfs through /proc/sys. But how do we know which files are actual configuration and which ones are dynamic system introspection data?
>>>
>>> We could add a filter, but then we can just as well stick with the manual approach I followed here :).
>>>
>> Maybe not stick it under /proc/sys, but teach sysctl(8) about them. I
>> guess at the moment, sysctl says that it's tied to /proc/sys:
>>
>>
>>> DESCRIPTION
>>> sysctl is used to modify kernel parameters at runtime. The parameters available are those listed under /proc/sys/. Procfs is required
>>> for sysctl support in Linux. You can use sysctl to both read and write sysctl data.
>>>
>> But surely that's not set in stone just because the manpage says so. :)
>>
>
> What I still don't get is why you need this?
>
> My distribution (Debian) has a sysfsutils package which provides a
> /etc/sysfs.conf / /etc/sysfs.d/foo exactly like /etc/sysctl.conf.
>
> Don't other distributions have something like this?
>

Maybe that's the right answer to the problem, but I still don't
understand why these properties were put into sysfs in the first place.
We're not configuring a dynamic device here, are we?

Also if we do want something like a sysfs.conf and sysfs.d, that should
probably be something that gets properly coordinated between
distributions so that users don't get completely confused. Today
openSUSE does not have a sysfs.conf/.d provided by the sysfsutils
package. Maybe it's something homegrown?


Alex