2023-04-11 18:41:16

by Wen Yang

[permalink] [raw]
Subject: [PATCH] mm: compaction: optimize compact_memory to comply with the admin-guide

From: Wen Yang <[email protected]>

For the /proc/sys/vm/compact_memory file, the admin-guide states:
When 1 is written to the file, all zones are compacted such that free
memory is available in contiguous blocks where possible. This can be
important for example in the allocation of huge pages although processes
will also directly compact memory as required

But it was not strictly followed, writing any value would cause all
zones to be compacted. In some critical scenarios, some applications
operating it, such as echo 0, have caused serious problems.

It has been slightly optimized to comply with the admin-guide.

Signed-off-by: Wen Yang <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: William Lam <[email protected]>
Cc: Fu Wei <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
mm/compaction.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/mm/compaction.c b/mm/compaction.c
index c8bcdea15f5f..3c4aa533d61c 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2780,6 +2780,17 @@ static int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int
static int sysctl_compaction_handler(struct ctl_table *table, int write,
void *buffer, size_t *length, loff_t *ppos)
{
+ struct ctl_table t;
+ int compact;
+ int ret;
+
+ t = *table;
+ t.data = &compact;
+
+ ret = proc_dointvec_minmax(&t, write, buffer, length, ppos);
+ if (ret)
+ return ret;
+
if (write)
compact_nodes();

@@ -3099,6 +3110,8 @@ static struct ctl_table vm_compaction[] = {
.maxlen = sizeof(int),
.mode = 0200,
.proc_handler = sysctl_compaction_handler,
+ .extra1 = SYSCTL_ONE,
+ .extra2 = SYSCTL_ONE,
},
{
.procname = "compaction_proactiveness",
--
2.37.2


2023-04-11 20:49:10

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm: compaction: optimize compact_memory to comply with the admin-guide

On Wed, 12 Apr 2023 02:24:26 +0800 [email protected] wrote:

> For the /proc/sys/vm/compact_memory file, the admin-guide states:
> When 1 is written to the file, all zones are compacted such that free
> memory is available in contiguous blocks where possible. This can be
> important for example in the allocation of huge pages although processes
> will also directly compact memory as required
>
> But it was not strictly followed, writing any value would cause all
> zones to be compacted. In some critical scenarios, some applications
> operating it, such as echo 0, have caused serious problems.

Really? You mean someone actually did this and didn't observe the
effect during their testing?

> It has been slightly optimized to comply with the admin-guide.

2023-04-12 17:10:07

by Wen Yang

[permalink] [raw]
Subject: Re: [PATCH] mm: compaction: optimize compact_memory to comply with the admin-guide


在 2023/4/12 04:48, Andrew Morton 写道:
> On Wed, 12 Apr 2023 02:24:26 +0800 [email protected] wrote:
>
>> For the /proc/sys/vm/compact_memory file, the admin-guide states:
>> When 1 is written to the file, all zones are compacted such that free
>> memory is available in contiguous blocks where possible. This can be
>> important for example in the allocation of huge pages although processes
>> will also directly compact memory as required
>>
>> But it was not strictly followed, writing any value would cause all
>> zones to be compacted. In some critical scenarios, some applications
>> operating it, such as echo 0, have caused serious problems.
> Really? You mean someone actually did this and didn't observe the
> effect during their testing?

Thanks for your reply.

Since /proc/sys/vm/compact_memory has been well documented for over a
decade:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/sysctl/vm.rst#n109

it is believed that only writing 1 will trigger trigger all zones to be
compacted.

Especially for those who write applications, they may only focus on
documentation and generally do not read kernel code.  Moreover, such
problems are not easily detected through testing on low pressure machines.

Writing any meaningful or meaningless values will trigger it and affect
the entire server:

# echo 1 > /proc/sys/vm/compact_memory
# echo 0 > /proc/sys/vm/compact_memory
# echo dead > /proc/sys/vm/compact_memory
# echo "hello world" > /proc/sys/vm/compact_memory

The implementation of this high-risk operation may require following the
admin-guides.

--

Best wishes,

Wen


>> It has been slightly optimized to comply with the admin-guide.

2023-04-15 17:46:58

by Wen Yang

[permalink] [raw]
Subject: Re: [PATCH] mm: compaction: optimize compact_memory to comply with the admin-guide


在 2023/4/13 00:54, Wen Yang 写道:
>
> 在 2023/4/12 04:48, Andrew Morton 写道:
>> On Wed, 12 Apr 2023 02:24:26 +0800 [email protected] wrote:
>>
>>> For the /proc/sys/vm/compact_memory file, the admin-guide states:
>>> When 1 is written to the file, all zones are compacted such that free
>>> memory is available in contiguous blocks where possible. This can be
>>> important for example in the allocation of huge pages although
>>> processes
>>> will also directly compact memory as required
>>>
>>> But it was not strictly followed, writing any value would cause all
>>> zones to be compacted. In some critical scenarios, some applications
>>> operating it, such as echo 0, have caused serious problems.
>> Really?  You mean someone actually did this and didn't observe the
>> effect during their testing?
>
> Thanks for your reply.
>
> Since /proc/sys/vm/compact_memory has been well documented for over a
> decade:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/sysctl/vm.rst#n109
>
>
> it is believed that only writing 1 will trigger trigger all zones to
> be compacted.
>
> Especially for those who write applications, they may only focus on
> documentation and generally do not read kernel code.  Moreover, such
> problems are not easily detected through testing on low pressure
> machines.
>
> Writing any meaningful or meaningless values will trigger it and
> affect the entire server:
>
> # echo 1 > /proc/sys/vm/compact_memory
> # echo 0 > /proc/sys/vm/compact_memory
> # echo dead > /proc/sys/vm/compact_memory
> # echo "hello world" > /proc/sys/vm/compact_memory
>
> The implementation of this high-risk operation may require following
> the admin-guides.
>
> --
>
> Best wishes,
>
> Wen
>
>
Hello, do you think it's better to optimize the
sysctl_compaction_handler code or update the admin-guide document?

--

Best wishes,

Wen

>>> It has been slightly optimized to comply with the admin-guide.
>

2023-04-17 11:36:03

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] mm: compaction: optimize compact_memory to comply with the admin-guide

On Sun, Apr 16, 2023 at 01:42:44AM +0800, Wen Yang wrote:
>
> ??? 2023/4/13 00:54, Wen Yang ??????:
> >
> > ??? 2023/4/12 04:48, Andrew Morton ??????:
> > > On Wed, 12 Apr 2023 02:24:26 +0800 [email protected] wrote:
> > >
> > > > For the /proc/sys/vm/compact_memory file, the admin-guide states:
> > > > When 1 is written to the file, all zones are compacted such that free
> > > > memory is available in contiguous blocks where possible. This can be
> > > > important for example in the allocation of huge pages although
> > > > processes
> > > > will also directly compact memory as required
> > > >
> > > > But it was not strictly followed, writing any value would cause all
> > > > zones to be compacted. In some critical scenarios, some applications
> > > > operating it, such as echo 0, have caused serious problems.
> > > Really?? You mean someone actually did this and didn't observe the
> > > effect during their testing?
> >
> > Thanks for your reply.
> >
> > Since /proc/sys/vm/compact_memory has been well documented for over a
> > decade:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/sysctl/vm.rst#n109
> >
> >
> > it is believed that only writing 1 will trigger trigger all zones to be
> > compacted.
> >
> > Especially for those who write applications, they may only focus on
> > documentation and generally do not read kernel code.? Moreover, such
> > problems are not easily detected through testing on low pressure
> > machines.
> >
> > Writing any meaningful or meaningless values will trigger it and affect
> > the entire server:
> >
> > # echo 1 > /proc/sys/vm/compact_memory
> > # echo 0 > /proc/sys/vm/compact_memory
> > # echo dead > /proc/sys/vm/compact_memory
> > # echo "hello world" > /proc/sys/vm/compact_memory
> >
> > The implementation of this high-risk operation may require following the
> > admin-guides.
> >
> > --
> >
> > Best wishes,
> >
> > Wen
> >
> >
> Hello, do you think it's better to optimize the sysctl_compaction_handler
> code or update the admin-guide document?
>

Enforce the 1 on the unlikely chance that the sysctl handler is ever
extended to do something different and expects a bitmask. The original
intent intent of the sysctl was debugging -- demonstrating a contiguous
allocation failure when aggressive compaction should have succeeded. Later
some machines dedicated to batch jobs used the compaction sysctl to compact
memory before a new job started to reduce startup latencies.

Drop the justification "In some critical scenarios, some applications
operating it, such as echo 0, have caused serious problems." from the
changelog. I cannot imagine a sane "critical scenario" where an application
running as root is writing expected garbage to proc or sysfs files and
then surprised when something unexpected happens.

--
Mel Gorman
SUSE Labs

2023-04-18 14:14:54

by Wen Yang

[permalink] [raw]
Subject: Re: [PATCH] mm: compaction: optimize compact_memory to comply with the admin-guide


在 2023/4/17 19:13, Mel Gorman 写道:
> On Sun, Apr 16, 2023 at 01:42:44AM +0800, Wen Yang wrote:
>> ??? 2023/4/13 00:54, Wen Yang ??????:
>>> ??? 2023/4/12 04:48, Andrew Morton ??????:
>>>> On Wed, 12 Apr 2023 02:24:26 +0800 [email protected] wrote:
>>>>
>>>>> For the /proc/sys/vm/compact_memory file, the admin-guide states:
>>>>> When 1 is written to the file, all zones are compacted such that free
>>>>> memory is available in contiguous blocks where possible. This can be
>>>>> important for example in the allocation of huge pages although
>>>>> processes
>>>>> will also directly compact memory as required
>>>>>
>>>>> But it was not strictly followed, writing any value would cause all
>>>>> zones to be compacted. In some critical scenarios, some applications
>>>>> operating it, such as echo 0, have caused serious problems.
>>>> Really?  You mean someone actually did this and didn't observe the
>>>> effect during their testing?
>>> Thanks for your reply.
>>>
>>> Since /proc/sys/vm/compact_memory has been well documented for over a
>>> decade:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/sysctl/vm.rst#n109
>>>
>>>
>>> it is believed that only writing 1 will trigger trigger all zones to be
>>> compacted.
>>>
>>> Especially for those who write applications, they may only focus on
>>> documentation and generally do not read kernel code.  Moreover, such
>>> problems are not easily detected through testing on low pressure
>>> machines.
>>>
>>> Writing any meaningful or meaningless values will trigger it and affect
>>> the entire server:
>>>
>>> # echo 1 > /proc/sys/vm/compact_memory
>>> # echo 0 > /proc/sys/vm/compact_memory
>>> # echo dead > /proc/sys/vm/compact_memory
>>> # echo "hello world" > /proc/sys/vm/compact_memory
>>>
>>> The implementation of this high-risk operation may require following the
>>> admin-guides.
>>>
>>> --
>>>
>>> Best wishes,
>>>
>>> Wen
>>>
>>>
>> Hello, do you think it's better to optimize the sysctl_compaction_handler
>> code or update the admin-guide document?
>>
> Enforce the 1 on the unlikely chance that the sysctl handler is ever
> extended to do something different and expects a bitmask. The original
> intent intent of the sysctl was debugging -- demonstrating a contiguous
> allocation failure when aggressive compaction should have succeeded. Later
> some machines dedicated to batch jobs used the compaction sysctl to compact
> memory before a new job started to reduce startup latencies.
>
> Drop the justification "In some critical scenarios, some applications
> operating it, such as echo 0, have caused serious problems." from the
> changelog. I cannot imagine a sane "critical scenario" where an application
> running as root is writing expected garbage to proc or sysfs files and
> then surprised when something unexpected happens.
>
Thanks for your comments.

We will modify it according to your suggestion and then send v2.


--

Best wishes,

Wen