by Feng Tang

[permalink] [raw]

Subject: Re: [mm] 4e2c82a409: ltp.overcommit_memory01.fail

On Thu, Jul 09, 2020 at 10:15:19PM +0800, Feng Tang wrote:
> Hi Qian Cai,
>
> On Thu, Jul 09, 2020 at 09:40:40AM -0400, Qian Cai wrote:
> > > > > Can we change the batch firstly, then sync the global counter, finally
> > > > > change the overcommit policy?
> > > >
> > > > These reorderings are really head scratching :)
> > > >
> > > > I've thought about this before when Qian Cai first reported the warning
> > > > message, as kernel had a check:
> > > >
> > > > VM_WARN_ONCE(percpu_counter_read(&vm_committed_as) <
> > > > -(s64)vm_committed_as_batch * num_online_cpus(),
> > > > "memory commitment underflow");
> > > >
> > > > If the batch is decreased first, the warning will be easier/earlier to be
> > > > triggered, so I didn't brought this up when handling the warning message.
> > > >
> > > > But it might work now, as the warning has been removed.
> > >
> > > I tested the reorder way, and the test could pass in 100 times run. The
> > > new order when changing policy to OVERCOMMIT_NEVER:
> > > 1. re-compute the batch ( to the smaller one)
> > > 2. do the on_each_cpu sync
> > > 3. really change the policy to NEVER.
> > >
> > > It solves one of previous concern, that after the sync is done on cpuX,
> > > but before the whole sync on all cpus are done, there is a window that
> > > the percpu-counter could be enlarged again.
> > >
> > > IIRC Andi had concern about read side cost when doing the sync, my
> > > understanding is most of the readers (malloc/free/map/unmap) are using
> > > percpu_counter_read_positive, which is a fast path without involving lock.
> > >
> > > As for the problem itself, I agree with Michal's point, that usually there
> > > is no normal case that will change the overcommit_policy too frequently.
> > >
> > > The code logic is mainly in overcommit_policy_handler(), based on the
> > > previous sync fix. please help to review, thanks!
> > >
> > > int overcommit_policy_handler(struct ctl_table *table, int write, void *buffer,
> > > size_t *lenp, loff_t *ppos)
> > > {
> > > int ret;
> > >
> > > if (write) {
> > > int new_policy;
> > > struct ctl_table t;
> > >
> > > t = *table;
> > > t.data = &new_policy;
> > > ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
> > > if (ret)
> > > return ret;
> > >
> > > mm_compute_batch(new_policy);
> > > if (new_policy == OVERCOMMIT_NEVER)
> > > schedule_on_each_cpu(sync_overcommit_as);
> > > sysctl_overcommit_memory = new_policy;
> > > } else {
> > > ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> > > }
> > >
> > > return ret;
> > > }
> >
> > Rather than having to indent those many lines, how about this?
>
> Thanks for the cleanup suggestion.
>
> > t = *table;
> > t.data = &new_policy;
>
> The input table->data is actually &sysctl_overcommit_memory, so
> there is a problem for "read" case, it will return the 'new_policy'
> value instead of real sysctl_overcommit_memory.
>
> It should work after adding a check
> if (write)
> t.data = &new_policy;
>
> > ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> --> &t

Give it a second thought, my previous way has more indents and lines,
but it is easier to be understood that we have special handling for
'write' case. So I would prefer using it.

Thoughts?

Thanks,
Feng

> Thanks,
> Feng
>
> > if (ret || !write)
> > return ret;
> > mm_compute_batch(new_policy);
> > if (new_policy == OVERCOMMIT_NEVER)
> > schedule_on_each_cpu(sync_overcommit_as);
> >
> > sysctl_overcommit_memory = new_policy;
> > return ret;

2020-07-10 03:26:58

by Qian Cai

[permalink] [raw]

Subject: Re: [mm] 4e2c82a409: ltp.overcommit_memory01.fail

> On Jul 9, 2020, at 9:38 PM, Feng Tang <[email protected]> wrote:
>
> Give it a second thought, my previous way has more indents and lines,
> but it is easier to be understood that we have special handling for
> 'write' case. So I would prefer using it.
>
> Thoughts?

I don’t feel it is easier to understand. I generally prefer to bail out early if possible to also make code a bit more solid for future extensions (once the indentation reached 3+ levels, we will need to rework it).

But, I realize that I have spent too much time debugging than actually writing code those days, so my taste is probably not all that good. Thus, feel free to submit what style you prefer, so other people have more experience coding could review them more.