2015-12-08 14:58:55

by Andy Whitcroft

[permalink] [raw]
Subject: cgroup pids controller -- WARN_ON_ONCE triggering

The commit below attempts to fix up pid controller charging:

commit afcf6c8b75444382e0f9996157207ebae34a8848
Author: Tejun Heo <[email protected]>
Date: Thu Oct 15 16:41:53 2015 -0400

cgroup: add cgroup_subsys->free() method and use it to fix pids controller

Since this change we are seeing system hangs in early boot on multiple
architecures. We have a console log on ppc64el [1] which fingers
pids_cancel(). Manual debugging on amd64 VMs seems to indicate that we
are now tripping the WARN_ON_ONCE() below:

static void pids_cancel(struct pids_cgroup *pids, int num)
{
/*
* A negative count (or overflow for that matter) is invalid,
* and indicates a bug in the `pids` controller proper.
*/
WARN_ON_ONCE(atomic64_add_negative(-num, &pids->counter));
}

Converting this to a printk I was able to obtain confirmation that we are
indeed seeing this go negative in some cases.

Reverting the above commit seems to resolve the early boot issues in
my testing.

-apw

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1523586


2015-12-08 15:24:37

by Tejun Heo

[permalink] [raw]
Subject: Re: cgroup pids controller -- WARN_ON_ONCE triggering

Hello, Andy.

On Tue, Dec 08, 2015 at 02:58:51PM +0000, Andy Whitcroft wrote:
> Converting this to a printk I was able to obtain confirmation that we are
> indeed seeing this go negative in some cases.
>
> Reverting the above commit seems to resolve the early boot issues in
> my testing.

Fix already queued in the following git branch. Pushing it out as we
speak.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-4.4-fixes

Thanks.

--
tejun

2015-12-08 17:05:25

by Andy Whitcroft

[permalink] [raw]
Subject: Re: cgroup pids controller -- WARN_ON_ONCE triggering

On Tue, Dec 08, 2015 at 10:24:28AM -0500, Tejun Heo wrote:
> Hello, Andy.
>
> On Tue, Dec 08, 2015 at 02:58:51PM +0000, Andy Whitcroft wrote:
> > Converting this to a printk I was able to obtain confirmation that we are
> > indeed seeing this go negative in some cases.
> >
> > Reverting the above commit seems to resolve the early boot issues in
> > my testing.
>
> Fix already queued in the following git branch. Pushing it out as we
> speak.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-4.4-fixes
>
> Thanks.

Cool, thanks. That seems to fix me up here.

-apw