2015-12-24 17:12:59

by Jeremiah Mahler

[permalink] [raw]
Subject: BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894

all,

I have started seeing a "Bad rss-counter" message in the logs with
the latest linux-next 20151222+.

[ 458.282192] BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894

I can test patches if anyone has any ideas.

--
- Jeremiah Mahler


2015-12-29 16:23:51

by Michal Hocko

[permalink] [raw]
Subject: Re: BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894

[CCing Andrew]

On Thu 24-12-15 09:12:53, Jeremiah Mahler wrote:
> all,
>
> I have started seeing a "Bad rss-counter" message in the logs with
> the latest linux-next 20151222+.
>
> [ 458.282192] BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894

This is MM_SHMEMPAGES so an "unamapped" shmem memory. One possible
reason might be an unitialized zap_details used from unmap_mapping_range
during truncate introduced by "mm, oom: introduce oom reaper" from the
mmotm tree. There is a fix for this which is still pending
http://lkml.kernel.org/r/1450487091-7822-1-git-send-email-sasha.levin%40oracle.com
--
Michal Hocko
SUSE Labs

2015-12-29 18:30:53

by Andrew Morton

[permalink] [raw]
Subject: Re: BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894

On Tue, 29 Dec 2015 17:23:47 +0100 Michal Hocko <[email protected]> wrote:

> [CCing Andrew]
>
> On Thu 24-12-15 09:12:53, Jeremiah Mahler wrote:
> > all,
> >
> > I have started seeing a "Bad rss-counter" message in the logs with
> > the latest linux-next 20151222+.
> >
> > [ 458.282192] BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894
>
> This is MM_SHMEMPAGES so an "unamapped" shmem memory. One possible
> reason might be an unitialized zap_details used from unmap_mapping_range
> during truncate introduced by "mm, oom: introduce oom reaper" from the
> mmotm tree. There is a fix for this which is still pending
> http://lkml.kernel.org/r/1450487091-7822-1-git-send-email-sasha.levin%40oracle.com

That won't be getting fixed until linux-next returns from holidays.

This:

--- a/mm/memory.c~mm-oom-introduce-oom-reaper-fix-5-fix
+++ a/mm/memory.c
@@ -2415,7 +2415,7 @@ static inline void unmap_mapping_range_t
void unmap_mapping_range(struct address_space *mapping,
loff_t const holebegin, loff_t const holelen, int even_cows)
{
- struct zap_details details;
+ struct zap_details details = { };
pgoff_t hba = holebegin >> PAGE_SHIFT;
pgoff_t hlen = (holelen + PAGE_SIZE - 1) >> PAGE_SHIFT;

_

2015-12-29 18:57:32

by Jeremiah Mahler

[permalink] [raw]
Subject: Re: BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894

Andrew, Michal,

On Tue, Dec 29, 2015 at 10:30:37AM -0800, Andrew Morton wrote:
> On Tue, 29 Dec 2015 17:23:47 +0100 Michal Hocko <[email protected]> wrote:
>
> > [CCing Andrew]
> >
> > On Thu 24-12-15 09:12:53, Jeremiah Mahler wrote:
> > > all,
> > >
> > > I have started seeing a "Bad rss-counter" message in the logs with
> > > the latest linux-next 20151222+.
> > >
> > > [ 458.282192] BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894
> >
> > This is MM_SHMEMPAGES so an "unamapped" shmem memory. One possible
> > reason might be an unitialized zap_details used from unmap_mapping_range
> > during truncate introduced by "mm, oom: introduce oom reaper" from the
> > mmotm tree. There is a fix for this which is still pending
> > http://lkml.kernel.org/r/1450487091-7822-1-git-send-email-sasha.levin%40oracle.com
>
> That won't be getting fixed until linux-next returns from holidays.
>
> This:
>
> --- a/mm/memory.c~mm-oom-introduce-oom-reaper-fix-5-fix
> +++ a/mm/memory.c
> @@ -2415,7 +2415,7 @@ static inline void unmap_mapping_range_t
> void unmap_mapping_range(struct address_space *mapping,
> loff_t const holebegin, loff_t const holelen, int even_cows)
> {
> - struct zap_details details;
> + struct zap_details details = { };
> pgoff_t hba = holebegin >> PAGE_SHIFT;
> pgoff_t hlen = (holelen + PAGE_SIZE - 1) >> PAGE_SHIFT;
>
> _
>

I tested both the patch that Michal mentioned as well as the change that
Andrew provided. The both appear to fix the problem on my machine.

Thanks for the help :-)

--
- Jeremiah Mahler

2015-12-30 22:32:55

by Stephen Rothwell

[permalink] [raw]
Subject: Re: BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894

Hi Andrew,

On Tue, 29 Dec 2015 10:30:37 -0800 Andrew Morton <[email protected]> wrote:
>
> On Tue, 29 Dec 2015 17:23:47 +0100 Michal Hocko <[email protected]> wrote:
>
> > [CCing Andrew]
> >
> > On Thu 24-12-15 09:12:53, Jeremiah Mahler wrote:
> > > all,
> > >
> > > I have started seeing a "Bad rss-counter" message in the logs with
> > > the latest linux-next 20151222+.
> > >
> > > [ 458.282192] BUG: Bad rss-counter state mm:ffff8800c5a96000 idx:3 val:3894
> >
> > This is MM_SHMEMPAGES so an "unamapped" shmem memory. One possible
> > reason might be an unitialized zap_details used from unmap_mapping_range
> > during truncate introduced by "mm, oom: introduce oom reaper" from the
> > mmotm tree. There is a fix for this which is still pending
> > http://lkml.kernel.org/r/1450487091-7822-1-git-send-email-sasha.levin%40oracle.com
>
> That won't be getting fixed until linux-next returns from holidays.
>
> This:
>
> --- a/mm/memory.c~mm-oom-introduce-oom-reaper-fix-5-fix
> +++ a/mm/memory.c
> @@ -2415,7 +2415,7 @@ static inline void unmap_mapping_range_t
> void unmap_mapping_range(struct address_space *mapping,
> loff_t const holebegin, loff_t const holelen, int even_cows)
> {
> - struct zap_details details;
> + struct zap_details details = { };
> pgoff_t hba = holebegin >> PAGE_SHIFT;
> pgoff_t hlen = (holelen + PAGE_SIZE - 1) >> PAGE_SHIFT;
>

OK, I have added the following from mmots to linux-next today (though I
may not get a release done):

mm-oom-introduce-oom-reaper-fix-4-fix.patch
mm-oom-introduce-oom-reaper-fix-4.patch
mm-oom-introduce-oom-reaper-fix-5-fix.patch
mm-oom-introduce-oom-reaper-fix-5.patch
mm-oom-introduce-oom-reaper-fix-6.patch
mmoom-exclude-tif_memdie-processes-from-candidates.patch

--
Cheers,
Stephen Rothwell [email protected]