2015-04-25 15:56:36

by Joshua Kinard

[permalink] [raw]
Subject: MIPS: BUG() in isolate_lru_pages in mm/vmscan.c?

I keep tripping up a BUG() in isolate_lru_pages in mm/vmscan.c:1345:

switch (__isolate_lru_page(page, mode)) {
case 0:
nr_pages = hpage_nr_pages(page);
mem_cgroup_update_lru_size(lruvec, lru, -nr_pages);
list_move(&page->lru, dst);
nr_taken += nr_pages;
break;

case -EBUSY:
/* else it is being freed elsewhere */
list_move(&page->lru, src);
continue;

default:
BUG();
}

This is on an SGI Onyx2 platform (MIPS, IP27), two node boards (4x R14000
CPUs), and 8G of RAM. The problem appears tied to heavy disk I/O, typically
writes. I can reproduce sometimes with a long bonnie++ run, but I haven't
gotten a recent panic() message under 4.0 yet. Most of the time, it silently
hardlocks. I only have serial console access at 9600bps, so it may lock too
fast before the serial driver can dump the panic.

Is there any information behind the purpose or triggers of this BUG()? I went
back in git all the way to the initial 2006 commit that added this function,
but could not find any comments or explanation of just what it's protecting
against. That makes it hard to know where to start debugging.

I've already tried switching filesystems, first ext4, now XFS. Enabling
CONFIG_NUMA seems to make it harder to trigger, but that's not an objective
observation. An md RAID resync doesn't appear to trigger it either.

Help?


2015-04-25 18:55:52

by Joshua Kinard

[permalink] [raw]
Subject: Re: MIPS: BUG() in isolate_lru_pages in mm/vmscan.c?

On 04/25/2015 11:56, Joshua Kinard wrote:
> I keep tripping up a BUG() in isolate_lru_pages in mm/vmscan.c:1345:
>
> switch (__isolate_lru_page(page, mode)) {
> case 0:
> nr_pages = hpage_nr_pages(page);
> mem_cgroup_update_lru_size(lruvec, lru, -nr_pages);
> list_move(&page->lru, dst);
> nr_taken += nr_pages;
> break;
>
> case -EBUSY:
> /* else it is being freed elsewhere */
> list_move(&page->lru, src);
> continue;
>
> default:
> BUG();
> }
>
> This is on an SGI Onyx2 platform (MIPS, IP27), two node boards (4x R14000
> CPUs), and 8G of RAM. The problem appears tied to heavy disk I/O, typically
> writes. I can reproduce sometimes with a long bonnie++ run, but I haven't
> gotten a recent panic() message under 4.0 yet. Most of the time, it silently
> hardlocks. I only have serial console access at 9600bps, so it may lock too
> fast before the serial driver can dump the panic.
>
> Is there any information behind the purpose or triggers of this BUG()? I went
> back in git all the way to the initial 2006 commit that added this function,
> but could not find any comments or explanation of just what it's protecting
> against. That makes it hard to know where to start debugging.
>
> I've already tried switching filesystems, first ext4, now XFS. Enabling
> CONFIG_NUMA seems to make it harder to trigger, but that's not an objective
> observation. An md RAID resync doesn't appear to trigger it either.


This patch seems to explain things a little bit (from 20070316):
http://marc.info/?l=linux-mm-commits&m=117401513810763&w=2

> Subject: lumpy: back out removal of active check in isolate_lru_pages
> From: Andy Whitcroft <[email protected]>
>
> As pointed out by Christop Lameter it should not be possible for a page to
> change its active/inactive state without taking the lru_lock. Reinstate this
> safety net.
>
> Signed-off-by: Andy Whitcroft <[email protected]>
> Acked-by: Mel Gorman <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
>
> mm/vmscan.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff -puN mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages mm/vmscan.c
> --- a/mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages
> +++ a/mm/vmscan.c
> @@ -686,10 +686,13 @@ static unsigned long isolate_lru_pages(u
> nr_taken++;
> break;
>
> - default:
> - /* page is being freed, or is a missmatch */
> + case -EBUSY:
> + /* else it is being freed elsewhere */
> list_move(&page->lru, src);
> continue;
> +
> + default:
> + BUG();
> }
>
> if (!order)

So if my reading is correct, the BUG() is being triggered because a page might
be changing its active/inactive state w/o taking the lru_lock. Given that the
SGI IP27 platform is an early NUMA machine and nodes can have a bit of physical
distance between them (thus some latency), could this be a sign of some kind of
SMP race condition specific to this platform?

--J