LinuxLists.cc - OOM Killer killing whole system

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

Anton Titov <[email protected]> wrote:
>
> Yesterday I accidently noticed few OOM killer messages in the system log
> and leaved a console tailing the log for the night. In 6 in the morning
> OOM killer got mad generating 500 lines in the log and 5 minutes later
> system closed the ssh connection and became inresponsive. The guy in the
> datacenter told me that when he attached keyboard even caps lock was not
> working. Inspite of this the system still was responsive (only to) ping.
>
> The strange thing is this machine is relatively light loaded - now after
> 6 hours being up free shows:
> total used free shared buffers cached
> Mem: 2075468 1148564 926904 0 123472 314516
> -/+ buffers/cache: 710576 1364892
> Swap: 1004020 0 1004020
>
> Load average stays under 0.5 most of the time. In 6 in the morning it
> should be almost no load (there is no crons scheduled at that time).
>
> I'm attaching messages from the log and my .config.

What kernel version? <looks in config.gz>. 2.6.15.

> Jan 15 06:05:09 vip Normal free:3700kB min:3756kB low:4692kB high:5632kB active:9964kB inactive:8532kB present:901120kB pages_scanned:19628

Pretty much all of the ZONE_NORMAL memory is AWOL.

> Jan 15 06:05:09 vip 216477 pages slab

It's all in slab. 800MB.

I'd be suspecting a slab memory leak. If it happens again, please take a
copy of /proc/slabinfo, send it.

2006-01-20 20:04:09

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, 20 Jan 2006, Andrew Morton wrote:
>> Jan 15 06:05:09 vip 216477 pages slab
>
> It's all in slab. 800MB.
>
> I'd be suspecting a slab memory leak. If it happens again, please take a
> copy of /proc/slabinfo, send it.
>

Andrew & Anton,
I've experienced slab leaking in my system lately too. The culprit
was 1.5 million SCSI commands in the scsi command cache. I haven't had an
opportunity to look into it further yet; I'll try to copy you guys when I
do.

Thanks,
Chase

2006-01-20 21:48:11

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, 2006-01-20 at 14:04 -0600, Chase Venters wrote:
> On Fri, 20 Jan 2006, Andrew Morton wrote:
> >> Jan 15 06:05:09 vip 216477 pages slab
> >
> > It's all in slab. 800MB.
> >
> > I'd be suspecting a slab memory leak. If it happens again, please take a
> > copy of /proc/slabinfo, send it.
> >
>
> Andrew & Anton,
> The culprit was 1.5 million SCSI commands in the scsi command cache.
>
> Thanks,
> Chase

I currently have this:
scsi_cmd_cache 1458778 1458790 384 10 1 : tunables 54 27
8 : slabdata 145879 145879 0

in /proc/slabinfo, which is pretty close to 1.5 million. The system is
working fine but it should be not very loaded anyway, so a mem leakage
will not show up early. Just checked, that scsi_cmd_cache on other
machines of mine is under 100, so it seems like a problem.

Unfortunately, while being a programmer, I'm totally unaware
what /proc/slabinfo means, but I'm perfectly willing to provide a shell
(in case of Andrew or other famous developer it may be even root) on
this machine.

I'm attaching the /proc/slabinfo

Thanks for help,
Anton

Attachments:

slab.gz (1.82 kB)

2006-01-20 22:48:14

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

Anton Titov <[email protected]> wrote:
>
> On Fri, 2006-01-20 at 14:04 -0600, Chase Venters wrote:
> > On Fri, 20 Jan 2006, Andrew Morton wrote:
> > >> Jan 15 06:05:09 vip 216477 pages slab
> > >
> > > It's all in slab. 800MB.
> > >
> > > I'd be suspecting a slab memory leak. If it happens again, please take a
> > > copy of /proc/slabinfo, send it.
> > >
> >
> > Andrew & Anton,
> > The culprit was 1.5 million SCSI commands in the scsi command cache.
> >
> > Thanks,
> > Chase
>
> I currently have this:
> scsi_cmd_cache 1458778 1458790 384 10 1 : tunables 54 27
> 8 : slabdata 145879 145879 0
>
> in /proc/slabinfo, which is pretty close to 1.5 million. The system is
> working fine but it should be not very loaded anyway, so a mem leakage
> will not show up early. Just checked, that scsi_cmd_cache on other
> machines of mine is under 100, so it seems like a problem.

That's great, thanks.

This is 2.6.15 and we have a deadly bug in scsi.

Next time you reboot 2.6.15 on that machine can you please send the output
of `dmesg -s 1000000'? You might have to set CONFIG_LOG_BUF_SHIFT=17 to
prevent it from being truncated.

2006-01-21 00:09:06

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, 2006-01-20 at 14:50 -0800, Andrew Morton wrote:
> That's great, thanks.
>
> This is 2.6.15 and we have a deadly bug in scsi.
>
> Next time you reboot 2.6.15 on that machine can you please send the output
> of `dmesg -s 1000000'? You might have to set CONFIG_LOG_BUF_SHIFT=17 to
> prevent it from being truncated.

Sure, here is it, just rebooted, seems complete to me.

15mins after rebooting I have:

vip ~ # cat /proc/slabinfo | grep scsi
scsi_cmd_cache 6160 6160 384 10 1 : tunables 54 27
8 : slabdata 616 616 0
vip ~ # uptime
02:04:49 up 15 min, 1 user, load average: 0.16, 0.21, 0.19

Attachments:

dmesg.gz (4.23 kB)

2006-01-21 00:20:07

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Friday 20 January 2006 16:49, Andrew Morton wrote:
> This is 2.6.15 and we have a deadly bug in scsi.
>
> Next time you reboot 2.6.15 on that machine can you please send the output
> of `dmesg -s 1000000'? You might have to set CONFIG_LOG_BUF_SHIFT=17 to
> prevent it from being truncated.

Here's mine (attached). Curious - the -s... were you expecting the ring buffer
to exceed 16384? I don't think my (boot time) buffer does.

Thanks,
Chase

Attachments:

(No filename) (449.00 B)
dmesg (30.45 kB)
Download all attachments

2006-01-21 00:48:37

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

Chase Venters <[email protected]> wrote:
>
> > Next time you reboot 2.6.15 on that machine can you please send the output
> > of `dmesg -s 1000000'? You might have to set CONFIG_LOG_BUF_SHIFT=17 to
> > prevent it from being truncated.
>
> Here's mine (attached).

Great, thanks. That tells us all sorts of stuff about your setup.

For linux-scsi reference, Chase's /proc/slabinfo says:

scsi_cmd_cache 1547440 1547440 384 10 1 : tunables 54 27 8 :
slabdata 154744 154744 0

> Curious - the -s... were you expecting the ring buffer
> to exceed 16384?

It can sometimes be quite large. I always say -s 1000000 to make sure
everything got there.

> I don't think my (boot time) buffer does.

It's compile-time configurable with CONFIG_LOG_BUF_SHIFT and boot-time
configurable with log_buf_len=n.

2006-01-21 01:17:45

by James Bottomley

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, 2006-01-20 at 16:50 -0800, Andrew Morton wrote:
> For linux-scsi reference, Chase's /proc/slabinfo says:
>
> scsi_cmd_cache 1547440 1547440 384 10 1 : tunables 54 27 8 :
> slabdata 154744 154744 0

There's another curiosity about this: the linux command stack is pretty
well counted per scsi device (it's how we control queue depth), so if a
driver leaks commands we see it not by this type of behaviour, but by
the system hanging (waiting for all the commands the mid-layer thinks
are outstanding to return). So, the only way we could leak commands
like this is in the mid-layer command return logic ... and I can't find
anywhere this might happen.

The sequence is:

driver -> cmd->scsi_done() -> blk softirq -> scsi_softirq_done() ->
scsi_finish_cmd() (where the queue counts are decremented, so anything
after here could leak commands if the rest of the chain is broken) ->
cmd->done() (which is the ULD completion callback) ->
scsi_io_completion() (frees the sg table, so if the sgpool slabs aren't
out of whack we must be past here) -> scsi_end_request() ->
scsi_next_command() -> scsi_put_command() (which is where the command
goes back to the slab).

James

> > Curious - the -s... were you expecting the ring buffer
> > to exceed 16384?
>
> It can sometimes be quite large. I always say -s 1000000 to make sure
> everything got there.
>
> > I don't think my (boot time) buffer does.
>
> It's compile-time configurable with CONFIG_LOG_BUF_SHIFT and boot-time
> configurable with log_buf_len=n.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2006-01-21 03:29:47

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, 2006-01-20 at 19:17 -0600, James Bottomley wrote:
> On Fri, 2006-01-20 at 16:50 -0800, Andrew Morton wrote:
> > For linux-scsi reference, Chase's /proc/slabinfo says:
> >
> > scsi_cmd_cache 1547440 1547440 384 10 1 : tunables 54 27 8 :
> > slabdata 154744 154744 0
>
> There's another curiosity about this: the linux command stack is pretty
> well counted per scsi device (it's how we control queue depth), so if a
> driver leaks commands we see it not by this type of behaviour, but by
> the system hanging (waiting for all the commands the mid-layer thinks
> are outstanding to return). So, the only way we could leak commands
> like this is in the mid-layer command return logic ... and I can't find
> anywhere this might happen.
>

Just to mention, that 2.6.14.2 does not have this problem:

vip ~ # cat /proc/slabinfo | grep scsi
scsi_cmd_cache 60 60 384 10 1 : tunables 54 27
8 : slabdata 6 6 27

but my guess is that the problem may be not in SCSI, as not /and
previosly actually/ I have this:

vip ~ # cat /proc/slabinfo | grep reiser
reiser_inode_cache 556594 556614 408 9 1 : tunables 54 27
8 : slabdata 61846 61846 0

which seems too high too

2006-01-21 03:45:05

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

Anton Titov <[email protected]> wrote:
>
> On Fri, 2006-01-20 at 19:17 -0600, James Bottomley wrote:
> > On Fri, 2006-01-20 at 16:50 -0800, Andrew Morton wrote:
> > > For linux-scsi reference, Chase's /proc/slabinfo says:
> > >
> > > scsi_cmd_cache 1547440 1547440 384 10 1 : tunables 54 27 8 :
> > > slabdata 154744 154744 0
> >
> > There's another curiosity about this: the linux command stack is pretty
> > well counted per scsi device (it's how we control queue depth), so if a
> > driver leaks commands we see it not by this type of behaviour, but by
> > the system hanging (waiting for all the commands the mid-layer thinks
> > are outstanding to return). So, the only way we could leak commands
> > like this is in the mid-layer command return logic ... and I can't find
> > anywhere this might happen.
> >
>
> Just to mention, that 2.6.14.2 does not have this problem:
>
> vip ~ # cat /proc/slabinfo | grep scsi
> scsi_cmd_cache 60 60 384 10 1 : tunables 54 27
> 8 : slabdata 6 6 27
>
> but my guess is that the problem may be not in SCSI, as not /and
> previosly actually/ I have this:
>
> vip ~ # cat /proc/slabinfo | grep reiser
> reiser_inode_cache 556594 556614 408 9 1 : tunables 54 27
> 8 : slabdata 61846 61846 0
>
> which seems too high too

Having large numbers of cached inodes is fairly common. Try running
something which uses lots of memory: memset(malloc(gigabytes)), or usemem
from http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz or
read a multi-gigabyte file from disk and you shuld see the inode count wind
down.

2006-01-21 03:45:17

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, 2006-01-20 at 19:17 -0600, James Bottomley wrote:
> There's another curiosity about this: the linux command stack is pretty
> well counted per scsi device (it's how we control queue depth), so if a
> driver leaks commands we see it not by this type of behaviour, but by
> the system hanging (waiting for all the commands the mid-layer thinks
> are outstanding to return). So, the only way we could leak commands
> like this is in the mid-layer command return logic ... and I can't find
> anywhere this might happen.

Additionaly I've looked into Chase's dmesg and we seem to use pretty
much the same motherboard (at least Marvell NIC and ICH6 controller), so
it may be ICH6 issue? Or sk98lin (I have another sk98lin patched server,
which works well)?

just in case, here is lspci:

00:00.0 Host bridge: Intel Corporation 915G/P/GV/GL/PL/910GL Processor
to I/O Controller (rev 0e)
00:02.0 VGA compatible controller: Intel Corporation 82915G/GV/910GL
Express Chipset Family Graphics Controller (rev 0e)
00:02.1 Display controller: Intel Corporation 82915G Express Chipset
Family Graphics Controller (rev 0e)
00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) PCI Express Port 2 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d3)
00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC
Interface Bridge (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) IDE Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation 82801FR/FRW (ICH6R/ICH6RW)
SATA Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
SMBus Controller (rev 03)
01:04.0 Mass storage controller: <pci_lookup_name: buffer too small>
(rev 13)
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E
Gigabit Ethernet Controller (rev 15)

2006-01-21 03:53:33

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Friday 20 January 2006 21:44, Anton Titov wrote:
> 00:00.0 Host bridge: Intel Corporation 915G/P/GV/GL/PL/910GL Processor
> to I/O Controller (rev 0e)
> 00:02.0 VGA compatible controller: Intel Corporation 82915G/GV/910GL
> Express Chipset Family Graphics Controller (rev 0e)
> 00:02.1 Display controller: Intel Corporation 82915G Express Chipset
> Family Graphics Controller (rev 0e)
> 00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
> Family) PCI Express Port 1 (rev 03)
> 00:1c.1 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
> Family) PCI Express Port 2 (rev 03)
> 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d3)
> 00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC
> Interface Bridge (rev 03)
> 00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
> Family) IDE Controller (rev 03)
> 00:1f.2 SATA controller: Intel Corporation 82801FR/FRW (ICH6R/ICH6RW)
> SATA Controller (rev 03)
> 00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
> SMBus Controller (rev 03)
> 01:04.0 Mass storage controller: <pci_lookup_name: buffer too small>
> (rev 13)
> 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E
> Gigabit Ethernet Controller (rev 15)

Random guess... Asus P5GDC-V with Firewire and USB turned off?

00:00.0 Host bridge: Intel Corporation 915G/P/GV/GL/PL/910GL Processor to I/O
Controller (rev 04)
00:01.0 PCI bridge: Intel Corporation 915G/P/GV/GL/PL/910GL PCI Express Root
Port (rev 04)
00:1b.0 Class 0403: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High
Definition Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI
Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI
Express Port 2 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #3 (rev 03)
00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #4 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB2 EHCI Controller (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d3)
00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC Interface
Bridge (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
IDE Controller (rev 03)
00:1f.2 Class 0106: Intel Corporation 82801FR/FRW (ICH6R/ICH6RW) SATA
Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus
Controller (rev 03)
01:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000
Controller (PHY/Link)
01:04.0 Mass storage controller: <pci_lookup_name: buffer too small> (rev 13)
01:09.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04)
01:09.1 Input device controller: Creative Labs SB Audigy MIDI/Game port (rev
04)
01:09.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire Port (rev 04)
01:0a.0 SCSI storage controller: Adaptec AHA-7850 (rev 03)
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 Gigabit
Ethernet Controller (rev 15)
04:00.0 VGA compatible controller: nVidia Corporation Unknown device 0092 (rev
a1)

Also using Marvell's sk98lin driver (iirc, sky2 should supercede it soon
enough). This is the only machine I'm using sk98lin on, but I haven't had any
trouble with it on prior kernels.

Thanks,
Chase

2006-01-21 04:21:34

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, 2006-01-20 at 21:53 -0600, Chase Venters wrote:
> Random guess... Asus P5GDC-V with Firewire and USB turned off?

Exactly (Asus P5GDC-V Deluxe actually, with few more things off). So
maybe it's ICH6?

2006-01-21 04:35:58

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Friday 20 January 2006 22:21, Anton Titov wrote:
> On Fri, 2006-01-20 at 21:53 -0600, Chase Venters wrote:
> > Random guess... Asus P5GDC-V with Firewire and USB turned off?
>
> Exactly (Asus P5GDC-V Deluxe actually, with few more things off). So
> maybe it's ICH6?

Just a shot in the dark, but in the last few kernel revisions have you
experienced any SATA problems with DMA timeouts, in some versions leading to
a hang?

Cheers,
Chase

2006-01-23 02:57:36

by Kalin KOZHUHAROV

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

Chase Venters wrote:
> On Friday 20 January 2006 22:21, Anton Titov wrote:
>> On Fri, 2006-01-20 at 21:53 -0600, Chase Venters wrote:
>>> Random guess... Asus P5GDC-V with Firewire and USB turned off?
>> Exactly (Asus P5GDC-V Deluxe actually, with few more things off). So
>> maybe it's ICH6?
>
> Just a shot in the dark, but in the last few kernel revisions have you
> experienced any SATA problems with DMA timeouts, in some versions leading to
> a hang?

I have two of these boards and one of them is constantly hanging, just
simply dead. With 2.6.15 it reports failed I/O (SATA here) and mounts
reiserfs root RO. sky2 works for me, but I had another hang, so sk98lin
might not be the culprit.

The other box (the difference is the SATA drive and the CD) is working OK,
almost 4d uptime now.

Will try to revive the black machine and report more.
(the bad machine is called black and the good is called white :-)

Kalin.

--
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]|
+-> http://ThinRope.net/ <-+
|[ ______________________ ]|

2006-01-23 03:29:19

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Sunday 22 January 2006 20:56, Kalin KOZHUHAROV wrote:
> I have two of these boards and one of them is constantly hanging, just
> simply dead. With 2.6.15 it reports failed I/O (SATA here) and mounts
> reiserfs root RO. sky2 works for me, but I had another hang, so sk98lin
> might not be the culprit.

Really? I had serious problems with mine hanging in earlier kernel revisions.
I haven't seen a hang yet on 2.6.15, but that may be because I've not made it
to a longer uptime because of the scsi leak.

When I hang I get complaints about DMA timeouts / weird ATA port statuses as
the last messages on my serial console. After that, not even SysRQ works.

Cheers,
Chase

2006-01-23 08:37:37

by Jens Axboe

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Fri, Jan 20 2006, Chase Venters wrote:
> On Friday 20 January 2006 16:49, Andrew Morton wrote:
> > This is 2.6.15 and we have a deadly bug in scsi.
> >
> > Next time you reboot 2.6.15 on that machine can you please send the output
> > of `dmesg -s 1000000'? You might have to set CONFIG_LOG_BUF_SHIFT=17 to
> > prevent it from being truncated.
>
> Here's mine (attached). Curious - the -s... were you expecting the
> ring buffer to exceed 16384? I don't think my (boot time) buffer does.

Just a note - you seem to have the raid1 in common with the rest of the
reporters so far.

--
Jens Axboe

2006-01-23 08:42:09

by Arjan van de Ven

[permalink] [raw]

Subject: Re: OOM Killer killing whole system

On Mon, 2006-01-23 at 09:39 +0100, Jens Axboe wrote:
> On Fri, Jan 20 2006, Chase Venters wrote:
> > On Friday 20 January 2006 16:49, Andrew Morton wrote:
> > > This is 2.6.15 and we have a deadly bug in scsi.
> > >
> > > Next time you reboot 2.6.15 on that machine can you please send the output
> > > of `dmesg -s 1000000'? You might have to set CONFIG_LOG_BUF_SHIFT=17 to
> > > prevent it from being truncated.
> >
> > Here's mine (attached). Curious - the -s... were you expecting the
> > ring buffer to exceed 16384? I don't think my (boot time) buffer does.
>
> Just a note - you seem to have the raid1 in common with the rest of the
> reporters so far.

time to get out some of the obvious heavy hitters.. and enable slab
debug and CONFIG_DEBUG_PAGEALLOC just with the chance to catch a random
scribble of sorts

2006-01-23 09:11:03