2007-11-25 22:02:53

by Josh Goldsmith

[permalink] [raw]
Subject: Small System Paging Problem - OOM-killer goes nuts

Hi,

I have a Linksys NSLU2 running 2.6.21 (I can replicate the problem on
2.6.23 but it isn't fully supported on SlugOS). It is a armv5teb device
with 32MB of RAM, 400+ MB swap on its 160GB USB2 root disk. The machine is
used as a fileserver and to build packages for other ARM devices. It may be
underpowered by today's standard but is a whole lot faster than my first
Linux system (386sx20 with 4MB RAM) but the whole system with disk uses <8
watts and is silent.

The problem comes when I try to untar a large file (in this case
linux-2.6.23.tar.bz2). Regardless if I kill off every other process,
eventually the oom-killer will appear and kill either the tar or the shell.
I've tried every tuning option I and my buddy Google could find including
(/proc/sys/vm/overcommit*) with no success. I'm not worried about paging
impacting performance.

I'd appreciate any help, pointers, or gentle taps with the cluebat.

-Josh

Error output to console: http://www.pastebin.ca/797155

config -> http://www.pastebin.ca/797206

slug2>$ uname -a
Linux slug2 2.6.21 #1 PREEMPT Fri Nov 9 11:54:06 MST 2007 armv5teb unknown

slug2:~$ free
total used free shared buffers cached
Mem: 30352 29124 1228 0 10196 9468
-/+ buffers/cache: 9460 20892
Swap: 465876 0 465876

cat /proc/swaps
Filename Type Size Used
Priority
/dev/sda4 partition 465876 0 -1

slug2:~$ lsmod
Module Size Used by
nfsd 186556 8
exportfs 4320 1 nfsd
lockd 51416 2 nfsd
sunrpc 131952 2 nfsd,lockd
reiserfs 255380 1
ixp4xx_mac 14644 0
ixp4xx_qmgr 5388 5 ixp4xx_mac
mii 3424 1 ixp4xx_mac
ext3 110472 2
jbd 47784 1 ext3
mbcache 5604 1 ext3
ohci_hcd 16804 0
ehci_hcd 30252 0

slug2>$ dmesg
<5>Linux version 2.6.21 (josh@styx) (gcc version 4.1.1) #1 PREEMPT Fri Nov 9
11:54:06 MST 2007
<4>CPU: XScale-IXP42x Family [690541f1] revision 1 (ARMv5TE), cr=000039ff
<4>Machine: Linksys NSLU2
<4>Memory policy: ECC disabled, Data cache writeback
<7>On node 0 totalpages: 8192
<7> DMA zone: 64 pages used for memmap
<7> DMA zone: 0 pages reserved
<7> DMA zone: 8128 pages, LIFO batch:0
<7> Normal zone: 0 pages used for memmap
<4>CPU0: D VIVT undefined 5 cache
<4>CPU0: I cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets
<4>CPU0: D cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets
<4>Built 1 zonelists. Total pages: 8128
<5>Kernel command line: rtc-x1205.probe=0,0x6f console=ttyS0,115200n8
root=/dev/mtdblock4 rootfstype=jffs2 rw init=/linuxrc noirqdebug
<6>IRQ lockup detection disabled
<4>PID hash table entries: 128 (order: 7, 512 bytes)
<4>Dentry cache hash table entries: 4096 (order: 2, 16384 bytes)
<4>Inode-cache hash table entries: 2048 (order: 1, 8192 bytes)
<6>Memory: 32MB = 32MB total
<5>Memory: 30268KB available (1940K code, 154K data, 84K init)
<7>Calibrating delay loop... 266.24 BogoMIPS (lpj=1331200)
<4>Mount-cache hash table entries: 512
<6>CPU: Testing write buffer coherency: ok
<6>NET: Registered protocol family 16
<4>IXP4xx: Using 16MiB expansion bus window size
<4>PCI: IXP4xx is host
<4>PCI: IXP4xx Using direct access for memory space
<6>PCI: bus0: Fast back to back transfers disabled
<6>dmabounce: registered device 0000:00:01.0 on pci bus
<6>dmabounce: registered device 0000:00:01.1 on pci bus
<6>dmabounce: registered device 0000:00:01.2 on pci bus
<5>SCSI subsystem initialized
<6>usbcore: registered new interface driver usbfs
<6>usbcore: registered new interface driver hub
<6>usbcore: registered new device driver usb
<6>Time: OSTS clocksource has been installed.
<6>NET: Registered protocol family 2
<4>IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
<4>TCP established hash table entries: 1024 (order: 1, 8192 bytes)
<4>TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
<6>TCP: Hash tables configured (established 1024 bind 1024)
<6>TCP reno registered
<4>NetWinder Floating Point Emulator V0.97 (double precision)
<6>JFFS2 version 2.2. (NAND) (C) 2001-2006 Red Hat, Inc.
<6>io scheduler noop registered
<6>io scheduler deadline registered (default)
<6>Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing disabled
<6>serial8250.0: ttyS0 at MMIO 0xc8000000 (irq = 15) is a XScale
<6>serial8250.0: ttyS1 at MMIO 0xc8001000 (irq = 13) is a XScale
<4>RAMDISK driver initialized: 4 RAM disks of 10240K size 1024 blocksize
<6>IXP4XX NPE driver Version 0.3.0 initialized
<6>NFTL driver: nftlcore.c $Revision: 1.98 $, nftlmount.c $Revision: 1.41 $
<6>IXP4XX-Flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
<7>IXP4XX-Flash.0: Found an alias at 0x800000 for the chip at 0x0
<4> Intel/Sharp Extended Query Table at 0x0031
<6>Using buffer write method
<5>cfi_cmdset_0001: Erase suspend on write enabled
<7>erase region 0: offset=0x0,size=0x20000,blocks=64
<5>Searching for RedBoot partition table in IXP4XX-Flash.0 at offset
0x7e0000
<5>6 RedBoot partitions found on MTD device IXP4XX-Flash.0
<5>Creating 6 MTD partitions on "IXP4XX-Flash.0":
<5>0x00000000-0x00040000 : "RedBoot"
<6>nslu2 mac: 00:0f:66:7c:14:23
<5>0x00040000-0x00060000 : "SysConf"
<5>0x00060000-0x00080000 : "Loader"
<5>0x00080000-0x00180000 : "Kernel"
<5>0x00180000-0x007e0000 : "Flashdisk"
<5>0x007e0000-0x00800000 : "FIS directory"
<4>npe: searching for firmware...
<6>npe: found at 0x1c010, IXP425/NPE-B func: 00, rev: 2.1, size: 11964, id:
01000201
<6>Image loaded to NPE-B Func:0, Rel: 2:1, Status: 80800000
<6>Initializing USB Mass Storage driver...
<6>usbcore: registered new interface driver usb-storage
<6>USB Mass Storage support registered.
<6>input: ixp4xx beeper as /class/input/input0
<6>i2c /dev entries driver
<6>x1205 0-006f: chip found, driver version 1.0.7
<6>x1205 0-006f: rtc core: registered x1205 as rtc0
<6>Registered led device: ready
<6>Registered led device: status
<6>Registered led device: disk-1
<6>Registered led device: disk-2
<6>TCP cubic registered
<6>NET: Registered protocol family 1
<6>NET: Registered protocol family 17
<6>XScale DSP coprocessor detected.
<6>x1205 0-006f: setting the system clock to 2007-11-25 20:35:07
(1196022907)
<4>VFS: Mounted root (jffs2 filesystem).
<6>Freeing init memory: 84K
<4>PCI: enabling device 0000:00:01.2 (0140 -> 0142)
<6>ehci_hcd 0000:00:01.2: EHCI Host Controller
<6>ehci_hcd 0000:00:01.2: new USB bus registered, assigned bus number 1
<6>ehci_hcd 0000:00:01.2: irq 26, io mem 0x48002000
<6>ehci_hcd 0000:00:01.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
<6>usb usb1: configuration #1 chosen from 1 choice
<6>hub 1-0:1.0: USB hub found
<6>hub 1-0:1.0: 5 ports detected
<7>ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
<4>PCI: enabling device 0000:00:01.0 (0140 -> 0142)
<6>ohci_hcd 0000:00:01.0: OHCI Host Controller
<6>ohci_hcd 0000:00:01.0: new USB bus registered, assigned bus number 2
<6>ohci_hcd 0000:00:01.0: irq 28, io mem 0x48000000
<6>usb usb2: configuration #1 chosen from 1 choice
<6>hub 2-0:1.0: USB hub found
<6>hub 2-0:1.0: 3 ports detected
<4>PCI: enabling device 0000:00:01.1 (0140 -> 0142)
<6>ohci_hcd 0000:00:01.1: OHCI Host Controller
<6>usb 1-1: new high speed USB device using ehci_hcd and address 2
<6>ohci_hcd 0000:00:01.1: new USB bus registered, assigned bus number 3
<6>ohci_hcd 0000:00:01.1: irq 27, io mem 0x48001000
<6>usb usb3: configuration #1 chosen from 1 choice
<6>hub 3-0:1.0: USB hub found
<6>hub 3-0:1.0: 2 ports detected
<6>usb 1-1: configuration #1 chosen from 1 choice
<6>scsi0 : SCSI emulation for USB Mass Storage devices
<7>usb-storage: device found at 2
<7>usb-storage: waiting for device to settle before scanning
<5>scsi 0:0:0:0: Direct-Access Maxtor 3200 0344 PQ: 0
ANSI: 4
<5>SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
<5>sda: Write Protect is off
<7>sda: Mode Sense: 17 00 00 00
<3>sda: assuming drive cache: write through
<5>SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
<5>sda: Write Protect is off
<7>sda: Mode Sense: 17 00 00 00
<3>sda: assuming drive cache: write through
<6> sda: sda1 sda2 sda3 sda4
<5>sd 0:0:0:0: Attached scsi disk sda
<7>usb-storage: device scan complete
<6>kjournald starting. Commit interval 60 seconds
<6>EXT3 FS on sda1, internal journal
<6>EXT3-fs: recovery complete.
<6>EXT3-fs: mounted filesystem with journal data mode.
<6>IXP4XX Q Manager 0.2.1 initialized.
<6>ixp4xx_mac driver 0.3.1: eth0 on NPE-B with PHY[1] initialized
<5>ReiserFS: sda3: found reiserfs format "3.6" with standard journal
<5>ReiserFS: sda3: using ordered data mode
<5>ReiserFS: sda3: journal params: device sda3, size 8192, journal first
block 18, max trans len 1024, max batch 900, max commit age 30, max trans
age 30
<5>ReiserFS: sda3: checking transaction log (sda3)
<6>kjournald starting. Commit interval 5 seconds
<4>EXT3-fs warning: maximal mount count reached, running e2fsck is
recommended
<6>EXT3 FS on sda2, internal journal
<6>EXT3-fs: recovery complete.
<6>EXT3-fs: mounted filesystem with journal data mode.
<5>ReiserFS: sda3: replayed 237 transactions in 21 seconds
<5>ReiserFS: sda3: Using r5 hash to sort names
<6>Adding 465876k swap on /dev/sda4. Priority:-1 extents:1 across:465876k
<6>eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
<7>Link of eth0 is full-duplex
<6>Installing knfsd (copyright (C) 1996 [email protected]).
<4>NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
<4>NFSD: starting 90-second grace period


2007-11-25 22:56:05

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

On Sun, 25 Nov 2007 15:02:15 -0700, Josh Goldsmith wrote:
> I have a Linksys NSLU2 running 2.6.21 (I can replicate the problem on
> 2.6.23 but it isn't fully supported on SlugOS). It is a armv5teb device
> with 32MB of RAM, 400+ MB swap on its 160GB USB2 root disk. The machine is
> used as a fileserver and to build packages for other ARM devices. It may be
> underpowered by today's standard but is a whole lot faster than my first
> Linux system (386sx20 with 4MB RAM) but the whole system with disk uses <8
> watts and is silent.
>
> The problem comes when I try to untar a large file (in this case
> linux-2.6.23.tar.bz2). Regardless if I kill off every other process,
> eventually the oom-killer will appear and kill either the tar or the shell.
> I've tried every tuning option I and my buddy Google could find including
> (/proc/sys/vm/overcommit*) with no success. I'm not worried about paging
> impacting performance.
>
> I'd appreciate any help, pointers, or gentle taps with the cluebat.

I'm no VM tuning expert, but I have and still do heavy compile
jobs on similarly configured machines, with no OOM problems:

I regularly build 2.6 kernels and occasionally also gcc on a
100MHz 486 with 28MB of RAM and perhaps 500MB of swap. It runs
a standard but stripped down Fedora Core 4 user-space, with ext3
file systems and a kernel that doesn't include anything non-essential.
The machine will swap madly, but the OOM killer never triggers.
(All system settings are FC4 defaults. I haven't touched them.)

In the past I did a fair amount of package rebuilds and test suite
runs on an NSLU2 myself, with a 2.4 Linksys/Openslug kernel, ext3,
and a 1GB or perhaps 2GB swap partition on a disk attached via a
USB2-to-PATA enclosure. Even when swapping heavily the OOM killer
wouldn't trigger.

2007-11-26 05:28:31

by Josh Goldsmith

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

Thanks for the response Mikael.

Is your 486 running a IDE disk on a normal interface or via USB? I wonder
if the NSLU2 only having I/O via USB might be significant. Also, this is a
2.6 kernel and I've seen spurious reports across the internet about similar
oom-killer problems since about 2.6.7.

Thanks!
-Josh

----- Original Message -----
From: "Mikael Pettersson" <[email protected]>
To: <[email protected]>; <[email protected]>
Sent: Sunday, November 25, 2007 3:55 PM
Subject: Re: Small System Paging Problem - OOM-killer goes nuts


> I'm no VM tuning expert, but I have and still do heavy compile
> jobs on similarly configured machines, with no OOM problems:
>
> I regularly build 2.6 kernels and occasionally also gcc on a
> 100MHz 486 with 28MB of RAM and perhaps 500MB of swap. It runs
> a standard but stripped down Fedora Core 4 user-space, with ext3
> file systems and a kernel that doesn't include anything non-essential.
> The machine will swap madly, but the OOM killer never triggers.
> (All system settings are FC4 defaults. I haven't touched them.)
>
> In the past I did a fair amount of package rebuilds and test suite
> runs on an NSLU2 myself, with a 2.4 Linksys/Openslug kernel, ext3,
> and a 1GB or perhaps 2GB swap partition on a disk attached via a
> USB2-to-PATA enclosure. Even when swapping heavily the OOM killer
> wouldn't trigger.
>

2007-11-26 08:47:21

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

On Sun, 25 Nov 2007 22:28:03 -0700, Josh Goldsmith wrote:
> Is your 486 running a IDE disk on a normal interface or via USB? I wonder
> if the NSLU2 only having I/O via USB might be significant.

My 486 has neither PCI nor USB, the disk is attached to a
plain ancient IDE port.

> Also, this is a
> 2.6 kernel and I've seen spurious reports across the internet about similar
> oom-killer problems since about 2.6.7.

If it is, I don't think it's ARM-specific. The last two years
I've done a lot of work with 2.6 kernels on a DS101 ARM box.
It's similar to the NSLU2 except it has 64MB of RAM and a built-in
PCI PATA controller. I've stressed it quite a bit, but never
seen the OOM killer trigger on it.

So you using USB storage might be relevant.

2007-11-26 11:57:36

by David Newall

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

Josh Goldsmith wrote:
> The problem comes when I try to untar a large file (in this case
> linux-2.6.23.tar.bz2). Regardless if I kill off every other process,
> eventually the oom-killer will appear and kill either the tar or the
> shell.

What's the actual command you are executing?

2007-11-26 16:55:54

by Pavel Machek

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

On Sun 2007-11-25 22:28:03, Josh Goldsmith wrote:
> Thanks for the response Mikael.
>
> Is your 486 running a IDE disk on a normal interface or
> via USB? I wonder if the NSLU2 only having I/O via USB
> might be significant. Also, this is a 2.6 kernel and

I'd suspect USB is significant here. Talk to Olivier Neukum (sp?),iirc
he was trying to fix swapping over usb.

Pavel

> >I'm no VM tuning expert, but I have and still do heavy
> >compile
> >jobs on similarly configured machines, with no OOM
> >problems:
> >
> >I regularly build 2.6 kernels and occasionally also gcc
> >on a
> >100MHz 486 with 28MB of RAM and perhaps 500MB of swap.
> >It runs
> >a standard but stripped down Fedora Core 4 user-space,
> >with ext3
> >file systems and a kernel that doesn't include anything
> >non-essential.
> >The machine will swap madly, but the OOM killer never
> >triggers.
> >(All system settings are FC4 defaults. I haven't
> >touched them.)
> >
> >In the past I did a fair amount of package rebuilds and
> >test suite
> >runs on an NSLU2 myself, with a 2.4 Linksys/Openslug
> >kernel, ext3,
> >and a 1GB or perhaps 2GB swap partition on a disk
> >attached via a
> >USB2-to-PATA enclosure. Even when swapping heavily the
> >OOM killer
> >wouldn't trigger.
> >
>
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2007-11-26 17:48:59

by Josh Goldsmith

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

David: The exact command this time was a "tar jxf linux-2.6.23.tar.bz2" as
part of an emerge (gentoo). Gnu tar version 1.18 but has happened with
prior versions too. I replicated it after my post by manually untarring it
on the command line and can almost always replicate the problem with any
large (GCC/kernel) tarball. If I shut down all other processes, the untar
will go longer but eventually the oom-killer will be invoked.

Pavel: I'll ping Olver Neukum about it.

Thanks for the responses!
-Josh

----- Original Message -----
From: "David Newall" <[email protected]>
To: "Josh Goldsmith" <[email protected]>
Cc: <[email protected]>
Sent: Monday, November 26, 2007 4:57 AM
Subject: Re: Small System Paging Problem - OOM-killer goes nuts


> Josh Goldsmith wrote:
>> The problem comes when I try to untar a large file (in this case
>> linux-2.6.23.tar.bz2). Regardless if I kill off every other process,
>> eventually the oom-killer will appear and kill either the tar or the
>> shell.
>
> What's the actual command you are executing?
>

2007-11-26 23:03:36

by David Newall

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

Josh Goldsmith wrote:
> David: The exact command this time was a "tar jxf
> linux-2.6.23.tar.bz2" as part of an emerge (gentoo). Gnu tar version
> 1.18 but has happened with prior versions too. I replicated it after
> my post by manually untarring it on the command line and can almost
> always replicate the problem with any large (GCC/kernel) tarball. If
> I shut down all other processes, the untar will go longer but
> eventually the oom-killer will be invoked.

Into which directory were you extracting the files?

2007-11-27 01:18:04

by Ming Lei

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

It seems oom happenes when VM(page frame reclaim) try to reclaim much
more memory by writing back dirty pages, but there is not enough ram
for usb disk related driver to finish the writeback operation. (usb
disk related driver: scsi_mod, usb mass storage, usbcore and uhci or
ehci )

You can try to set much value to /proc/sys/vm/min_free_kbytes to
verify the above guess.

2007/11/26, Josh Goldsmith <[email protected]>:
> Hi,
>
> I have a Linksys NSLU2 running 2.6.21 (I can replicate the problem on
> 2.6.23 but it isn't fully supported on SlugOS). It is a armv5teb device
> with 32MB of RAM, 400+ MB swap on its 160GB USB2 root disk. The machine is
> used as a fileserver and to build packages for other ARM devices. It may be
> underpowered by today's standard but is a whole lot faster than my first
> Linux system (386sx20 with 4MB RAM) but the whole system with disk uses <8
> watts and is silent.
>
> The problem comes when I try to untar a large file (in this case
> linux-2.6.23.tar.bz2). Regardless if I kill off every other process,
> eventually the oom-killer will appear and kill either the tar or the shell.
> I've tried every tuning option I and my buddy Google could find including
> (/proc/sys/vm/overcommit*) with no success. I'm not worried about paging
> impacting performance.
>
> I'd appreciate any help, pointers, or gentle taps with the cluebat.
>
> -Josh
>
> Error output to console: http://www.pastebin.ca/797155
>
> config -> http://www.pastebin.ca/797206
>
> slug2>$ uname -a
> Linux slug2 2.6.21 #1 PREEMPT Fri Nov 9 11:54:06 MST 2007 armv5teb unknown
>
> slug2:~$ free
> total used free shared buffers cached
> Mem: 30352 29124 1228 0 10196 9468
> -/+ buffers/cache: 9460 20892
> Swap: 465876 0 465876
>
> cat /proc/swaps
> Filename Type Size Used
> Priority
> /dev/sda4 partition 465876 0 -1
>
> slug2:~$ lsmod
> Module Size Used by
> nfsd 186556 8
> exportfs 4320 1 nfsd
> lockd 51416 2 nfsd
> sunrpc 131952 2 nfsd,lockd
> reiserfs 255380 1
> ixp4xx_mac 14644 0
> ixp4xx_qmgr 5388 5 ixp4xx_mac
> mii 3424 1 ixp4xx_mac
> ext3 110472 2
> jbd 47784 1 ext3
> mbcache 5604 1 ext3
> ohci_hcd 16804 0
> ehci_hcd 30252 0
>
> slug2>$ dmesg
> <5>Linux version 2.6.21 (josh@styx) (gcc version 4.1.1) #1 PREEMPT Fri Nov 9
> 11:54:06 MST 2007
> <4>CPU: XScale-IXP42x Family [690541f1] revision 1 (ARMv5TE), cr=000039ff
> <4>Machine: Linksys NSLU2
> <4>Memory policy: ECC disabled, Data cache writeback
> <7>On node 0 totalpages: 8192
> <7> DMA zone: 64 pages used for memmap
> <7> DMA zone: 0 pages reserved
> <7> DMA zone: 8128 pages, LIFO batch:0
> <7> Normal zone: 0 pages used for memmap
> <4>CPU0: D VIVT undefined 5 cache
> <4>CPU0: I cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets
> <4>CPU0: D cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets
> <4>Built 1 zonelists. Total pages: 8128
> <5>Kernel command line: rtc-x1205.probe=0,0x6f console=ttyS0,115200n8
> root=/dev/mtdblock4 rootfstype=jffs2 rw init=/linuxrc noirqdebug
> <6>IRQ lockup detection disabled
> <4>PID hash table entries: 128 (order: 7, 512 bytes)
> <4>Dentry cache hash table entries: 4096 (order: 2, 16384 bytes)
> <4>Inode-cache hash table entries: 2048 (order: 1, 8192 bytes)
> <6>Memory: 32MB = 32MB total
> <5>Memory: 30268KB available (1940K code, 154K data, 84K init)
> <7>Calibrating delay loop... 266.24 BogoMIPS (lpj=1331200)
> <4>Mount-cache hash table entries: 512
> <6>CPU: Testing write buffer coherency: ok
> <6>NET: Registered protocol family 16
> <4>IXP4xx: Using 16MiB expansion bus window size
> <4>PCI: IXP4xx is host
> <4>PCI: IXP4xx Using direct access for memory space
> <6>PCI: bus0: Fast back to back transfers disabled
> <6>dmabounce: registered device 0000:00:01.0 on pci bus
> <6>dmabounce: registered device 0000:00:01.1 on pci bus
> <6>dmabounce: registered device 0000:00:01.2 on pci bus
> <5>SCSI subsystem initialized
> <6>usbcore: registered new interface driver usbfs
> <6>usbcore: registered new interface driver hub
> <6>usbcore: registered new device driver usb
> <6>Time: OSTS clocksource has been installed.
> <6>NET: Registered protocol family 2
> <4>IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
> <4>TCP established hash table entries: 1024 (order: 1, 8192 bytes)
> <4>TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
> <6>TCP: Hash tables configured (established 1024 bind 1024)
> <6>TCP reno registered
> <4>NetWinder Floating Point Emulator V0.97 (double precision)
> <6>JFFS2 version 2.2. (NAND) (C) 2001-2006 Red Hat, Inc.
> <6>io scheduler noop registered
> <6>io scheduler deadline registered (default)
> <6>Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing disabled
> <6>serial8250.0: ttyS0 at MMIO 0xc8000000 (irq = 15) is a XScale
> <6>serial8250.0: ttyS1 at MMIO 0xc8001000 (irq = 13) is a XScale
> <4>RAMDISK driver initialized: 4 RAM disks of 10240K size 1024 blocksize
> <6>IXP4XX NPE driver Version 0.3.0 initialized
> <6>NFTL driver: nftlcore.c $Revision: 1.98 $, nftlmount.c $Revision: 1.41 $
> <6>IXP4XX-Flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
> <7>IXP4XX-Flash.0: Found an alias at 0x800000 for the chip at 0x0
> <4> Intel/Sharp Extended Query Table at 0x0031
> <6>Using buffer write method
> <5>cfi_cmdset_0001: Erase suspend on write enabled
> <7>erase region 0: offset=0x0,size=0x20000,blocks=64
> <5>Searching for RedBoot partition table in IXP4XX-Flash.0 at offset
> 0x7e0000
> <5>6 RedBoot partitions found on MTD device IXP4XX-Flash.0
> <5>Creating 6 MTD partitions on "IXP4XX-Flash.0":
> <5>0x00000000-0x00040000 : "RedBoot"
> <6>nslu2 mac: 00:0f:66:7c:14:23
> <5>0x00040000-0x00060000 : "SysConf"
> <5>0x00060000-0x00080000 : "Loader"
> <5>0x00080000-0x00180000 : "Kernel"
> <5>0x00180000-0x007e0000 : "Flashdisk"
> <5>0x007e0000-0x00800000 : "FIS directory"
> <4>npe: searching for firmware...
> <6>npe: found at 0x1c010, IXP425/NPE-B func: 00, rev: 2.1, size: 11964, id:
> 01000201
> <6>Image loaded to NPE-B Func:0, Rel: 2:1, Status: 80800000
> <6>Initializing USB Mass Storage driver...
> <6>usbcore: registered new interface driver usb-storage
> <6>USB Mass Storage support registered.
> <6>input: ixp4xx beeper as /class/input/input0
> <6>i2c /dev entries driver
> <6>x1205 0-006f: chip found, driver version 1.0.7
> <6>x1205 0-006f: rtc core: registered x1205 as rtc0
> <6>Registered led device: ready
> <6>Registered led device: status
> <6>Registered led device: disk-1
> <6>Registered led device: disk-2
> <6>TCP cubic registered
> <6>NET: Registered protocol family 1
> <6>NET: Registered protocol family 17
> <6>XScale DSP coprocessor detected.
> <6>x1205 0-006f: setting the system clock to 2007-11-25 20:35:07
> (1196022907)
> <4>VFS: Mounted root (jffs2 filesystem).
> <6>Freeing init memory: 84K
> <4>PCI: enabling device 0000:00:01.2 (0140 -> 0142)
> <6>ehci_hcd 0000:00:01.2: EHCI Host Controller
> <6>ehci_hcd 0000:00:01.2: new USB bus registered, assigned bus number 1
> <6>ehci_hcd 0000:00:01.2: irq 26, io mem 0x48002000
> <6>ehci_hcd 0000:00:01.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
> <6>usb usb1: configuration #1 chosen from 1 choice
> <6>hub 1-0:1.0: USB hub found
> <6>hub 1-0:1.0: 5 ports detected
> <7>ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
> <4>PCI: enabling device 0000:00:01.0 (0140 -> 0142)
> <6>ohci_hcd 0000:00:01.0: OHCI Host Controller
> <6>ohci_hcd 0000:00:01.0: new USB bus registered, assigned bus number 2
> <6>ohci_hcd 0000:00:01.0: irq 28, io mem 0x48000000
> <6>usb usb2: configuration #1 chosen from 1 choice
> <6>hub 2-0:1.0: USB hub found
> <6>hub 2-0:1.0: 3 ports detected
> <4>PCI: enabling device 0000:00:01.1 (0140 -> 0142)
> <6>ohci_hcd 0000:00:01.1: OHCI Host Controller
> <6>usb 1-1: new high speed USB device using ehci_hcd and address 2
> <6>ohci_hcd 0000:00:01.1: new USB bus registered, assigned bus number 3
> <6>ohci_hcd 0000:00:01.1: irq 27, io mem 0x48001000
> <6>usb usb3: configuration #1 chosen from 1 choice
> <6>hub 3-0:1.0: USB hub found
> <6>hub 3-0:1.0: 2 ports detected
> <6>usb 1-1: configuration #1 chosen from 1 choice
> <6>scsi0 : SCSI emulation for USB Mass Storage devices
> <7>usb-storage: device found at 2
> <7>usb-storage: waiting for device to settle before scanning
> <5>scsi 0:0:0:0: Direct-Access Maxtor 3200 0344 PQ: 0
> ANSI: 4
> <5>SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
> <5>sda: Write Protect is off
> <7>sda: Mode Sense: 17 00 00 00
> <3>sda: assuming drive cache: write through
> <5>SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
> <5>sda: Write Protect is off
> <7>sda: Mode Sense: 17 00 00 00
> <3>sda: assuming drive cache: write through
> <6> sda: sda1 sda2 sda3 sda4
> <5>sd 0:0:0:0: Attached scsi disk sda
> <7>usb-storage: device scan complete
> <6>kjournald starting. Commit interval 60 seconds
> <6>EXT3 FS on sda1, internal journal
> <6>EXT3-fs: recovery complete.
> <6>EXT3-fs: mounted filesystem with journal data mode.
> <6>IXP4XX Q Manager 0.2.1 initialized.
> <6>ixp4xx_mac driver 0.3.1: eth0 on NPE-B with PHY[1] initialized
> <5>ReiserFS: sda3: found reiserfs format "3.6" with standard journal
> <5>ReiserFS: sda3: using ordered data mode
> <5>ReiserFS: sda3: journal params: device sda3, size 8192, journal first
> block 18, max trans len 1024, max batch 900, max commit age 30, max trans
> age 30
> <5>ReiserFS: sda3: checking transaction log (sda3)
> <6>kjournald starting. Commit interval 5 seconds
> <4>EXT3-fs warning: maximal mount count reached, running e2fsck is
> recommended
> <6>EXT3 FS on sda2, internal journal
> <6>EXT3-fs: recovery complete.
> <6>EXT3-fs: mounted filesystem with journal data mode.
> <5>ReiserFS: sda3: replayed 237 transactions in 21 seconds
> <5>ReiserFS: sda3: Using r5 hash to sort names
> <6>Adding 465876k swap on /dev/sda4. Priority:-1 extents:1 across:465876k
> <6>eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
> <7>Link of eth0 is full-duplex
> <6>Installing knfsd (copyright (C) 1996 [email protected]).
> <4>NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> <4>NFSD: starting 90-second grace period
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2007-11-27 01:47:45

by Ian Kumlien

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

Hi,

I have some similar problems on a completely unrelated system, but i
still wonder.

When you untar, which filesystem do you untar too?

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net


Attachments:
signature.asc (198.00 B)
This is a digitally signed message part

2007-11-27 03:51:20

by Josh Goldsmith

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

>When you untar, which filesystem do you untar too?
I've untarred it to Ext3, Ext2, and Reiser filesystems. I've been fighting
with this for a while.

I did manage to get it to happen again doing a recursive chmod after
untarring the kernel (I stopped the untar a few times to let the system
catch up).

Interesting output below.

-J

top - 17:58:03 up 3:08, 1 user, load average: 3.54, 4.09, 4.08
Tasks: 53 total, 2 running, 51 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.1%us, 11.4%sy, 0.6%ni, 0.0%id, 81.4%wa, 2.7%hi, 1.8%si,
0.0%st
Mem: 30352k total, 28252k used, 2100k free, 19448k buffers
Swap: 465876k total, 15736k used, 450140k free, 1072k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1357 root 30 15 1568 168 88 R 8.1 0.6 0:07.87 chmod
168 root 10 -5 0 0 0 S 3.1 0.0 6:39.25 usb-storage
1353 root 15 0 2408 540 400 R 2.2 1.8 0:14.29 top
989 root 15 0 3600 292 192 S 1.2 1.0 0:37.81 sshd
2 root 34 19 0 0 0 S 0.6 0.0 2:14.65 ksoftirqd/0
56 root 15 0 0 0 0 S 0.3 0.0 0:23.85 pdflush
58 root 10 -5 0 0 0 S 0.3 0.0 0:54.70 kswapd0
950 root 15 0 3128 108 64 S 0.3 0.4 0:13.88 ntpd
1 root 16 0 1440 0 0 S 0.0 0.0 0:10.40 init
3 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 events/0
4 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 khelper
5 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread
38 root 10 -5 0 0 0 S 0.0 0.0 0:00.04 kblockd/0
41 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 khubd
57 root 15 0 0 0 0 D 0.0 0.0 0:20.29 pdflush


And the first of the oom-killer syslog messages:

ntpd invoked oom-killer: gfp_mask=0x200d2, order=0, oomkilladj=0
Mem-info:
DMA per-cpu:
CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd:
0
sshd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Active:2816 inactive:2778 dirty:0 writeback:0 unstable:0
free:179 slab:858 mapped:1 pagetables:93 bounce:0

2007-11-27 09:28:58

by Oliver Neukum

[permalink] [raw]
Subject: Re: Small System Paging Problem - OOM-killer goes nuts

Am Dienstag 27 November 2007 schrieb ming lei:
> It seems ?oom happenes when VM(page frame reclaim) try to reclaim much
> more memory by writing back dirty pages, but ?there is not enough ram
> for usb disk ?related driver to finish the writeback operation. (usb
> disk related driver: scsi_mod, usb mass storage, usbcore and uhci or
> ehci )

As far as I can tell, the problem arises because usb-storage's kernel
thread does not run with PF_MEMALLOC and cannot dip into the reserves.

Regards
Oliver