2009-06-18 14:47:33

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen


Hmmm.... It's possible that this makes my test box implode horribly when
running LTP.

I'm going to bisect it to see if this is actually due to your patches.

Note that I don't have any swap space. This after a fresh reboot:

[root@andromeda ~]# cat /proc/meminfo
MemTotal: 1000624 kB
MemFree: 797328 kB
Buffers: 13272 kB
Cached: 121744 kB
SwapCached: 0 kB
Active: 36240 kB
Inactive: 115856 kB
Active(anon): 17448 kB
Inactive(anon): 0 kB
Active(file): 18792 kB
Inactive(file): 115856 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 28 kB
Writeback: 0 kB
AnonPages: 17280 kB
Mapped: 5376 kB
Slab: 42984 kB
SReclaimable: 6956 kB
SUnreclaim: 36028 kB
PageTables: 1304 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 500312 kB
Committed_AS: 52596 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 190044 kB
VmallocChunk: 34359546363 kB
DirectMap4k: 13312 kB
DirectMap2M: 1009664 kB

David
---
Initializing cgroup subsys cpuset
Linux version 2.6.30-cachefs ([email protected]) (gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #106 SMP Wed Jun 17 22:10:31 BST 2009
Command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009ec00 (usable)
BIOS-e820: 000000000009ec00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003e59a000 (usable)
BIOS-e820: 000000003e59a000 - 000000003e5a6000 (reserved)
BIOS-e820: 000000003e5a6000 - 000000003e644000 (usable)
BIOS-e820: 000000003e644000 - 000000003e6a9000 (ACPI NVS)
BIOS-e820: 000000003e6a9000 - 000000003e6ac000 (ACPI data)
BIOS-e820: 000000003e6ac000 - 000000003e6f2000 (ACPI NVS)
BIOS-e820: 000000003e6f2000 - 000000003e6ff000 (ACPI data)
BIOS-e820: 000000003e6ff000 - 000000003e700000 (usable)
BIOS-e820: 000000003e700000 - 000000003f000000 (reserved)
BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
DMI 2.4 present.
last_pfn = 0x3e700 max_arch_pfn = 0x400000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
00000-9FFFF write-back
A0000-FFFFF uncachable
MTRR variable ranges enabled:
0 base 000000000 mask FC0000000 write-back
1 base 03F000000 mask FFF000000 uncachable
2 base 03E800000 mask FFF800000 uncachable
3 base 03E700000 mask FFFF00000 uncachable
4 disabled
5 disabled
6 disabled
7 disabled
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
initial memory mapped : 0 - 20000000
init_memory_mapping: 0000000000000000-000000003e700000
0000000000 - 003e600000 page 2M
003e600000 - 003e700000 page 4k
kernel direct mapping tables up to 3e700000 @ 8000-b000
RAMDISK: 3e2ee000 - 3e57991c
ACPI: RSDP 00000000000fe020 00014 (v00 INTEL )
ACPI: RSDT 000000003e6fd038 0004C (v01 INTEL DG965RY 00000330 01000013)
ACPI: FACP 000000003e6fc000 00074 (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: DSDT 000000003e6f8000 03EDA (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: FACS 000000003e6ac000 00040
ACPI: APIC 000000003e6f7000 00078 (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: WDDT 000000003e6f6000 00040 (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: MCFG 000000003e6f5000 0003C (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: ASF! 000000003e6f4000 000A6 (v32 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6f3000 001BC (v01 INTEL CpuPm 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6f2000 00175 (v01 INTEL Cpu0Ist 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6ab000 00175 (v01 INTEL Cpu1Ist 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6aa000 00175 (v01 INTEL Cpu2Ist 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6a9000 00175 (v01 INTEL Cpu3Ist 00000330 MSFT 01000013)
ACPI: Local APIC address 0xfee00000
(7 early reservations) ==> bootmem [0000000000 - 003e700000]
#0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
#1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
#2 [0001000000 - 0001535d90] TEXT DATA BSS ==> [0001000000 - 0001535d90]
#3 [003e2ee000 - 003e57991c] RAMDISK ==> [003e2ee000 - 003e57991c]
#4 [000009e800 - 0000100000] BIOS reserved ==> [000009e800 - 0000100000]
#5 [0001536000 - 0001536199] BRK ==> [0001536000 - 0001536199]
#6 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 0000009000]
found SMP MP-table at [ffff8800000fe200] fe200
[ffffea0000000000-ffffea0000dfffff] PMD -> [ffff880001a00000-ffff8800027fffff] on node 0
Zone PFN ranges:
DMA 0x00000000 -> 0x00001000
DMA32 0x00001000 -> 0x00100000
Normal 0x00100000 -> 0x00100000
Movable zone start PFN for each node
early_node_map[4] active PFN ranges
0: 0x00000000 -> 0x0000009e
0: 0x00000100 -> 0x0003e59a
0: 0x0003e5a6 -> 0x0003e644
0: 0x0003e6ff -> 0x0003e700
On node 0 totalpages: 255447
DMA zone: 56 pages used for memmap
DMA zone: 101 pages reserved
DMA zone: 3841 pages, LIFO batch:0
DMA32 zone: 3441 pages used for memmap
DMA32 zone: 248008 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
4 Processors exceeds NR_CPUS limit of 2
SMP: Allowing 2 CPUs, 0 hotplug CPUs
nr_irqs_gsi: 24
PM: Registered nosave memory: 000000000009e000 - 000000000009f000
PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
PM: Registered nosave memory: 000000003e59a000 - 000000003e5a6000
PM: Registered nosave memory: 000000003e644000 - 000000003e6a9000
PM: Registered nosave memory: 000000003e6a9000 - 000000003e6ac000
PM: Registered nosave memory: 000000003e6ac000 - 000000003e6f2000
PM: Registered nosave memory: 000000003e6f2000 - 000000003e6ff000
Allocating PCI resources starting at 3f000000 (gap: 3f000000:c0f00000)
NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
PERCPU: Embedded 24 pages at ffff880001541000, static data 67296 bytes
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 251849
Kernel command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
PID hash table entries: 4096 (order: 12, 32768 bytes)
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
Initializing CPU#0
Checking aperture...
No AGP bridge found
Memory: 996952k/1022976k available (2953k kernel code, 1188k absent, 24132k reserved, 1678k data, 360k init)
NR_IRQS:320
Fast TSC calibration using PIT
Detected 1864.978 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
console [ttyS0] enabled
Calibrating delay loop (skipped), value calculated using timer frequency.. 3729.95 BogoMIPS (lpj=7459912)
Security Framework initialized
SELinux: Initializing.
SELinux: Starting in enforcing mode
Mount-cache hash table entries: 256
Initializing cgroup subsys debug
Initializing cgroup subsys ns
Initializing cgroup subsys devices
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
mce: CPU supports 6 MCE banks
CPU0: Thermal monitoring enabled (TM2)
using mwait in idle threads.
ACPI: Core revision 20090521
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
Booting processor 1 APIC 0x1 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 3525.06 BogoMIPS (lpj=7050122)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
mce: CPU supports 6 MCE banks
CPU1: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
CPU1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
Total of 2 processors activated (7255.01 BogoMIPS).
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
PCI: Not using MMCONFIG.
PCI: Using configuration type 1 for base access
bio: create slab <bio-0> at 0
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
PCI: Using MMCONFIG at f0000000 - f7ffffff
ACPI: No dock devices found.
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci 0000:00:02.0: reg 10 32bit mmio: [0x50200000-0x502fffff]
pci 0000:00:02.0: reg 18 64bit mmio: [0x40000000-0x4fffffff]
pci 0000:00:02.0: reg 20 io port: [0x2110-0x2117]
pci 0000:00:03.0: reg 10 64bit mmio: [0x50326100-0x5032610f]
pci 0000:00:03.0: PME# supported from D0 D3hot D3cold
pci 0000:00:03.0: PME# disabled
pci 0000:00:19.0: reg 10 32bit mmio: [0x50300000-0x5031ffff]
pci 0000:00:19.0: reg 14 32bit mmio: [0x50324000-0x50324fff]
pci 0000:00:19.0: reg 18 io port: [0x20e0-0x20ff]
pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
pci 0000:00:19.0: PME# disabled
pci 0000:00:1a.0: reg 20 io port: [0x20c0-0x20df]
pci 0000:00:1a.1: reg 20 io port: [0x20a0-0x20bf]
pci 0000:00:1a.7: reg 10 32bit mmio: [0x50325c00-0x50325fff]
pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1a.7: PME# disabled
pci 0000:00:1b.0: reg 10 64bit mmio: [0x50320000-0x50323fff]
pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1b.0: PME# disabled
pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.0: PME# disabled
pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.1: PME# disabled
pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.2: PME# disabled
pci 0000:00:1c.3: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.3: PME# disabled
pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.4: PME# disabled
pci 0000:00:1d.0: reg 20 io port: [0x2080-0x209f]
pci 0000:00:1d.1: reg 20 io port: [0x2060-0x207f]
pci 0000:00:1d.2: reg 20 io port: [0x2040-0x205f]
pci 0000:00:1d.7: reg 10 32bit mmio: [0x50325800-0x50325bff]
pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1d.7: PME# disabled
pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 007f)
pci 0000:00:1f.2: reg 10 io port: [0x2108-0x210f]
pci 0000:00:1f.2: reg 14 io port: [0x211c-0x211f]
pci 0000:00:1f.2: reg 18 io port: [0x2100-0x2107]
pci 0000:00:1f.2: reg 1c io port: [0x2118-0x211b]
pci 0000:00:1f.2: reg 20 io port: [0x2020-0x203f]
pci 0000:00:1f.2: reg 24 32bit mmio: [0x50325000-0x503257ff]
pci 0000:00:1f.2: PME# supported from D3hot
pci 0000:00:1f.2: PME# disabled
pci 0000:00:1f.3: reg 10 32bit mmio: [0x50326000-0x503260ff]
pci 0000:00:1f.3: reg 20 io port: [0x2000-0x201f]
pci 0000:00:1c.0: bridge 32bit mmio: [0x50400000-0x504fffff]
pci 0000:02:00.0: reg 10 io port: [0x1018-0x101f]
pci 0000:02:00.0: reg 14 io port: [0x1024-0x1027]
pci 0000:02:00.0: reg 18 io port: [0x1010-0x1017]
pci 0000:02:00.0: reg 1c io port: [0x1020-0x1023]
pci 0000:02:00.0: reg 20 io port: [0x1000-0x100f]
pci 0000:02:00.0: reg 24 32bit mmio: [0x50100000-0x501001ff]
pci 0000:02:00.0: supports D1
pci 0000:02:00.0: PME# supported from D0 D1 D3hot
pci 0000:02:00.0: PME# disabled
pci 0000:00:1c.1: bridge io port: [0x1000-0x1fff]
pci 0000:00:1c.1: bridge 32bit mmio: [0x50100000-0x501fffff]
pci 0000:00:1c.2: bridge 32bit mmio: [0x50500000-0x505fffff]
pci 0000:00:1c.3: bridge 32bit mmio: [0x50600000-0x506fffff]
pci 0000:00:1c.4: bridge 32bit mmio: [0x50700000-0x507fffff]
pci 0000:06:03.0: reg 10 32bit mmio: [0x50004000-0x500047ff]
pci 0000:06:03.0: reg 14 32bit mmio: [0x50000000-0x50003fff]
pci 0000:06:03.0: supports D1 D2
pci 0000:06:03.0: PME# supported from D0 D1 D2 D3hot
pci 0000:06:03.0: PME# disabled
pci 0000:00:1e.0: transparent bridge
pci 0000:00:1e.0: bridge 32bit mmio: [0x50000000-0x500fffff]
pci_bus 0000:00: on NUMA node 0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P32_._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX2._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX3._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 *10 11 12)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 *9 10 11 12)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 *9 10 11 12)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 *11 12)
SCSI subsystem initialized
libata version 3.00 loaded.
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 12 devices
ACPI: ACPI bus type pnp unregistered
system 00:01: iomem range 0xf0000000-0xf7ffffff has been reserved
system 00:01: iomem range 0xfed13000-0xfed13fff has been reserved
system 00:01: iomem range 0xfed14000-0xfed17fff has been reserved
system 00:01: iomem range 0xfed18000-0xfed18fff has been reserved
system 00:01: iomem range 0xfed19000-0xfed19fff has been reserved
system 00:01: iomem range 0xfed1c000-0xfed1ffff has been reserved
system 00:01: iomem range 0xfed20000-0xfed3ffff has been reserved
system 00:01: iomem range 0xfed45000-0xfed99fff has been reserved
system 00:01: iomem range 0xc0000-0xdffff has been reserved
system 00:01: iomem range 0xe0000-0xfffff could not be reserved
system 00:06: ioport range 0x500-0x53f has been reserved
system 00:06: ioport range 0x400-0x47f has been reserved
system 00:06: ioport range 0x680-0x6ff has been reserved
pci 0000:00:1c.0: PCI bridge, secondary bus 0000:01
pci 0000:00:1c.0: IO window: disabled
pci 0000:00:1c.0: MEM window: 0x50400000-0x504fffff
pci 0000:00:1c.0: PREFETCH window: disabled
pci 0000:00:1c.1: PCI bridge, secondary bus 0000:02
pci 0000:00:1c.1: IO window: 0x1000-0x1fff
pci 0000:00:1c.1: MEM window: 0x50100000-0x501fffff
pci 0000:00:1c.1: PREFETCH window: disabled
pci 0000:00:1c.2: PCI bridge, secondary bus 0000:03
pci 0000:00:1c.2: IO window: disabled
pci 0000:00:1c.2: MEM window: 0x50500000-0x505fffff
pci 0000:00:1c.2: PREFETCH window: disabled
pci 0000:00:1c.3: PCI bridge, secondary bus 0000:04
pci 0000:00:1c.3: IO window: disabled
pci 0000:00:1c.3: MEM window: 0x50600000-0x506fffff
pci 0000:00:1c.3: PREFETCH window: disabled
pci 0000:00:1c.4: PCI bridge, secondary bus 0000:05
pci 0000:00:1c.4: IO window: disabled
pci 0000:00:1c.4: MEM window: 0x50700000-0x507fffff
pci 0000:00:1c.4: PREFETCH window: disabled
pci 0000:00:1e.0: PCI bridge, secondary bus 0000:06
pci 0000:00:1e.0: IO window: disabled
pci 0000:00:1e.0: MEM window: 0x50000000-0x500fffff
pci 0000:00:1e.0: PREFETCH window: disabled
pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.0: setting latency timer to 64
pci 0000:00:1c.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1c.1: setting latency timer to 64
pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
pci 0000:00:1c.2: setting latency timer to 64
pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
pci 0000:00:1c.3: setting latency timer to 64
pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.4: setting latency timer to 64
pci 0000:00:1e.0: setting latency timer to 64
pci_bus 0000:00: resource 0 io: [0x00-0xffff]
pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:01: resource 1 mem: [0x50400000-0x504fffff]
pci_bus 0000:02: resource 0 io: [0x1000-0x1fff]
pci_bus 0000:02: resource 1 mem: [0x50100000-0x501fffff]
pci_bus 0000:03: resource 1 mem: [0x50500000-0x505fffff]
pci_bus 0000:04: resource 1 mem: [0x50600000-0x506fffff]
pci_bus 0000:05: resource 1 mem: [0x50700000-0x507fffff]
pci_bus 0000:06: resource 1 mem: [0x50000000-0x500fffff]
pci_bus 0000:06: resource 3 io: [0x00-0xffff]
pci_bus 0000:06: resource 4 mem: [0x000000-0xffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
NET: Registered protocol family 1
Unpacking initramfs...
Freeing initrd memory: 2606k freed
audit: initializing netlink socket (disabled)
type=2000 audit(1245320564.157:1): initialized
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
msgmni has been set to 1953
SELinux: Registering netfilter hooks
alg: No test for fcrypt (fcrypt-generic)
alg: No test for stdrng (krng)
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
pci 0000:00:02.0: Boot video device
pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
pcieport-driver 0000:00:1c.0: setting latency timer to 64
pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
pcieport-driver 0000:00:1c.1: setting latency timer to 64
pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
pcieport-driver 0000:00:1c.2: setting latency timer to 64
pcieport-driver 0000:00:1c.3: irq 27 for MSI/MSI-X
pcieport-driver 0000:00:1c.3: setting latency timer to 64
pcieport-driver 0000:00:1c.4: irq 28 for MSI/MSI-X
pcieport-driver 0000:00:1c.4: setting latency timer to 64
input: Power Button as /class/input/input0
ACPI: Power Button [PWRF]
input: Sleep Button as /class/input/input1
ACPI: Sleep Button [SLPB]
processor ACPI_CPU:00: registered as cooling_device0
ACPI: Processor [CPU0] (supports 8 throttling states)
processor ACPI_CPU:01: registered as cooling_device1
ACPI: Processor [CPU1] (supports 8 throttling states)
Linux agpgart interface v0.103
agpgart-intel 0000:00:00.0: Intel 965G Chipset
agpgart-intel 0000:00:00.0: detected 7676K stolen memory
agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0x40000000
intelfb: Framebuffer driver for Intel(R) 830M/845G/852GM/855GM/865G/915G/915GM/945G/945GM/945GME/965G/965GM chipsets
intelfb: Version 0.9.6
intelfb 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
intelfb: 00:02.0: Intel(R) 965G, aperture size 256MB, stolen memory 7932kB
intelfb: Initial video mode is 1024x768-32@70.
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
Platform driver 'serial8250' needs updating - please use dev_pm_ops
00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
loop: module loaded
Driver 'sd' needs updating - please use bus_type methods
ahci 0000:00:1f.2: version 3.0
ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
ahci 0000:00:1f.2: irq 29 for MSI/MSI-X
ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0x33 impl SATA mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part ems
ahci 0000:00:1f.2: setting latency timer to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325100 irq 29
ata2: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325180 irq 29
ata3: DUMMY
ata4: DUMMY
ata5: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325300 irq 29
ata6: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325380 irq 29
e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
e1000e: Copyright (c) 1999-2008 Intel Corporation.
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: setting latency timer to 64
e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:16:76:ce:3a:3c
0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
Platform driver 'i8042' needs updating - please use dev_pm_ops
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
rtc_cmos 00:03: RTC can wake from S4
rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
rtc0: alarms up to one month, 114 bytes nvram
i2c /dev entries driver
i801_smbus 0000:00:1f.3: PCI INT B -> GSI 21 (level, low) -> IRQ 21
coretemp coretemp.0: Using relative temperature scale!
coretemp coretemp.1: Using relative temperature scale!
cpuidle: using governor ladder
ip_tables: (C) 2000-2006 Netfilter Core Team
TCP cubic registered
input: AT Translated Set 2 keyboard as /class/input/input2
NET: Registered protocol family 17
ata2: SATA link down (SStatus 0 SControl 300)
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
registered taskstats version 1
ata6: SATA link down (SStatus 0 SControl 300)
ata5: SATA link down (SStatus 0 SControl 300)
rtc_cmos 00:03: setting system clock to 2009-06-18 10:22:46 UTC (1245320566)
ata1.00: ATA-7: ST380211AS, 3.AAE, max UDMA/133
ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access ATA ST380211AS 3.AA PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
sd 0:0:0:0: [sda] Attached SCSI disk
Freeing unused kernel memory: 360k freed
Write protecting the kernel read-only data: 4324k
Red Hat nash version 6.0.52 starting
Mounting proc filesystem
Mounting sysfs filesystem
Creating /dev
Creating initial device nodes
Setting up hotplug.
input: ImPS/2 Generic Wheel Mouse as /class/input/input3
Creating block device nodes.
mount: could not find filesystem '/proc/bus/usb'
Waiting for driver initialization.
Waiting for driver initialization.
Creating root device.
Mounting root filesystem.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
Setting up otherEXT3-fs: recovery complete.
filesystems.
EXT3-fs: mounted filesystem with writeback data mode.
Setting up new root fs
no fstab.sys, mounting internal defaults
SELinux: 8192 avtab hash slots, 177803 rules.
SELinux: 8192 avtab hash slots, 177803 rules.
SELinux: 6 users, 12 roles, 2431 types, 118 bools, 1 sens, 1024 cats
SELinux: 73 classes, 177803 rules
SELinux: class kernel_service not defined in policy
SELinux: permission open in class sock_file not defined in policy
SELinux: permission nlmsg_tty_audit in class netlink_audit_socket not defined in policy
SELinux: the above unknown classes and permissions will be allowed
SELinux: Completing initialization.
SELinux: Setting up existing superblocks.
SELinux: initialized (dev sda2, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
type=1403 audit(1245320574.561:2): policy loaded auid=4294967295 ses=4294967295
Switching to new root and running init.
unmounting old /dev
unmounting old /proc
unmounting old /sys
Welcome to Fedora
Press 'I' to enter interactive startup.
Starting udev: [ OK ]
Setting hostname andromeda.procyon.org.uk: [ OK ]
Checking filesystems
Checking all file systems.
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda2
/1: clean, 330515/2621440 files, 1528849/2620603 blocks
[/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
/boot1: recovering journal
/boot1: clean, 79/50200 files, 72187/200780 blocks
[ OK ]
Remounting root filesystem in read-write mode: [ OK ]
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling /etc/fstab swaps: [ OK ]
Entering non-interactive startup
Starting background readahead (early, fast mode): [ OK ]
FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
Bringing up loopback interface: [ OK ]
Bringing up interface eth0:
Determining IP information for eth0... done.
[ OK ]
FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
Starting restorecond: [ OK ]
Starting auditd: [ OK ]
Starting irqbalance: [ OK ]
Starting mcstransd: [ OK ]
Starting rpcbind: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

rpcbind: cannot create socket for udp6
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

rpcbind: cannot create socket for tcp6
[ OK ]
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

Starting NFS statd: [ OK ]
Starting system message bus: [ OK ]
Starting lm_sensors: not configured, run sensors-detect[WARNING]
Starting sshd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

[ OK ]
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

Starting ntpd: [ OK ]
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

SysRq : Changing Loglevel
Loglevel set to 8
Now booted
Starting smartd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

[ OK ]

Fedora release 9 (Sulphur)
Kernel 2.6.30-cachefs on an x86_64 (/dev/ttyS0)

andromeda.procyon.org.uk login: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

warning: `capget01' uses 32-bit capabilities (legacy support in use)
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 30549, comm: msgctl11 Not tainted 2.6.30-cachefs #106
Call Trace:
[<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
[<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
[<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
[<ffffffff81031110>] ? copy_process+0x93/0x113f
[<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
[<ffffffff81031130>] ? copy_process+0xb3/0x113f
[<ffffffff81081ae2>] ? handle_mm_fault+0x2d5/0x645
[<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
[<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 47
Active_anon:80388 active_file:0 inactive_anon:822
inactive_file:2 unevictable:0 dirty:0 writeback:0 unstable:0
free:2053 slab:38793 mapped:357 pagetables:60476 bounce:0
DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:128kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:4296kB min:3948kB low:4932kB high:5920kB active_anon:317944kB inactive_anon:3160kB active_file:0kB inactive_file:8kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
DMA32: 576*4kB 15*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4296kB
1854 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5588 pages reserved
230698 pages shared
217103 pages non-shared
Out of memory: kill process 25166 (msgctl11) score 133496 or a child
Killed process 28855 (msgctl11)
msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 30312, comm: msgctl11 Not tainted 2.6.30-cachefs #106
Call Trace:
[<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
[<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
[<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
[<ffffffff81031110>] ? copy_process+0x93/0x113f
[<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
[<ffffffff81031130>] ? copy_process+0xb3/0x113f
[<ffffffff81029a83>] ? update_curr+0x53/0xdf
[<ffffffff81081e00>] ? handle_mm_fault+0x5f3/0x645
[<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
[<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
Active_anon:79646 active_file:2 inactive_anon:4113
inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
free:1966 slab:38417 mapped:2 pagetables:61720 bounce:0
DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:3948kB min:3948kB low:4932kB high:5920kB active_anon:314976kB inactive_anon:16196kB active_file:8kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
DMA32: 443*4kB 20*8kB 10*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3948kB
36 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5588 pages reserved
151665 pages shared
220702 pages non-shared
Out of memory: kill process 25166 (msgctl11) score 133404 or a child
Killed process 28860 (msgctl11)


2009-06-18 14:51:41

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen


Oh, and my .config.

David
---
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.30
# Thu Jun 18 15:31:34 2009
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_HAVE_DYNAMIC_PER_CPU_AREA=y
CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_TRAMPOLINE=y
# CONFIG_KTIME_SCALAR is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION="-cachefs"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_TREE=y

#
# RCU Subsystem
#
CONFIG_CLASSIC_RCU=y
# CONFIG_TREE_RCU is not set
# CONFIG_PREEMPT_RCU is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_PREEMPT_RCU_TRACE is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=15
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_RT_GROUP_SCHED is not set
CONFIG_USER_SCHED=y
# CONFIG_CGROUP_SCHED is not set
CONFIG_CGROUPS=y
CONFIG_CGROUP_DEBUG=y
CONFIG_CGROUP_NS=y
# CONFIG_CGROUP_FREEZER is not set
CONFIG_CGROUP_DEVICE=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_RESOURCE_COUNTERS is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSFS_DEPRECATED_V2=y
# CONFIG_RELAY is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
CONFIG_USER_NS=y
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_HAVE_PERF_COUNTERS=y

#
# Performance Counters
#
# CONFIG_PERF_COUNTERS is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
# CONFIG_STRIP_ASM_SYMS is not set
CONFIG_COMPAT_BRK=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
# CONFIG_PROFILING is not set
# CONFIG_MARKERS is not set
CONFIG_HAVE_OPROFILE=y
# CONFIG_KPROBES is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_SLOW_WORK=y
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_BSG=y
# CONFIG_BLK_DEV_INTEGRITY is not set
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="anticipatory"
CONFIG_FREEZER=y

#
# Processor type and features
#
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_X2APIC=y
# CONFIG_SPARSE_IRQ is not set
CONFIG_X86_MPPARSE=y
# CONFIG_X86_EXTENDED_PLATFORM is not set
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
# CONFIG_PARAVIRT_GUEST is not set
# CONFIG_MEMTEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MPSC is not set
CONFIG_MCORE2=y
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_CPU=y
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_P6_NOP=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
# CONFIG_X86_DS is not set
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
# CONFIG_AMD_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
CONFIG_IOMMU_API=y
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not set
# CONFIG_SCHED_MC is not set
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
CONFIG_X86_MCE=y
CONFIG_X86_NEW_MCE=y
CONFIG_X86_MCE_INTEL=y
# CONFIG_X86_MCE_AMD is not set
CONFIG_X86_MCE_THRESHOLD=y
# CONFIG_X86_MCE_INJECT is not set
CONFIG_X86_THERMAL_VECTOR=y
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
# CONFIG_X86_CPU_DEBUG is not set
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_DIRECT_GBPAGES=y
# CONFIG_NUMA is not set
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
# CONFIG_DISCONTIGMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y

#
# Memory hotplug is currently incompatible with Software Suspend
#
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_HAVE_MLOCK=y
CONFIG_HAVE_MLOCKED_PAGE_BIT=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
# CONFIG_X86_RESERVE_LOW_64K is not set
CONFIG_MTRR=y
# CONFIG_MTRR_SANITIZER is not set
CONFIG_X86_PAT=y
# CONFIG_EFI is not set
CONFIG_SECCOMP=y
# CONFIG_CC_STACKPROTECTOR is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_SCHED_HRTICK is not set
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x1000000
# CONFIG_RELOCATABLE is not set
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
CONFIG_COMPAT_VDSO=y
# CONFIG_CMDLINE_BOOL is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM_SLEEP=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATION_NVS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
# CONFIG_ACPI_PROCFS is not set
CONFIG_ACPI_PROCFS_POWER=y
CONFIG_ACPI_SYSFS_POWER=y
# CONFIG_ACPI_PROC_EVENT is not set
CONFIG_ACPI_AC=y
# CONFIG_ACPI_BATTERY is not set
CONFIG_ACPI_BUTTON=y
# CONFIG_ACPI_FAN is not set
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_ACPI_DEBUG=y
# CONFIG_ACPI_DEBUG_FUNC_TRACE is not set
# CONFIG_ACPI_PCI_SLOT is not set
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
# CONFIG_ACPI_SBS is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=y
# CONFIG_CPU_FREQ_STAT_DETAILS is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set

#
# CPUFreq processor drivers
#
# CONFIG_X86_ACPI_CPUFREQ is not set
# CONFIG_X86_POWERNOW_K8 is not set
CONFIG_X86_SPEEDSTEP_CENTRINO=y
# CONFIG_X86_P4_CLOCKMOD is not set

#
# shared options
#
# CONFIG_X86_SPEEDSTEP_LIB is not set
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y

#
# Memory power savings
#
# CONFIG_I7300_IDLE is not set

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_DMAR=y
CONFIG_DMAR_DEFAULT_ON=y
CONFIG_DMAR_GFX_WA=y
CONFIG_DMAR_FLOPPY_WA=y
CONFIG_INTR_REMAP=y
CONFIG_PCIEPORTBUS=y
CONFIG_PCIEAER=y
# CONFIG_PCIEASPM is not set
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
# CONFIG_HT_IRQ is not set
# CONFIG_PCI_IOV is not set
CONFIG_ISA_DMA_API=y
CONFIG_K8_NB=y
# CONFIG_PCCARD is not set
# CONFIG_HOTPLUG_PCI is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_FIB_HASH=y
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
# CONFIG_INET_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
# CONFIG_IPV6 is not set
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y

#
# Core Netfilter Configuration
#
# CONFIG_NETFILTER_NETLINK_QUEUE is not set
# CONFIG_NETFILTER_NETLINK_LOG is not set
# CONFIG_NF_CONNTRACK is not set
CONFIG_NETFILTER_XTABLES=y
# CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set
# CONFIG_NETFILTER_XT_TARGET_MARK is not set
# CONFIG_NETFILTER_XT_TARGET_NFLOG is not set
# CONFIG_NETFILTER_XT_TARGET_NFQUEUE is not set
# CONFIG_NETFILTER_XT_TARGET_RATEEST is not set
# CONFIG_NETFILTER_XT_TARGET_SECMARK is not set
# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set
# CONFIG_NETFILTER_XT_MATCH_COMMENT is not set
# CONFIG_NETFILTER_XT_MATCH_DCCP is not set
# CONFIG_NETFILTER_XT_MATCH_DSCP is not set
# CONFIG_NETFILTER_XT_MATCH_ESP is not set
# CONFIG_NETFILTER_XT_MATCH_HASHLIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_HL is not set
# CONFIG_NETFILTER_XT_MATCH_IPRANGE is not set
# CONFIG_NETFILTER_XT_MATCH_LENGTH is not set
# CONFIG_NETFILTER_XT_MATCH_LIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_MAC is not set
# CONFIG_NETFILTER_XT_MATCH_MARK is not set
# CONFIG_NETFILTER_XT_MATCH_MULTIPORT is not set
# CONFIG_NETFILTER_XT_MATCH_OWNER is not set
# CONFIG_NETFILTER_XT_MATCH_POLICY is not set
# CONFIG_NETFILTER_XT_MATCH_PKTTYPE is not set
# CONFIG_NETFILTER_XT_MATCH_QUOTA is not set
# CONFIG_NETFILTER_XT_MATCH_RATEEST is not set
# CONFIG_NETFILTER_XT_MATCH_REALM is not set
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
# CONFIG_NETFILTER_XT_MATCH_SCTP is not set
# CONFIG_NETFILTER_XT_MATCH_STATISTIC is not set
# CONFIG_NETFILTER_XT_MATCH_STRING is not set
# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
# CONFIG_IP_VS is not set

#
# IP: Netfilter Configuration
#
# CONFIG_NF_DEFRAG_IPV4 is not set
CONFIG_IP_NF_QUEUE=y
CONFIG_IP_NF_IPTABLES=y
# CONFIG_IP_NF_MATCH_ADDRTYPE is not set
# CONFIG_IP_NF_MATCH_AH is not set
# CONFIG_IP_NF_MATCH_ECN is not set
# CONFIG_IP_NF_MATCH_TTL is not set
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
# CONFIG_IP_NF_TARGET_LOG is not set
# CONFIG_IP_NF_TARGET_ULOG is not set
# CONFIG_IP_NF_MANGLE is not set
# CONFIG_IP_NF_TARGET_TTL is not set
# CONFIG_IP_NF_RAW is not set
# CONFIG_IP_NF_SECURITY is not set
# CONFIG_IP_NF_ARPTABLES is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_NET_DSA is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_PHONET is not set
# CONFIG_IEEE802154 is not set
# CONFIG_NET_SCHED is not set
# CONFIG_DCB is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_AF_RXRPC=m
CONFIG_AF_RXRPC_DEBUG=y
CONFIG_RXKAD=m
# CONFIG_WIRELESS is not set
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
# CONFIG_CONNECTOR is not set
# CONFIG_MTD is not set
# CONFIG_PARPORT is not set
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
# CONFIG_BLK_DEV_RAM is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_MISC_DEVICES=y
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ICS932S401 is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_ISL29003 is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_AT24 is not set
# CONFIG_EEPROM_LEGACY is not set
# CONFIG_EEPROM_MAX6875 is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_CB710_CORE is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set

#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_TGT is not set
# CONFIG_SCSI_NETLINK is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
# CONFIG_CHR_DEV_SCH is not set
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_SCAN_ASYNC is not set
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_ATTRS is not set
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_SCSI_BNX2_ISCSI is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_MVSAS is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_MPT2SAS is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_LIBFC is not set
# CONFIG_LIBFCOE is not set
# CONFIG_FCOE is not set
# CONFIG_FCOE_FNIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_SRP is not set
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_ACPI=y
# CONFIG_SATA_PMP is not set
CONFIG_SATA_AHCI=y
# CONFIG_SATA_SIL24 is not set
# CONFIG_ATA_SFF is not set
# CONFIG_MD is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#

#
# Enable only one of the two stacks, unless you know what you are doing
#
# CONFIG_FIREWIRE is not set
# CONFIG_IEEE1394 is not set
# CONFIG_I2O is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
# CONFIG_DUMMY is not set
# CONFIG_BONDING is not set
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_VETH is not set
# CONFIG_NET_SB1000 is not set
# CONFIG_ARCNET is not set
# CONFIG_NET_ETHERNET is not set
CONFIG_NETDEV_1000=y
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
CONFIG_E1000E=y
# CONFIG_IP1000 is not set
# CONFIG_IGB is not set
# CONFIG_IGBVF is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_JME is not set
# CONFIG_NETDEV_10000 is not set
# CONFIG_TR is not set

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
# CONFIG_WLAN_80211 is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#

#
# USB Network Adapters
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
# CONFIG_INPUT_POLLDEV is not set

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_DEVKMEM=y
# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_NOZOMI is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
# CONFIG_SERIAL_8250_MANY_PORTS is not set
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
# CONFIG_SERIAL_8250_RSA is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_IPMI_HANDLER is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_HPET is not set
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_ALGOBIT=y

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
CONFIG_I2C_I801=y
# CONFIG_I2C_ISCH is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_OCORES is not set
# CONFIG_I2C_SIMTEC is not set

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_PARPORT_LIGHT is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set

#
# Graphics adapter I2C/DDC channel drivers
#
# CONFIG_I2C_VOODOO3 is not set

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_STUB is not set

#
# Miscellaneous I2C Chip support
#
# CONFIG_DS1682 is not set
# CONFIG_SENSORS_PCF8574 is not set
# CONFIG_PCF8575 is not set
# CONFIG_SENSORS_PCA9539 is not set
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
# CONFIG_SPI is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_BATTERY_DS2760 is not set
# CONFIG_BATTERY_BQ27x00 is not set
CONFIG_HWMON=y
# CONFIG_HWMON_VID is not set
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7414 is not set
# CONFIG_SENSORS_AD7418 is not set
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1026 is not set
# CONFIG_SENSORS_ADM1029 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM9240 is not set
# CONFIG_SENSORS_ADT7462 is not set
# CONFIG_SENSORS_ADT7470 is not set
# CONFIG_SENSORS_ADT7473 is not set
# CONFIG_SENSORS_ADT7475 is not set
# CONFIG_SENSORS_K8TEMP is not set
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_ATK0110 is not set
# CONFIG_SENSORS_ATXP1 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_I5K_AMB is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
# CONFIG_SENSORS_FSCHER is not set
# CONFIG_SENSORS_FSCPOS is not set
# CONFIG_SENSORS_FSCHMD is not set
# CONFIG_SENSORS_G760A is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_GL520SM is not set
CONFIG_SENSORS_CORETEMP=y
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_LM92 is not set
# CONFIG_SENSORS_LM93 is not set
# CONFIG_SENSORS_LTC4215 is not set
# CONFIG_SENSORS_LTC4245 is not set
# CONFIG_SENSORS_LM95241 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_MAX6650 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_DME1737 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47M192 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_ADS7828 is not set
# CONFIG_SENSORS_THMC50 is not set
# CONFIG_SENSORS_TMP401 is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83791D is not set
# CONFIG_SENSORS_W83792D is not set
# CONFIG_SENSORS_W83793 is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83L786NG is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_SENSORS_LIS3LV02D is not set
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_HWMON_DEBUG_CHIP is not set
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_MFD_TMIO is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_REGULATOR is not set
# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_DRM is not set
# CONFIG_VGASTATE is not set
CONFIG_VIDEO_OUTPUT_CONTROL=y
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_DDC=y
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
# CONFIG_FB_FOREIGN_ENDIAN is not set
# CONFIG_FB_SYS_FOPS is not set
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
CONFIG_FB_MODE_HELPERS=y
# CONFIG_FB_TILEBLITTING is not set

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_VESA is not set
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_LE80578 is not set
CONFIG_FB_INTEL=y
# CONFIG_FB_INTEL_DEBUG is not set
CONFIG_FB_INTEL_I2C=y
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_GEODE is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_BACKLIGHT_LCD_SUPPORT is not set

#
# Display device support
#
# CONFIG_DISPLAY_SUPPORT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_VGACON_SOFT_SCROLLBACK is not set
CONFIG_DUMMY_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE is not set
# CONFIG_LOGO is not set
# CONFIG_SOUND is not set
CONFIG_HID_SUPPORT=y
CONFIG_HID=m
CONFIG_HID_DEBUG=y
# CONFIG_HIDRAW is not set

#
# USB Input Devices
#
CONFIG_USB_HID=m
# CONFIG_HID_PID is not set
# CONFIG_USB_HIDDEV is not set

#
# Special HID drivers
#
CONFIG_HID_A4TECH=m
CONFIG_HID_APPLE=m
CONFIG_HID_BELKIN=m
CONFIG_HID_CHERRY=m
CONFIG_HID_CHICONY=m
CONFIG_HID_CYPRESS=m
CONFIG_HID_DRAGONRISE=m
# CONFIG_DRAGONRISE_FF is not set
CONFIG_HID_EZKEY=m
CONFIG_HID_KYE=m
CONFIG_HID_GYRATION=m
CONFIG_HID_KENSINGTON=m
CONFIG_HID_LOGITECH=m
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
CONFIG_HID_MICROSOFT=m
CONFIG_HID_MONTEREY=m
CONFIG_HID_NTRIG=m
CONFIG_HID_PANTHERLORD=m
# CONFIG_PANTHERLORD_FF is not set
CONFIG_HID_PETALYNX=m
CONFIG_HID_SAMSUNG=m
CONFIG_HID_SONY=m
CONFIG_HID_SUNPLUS=m
CONFIG_HID_GREENASIA=m
# CONFIG_GREENASIA_FF is not set
CONFIG_HID_SMARTJOYPLUS=m
# CONFIG_SMARTJOYPLUS_FF is not set
CONFIG_HID_TOPSEED=m
CONFIG_HID_THRUSTMASTER=m
# CONFIG_THRUSTMASTER_FF is not set
CONFIG_HID_ZEROPLUS=m
# CONFIG_ZEROPLUS_FF is not set
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set
# CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICE_CLASS=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_OTG is not set
# CONFIG_USB_MON is not set
# CONFIG_USB_WUSB is not set
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
# CONFIG_USB_XHCI_HCD is not set
# CONFIG_USB_EHCI_HCD is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1760_HCD is not set
# CONFIG_USB_OHCI_HCD is not set
# CONFIG_USB_UHCI_HCD is not set
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_WHCI_HCD is not set
# CONFIG_USB_HWA_HCD is not set

#
# Enable Host or Gadget support to see Inventra options
#

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
# CONFIG_USB_STORAGE is not set
# CONFIG_USB_LIBUSUAL is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set

#
# USB port drivers
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_BERRY_CHARGE is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_VST is not set
# CONFIG_USB_GADGET is not set

#
# OTG and related infrastructure
#
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_DS1307 is not set
# CONFIG_RTC_DRV_DS1374 is not set
# CONFIG_RTC_DRV_DS1672 is not set
# CONFIG_RTC_DRV_MAX6900 is not set
# CONFIG_RTC_DRV_RS5C372 is not set
# CONFIG_RTC_DRV_ISL1208 is not set
# CONFIG_RTC_DRV_X1205 is not set
# CONFIG_RTC_DRV_PCF8563 is not set
# CONFIG_RTC_DRV_PCF8583 is not set
# CONFIG_RTC_DRV_M41T80 is not set
# CONFIG_RTC_DRV_S35390A is not set
# CONFIG_RTC_DRV_FM3130 is not set
# CONFIG_RTC_DRV_RX8581 is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_V3020 is not set

#
# on-CPU RTC drivers
#
# CONFIG_DMADEVICES is not set
# CONFIG_AUXDISPLAY is not set
# CONFIG_UIO is not set

#
# TI VLYNQ
#
# CONFIG_STAGING is not set
CONFIG_X86_PLATFORM_DEVICES=y
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_THINKPAD_ACPI is not set
# CONFIG_INTEL_MENLOW is not set
# CONFIG_EEEPC_LAPTOP is not set
# CONFIG_ACPI_WMI is not set
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_ISCSI_IBFT_FIND is not set

#
# File systems
#
# CONFIG_EXT2_FS is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_EXT4_FS is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_XFS_FS=y
# CONFIG_XFS_QUOTA is not set
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
# CONFIG_XFS_DEBUG is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_BTRFS_FS is not set
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
# CONFIG_QUOTA_NETLINK_INTERFACE is not set
CONFIG_PRINT_QUOTA_WARNING=y
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_FUSE_FS is not set
CONFIG_GENERIC_ACL=y

#
# Caches
#
CONFIG_FSCACHE=m
CONFIG_FSCACHE_STATS=y
CONFIG_FSCACHE_HISTOGRAM=y
CONFIG_FSCACHE_DEBUG=y
CONFIG_CACHEFILES=m
CONFIG_CACHEFILES_DEBUG=y
CONFIG_CACHEFILES_HISTOGRAM=y

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_CONFIGFS_FS=m
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_NILFS2_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFS_FSCACHE=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_RPCSEC_GSS_SPKM3=m
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
CONFIG_AFS_FS=m
CONFIG_AFS_DEBUG=y
CONFIG_AFS_FSCACHE=y

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=m
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
CONFIG_NLS_UTF8=m
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_PRINTK_TIME is not set
# CONFIG_ENABLE_WARN_DEPRECATED is not set
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_FRAME_WARN=2048
CONFIG_MAGIC_SYSRQ=y
CONFIG_UNUSED_SYMBOLS=y
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SHIRQ is not set
# CONFIG_DETECT_SOFTLOCKUP is not set
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_SCHED_DEBUG is not set
# CONFIG_SCHEDSTATS is not set
# CONFIG_TIMER_STATS is not set
# CONFIG_DEBUG_OBJECTS is not set
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
# CONFIG_DEBUG_WRITECOUNT is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
CONFIG_ARCH_WANT_FRAME_POINTERS=y
# CONFIG_FRAME_POINTER is not set
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_LATENCYTOP is not set
CONFIG_SYSCTL_SYSCALL_CHECK=y
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_FTRACE_SYSCALLS=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_DYNAMIC_DEBUG is not set
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_HAVE_ARCH_KMEMCHECK=y
# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
# CONFIG_EARLY_PRINTK_DBGP is not set
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PER_CPU_MAPS is not set
# CONFIG_X86_PTDUMP is not set
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_RODATA_TEST is not set
# CONFIG_DEBUG_NX_TEST is not set
# CONFIG_IOMMU_DEBUG is not set
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
# CONFIG_DEBUG_BOOT_PARAMS is not set
# CONFIG_CPA_DEBUG is not set
# CONFIG_OPTIMIZE_INLINING is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y
CONFIG_SECURITY=y
CONFIG_SECURITYFS=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
# CONFIG_SECURITY_PATH is not set
CONFIG_SECURITY_FILE_CAPABILITIES=y
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
CONFIG_SECURITY_SMACK=y
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_IMA is not set
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
# CONFIG_CRYPTO_FIPS is not set
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
# CONFIG_CRYPTO_GF128MUL is not set
# CONFIG_CRYPTO_NULL is not set
CONFIG_CRYPTO_WORKQUEUE=y
# CONFIG_CRYPTO_CRYPTD is not set
# CONFIG_CRYPTO_AUTHENC is not set
# CONFIG_CRYPTO_TEST is not set

#
# Authenticated Encryption with Associated Data
#
# CONFIG_CRYPTO_CCM is not set
# CONFIG_CRYPTO_GCM is not set
# CONFIG_CRYPTO_SEQIV is not set

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
# CONFIG_CRYPTO_CTR is not set
# CONFIG_CRYPTO_CTS is not set
# CONFIG_CRYPTO_ECB is not set
# CONFIG_CRYPTO_LRW is not set
CONFIG_CRYPTO_PCBC=y
# CONFIG_CRYPTO_XTS is not set

#
# Hash modes
#
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_XCBC is not set

#
# Digest
#
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_CRC32C_INTEL is not set
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_RMD128 is not set
# CONFIG_CRYPTO_RMD160 is not set
# CONFIG_CRYPTO_RMD256 is not set
# CONFIG_CRYPTO_RMD320 is not set
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_TGR192 is not set
# CONFIG_CRYPTO_WP512 is not set

#
# Ciphers
#
# CONFIG_CRYPTO_AES is not set
# CONFIG_CRYPTO_AES_X86_64 is not set
# CONFIG_CRYPTO_AES_NI_INTEL is not set
# CONFIG_CRYPTO_ANUBIS is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_CAMELLIA is not set
CONFIG_CRYPTO_CAST5=m
# CONFIG_CRYPTO_CAST6 is not set
CONFIG_CRYPTO_DES=y
CONFIG_CRYPTO_FCRYPT=y
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_SALSA20 is not set
# CONFIG_CRYPTO_SALSA20_X86_64 is not set
# CONFIG_CRYPTO_SEED is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_TWOFISH_X86_64 is not set

#
# Compression
#
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_ZLIB is not set
# CONFIG_CRYPTO_LZO is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
# CONFIG_CRYPTO_HW is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
# CONFIG_VIRTUALIZATION is not set
# CONFIG_BINARY_PRINTF is not set

#
# Library routines
#
CONFIG_BITREVERSE=m
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_FIND_LAST_BIT=y
# CONFIG_CRC_CCITT is not set
CONFIG_CRC16=m
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=m
# CONFIG_CRC7 is not set
# CONFIG_LIBCRC32C is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_NLATTR=y

2009-06-18 16:19:39

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen


Okay, after dropping all my devel patches, I got the OOM to happen again;
fresh trace attached. I was running LTP and an NFSD, and I was spamming the
NFSD continuously from another machine (mount;tar;umount;repeat).

David
---
Initializing cgroup subsys cpuset
Linux version 2.6.30-cachefs ([email protected]) (gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #107 SMP Thu Jun 18 15:36:16 BST 2009
Command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009ec00 (usable)
BIOS-e820: 000000000009ec00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003e59a000 (usable)
BIOS-e820: 000000003e59a000 - 000000003e5a6000 (reserved)
BIOS-e820: 000000003e5a6000 - 000000003e644000 (usable)
BIOS-e820: 000000003e644000 - 000000003e6a9000 (ACPI NVS)
BIOS-e820: 000000003e6a9000 - 000000003e6ac000 (ACPI data)
BIOS-e820: 000000003e6ac000 - 000000003e6f2000 (ACPI NVS)
BIOS-e820: 000000003e6f2000 - 000000003e6ff000 (ACPI data)
BIOS-e820: 000000003e6ff000 - 000000003e700000 (usable)
BIOS-e820: 000000003e700000 - 000000003f000000 (reserved)
BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
DMI 2.4 present.
last_pfn = 0x3e700 max_arch_pfn = 0x400000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
00000-9FFFF write-back
A0000-FFFFF uncachable
MTRR variable ranges enabled:
0 base 000000000 mask FC0000000 write-back
1 base 03F000000 mask FFF000000 uncachable
2 base 03E800000 mask FFF800000 uncachable
3 base 03E700000 mask FFFF00000 uncachable
4 disabled
5 disabled
6 disabled
7 disabled
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
initial memory mapped : 0 - 20000000
init_memory_mapping: 0000000000000000-000000003e700000
0000000000 - 003e600000 page 2M
003e600000 - 003e700000 page 4k
kernel direct mapping tables up to 3e700000 @ 8000-b000
RAMDISK: 3e2ee000 - 3e57991c
ACPI: RSDP 00000000000fe020 00014 (v00 INTEL )
ACPI: RSDT 000000003e6fd038 0004C (v01 INTEL DG965RY 00000330 01000013)
ACPI: FACP 000000003e6fc000 00074 (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: DSDT 000000003e6f8000 03EDA (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: FACS 000000003e6ac000 00040
ACPI: APIC 000000003e6f7000 00078 (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: WDDT 000000003e6f6000 00040 (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: MCFG 000000003e6f5000 0003C (v01 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: ASF! 000000003e6f4000 000A6 (v32 INTEL DG965RY 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6f3000 001BC (v01 INTEL CpuPm 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6f2000 00175 (v01 INTEL Cpu0Ist 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6ab000 00175 (v01 INTEL Cpu1Ist 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6aa000 00175 (v01 INTEL Cpu2Ist 00000330 MSFT 01000013)
ACPI: SSDT 000000003e6a9000 00175 (v01 INTEL Cpu3Ist 00000330 MSFT 01000013)
ACPI: Local APIC address 0xfee00000
(7 early reservations) ==> bootmem [0000000000 - 003e700000]
#0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
#1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
#2 [0001000000 - 0001535d90] TEXT DATA BSS ==> [0001000000 - 0001535d90]
#3 [003e2ee000 - 003e57991c] RAMDISK ==> [003e2ee000 - 003e57991c]
#4 [000009e800 - 0000100000] BIOS reserved ==> [000009e800 - 0000100000]
#5 [0001536000 - 0001536199] BRK ==> [0001536000 - 0001536199]
#6 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 0000009000]
found SMP MP-table at [ffff8800000fe200] fe200
[ffffea0000000000-ffffea0000dfffff] PMD -> [ffff880001a00000-ffff8800027fffff] on node 0
Zone PFN ranges:
DMA 0x00000000 -> 0x00001000
DMA32 0x00001000 -> 0x00100000
Normal 0x00100000 -> 0x00100000
Movable zone start PFN for each node
early_node_map[4] active PFN ranges
0: 0x00000000 -> 0x0000009e
0: 0x00000100 -> 0x0003e59a
0: 0x0003e5a6 -> 0x0003e644
0: 0x0003e6ff -> 0x0003e700
On node 0 totalpages: 255447
DMA zone: 56 pages used for memmap
DMA zone: 101 pages reserved
DMA zone: 3841 pages, LIFO batch:0
DMA32 zone: 3441 pages used for memmap
DMA32 zone: 248008 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
4 Processors exceeds NR_CPUS limit of 2
SMP: Allowing 2 CPUs, 0 hotplug CPUs
nr_irqs_gsi: 24
PM: Registered nosave memory: 000000000009e000 - 000000000009f000
PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
PM: Registered nosave memory: 000000003e59a000 - 000000003e5a6000
PM: Registered nosave memory: 000000003e644000 - 000000003e6a9000
PM: Registered nosave memory: 000000003e6a9000 - 000000003e6ac000
PM: Registered nosave memory: 000000003e6ac000 - 000000003e6f2000
PM: Registered nosave memory: 000000003e6f2000 - 000000003e6ff000
Allocating PCI resources starting at 3f000000 (gap: 3f000000:c0f00000)
NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
PERCPU: Embedded 24 pages at ffff880001541000, static data 67296 bytes
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 251849
Kernel command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
PID hash table entries: 4096 (order: 12, 32768 bytes)
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
Initializing CPU#0
Checking aperture...
No AGP bridge found
Memory: 996952k/1022976k available (2949k kernel code, 1188k absent, 24132k reserved, 1679k data, 360k init)
NR_IRQS:320
Fast TSC calibration using PIT
Detected 1865.185 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
console [ttyS0] enabled
Calibrating delay loop (skipped), value calculated using timer frequency.. 3730.37 BogoMIPS (lpj=7460740)
Security Framework initialized
SELinux: Initializing.
SELinux: Starting in enforcing mode
Mount-cache hash table entries: 256
Initializing cgroup subsys debug
Initializing cgroup subsys ns
Initializing cgroup subsys devices
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
mce: CPU supports 6 MCE banks
CPU0: Thermal monitoring enabled (TM2)
using mwait in idle threads.
ACPI: Core revision 20090521
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
Booting processor 1 APIC 0x1 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 3729.90 BogoMIPS (lpj=7459814)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
mce: CPU supports 6 MCE banks
CPU1: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
CPU1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
Total of 2 processors activated (7460.27 BogoMIPS).
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
PCI: Not using MMCONFIG.
PCI: Using configuration type 1 for base access
bio: create slab <bio-0> at 0
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
PCI: Using MMCONFIG at f0000000 - f7ffffff
ACPI: No dock devices found.
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci 0000:00:02.0: reg 10 32bit mmio: [0x50200000-0x502fffff]
pci 0000:00:02.0: reg 18 64bit mmio: [0x40000000-0x4fffffff]
pci 0000:00:02.0: reg 20 io port: [0x2110-0x2117]
pci 0000:00:03.0: reg 10 64bit mmio: [0x50326100-0x5032610f]
pci 0000:00:03.0: PME# supported from D0 D3hot D3cold
pci 0000:00:03.0: PME# disabled
pci 0000:00:19.0: reg 10 32bit mmio: [0x50300000-0x5031ffff]
pci 0000:00:19.0: reg 14 32bit mmio: [0x50324000-0x50324fff]
pci 0000:00:19.0: reg 18 io port: [0x20e0-0x20ff]
pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
pci 0000:00:19.0: PME# disabled
pci 0000:00:1a.0: reg 20 io port: [0x20c0-0x20df]
pci 0000:00:1a.1: reg 20 io port: [0x20a0-0x20bf]
pci 0000:00:1a.7: reg 10 32bit mmio: [0x50325c00-0x50325fff]
pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1a.7: PME# disabled
pci 0000:00:1b.0: reg 10 64bit mmio: [0x50320000-0x50323fff]
pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1b.0: PME# disabled
pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.0: PME# disabled
pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.1: PME# disabled
pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.2: PME# disabled
pci 0000:00:1c.3: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.3: PME# disabled
pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.4: PME# disabled
pci 0000:00:1d.0: reg 20 io port: [0x2080-0x209f]
pci 0000:00:1d.1: reg 20 io port: [0x2060-0x207f]
pci 0000:00:1d.2: reg 20 io port: [0x2040-0x205f]
pci 0000:00:1d.7: reg 10 32bit mmio: [0x50325800-0x50325bff]
pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1d.7: PME# disabled
pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 007f)
pci 0000:00:1f.2: reg 10 io port: [0x2108-0x210f]
pci 0000:00:1f.2: reg 14 io port: [0x211c-0x211f]
pci 0000:00:1f.2: reg 18 io port: [0x2100-0x2107]
pci 0000:00:1f.2: reg 1c io port: [0x2118-0x211b]
pci 0000:00:1f.2: reg 20 io port: [0x2020-0x203f]
pci 0000:00:1f.2: reg 24 32bit mmio: [0x50325000-0x503257ff]
pci 0000:00:1f.2: PME# supported from D3hot
pci 0000:00:1f.2: PME# disabled
pci 0000:00:1f.3: reg 10 32bit mmio: [0x50326000-0x503260ff]
pci 0000:00:1f.3: reg 20 io port: [0x2000-0x201f]
pci 0000:00:1c.0: bridge 32bit mmio: [0x50400000-0x504fffff]
pci 0000:02:00.0: reg 10 io port: [0x1018-0x101f]
pci 0000:02:00.0: reg 14 io port: [0x1024-0x1027]
pci 0000:02:00.0: reg 18 io port: [0x1010-0x1017]
pci 0000:02:00.0: reg 1c io port: [0x1020-0x1023]
pci 0000:02:00.0: reg 20 io port: [0x1000-0x100f]
pci 0000:02:00.0: reg 24 32bit mmio: [0x50100000-0x501001ff]
pci 0000:02:00.0: supports D1
pci 0000:02:00.0: PME# supported from D0 D1 D3hot
pci 0000:02:00.0: PME# disabled
pci 0000:00:1c.1: bridge io port: [0x1000-0x1fff]
pci 0000:00:1c.1: bridge 32bit mmio: [0x50100000-0x501fffff]
pci 0000:00:1c.2: bridge 32bit mmio: [0x50500000-0x505fffff]
pci 0000:00:1c.3: bridge 32bit mmio: [0x50600000-0x506fffff]
pci 0000:00:1c.4: bridge 32bit mmio: [0x50700000-0x507fffff]
pci 0000:06:03.0: reg 10 32bit mmio: [0x50004000-0x500047ff]
pci 0000:06:03.0: reg 14 32bit mmio: [0x50000000-0x50003fff]
pci 0000:06:03.0: supports D1 D2
pci 0000:06:03.0: PME# supported from D0 D1 D2 D3hot
pci 0000:06:03.0: PME# disabled
pci 0000:00:1e.0: transparent bridge
pci 0000:00:1e.0: bridge 32bit mmio: [0x50000000-0x500fffff]
pci_bus 0000:00: on NUMA node 0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P32_._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX2._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX3._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 *10 11 12)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 *9 10 11 12)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 *9 10 11 12)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 *11 12)
SCSI subsystem initialized
libata version 3.00 loaded.
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 12 devices
ACPI: ACPI bus type pnp unregistered
system 00:01: iomem range 0xf0000000-0xf7ffffff has been reserved
system 00:01: iomem range 0xfed13000-0xfed13fff has been reserved
system 00:01: iomem range 0xfed14000-0xfed17fff has been reserved
system 00:01: iomem range 0xfed18000-0xfed18fff has been reserved
system 00:01: iomem range 0xfed19000-0xfed19fff has been reserved
system 00:01: iomem range 0xfed1c000-0xfed1ffff has been reserved
system 00:01: iomem range 0xfed20000-0xfed3ffff has been reserved
system 00:01: iomem range 0xfed45000-0xfed99fff has been reserved
system 00:01: iomem range 0xc0000-0xdffff has been reserved
system 00:01: iomem range 0xe0000-0xfffff could not be reserved
system 00:06: ioport range 0x500-0x53f has been reserved
system 00:06: ioport range 0x400-0x47f has been reserved
system 00:06: ioport range 0x680-0x6ff has been reserved
pci 0000:00:1c.0: PCI bridge, secondary bus 0000:01
pci 0000:00:1c.0: IO window: disabled
pci 0000:00:1c.0: MEM window: 0x50400000-0x504fffff
pci 0000:00:1c.0: PREFETCH window: disabled
pci 0000:00:1c.1: PCI bridge, secondary bus 0000:02
pci 0000:00:1c.1: IO window: 0x1000-0x1fff
pci 0000:00:1c.1: MEM window: 0x50100000-0x501fffff
pci 0000:00:1c.1: PREFETCH window: disabled
pci 0000:00:1c.2: PCI bridge, secondary bus 0000:03
pci 0000:00:1c.2: IO window: disabled
pci 0000:00:1c.2: MEM window: 0x50500000-0x505fffff
pci 0000:00:1c.2: PREFETCH window: disabled
pci 0000:00:1c.3: PCI bridge, secondary bus 0000:04
pci 0000:00:1c.3: IO window: disabled
pci 0000:00:1c.3: MEM window: 0x50600000-0x506fffff
pci 0000:00:1c.3: PREFETCH window: disabled
pci 0000:00:1c.4: PCI bridge, secondary bus 0000:05
pci 0000:00:1c.4: IO window: disabled
pci 0000:00:1c.4: MEM window: 0x50700000-0x507fffff
pci 0000:00:1c.4: PREFETCH window: disabled
pci 0000:00:1e.0: PCI bridge, secondary bus 0000:06
pci 0000:00:1e.0: IO window: disabled
pci 0000:00:1e.0: MEM window: 0x50000000-0x500fffff
pci 0000:00:1e.0: PREFETCH window: disabled
pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.0: setting latency timer to 64
pci 0000:00:1c.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1c.1: setting latency timer to 64
pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
pci 0000:00:1c.2: setting latency timer to 64
pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
pci 0000:00:1c.3: setting latency timer to 64
pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.4: setting latency timer to 64
pci 0000:00:1e.0: setting latency timer to 64
pci_bus 0000:00: resource 0 io: [0x00-0xffff]
pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:01: resource 1 mem: [0x50400000-0x504fffff]
pci_bus 0000:02: resource 0 io: [0x1000-0x1fff]
pci_bus 0000:02: resource 1 mem: [0x50100000-0x501fffff]
pci_bus 0000:03: resource 1 mem: [0x50500000-0x505fffff]
pci_bus 0000:04: resource 1 mem: [0x50600000-0x506fffff]
pci_bus 0000:05: resource 1 mem: [0x50700000-0x507fffff]
pci_bus 0000:06: resource 1 mem: [0x50000000-0x500fffff]
pci_bus 0000:06: resource 3 io: [0x00-0xffff]
pci_bus 0000:06: resource 4 mem: [0x000000-0xffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
NET: Registered protocol family 1
Unpacking initramfs...
Freeing initrd memory: 2606k freed
audit: initializing netlink socket (disabled)
type=2000 audit(1245336472.149:1): initialized
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
msgmni has been set to 1953
SELinux: Registering netfilter hooks
alg: No test for fcrypt (fcrypt-generic)
alg: No test for stdrng (krng)
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
pci 0000:00:02.0: Boot video device
pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
pcieport-driver 0000:00:1c.0: setting latency timer to 64
pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
pcieport-driver 0000:00:1c.1: setting latency timer to 64
pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
pcieport-driver 0000:00:1c.2: setting latency timer to 64
pcieport-driver 0000:00:1c.3: irq 27 for MSI/MSI-X
pcieport-driver 0000:00:1c.3: setting latency timer to 64
pcieport-driver 0000:00:1c.4: irq 28 for MSI/MSI-X
pcieport-driver 0000:00:1c.4: setting latency timer to 64
input: Power Button as /class/input/input0
ACPI: Power Button [PWRF]
input: Sleep Button as /class/input/input1
ACPI: Sleep Button [SLPB]
processor ACPI_CPU:00: registered as cooling_device0
ACPI: Processor [CPU0] (supports 8 throttling states)
processor ACPI_CPU:01: registered as cooling_device1
ACPI: Processor [CPU1] (supports 8 throttling states)
Linux agpgart interface v0.103
agpgart-intel 0000:00:00.0: Intel 965G Chipset
agpgart-intel 0000:00:00.0: detected 7676K stolen memory
agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0x40000000
intelfb: Framebuffer driver for Intel(R) 830M/845G/852GM/855GM/865G/915G/915GM/945G/945GM/945GME/965G/965GM chipsets
intelfb: Version 0.9.6
intelfb 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
intelfb: 00:02.0: Intel(R) 965G, aperture size 256MB, stolen memory 7932kB
intelfb: Initial video mode is 1024x768-32@70.
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
Platform driver 'serial8250' needs updating - please use dev_pm_ops
00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
loop: module loaded
Driver 'sd' needs updating - please use bus_type methods
ahci 0000:00:1f.2: version 3.0
ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
ahci 0000:00:1f.2: irq 29 for MSI/MSI-X
ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0x33 impl SATA mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part ems
ahci 0000:00:1f.2: setting latency timer to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325100 irq 29
ata2: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325180 irq 29
ata3: DUMMY
ata4: DUMMY
ata5: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325300 irq 29
ata6: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325380 irq 29
e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
e1000e: Copyright (c) 1999-2008 Intel Corporation.
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: setting latency timer to 64
e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:16:76:ce:3a:3c
0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
Platform driver 'i8042' needs updating - please use dev_pm_ops
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
rtc_cmos 00:03: RTC can wake from S4
rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
rtc0: alarms up to one month, 114 bytes nvram
i2c /dev entries driver
i801_smbus 0000:00:1f.3: PCI INT B -> GSI 21 (level, low) -> IRQ 21
coretemp coretemp.0: Using relative temperature scale!
coretemp coretemp.1: Using relative temperature scale!
cpuidle: using governor ladder
ip_tables: (C) 2000-2006 Netfilter Core Team
TCP cubic registered
input: AT Translated Set 2 keyboard as /class/input/input2
NET: Registered protocol family 17
registered taskstats version 1
ata6: SATA link down (SStatus 0 SControl 300)
rtc_cmos 00:03: setting system clock to 2009-06-18 14:47:54 UTC (1245336474)
ata5: SATA link down (SStatus 0 SControl 300)
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
ata1.00: ATA-7: ST380211AS, 3.AAE, max UDMA/133
ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access ATA ST380211AS 3.AA PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
sd 0:0:0:0: [sda] Attached SCSI disk
Freeing unused kernel memory: 360k freed
Write protecting the kernel read-only data: 4320k
Red Hat nash version 6.0.52 starting
Mounting proc filesystem
Mounting sysfs filesystem
Creating /dev
Creating initial device nodes
Setting up hotplug.
input: ImPS/2 Generic Wheel Mouse as /class/input/input3
Creating block device nodes.
mount: could not find filesystem '/proc/bus/usb'
Waiting for driver initialization.
Waiting for driver initialization.
Creating root device.
Mounting root filesystem.
kjournald starting. Commit interval 5 seconds
Setting up otherEXT3-fs: mounted filesystem with writeback data mode.
filesystems.
Setting up new root fs
no fstab.sys, mounting internal defaults
SELinux: 8192 avtab hash slots, 177803 rules.
SELinux: 8192 avtab hash slots, 177803 rules.
SELinux: 6 users, 12 roles, 2431 types, 118 bools, 1 sens, 1024 cats
SELinux: 73 classes, 177803 rules
SELinux: class kernel_service not defined in policy
SELinux: permission open in class sock_file not defined in policy
SELinux: permission nlmsg_tty_audit in class netlink_audit_socket not defined in policy
SELinux: the above unknown classes and permissions will be allowed
SELinux: Completing initialization.
SELinux: Setting up existing superblocks.
SELinux: initialized (dev sda2, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
type=1403 audit(1245336481.989:2): policy loaded auid=4294967295 ses=4294967295
Switching to new root and running init.
unmounting old /dev
unmounting old /proc
unmounting old /sys
Welcome to Fedora
Press 'I' to enter interactive startup.
Starting udev: [ OK ]
Setting hostname andromeda.procyon.org.uk: [ OK ]
Checking filesystems
Checking all file systems.
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda2
/1: clean, 330519/2621440 files, 1528859/2620603 blocks
[/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
/boot1: clean, 79/50200 files, 72187/200780 blocks
[ OK ]
Remounting root filesystem in read-write mode: [ OK ]
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling /etc/fstab swaps: [ OK ]
Entering non-interactive startup
Starting background readahead (early, fast mode): [ OK ]
FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
Bringing up loopback interface: [ OK ]
Bringing up interface eth0:
Determining IP information for eth0... done.
[ OK ]
FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
Starting restorecond: [ OK ]
Starting auditd: [ OK ]
Starting irqbalance: [ OK ]
Starting mcstransd: [ OK ]
Starting rpcbind: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

rpcbind: cannot create socket for udp6
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

rpcbind: cannot create socket for tcp6
[ OK ]
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

Starting NFS statd: [ OK ]
Starting system message bus: [ OK ]
Starting lm_sensors: not configured, run sensors-detect[WARNING]
Starting sshd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

[ OK ]
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

Starting ntpd: [ OK ]
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

SysRq : Changing Loglevel
Loglevel set to 8
Now booted
Starting smartd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

[ OK ]

Fedora release 9 (Sulphur)
Kernel 2.6.30-cachefs on an x86_64 (/dev/ttyS0)

andromeda.procyon.org.uk login: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

warning: `capget01' uses 32-bit capabilities (legacy support in use)
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

Adding 65528k swap on ./swapfile01. Priority:-1 extents:141 across:498688k
Adding 65528k swap on ./swapfile01. Priority:-1 extents:203 across:829292k
Adding 65528k swap on ./swapfile01. Priority:-1 extents:151 across:811620k
Unable to find swap-space signature
Adding 32k swap on alreadyused. Priority:-1 extents:4 across:18988k
Adding 32k swap on swapfile02. Priority:-1 extents:4 across:1064k
Adding 32k swap on swapfile03. Priority:-2 extents:1 across:32k
Adding 32k swap on swapfile04. Priority:-3 extents:4 across:18976k
Adding 32k swap on swapfile05. Priority:-4 extents:2 across:44k
Adding 32k swap on swapfile06. Priority:-5 extents:1 across:32k
Adding 32k swap on swapfile07. Priority:-6 extents:2 across:60k
Adding 32k swap on swapfile08. Priority:-7 extents:2 across:32k
Adding 32k swap on swapfile09. Priority:-8 extents:1 across:32k
Adding 32k swap on swapfile10. Priority:-9 extents:2 across:36k
Adding 32k swap on swapfile11. Priority:-10 extents:1 across:32k
Adding 32k swap on swapfile12. Priority:-11 extents:2 across:32k
Adding 32k swap on swapfile13. Priority:-12 extents:1 across:32k
Adding 32k swap on swapfile14. Priority:-13 extents:1 across:32k
Adding 32k swap on swapfile15. Priority:-14 extents:1 across:32k
Adding 32k swap on swapfile16. Priority:-15 extents:2 across:32k
Adding 32k swap on swapfile17. Priority:-16 extents:1 across:32k
Adding 32k swap on swapfile18. Priority:-17 extents:2 across:44k
Adding 32k swap on swapfile19. Priority:-18 extents:2 across:1316k
Adding 32k swap on swapfile20. Priority:-19 extents:2 across:32k
Adding 32k swap on swapfile21. Priority:-20 extents:2 across:72k
Adding 32k swap on swapfile22. Priority:-21 extents:1 across:32k
Adding 32k swap on swapfile23. Priority:-22 extents:1 across:32k
Adding 32k swap on swapfile24. Priority:-23 extents:3 across:44k
Adding 32k swap on swapfile25. Priority:-24 extents:1 across:32k
Adding 32k swap on swapfile26. Priority:-25 extents:1 across:32k
Adding 32k swap on swapfile27. Priority:-26 extents:1 across:32k
Adding 32k swap on swapfile28. Priority:-27 extents:2 across:32k
Adding 32k swap on swapfile29. Priority:-28 extents:1 across:32k
Adding 32k swap on swapfile30. Priority:-29 extents:1 across:32k
Adding 32k swap on swapfile31. Priority:-30 extents:1 across:32k
Adding 32k swap on firstswapfile. Priority:-31 extents:2 across:32k
Adding 32k swap on secondswapfile. Priority:-32 extents:2 across:44k
warning: process `sysctl01' used the deprecated sysctl system call with 1.1.
warning: process `sysctl01' used the deprecated sysctl system call with 1.2.
warning: process `sysctl03' used the deprecated sysctl system call with 1.1.
warning: process `sysctl03' used the deprecated sysctl system call with 1.1.
warning: process `sysctl04' used the deprecated sysctl system call with
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
Installing knfsd (copyright (C) 1996 [email protected]).
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory

msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 12411, comm: msgctl11 Not tainted 2.6.30-cachefs #107
Call Trace:
[<ffffffff81071612>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff810736e7>] ? drain_local_pages+0x0/0x13
[<ffffffff810718d9>] ? __out_of_memory+0x12b/0x142
[<ffffffff8107195a>] ? out_of_memory+0x6a/0x94
[<ffffffff81074002>] ? __alloc_pages_nodemask+0x422/0x50b
[<ffffffff81031112>] ? copy_process+0x95/0x1158
[<ffffffff81074155>] ? __get_free_pages+0x12/0x50
[<ffffffff81031135>] ? copy_process+0xb8/0x1158
[<ffffffff81081346>] ? handle_mm_fault+0x2d5/0x645
[<ffffffff81032314>] ? do_fork+0x13f/0x2ba
[<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 57
CPU 1: hi: 186, btch: 31 usd: 0
Active_anon:70104 active_file:1 inactive_anon:6557
inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
free:4062 slab:41969 mapped:541 pagetables:59663 bounce:0
DMA free:3920kB min:60kB low:72kB high:88kB active_anon:2268kB inactive_anon:428kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:12328kB min:3948kB low:4932kB high:5920kB active_anon:278148kB inactive_anon:25800kB active_file:4kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 8*4kB 0*8kB 1*16kB 1*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3920kB
DMA32: 2474*4kB 56*8kB 8*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 12328kB
1660 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5588 pages reserved
255749 pages shared
215785 pages non-shared
Out of memory: kill process 6838 (msgctl11) score 152029 or a child
Killed process 8850 (msgctl11)

2009-06-18 16:58:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Thu, 18 Jun 2009 17:18:58 +0100 David Howells <[email protected]> wrote:

>
> Okay, after dropping all my devel patches, I got the OOM to happen again;
> fresh trace attached. I was running LTP and an NFSD, and I was spamming the
> NFSD continuously from another machine (mount;tar;umount;repeat).
>
>
> ...
>
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 57
> CPU 1: hi: 186, btch: 31 usd: 0
> Active_anon:70104 active_file:1 inactive_anon:6557
> inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> free:4062 slab:41969 mapped:541 pagetables:59663 bounce:0

77000 pages in anonymous memory, no swap online.

42000 pages in slab. Maybe this is a leak?

60000 pagetable pages. Seems rather a lot?

179000 pages accounted for above

> DMA free:3920kB min:60kB low:72kB high:88kB active_anon:2268kB inactive_anon:428kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:12328kB min:3948kB low:4932kB high:5920kB active_anon:278148kB inactive_anon:25800kB active_file:4kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 8*4kB 0*8kB 1*16kB 1*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3920kB
> DMA32: 2474*4kB 56*8kB 8*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 12328kB

present memory: 15364 + 992032 = 1007396kB. 250000 pages. It's a 1GB
box, yes?

> 1660 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 255749 pages shared
> 215785 pages non-shared
> Out of memory: kill process 6838 (msgctl11) score 152029 or a child
> Killed process 8850 (msgctl11)

afacit, 70000 pages are unaccounted for (leaked?)

2009-06-19 05:24:47

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Thu, Jun 18, 2009 at 10:46:52PM +0800, David Howells wrote:
>
> Hmmm.... It's possible that this makes my test box implode horribly when
> running LTP.
>
> I'm going to bisect it to see if this is actually due to your patches.

Not likely my patches, they may cause regression only when there are
many VM_EXEC mapped pages.

You can try reverting this patch, it is a much bigger change for normal systems.

commit 56e49d218890f49b0057710a4b6fef31f5ffbfec
Author: Rik van Riel <[email protected]>
Date: Tue Jun 16 15:32:28 2009 -0700

vmscan: evict use-once pages first

When the file LRU lists are dominated by streaming IO pages, evict those
pages first, before considering evicting other pages.

This should be safe from deadlocks or performance problems
because only three things can happen to an inactive file page:

1) referenced twice and promoted to the active list
2) evicted by the pageout code
3) under IO, after which it will get evicted or promoted

The pages freed in this way can either be reused for streaming IO, or
allocated for something else. If the pages are used for streaming IO,
this pageout pattern continues. Otherwise, we will fall back to the
normal pageout pattern.


Thanks,
Fengguang

> Note that I don't have any swap space. This after a fresh reboot:
>
> [root@andromeda ~]# cat /proc/meminfo
> MemTotal: 1000624 kB
> MemFree: 797328 kB
> Buffers: 13272 kB
> Cached: 121744 kB
> SwapCached: 0 kB
> Active: 36240 kB
> Inactive: 115856 kB
> Active(anon): 17448 kB
> Inactive(anon): 0 kB
> Active(file): 18792 kB
> Inactive(file): 115856 kB
> Unevictable: 0 kB
> Mlocked: 0 kB
> SwapTotal: 0 kB
> SwapFree: 0 kB
> Dirty: 28 kB
> Writeback: 0 kB
> AnonPages: 17280 kB
> Mapped: 5376 kB
> Slab: 42984 kB
> SReclaimable: 6956 kB
> SUnreclaim: 36028 kB
> PageTables: 1304 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> CommitLimit: 500312 kB
> Committed_AS: 52596 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 190044 kB
> VmallocChunk: 34359546363 kB
> DirectMap4k: 13312 kB
> DirectMap2M: 1009664 kB
>
> David
> ---
> Initializing cgroup subsys cpuset
> Linux version 2.6.30-cachefs ([email protected]) (gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #106 SMP Wed Jun 17 22:10:31 BST 2009
> Command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> KERNEL supported cpus:
> Intel GenuineIntel
> AMD AuthenticAMD
> Centaur CentaurHauls
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009ec00 (usable)
> BIOS-e820: 000000000009ec00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 000000003e59a000 (usable)
> BIOS-e820: 000000003e59a000 - 000000003e5a6000 (reserved)
> BIOS-e820: 000000003e5a6000 - 000000003e644000 (usable)
> BIOS-e820: 000000003e644000 - 000000003e6a9000 (ACPI NVS)
> BIOS-e820: 000000003e6a9000 - 000000003e6ac000 (ACPI data)
> BIOS-e820: 000000003e6ac000 - 000000003e6f2000 (ACPI NVS)
> BIOS-e820: 000000003e6f2000 - 000000003e6ff000 (ACPI data)
> BIOS-e820: 000000003e6ff000 - 000000003e700000 (usable)
> BIOS-e820: 000000003e700000 - 000000003f000000 (reserved)
> BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
> DMI 2.4 present.
> last_pfn = 0x3e700 max_arch_pfn = 0x400000000
> MTRR default type: uncachable
> MTRR fixed ranges enabled:
> 00000-9FFFF write-back
> A0000-FFFFF uncachable
> MTRR variable ranges enabled:
> 0 base 000000000 mask FC0000000 write-back
> 1 base 03F000000 mask FFF000000 uncachable
> 2 base 03E800000 mask FFF800000 uncachable
> 3 base 03E700000 mask FFFF00000 uncachable
> 4 disabled
> 5 disabled
> 6 disabled
> 7 disabled
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> initial memory mapped : 0 - 20000000
> init_memory_mapping: 0000000000000000-000000003e700000
> 0000000000 - 003e600000 page 2M
> 003e600000 - 003e700000 page 4k
> kernel direct mapping tables up to 3e700000 @ 8000-b000
> RAMDISK: 3e2ee000 - 3e57991c
> ACPI: RSDP 00000000000fe020 00014 (v00 INTEL )
> ACPI: RSDT 000000003e6fd038 0004C (v01 INTEL DG965RY 00000330 01000013)
> ACPI: FACP 000000003e6fc000 00074 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: DSDT 000000003e6f8000 03EDA (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: FACS 000000003e6ac000 00040
> ACPI: APIC 000000003e6f7000 00078 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: WDDT 000000003e6f6000 00040 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: MCFG 000000003e6f5000 0003C (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: ASF! 000000003e6f4000 000A6 (v32 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f3000 001BC (v01 INTEL CpuPm 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f2000 00175 (v01 INTEL Cpu0Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6ab000 00175 (v01 INTEL Cpu1Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6aa000 00175 (v01 INTEL Cpu2Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6a9000 00175 (v01 INTEL Cpu3Ist 00000330 MSFT 01000013)
> ACPI: Local APIC address 0xfee00000
> (7 early reservations) ==> bootmem [0000000000 - 003e700000]
> #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
> #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
> #2 [0001000000 - 0001535d90] TEXT DATA BSS ==> [0001000000 - 0001535d90]
> #3 [003e2ee000 - 003e57991c] RAMDISK ==> [003e2ee000 - 003e57991c]
> #4 [000009e800 - 0000100000] BIOS reserved ==> [000009e800 - 0000100000]
> #5 [0001536000 - 0001536199] BRK ==> [0001536000 - 0001536199]
> #6 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 0000009000]
> found SMP MP-table at [ffff8800000fe200] fe200
> [ffffea0000000000-ffffea0000dfffff] PMD -> [ffff880001a00000-ffff8800027fffff] on node 0
> Zone PFN ranges:
> DMA 0x00000000 -> 0x00001000
> DMA32 0x00001000 -> 0x00100000
> Normal 0x00100000 -> 0x00100000
> Movable zone start PFN for each node
> early_node_map[4] active PFN ranges
> 0: 0x00000000 -> 0x0000009e
> 0: 0x00000100 -> 0x0003e59a
> 0: 0x0003e5a6 -> 0x0003e644
> 0: 0x0003e6ff -> 0x0003e700
> On node 0 totalpages: 255447
> DMA zone: 56 pages used for memmap
> DMA zone: 101 pages reserved
> DMA zone: 3841 pages, LIFO batch:0
> DMA32 zone: 3441 pages used for memmap
> DMA32 zone: 248008 pages, LIFO batch:31
> ACPI: PM-Timer IO Port: 0x408
> ACPI: Local APIC address 0xfee00000
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> ACPI: IRQ0 used by override.
> ACPI: IRQ2 used by override.
> ACPI: IRQ9 used by override.
> Using ACPI (MADT) for SMP configuration information
> 4 Processors exceeds NR_CPUS limit of 2
> SMP: Allowing 2 CPUs, 0 hotplug CPUs
> nr_irqs_gsi: 24
> PM: Registered nosave memory: 000000000009e000 - 000000000009f000
> PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
> PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
> PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
> PM: Registered nosave memory: 000000003e59a000 - 000000003e5a6000
> PM: Registered nosave memory: 000000003e644000 - 000000003e6a9000
> PM: Registered nosave memory: 000000003e6a9000 - 000000003e6ac000
> PM: Registered nosave memory: 000000003e6ac000 - 000000003e6f2000
> PM: Registered nosave memory: 000000003e6f2000 - 000000003e6ff000
> Allocating PCI resources starting at 3f000000 (gap: 3f000000:c0f00000)
> NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
> PERCPU: Embedded 24 pages at ffff880001541000, static data 67296 bytes
> Built 1 zonelists in Zone order, mobility grouping on. Total pages: 251849
> Kernel command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
> Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
> Initializing CPU#0
> Checking aperture...
> No AGP bridge found
> Memory: 996952k/1022976k available (2953k kernel code, 1188k absent, 24132k reserved, 1678k data, 360k init)
> NR_IRQS:320
> Fast TSC calibration using PIT
> Detected 1864.978 MHz processor.
> Console: colour VGA+ 80x25
> console [tty0] enabled
> console [ttyS0] enabled
> Calibrating delay loop (skipped), value calculated using timer frequency.. 3729.95 BogoMIPS (lpj=7459912)
> Security Framework initialized
> SELinux: Initializing.
> SELinux: Starting in enforcing mode
> Mount-cache hash table entries: 256
> Initializing cgroup subsys debug
> Initializing cgroup subsys ns
> Initializing cgroup subsys devices
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> mce: CPU supports 6 MCE banks
> CPU0: Thermal monitoring enabled (TM2)
> using mwait in idle threads.
> ACPI: Core revision 20090521
> Setting APIC routing to flat
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> CPU0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> Booting processor 1 APIC 0x1 ip 0x6000
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 3525.06 BogoMIPS (lpj=7050122)
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> mce: CPU supports 6 MCE banks
> CPU1: Thermal monitoring enabled (TM2)
> x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
> CPU1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> checking TSC synchronization [CPU#0 -> CPU#1]: passed.
> Brought up 2 CPUs
> Total of 2 processors activated (7255.01 BogoMIPS).
> NET: Registered protocol family 16
> ACPI: bus type pci registered
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: Not using MMCONFIG.
> PCI: Using configuration type 1 for base access
> bio: create slab <bio-0> at 0
> ACPI: EC: Look up EC in DSDT
> ACPI: Interpreter enabled
> ACPI: (supports S0 S3 S4 S5)
> ACPI: Using IOAPIC for interrupt routing
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
> PCI: Using MMCONFIG at f0000000 - f7ffffff
> ACPI: No dock devices found.
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> pci 0000:00:02.0: reg 10 32bit mmio: [0x50200000-0x502fffff]
> pci 0000:00:02.0: reg 18 64bit mmio: [0x40000000-0x4fffffff]
> pci 0000:00:02.0: reg 20 io port: [0x2110-0x2117]
> pci 0000:00:03.0: reg 10 64bit mmio: [0x50326100-0x5032610f]
> pci 0000:00:03.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:03.0: PME# disabled
> pci 0000:00:19.0: reg 10 32bit mmio: [0x50300000-0x5031ffff]
> pci 0000:00:19.0: reg 14 32bit mmio: [0x50324000-0x50324fff]
> pci 0000:00:19.0: reg 18 io port: [0x20e0-0x20ff]
> pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:19.0: PME# disabled
> pci 0000:00:1a.0: reg 20 io port: [0x20c0-0x20df]
> pci 0000:00:1a.1: reg 20 io port: [0x20a0-0x20bf]
> pci 0000:00:1a.7: reg 10 32bit mmio: [0x50325c00-0x50325fff]
> pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1a.7: PME# disabled
> pci 0000:00:1b.0: reg 10 64bit mmio: [0x50320000-0x50323fff]
> pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1b.0: PME# disabled
> pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.0: PME# disabled
> pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.1: PME# disabled
> pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.2: PME# disabled
> pci 0000:00:1c.3: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.3: PME# disabled
> pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.4: PME# disabled
> pci 0000:00:1d.0: reg 20 io port: [0x2080-0x209f]
> pci 0000:00:1d.1: reg 20 io port: [0x2060-0x207f]
> pci 0000:00:1d.2: reg 20 io port: [0x2040-0x205f]
> pci 0000:00:1d.7: reg 10 32bit mmio: [0x50325800-0x50325bff]
> pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1d.7: PME# disabled
> pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
> pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO
> pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 007f)
> pci 0000:00:1f.2: reg 10 io port: [0x2108-0x210f]
> pci 0000:00:1f.2: reg 14 io port: [0x211c-0x211f]
> pci 0000:00:1f.2: reg 18 io port: [0x2100-0x2107]
> pci 0000:00:1f.2: reg 1c io port: [0x2118-0x211b]
> pci 0000:00:1f.2: reg 20 io port: [0x2020-0x203f]
> pci 0000:00:1f.2: reg 24 32bit mmio: [0x50325000-0x503257ff]
> pci 0000:00:1f.2: PME# supported from D3hot
> pci 0000:00:1f.2: PME# disabled
> pci 0000:00:1f.3: reg 10 32bit mmio: [0x50326000-0x503260ff]
> pci 0000:00:1f.3: reg 20 io port: [0x2000-0x201f]
> pci 0000:00:1c.0: bridge 32bit mmio: [0x50400000-0x504fffff]
> pci 0000:02:00.0: reg 10 io port: [0x1018-0x101f]
> pci 0000:02:00.0: reg 14 io port: [0x1024-0x1027]
> pci 0000:02:00.0: reg 18 io port: [0x1010-0x1017]
> pci 0000:02:00.0: reg 1c io port: [0x1020-0x1023]
> pci 0000:02:00.0: reg 20 io port: [0x1000-0x100f]
> pci 0000:02:00.0: reg 24 32bit mmio: [0x50100000-0x501001ff]
> pci 0000:02:00.0: supports D1
> pci 0000:02:00.0: PME# supported from D0 D1 D3hot
> pci 0000:02:00.0: PME# disabled
> pci 0000:00:1c.1: bridge io port: [0x1000-0x1fff]
> pci 0000:00:1c.1: bridge 32bit mmio: [0x50100000-0x501fffff]
> pci 0000:00:1c.2: bridge 32bit mmio: [0x50500000-0x505fffff]
> pci 0000:00:1c.3: bridge 32bit mmio: [0x50600000-0x506fffff]
> pci 0000:00:1c.4: bridge 32bit mmio: [0x50700000-0x507fffff]
> pci 0000:06:03.0: reg 10 32bit mmio: [0x50004000-0x500047ff]
> pci 0000:06:03.0: reg 14 32bit mmio: [0x50000000-0x50003fff]
> pci 0000:06:03.0: supports D1 D2
> pci 0000:06:03.0: PME# supported from D0 D1 D2 D3hot
> pci 0000:06:03.0: PME# disabled
> pci 0000:00:1e.0: transparent bridge
> pci 0000:00:1e.0: bridge 32bit mmio: [0x50000000-0x500fffff]
> pci_bus 0000:00: on NUMA node 0
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P32_._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX1._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX2._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX3._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 *11 12)
> SCSI subsystem initialized
> libata version 3.00 loaded.
> PCI: Using ACPI for IRQ routing
> NetLabel: Initializing
> NetLabel: domain hash size = 128
> NetLabel: protocols = UNLABELED CIPSOv4
> NetLabel: unlabeled traffic allowed by default
> pnp: PnP ACPI init
> ACPI: bus type pnp registered
> pnp: PnP ACPI: found 12 devices
> ACPI: ACPI bus type pnp unregistered
> system 00:01: iomem range 0xf0000000-0xf7ffffff has been reserved
> system 00:01: iomem range 0xfed13000-0xfed13fff has been reserved
> system 00:01: iomem range 0xfed14000-0xfed17fff has been reserved
> system 00:01: iomem range 0xfed18000-0xfed18fff has been reserved
> system 00:01: iomem range 0xfed19000-0xfed19fff has been reserved
> system 00:01: iomem range 0xfed1c000-0xfed1ffff has been reserved
> system 00:01: iomem range 0xfed20000-0xfed3ffff has been reserved
> system 00:01: iomem range 0xfed45000-0xfed99fff has been reserved
> system 00:01: iomem range 0xc0000-0xdffff has been reserved
> system 00:01: iomem range 0xe0000-0xfffff could not be reserved
> system 00:06: ioport range 0x500-0x53f has been reserved
> system 00:06: ioport range 0x400-0x47f has been reserved
> system 00:06: ioport range 0x680-0x6ff has been reserved
> pci 0000:00:1c.0: PCI bridge, secondary bus 0000:01
> pci 0000:00:1c.0: IO window: disabled
> pci 0000:00:1c.0: MEM window: 0x50400000-0x504fffff
> pci 0000:00:1c.0: PREFETCH window: disabled
> pci 0000:00:1c.1: PCI bridge, secondary bus 0000:02
> pci 0000:00:1c.1: IO window: 0x1000-0x1fff
> pci 0000:00:1c.1: MEM window: 0x50100000-0x501fffff
> pci 0000:00:1c.1: PREFETCH window: disabled
> pci 0000:00:1c.2: PCI bridge, secondary bus 0000:03
> pci 0000:00:1c.2: IO window: disabled
> pci 0000:00:1c.2: MEM window: 0x50500000-0x505fffff
> pci 0000:00:1c.2: PREFETCH window: disabled
> pci 0000:00:1c.3: PCI bridge, secondary bus 0000:04
> pci 0000:00:1c.3: IO window: disabled
> pci 0000:00:1c.3: MEM window: 0x50600000-0x506fffff
> pci 0000:00:1c.3: PREFETCH window: disabled
> pci 0000:00:1c.4: PCI bridge, secondary bus 0000:05
> pci 0000:00:1c.4: IO window: disabled
> pci 0000:00:1c.4: MEM window: 0x50700000-0x507fffff
> pci 0000:00:1c.4: PREFETCH window: disabled
> pci 0000:00:1e.0: PCI bridge, secondary bus 0000:06
> pci 0000:00:1e.0: IO window: disabled
> pci 0000:00:1e.0: MEM window: 0x50000000-0x500fffff
> pci 0000:00:1e.0: PREFETCH window: disabled
> pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.0: setting latency timer to 64
> pci 0000:00:1c.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16
> pci 0000:00:1c.1: setting latency timer to 64
> pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
> pci 0000:00:1c.2: setting latency timer to 64
> pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
> pci 0000:00:1c.3: setting latency timer to 64
> pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.4: setting latency timer to 64
> pci 0000:00:1e.0: setting latency timer to 64
> pci_bus 0000:00: resource 0 io: [0x00-0xffff]
> pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:01: resource 1 mem: [0x50400000-0x504fffff]
> pci_bus 0000:02: resource 0 io: [0x1000-0x1fff]
> pci_bus 0000:02: resource 1 mem: [0x50100000-0x501fffff]
> pci_bus 0000:03: resource 1 mem: [0x50500000-0x505fffff]
> pci_bus 0000:04: resource 1 mem: [0x50600000-0x506fffff]
> pci_bus 0000:05: resource 1 mem: [0x50700000-0x507fffff]
> pci_bus 0000:06: resource 1 mem: [0x50000000-0x500fffff]
> pci_bus 0000:06: resource 3 io: [0x00-0xffff]
> pci_bus 0000:06: resource 4 mem: [0x000000-0xffffffffffffffff]
> NET: Registered protocol family 2
> IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
> TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
> TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
> TCP: Hash tables configured (established 131072 bind 65536)
> TCP reno registered
> NET: Registered protocol family 1
> Unpacking initramfs...
> Freeing initrd memory: 2606k freed
> audit: initializing netlink socket (disabled)
> type=2000 audit(1245320564.157:1): initialized
> VFS: Disk quotas dquot_6.5.2
> Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
> SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
> msgmni has been set to 1953
> SELinux: Registering netfilter hooks
> alg: No test for fcrypt (fcrypt-generic)
> alg: No test for stdrng (krng)
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
> io scheduler noop registered
> io scheduler anticipatory registered (default)
> io scheduler deadline registered
> io scheduler cfq registered
> pci 0000:00:02.0: Boot video device
> pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
> pcieport-driver 0000:00:1c.0: setting latency timer to 64
> pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
> pcieport-driver 0000:00:1c.1: setting latency timer to 64
> pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
> pcieport-driver 0000:00:1c.2: setting latency timer to 64
> pcieport-driver 0000:00:1c.3: irq 27 for MSI/MSI-X
> pcieport-driver 0000:00:1c.3: setting latency timer to 64
> pcieport-driver 0000:00:1c.4: irq 28 for MSI/MSI-X
> pcieport-driver 0000:00:1c.4: setting latency timer to 64
> input: Power Button as /class/input/input0
> ACPI: Power Button [PWRF]
> input: Sleep Button as /class/input/input1
> ACPI: Sleep Button [SLPB]
> processor ACPI_CPU:00: registered as cooling_device0
> ACPI: Processor [CPU0] (supports 8 throttling states)
> processor ACPI_CPU:01: registered as cooling_device1
> ACPI: Processor [CPU1] (supports 8 throttling states)
> Linux agpgart interface v0.103
> agpgart-intel 0000:00:00.0: Intel 965G Chipset
> agpgart-intel 0000:00:00.0: detected 7676K stolen memory
> agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0x40000000
> intelfb: Framebuffer driver for Intel(R) 830M/845G/852GM/855GM/865G/915G/915GM/945G/945GM/945GME/965G/965GM chipsets
> intelfb: Version 0.9.6
> intelfb 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> intelfb: 00:02.0: Intel(R) 965G, aperture size 256MB, stolen memory 7932kB
> intelfb: Initial video mode is 1024x768-32@70.
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> Platform driver 'serial8250' needs updating - please use dev_pm_ops
> 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> loop: module loaded
> Driver 'sd' needs updating - please use bus_type methods
> ahci 0000:00:1f.2: version 3.0
> ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
> ahci 0000:00:1f.2: irq 29 for MSI/MSI-X
> ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0x33 impl SATA mode
> ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part ems
> ahci 0000:00:1f.2: setting latency timer to 64
> scsi0 : ahci
> scsi1 : ahci
> scsi2 : ahci
> scsi3 : ahci
> scsi4 : ahci
> scsi5 : ahci
> ata1: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325100 irq 29
> ata2: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325180 irq 29
> ata3: DUMMY
> ata4: DUMMY
> ata5: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325300 irq 29
> ata6: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325380 irq 29
> e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
> e1000e: Copyright (c) 1999-2008 Intel Corporation.
> e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
> e1000e 0000:00:19.0: setting latency timer to 64
> e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:16:76:ce:3a:3c
> 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
> 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
> PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
> Platform driver 'i8042' needs updating - please use dev_pm_ops
> serio: i8042 KBD port at 0x60,0x64 irq 1
> serio: i8042 AUX port at 0x60,0x64 irq 12
> mice: PS/2 mouse device common for all mice
> rtc_cmos 00:03: RTC can wake from S4
> rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
> rtc0: alarms up to one month, 114 bytes nvram
> i2c /dev entries driver
> i801_smbus 0000:00:1f.3: PCI INT B -> GSI 21 (level, low) -> IRQ 21
> coretemp coretemp.0: Using relative temperature scale!
> coretemp coretemp.1: Using relative temperature scale!
> cpuidle: using governor ladder
> ip_tables: (C) 2000-2006 Netfilter Core Team
> TCP cubic registered
> input: AT Translated Set 2 keyboard as /class/input/input2
> NET: Registered protocol family 17
> ata2: SATA link down (SStatus 0 SControl 300)
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> registered taskstats version 1
> ata6: SATA link down (SStatus 0 SControl 300)
> ata5: SATA link down (SStatus 0 SControl 300)
> rtc_cmos 00:03: setting system clock to 2009-06-18 10:22:46 UTC (1245320566)
> ata1.00: ATA-7: ST380211AS, 3.AAE, max UDMA/133
> ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata1.00: configured for UDMA/133
> scsi 0:0:0:0: Direct-Access ATA ST380211AS 3.AA PQ: 0 ANSI: 5
> sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> sd 0:0:0:0: [sda] Attached SCSI disk
> Freeing unused kernel memory: 360k freed
> Write protecting the kernel read-only data: 4324k
> Red Hat nash version 6.0.52 starting
> Mounting proc filesystem
> Mounting sysfs filesystem
> Creating /dev
> Creating initial device nodes
> Setting up hotplug.
> input: ImPS/2 Generic Wheel Mouse as /class/input/input3
> Creating block device nodes.
> mount: could not find filesystem '/proc/bus/usb'
> Waiting for driver initialization.
> Waiting for driver initialization.
> Creating root device.
> Mounting root filesystem.
> EXT3-fs: INFO: recovery required on readonly filesystem.
> EXT3-fs: write access will be enabled during recovery.
> kjournald starting. Commit interval 5 seconds
> Setting up otherEXT3-fs: recovery complete.
> filesystems.
> EXT3-fs: mounted filesystem with writeback data mode.
> Setting up new root fs
> no fstab.sys, mounting internal defaults
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 6 users, 12 roles, 2431 types, 118 bools, 1 sens, 1024 cats
> SELinux: 73 classes, 177803 rules
> SELinux: class kernel_service not defined in policy
> SELinux: permission open in class sock_file not defined in policy
> SELinux: permission nlmsg_tty_audit in class netlink_audit_socket not defined in policy
> SELinux: the above unknown classes and permissions will be allowed
> SELinux: Completing initialization.
> SELinux: Setting up existing superblocks.
> SELinux: initialized (dev sda2, type ext3), uses xattr
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
> SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
> SELinux: initialized (dev devpts, type devpts), uses transition SIDs
> SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
> SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
> SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
> SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
> SELinux: initialized (dev proc, type proc), uses genfs_contexts
> SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
> SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
> SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
> type=1403 audit(1245320574.561:2): policy loaded auid=4294967295 ses=4294967295
> Switching to new root and running init.
> unmounting old /dev
> unmounting old /proc
> unmounting old /sys
> Welcome to Fedora
> Press 'I' to enter interactive startup.
> Starting udev: [ OK ]
> Setting hostname andromeda.procyon.org.uk: [ OK ]
> Checking filesystems
> Checking all file systems.
> [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda2
> /1: clean, 330515/2621440 files, 1528849/2620603 blocks
> [/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
> /boot1: recovering journal
> /boot1: clean, 79/50200 files, 72187/200780 blocks
> [ OK ]
> Remounting root filesystem in read-write mode: [ OK ]
> Mounting local filesystems: [ OK ]
> Enabling local filesystem quotas: [ OK ]
> Enabling /etc/fstab swaps: [ OK ]
> Entering non-interactive startup
> Starting background readahead (early, fast mode): [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Bringing up loopback interface: [ OK ]
> Bringing up interface eth0:
> Determining IP information for eth0... done.
> [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Starting restorecond: [ OK ]
> Starting auditd: [ OK ]
> Starting irqbalance: [ OK ]
> Starting mcstransd: [ OK ]
> Starting rpcbind: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for udp6
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for tcp6
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting NFS statd: [ OK ]
> Starting system message bus: [ OK ]
> Starting lm_sensors: not configured, run sensors-detect[WARNING]
> Starting sshd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting ntpd: [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> SysRq : Changing Loglevel
> Loglevel set to 8
> Now booted
> Starting smartd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
>
> Fedora release 9 (Sulphur)
> Kernel 2.6.30-cachefs on an x86_64 (/dev/ttyS0)
>
> andromeda.procyon.org.uk login: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> warning: `capget01' uses 32-bit capabilities (legacy support in use)
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 30549, comm: msgctl11 Not tainted 2.6.30-cachefs #106
> Call Trace:
> [<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
> [<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
> [<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
> [<ffffffff81031110>] ? copy_process+0x93/0x113f
> [<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
> [<ffffffff81031130>] ? copy_process+0xb3/0x113f
> [<ffffffff81081ae2>] ? handle_mm_fault+0x2d5/0x645
> [<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 47
> Active_anon:80388 active_file:0 inactive_anon:822
> inactive_file:2 unevictable:0 dirty:0 writeback:0 unstable:0
> free:2053 slab:38793 mapped:357 pagetables:60476 bounce:0
> DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:128kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:4296kB min:3948kB low:4932kB high:5920kB active_anon:317944kB inactive_anon:3160kB active_file:0kB inactive_file:8kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
> DMA32: 576*4kB 15*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4296kB
> 1854 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 230698 pages shared
> 217103 pages non-shared
> Out of memory: kill process 25166 (msgctl11) score 133496 or a child
> Killed process 28855 (msgctl11)
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 30312, comm: msgctl11 Not tainted 2.6.30-cachefs #106
> Call Trace:
> [<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
> [<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
> [<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
> [<ffffffff81031110>] ? copy_process+0x93/0x113f
> [<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
> [<ffffffff81031130>] ? copy_process+0xb3/0x113f
> [<ffffffff81029a83>] ? update_curr+0x53/0xdf
> [<ffffffff81081e00>] ? handle_mm_fault+0x5f3/0x645
> [<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 0
> Active_anon:79646 active_file:2 inactive_anon:4113
> inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> free:1966 slab:38417 mapped:2 pagetables:61720 bounce:0
> DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:3948kB min:3948kB low:4932kB high:5920kB active_anon:314976kB inactive_anon:16196kB active_file:8kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
> DMA32: 443*4kB 20*8kB 10*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3948kB
> 36 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 151665 pages shared
> 220702 pages non-shared
> Out of memory: kill process 25166 (msgctl11) score 133404 or a child
> Killed process 28860 (msgctl11)

2009-06-19 05:27:35

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Fri, Jun 19, 2009 at 12:18:58AM +0800, David Howells wrote:
>
> Okay, after dropping all my devel patches, I got the OOM to happen again;
> fresh trace attached. I was running LTP and an NFSD, and I was spamming the
> NFSD continuously from another machine (mount;tar;umount;repeat).

It's not likely Rik or mine patches can create OOM situations.

But the problem is true - Roger also reports OOM on 2.6.30.
He's running a 2GB desktop that suspend/resumes a lot.

Thanks,
Fengguang

>
> David
> ---
> Initializing cgroup subsys cpuset
> Linux version 2.6.30-cachefs ([email protected]) (gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #107 SMP Thu Jun 18 15:36:16 BST 2009
> Command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> KERNEL supported cpus:
> Intel GenuineIntel
> AMD AuthenticAMD
> Centaur CentaurHauls
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009ec00 (usable)
> BIOS-e820: 000000000009ec00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 000000003e59a000 (usable)
> BIOS-e820: 000000003e59a000 - 000000003e5a6000 (reserved)
> BIOS-e820: 000000003e5a6000 - 000000003e644000 (usable)
> BIOS-e820: 000000003e644000 - 000000003e6a9000 (ACPI NVS)
> BIOS-e820: 000000003e6a9000 - 000000003e6ac000 (ACPI data)
> BIOS-e820: 000000003e6ac000 - 000000003e6f2000 (ACPI NVS)
> BIOS-e820: 000000003e6f2000 - 000000003e6ff000 (ACPI data)
> BIOS-e820: 000000003e6ff000 - 000000003e700000 (usable)
> BIOS-e820: 000000003e700000 - 000000003f000000 (reserved)
> BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
> DMI 2.4 present.
> last_pfn = 0x3e700 max_arch_pfn = 0x400000000
> MTRR default type: uncachable
> MTRR fixed ranges enabled:
> 00000-9FFFF write-back
> A0000-FFFFF uncachable
> MTRR variable ranges enabled:
> 0 base 000000000 mask FC0000000 write-back
> 1 base 03F000000 mask FFF000000 uncachable
> 2 base 03E800000 mask FFF800000 uncachable
> 3 base 03E700000 mask FFFF00000 uncachable
> 4 disabled
> 5 disabled
> 6 disabled
> 7 disabled
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> initial memory mapped : 0 - 20000000
> init_memory_mapping: 0000000000000000-000000003e700000
> 0000000000 - 003e600000 page 2M
> 003e600000 - 003e700000 page 4k
> kernel direct mapping tables up to 3e700000 @ 8000-b000
> RAMDISK: 3e2ee000 - 3e57991c
> ACPI: RSDP 00000000000fe020 00014 (v00 INTEL )
> ACPI: RSDT 000000003e6fd038 0004C (v01 INTEL DG965RY 00000330 01000013)
> ACPI: FACP 000000003e6fc000 00074 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: DSDT 000000003e6f8000 03EDA (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: FACS 000000003e6ac000 00040
> ACPI: APIC 000000003e6f7000 00078 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: WDDT 000000003e6f6000 00040 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: MCFG 000000003e6f5000 0003C (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: ASF! 000000003e6f4000 000A6 (v32 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f3000 001BC (v01 INTEL CpuPm 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f2000 00175 (v01 INTEL Cpu0Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6ab000 00175 (v01 INTEL Cpu1Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6aa000 00175 (v01 INTEL Cpu2Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6a9000 00175 (v01 INTEL Cpu3Ist 00000330 MSFT 01000013)
> ACPI: Local APIC address 0xfee00000
> (7 early reservations) ==> bootmem [0000000000 - 003e700000]
> #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
> #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
> #2 [0001000000 - 0001535d90] TEXT DATA BSS ==> [0001000000 - 0001535d90]
> #3 [003e2ee000 - 003e57991c] RAMDISK ==> [003e2ee000 - 003e57991c]
> #4 [000009e800 - 0000100000] BIOS reserved ==> [000009e800 - 0000100000]
> #5 [0001536000 - 0001536199] BRK ==> [0001536000 - 0001536199]
> #6 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 0000009000]
> found SMP MP-table at [ffff8800000fe200] fe200
> [ffffea0000000000-ffffea0000dfffff] PMD -> [ffff880001a00000-ffff8800027fffff] on node 0
> Zone PFN ranges:
> DMA 0x00000000 -> 0x00001000
> DMA32 0x00001000 -> 0x00100000
> Normal 0x00100000 -> 0x00100000
> Movable zone start PFN for each node
> early_node_map[4] active PFN ranges
> 0: 0x00000000 -> 0x0000009e
> 0: 0x00000100 -> 0x0003e59a
> 0: 0x0003e5a6 -> 0x0003e644
> 0: 0x0003e6ff -> 0x0003e700
> On node 0 totalpages: 255447
> DMA zone: 56 pages used for memmap
> DMA zone: 101 pages reserved
> DMA zone: 3841 pages, LIFO batch:0
> DMA32 zone: 3441 pages used for memmap
> DMA32 zone: 248008 pages, LIFO batch:31
> ACPI: PM-Timer IO Port: 0x408
> ACPI: Local APIC address 0xfee00000
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> ACPI: IRQ0 used by override.
> ACPI: IRQ2 used by override.
> ACPI: IRQ9 used by override.
> Using ACPI (MADT) for SMP configuration information
> 4 Processors exceeds NR_CPUS limit of 2
> SMP: Allowing 2 CPUs, 0 hotplug CPUs
> nr_irqs_gsi: 24
> PM: Registered nosave memory: 000000000009e000 - 000000000009f000
> PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
> PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
> PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
> PM: Registered nosave memory: 000000003e59a000 - 000000003e5a6000
> PM: Registered nosave memory: 000000003e644000 - 000000003e6a9000
> PM: Registered nosave memory: 000000003e6a9000 - 000000003e6ac000
> PM: Registered nosave memory: 000000003e6ac000 - 000000003e6f2000
> PM: Registered nosave memory: 000000003e6f2000 - 000000003e6ff000
> Allocating PCI resources starting at 3f000000 (gap: 3f000000:c0f00000)
> NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
> PERCPU: Embedded 24 pages at ffff880001541000, static data 67296 bytes
> Built 1 zonelists in Zone order, mobility grouping on. Total pages: 251849
> Kernel command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
> Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
> Initializing CPU#0
> Checking aperture...
> No AGP bridge found
> Memory: 996952k/1022976k available (2949k kernel code, 1188k absent, 24132k reserved, 1679k data, 360k init)
> NR_IRQS:320
> Fast TSC calibration using PIT
> Detected 1865.185 MHz processor.
> Console: colour VGA+ 80x25
> console [tty0] enabled
> console [ttyS0] enabled
> Calibrating delay loop (skipped), value calculated using timer frequency.. 3730.37 BogoMIPS (lpj=7460740)
> Security Framework initialized
> SELinux: Initializing.
> SELinux: Starting in enforcing mode
> Mount-cache hash table entries: 256
> Initializing cgroup subsys debug
> Initializing cgroup subsys ns
> Initializing cgroup subsys devices
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> mce: CPU supports 6 MCE banks
> CPU0: Thermal monitoring enabled (TM2)
> using mwait in idle threads.
> ACPI: Core revision 20090521
> Setting APIC routing to flat
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> CPU0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> Booting processor 1 APIC 0x1 ip 0x6000
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 3729.90 BogoMIPS (lpj=7459814)
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> mce: CPU supports 6 MCE banks
> CPU1: Thermal monitoring enabled (TM2)
> x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
> CPU1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> checking TSC synchronization [CPU#0 -> CPU#1]: passed.
> Brought up 2 CPUs
> Total of 2 processors activated (7460.27 BogoMIPS).
> NET: Registered protocol family 16
> ACPI: bus type pci registered
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: Not using MMCONFIG.
> PCI: Using configuration type 1 for base access
> bio: create slab <bio-0> at 0
> ACPI: EC: Look up EC in DSDT
> ACPI: Interpreter enabled
> ACPI: (supports S0 S3 S4 S5)
> ACPI: Using IOAPIC for interrupt routing
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
> PCI: Using MMCONFIG at f0000000 - f7ffffff
> ACPI: No dock devices found.
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> pci 0000:00:02.0: reg 10 32bit mmio: [0x50200000-0x502fffff]
> pci 0000:00:02.0: reg 18 64bit mmio: [0x40000000-0x4fffffff]
> pci 0000:00:02.0: reg 20 io port: [0x2110-0x2117]
> pci 0000:00:03.0: reg 10 64bit mmio: [0x50326100-0x5032610f]
> pci 0000:00:03.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:03.0: PME# disabled
> pci 0000:00:19.0: reg 10 32bit mmio: [0x50300000-0x5031ffff]
> pci 0000:00:19.0: reg 14 32bit mmio: [0x50324000-0x50324fff]
> pci 0000:00:19.0: reg 18 io port: [0x20e0-0x20ff]
> pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:19.0: PME# disabled
> pci 0000:00:1a.0: reg 20 io port: [0x20c0-0x20df]
> pci 0000:00:1a.1: reg 20 io port: [0x20a0-0x20bf]
> pci 0000:00:1a.7: reg 10 32bit mmio: [0x50325c00-0x50325fff]
> pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1a.7: PME# disabled
> pci 0000:00:1b.0: reg 10 64bit mmio: [0x50320000-0x50323fff]
> pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1b.0: PME# disabled
> pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.0: PME# disabled
> pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.1: PME# disabled
> pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.2: PME# disabled
> pci 0000:00:1c.3: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.3: PME# disabled
> pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.4: PME# disabled
> pci 0000:00:1d.0: reg 20 io port: [0x2080-0x209f]
> pci 0000:00:1d.1: reg 20 io port: [0x2060-0x207f]
> pci 0000:00:1d.2: reg 20 io port: [0x2040-0x205f]
> pci 0000:00:1d.7: reg 10 32bit mmio: [0x50325800-0x50325bff]
> pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1d.7: PME# disabled
> pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
> pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO
> pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 007f)
> pci 0000:00:1f.2: reg 10 io port: [0x2108-0x210f]
> pci 0000:00:1f.2: reg 14 io port: [0x211c-0x211f]
> pci 0000:00:1f.2: reg 18 io port: [0x2100-0x2107]
> pci 0000:00:1f.2: reg 1c io port: [0x2118-0x211b]
> pci 0000:00:1f.2: reg 20 io port: [0x2020-0x203f]
> pci 0000:00:1f.2: reg 24 32bit mmio: [0x50325000-0x503257ff]
> pci 0000:00:1f.2: PME# supported from D3hot
> pci 0000:00:1f.2: PME# disabled
> pci 0000:00:1f.3: reg 10 32bit mmio: [0x50326000-0x503260ff]
> pci 0000:00:1f.3: reg 20 io port: [0x2000-0x201f]
> pci 0000:00:1c.0: bridge 32bit mmio: [0x50400000-0x504fffff]
> pci 0000:02:00.0: reg 10 io port: [0x1018-0x101f]
> pci 0000:02:00.0: reg 14 io port: [0x1024-0x1027]
> pci 0000:02:00.0: reg 18 io port: [0x1010-0x1017]
> pci 0000:02:00.0: reg 1c io port: [0x1020-0x1023]
> pci 0000:02:00.0: reg 20 io port: [0x1000-0x100f]
> pci 0000:02:00.0: reg 24 32bit mmio: [0x50100000-0x501001ff]
> pci 0000:02:00.0: supports D1
> pci 0000:02:00.0: PME# supported from D0 D1 D3hot
> pci 0000:02:00.0: PME# disabled
> pci 0000:00:1c.1: bridge io port: [0x1000-0x1fff]
> pci 0000:00:1c.1: bridge 32bit mmio: [0x50100000-0x501fffff]
> pci 0000:00:1c.2: bridge 32bit mmio: [0x50500000-0x505fffff]
> pci 0000:00:1c.3: bridge 32bit mmio: [0x50600000-0x506fffff]
> pci 0000:00:1c.4: bridge 32bit mmio: [0x50700000-0x507fffff]
> pci 0000:06:03.0: reg 10 32bit mmio: [0x50004000-0x500047ff]
> pci 0000:06:03.0: reg 14 32bit mmio: [0x50000000-0x50003fff]
> pci 0000:06:03.0: supports D1 D2
> pci 0000:06:03.0: PME# supported from D0 D1 D2 D3hot
> pci 0000:06:03.0: PME# disabled
> pci 0000:00:1e.0: transparent bridge
> pci 0000:00:1e.0: bridge 32bit mmio: [0x50000000-0x500fffff]
> pci_bus 0000:00: on NUMA node 0
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P32_._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX1._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX2._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX3._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 *11 12)
> SCSI subsystem initialized
> libata version 3.00 loaded.
> PCI: Using ACPI for IRQ routing
> NetLabel: Initializing
> NetLabel: domain hash size = 128
> NetLabel: protocols = UNLABELED CIPSOv4
> NetLabel: unlabeled traffic allowed by default
> pnp: PnP ACPI init
> ACPI: bus type pnp registered
> pnp: PnP ACPI: found 12 devices
> ACPI: ACPI bus type pnp unregistered
> system 00:01: iomem range 0xf0000000-0xf7ffffff has been reserved
> system 00:01: iomem range 0xfed13000-0xfed13fff has been reserved
> system 00:01: iomem range 0xfed14000-0xfed17fff has been reserved
> system 00:01: iomem range 0xfed18000-0xfed18fff has been reserved
> system 00:01: iomem range 0xfed19000-0xfed19fff has been reserved
> system 00:01: iomem range 0xfed1c000-0xfed1ffff has been reserved
> system 00:01: iomem range 0xfed20000-0xfed3ffff has been reserved
> system 00:01: iomem range 0xfed45000-0xfed99fff has been reserved
> system 00:01: iomem range 0xc0000-0xdffff has been reserved
> system 00:01: iomem range 0xe0000-0xfffff could not be reserved
> system 00:06: ioport range 0x500-0x53f has been reserved
> system 00:06: ioport range 0x400-0x47f has been reserved
> system 00:06: ioport range 0x680-0x6ff has been reserved
> pci 0000:00:1c.0: PCI bridge, secondary bus 0000:01
> pci 0000:00:1c.0: IO window: disabled
> pci 0000:00:1c.0: MEM window: 0x50400000-0x504fffff
> pci 0000:00:1c.0: PREFETCH window: disabled
> pci 0000:00:1c.1: PCI bridge, secondary bus 0000:02
> pci 0000:00:1c.1: IO window: 0x1000-0x1fff
> pci 0000:00:1c.1: MEM window: 0x50100000-0x501fffff
> pci 0000:00:1c.1: PREFETCH window: disabled
> pci 0000:00:1c.2: PCI bridge, secondary bus 0000:03
> pci 0000:00:1c.2: IO window: disabled
> pci 0000:00:1c.2: MEM window: 0x50500000-0x505fffff
> pci 0000:00:1c.2: PREFETCH window: disabled
> pci 0000:00:1c.3: PCI bridge, secondary bus 0000:04
> pci 0000:00:1c.3: IO window: disabled
> pci 0000:00:1c.3: MEM window: 0x50600000-0x506fffff
> pci 0000:00:1c.3: PREFETCH window: disabled
> pci 0000:00:1c.4: PCI bridge, secondary bus 0000:05
> pci 0000:00:1c.4: IO window: disabled
> pci 0000:00:1c.4: MEM window: 0x50700000-0x507fffff
> pci 0000:00:1c.4: PREFETCH window: disabled
> pci 0000:00:1e.0: PCI bridge, secondary bus 0000:06
> pci 0000:00:1e.0: IO window: disabled
> pci 0000:00:1e.0: MEM window: 0x50000000-0x500fffff
> pci 0000:00:1e.0: PREFETCH window: disabled
> pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.0: setting latency timer to 64
> pci 0000:00:1c.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16
> pci 0000:00:1c.1: setting latency timer to 64
> pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
> pci 0000:00:1c.2: setting latency timer to 64
> pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
> pci 0000:00:1c.3: setting latency timer to 64
> pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.4: setting latency timer to 64
> pci 0000:00:1e.0: setting latency timer to 64
> pci_bus 0000:00: resource 0 io: [0x00-0xffff]
> pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:01: resource 1 mem: [0x50400000-0x504fffff]
> pci_bus 0000:02: resource 0 io: [0x1000-0x1fff]
> pci_bus 0000:02: resource 1 mem: [0x50100000-0x501fffff]
> pci_bus 0000:03: resource 1 mem: [0x50500000-0x505fffff]
> pci_bus 0000:04: resource 1 mem: [0x50600000-0x506fffff]
> pci_bus 0000:05: resource 1 mem: [0x50700000-0x507fffff]
> pci_bus 0000:06: resource 1 mem: [0x50000000-0x500fffff]
> pci_bus 0000:06: resource 3 io: [0x00-0xffff]
> pci_bus 0000:06: resource 4 mem: [0x000000-0xffffffffffffffff]
> NET: Registered protocol family 2
> IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
> TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
> TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
> TCP: Hash tables configured (established 131072 bind 65536)
> TCP reno registered
> NET: Registered protocol family 1
> Unpacking initramfs...
> Freeing initrd memory: 2606k freed
> audit: initializing netlink socket (disabled)
> type=2000 audit(1245336472.149:1): initialized
> VFS: Disk quotas dquot_6.5.2
> Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
> SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
> msgmni has been set to 1953
> SELinux: Registering netfilter hooks
> alg: No test for fcrypt (fcrypt-generic)
> alg: No test for stdrng (krng)
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
> io scheduler noop registered
> io scheduler anticipatory registered (default)
> io scheduler deadline registered
> io scheduler cfq registered
> pci 0000:00:02.0: Boot video device
> pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
> pcieport-driver 0000:00:1c.0: setting latency timer to 64
> pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
> pcieport-driver 0000:00:1c.1: setting latency timer to 64
> pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
> pcieport-driver 0000:00:1c.2: setting latency timer to 64
> pcieport-driver 0000:00:1c.3: irq 27 for MSI/MSI-X
> pcieport-driver 0000:00:1c.3: setting latency timer to 64
> pcieport-driver 0000:00:1c.4: irq 28 for MSI/MSI-X
> pcieport-driver 0000:00:1c.4: setting latency timer to 64
> input: Power Button as /class/input/input0
> ACPI: Power Button [PWRF]
> input: Sleep Button as /class/input/input1
> ACPI: Sleep Button [SLPB]
> processor ACPI_CPU:00: registered as cooling_device0
> ACPI: Processor [CPU0] (supports 8 throttling states)
> processor ACPI_CPU:01: registered as cooling_device1
> ACPI: Processor [CPU1] (supports 8 throttling states)
> Linux agpgart interface v0.103
> agpgart-intel 0000:00:00.0: Intel 965G Chipset
> agpgart-intel 0000:00:00.0: detected 7676K stolen memory
> agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0x40000000
> intelfb: Framebuffer driver for Intel(R) 830M/845G/852GM/855GM/865G/915G/915GM/945G/945GM/945GME/965G/965GM chipsets
> intelfb: Version 0.9.6
> intelfb 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> intelfb: 00:02.0: Intel(R) 965G, aperture size 256MB, stolen memory 7932kB
> intelfb: Initial video mode is 1024x768-32@70.
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> Platform driver 'serial8250' needs updating - please use dev_pm_ops
> 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> loop: module loaded
> Driver 'sd' needs updating - please use bus_type methods
> ahci 0000:00:1f.2: version 3.0
> ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
> ahci 0000:00:1f.2: irq 29 for MSI/MSI-X
> ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0x33 impl SATA mode
> ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part ems
> ahci 0000:00:1f.2: setting latency timer to 64
> scsi0 : ahci
> scsi1 : ahci
> scsi2 : ahci
> scsi3 : ahci
> scsi4 : ahci
> scsi5 : ahci
> ata1: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325100 irq 29
> ata2: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325180 irq 29
> ata3: DUMMY
> ata4: DUMMY
> ata5: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325300 irq 29
> ata6: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325380 irq 29
> e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
> e1000e: Copyright (c) 1999-2008 Intel Corporation.
> e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
> e1000e 0000:00:19.0: setting latency timer to 64
> e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:16:76:ce:3a:3c
> 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
> 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
> PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
> Platform driver 'i8042' needs updating - please use dev_pm_ops
> serio: i8042 KBD port at 0x60,0x64 irq 1
> serio: i8042 AUX port at 0x60,0x64 irq 12
> mice: PS/2 mouse device common for all mice
> rtc_cmos 00:03: RTC can wake from S4
> rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
> rtc0: alarms up to one month, 114 bytes nvram
> i2c /dev entries driver
> i801_smbus 0000:00:1f.3: PCI INT B -> GSI 21 (level, low) -> IRQ 21
> coretemp coretemp.0: Using relative temperature scale!
> coretemp coretemp.1: Using relative temperature scale!
> cpuidle: using governor ladder
> ip_tables: (C) 2000-2006 Netfilter Core Team
> TCP cubic registered
> input: AT Translated Set 2 keyboard as /class/input/input2
> NET: Registered protocol family 17
> registered taskstats version 1
> ata6: SATA link down (SStatus 0 SControl 300)
> rtc_cmos 00:03: setting system clock to 2009-06-18 14:47:54 UTC (1245336474)
> ata5: SATA link down (SStatus 0 SControl 300)
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata2: SATA link down (SStatus 0 SControl 300)
> ata1.00: ATA-7: ST380211AS, 3.AAE, max UDMA/133
> ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata1.00: configured for UDMA/133
> scsi 0:0:0:0: Direct-Access ATA ST380211AS 3.AA PQ: 0 ANSI: 5
> sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> sd 0:0:0:0: [sda] Attached SCSI disk
> Freeing unused kernel memory: 360k freed
> Write protecting the kernel read-only data: 4320k
> Red Hat nash version 6.0.52 starting
> Mounting proc filesystem
> Mounting sysfs filesystem
> Creating /dev
> Creating initial device nodes
> Setting up hotplug.
> input: ImPS/2 Generic Wheel Mouse as /class/input/input3
> Creating block device nodes.
> mount: could not find filesystem '/proc/bus/usb'
> Waiting for driver initialization.
> Waiting for driver initialization.
> Creating root device.
> Mounting root filesystem.
> kjournald starting. Commit interval 5 seconds
> Setting up otherEXT3-fs: mounted filesystem with writeback data mode.
> filesystems.
> Setting up new root fs
> no fstab.sys, mounting internal defaults
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 6 users, 12 roles, 2431 types, 118 bools, 1 sens, 1024 cats
> SELinux: 73 classes, 177803 rules
> SELinux: class kernel_service not defined in policy
> SELinux: permission open in class sock_file not defined in policy
> SELinux: permission nlmsg_tty_audit in class netlink_audit_socket not defined in policy
> SELinux: the above unknown classes and permissions will be allowed
> SELinux: Completing initialization.
> SELinux: Setting up existing superblocks.
> SELinux: initialized (dev sda2, type ext3), uses xattr
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
> SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
> SELinux: initialized (dev devpts, type devpts), uses transition SIDs
> SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
> SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
> SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
> SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
> SELinux: initialized (dev proc, type proc), uses genfs_contexts
> SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
> SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
> SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
> type=1403 audit(1245336481.989:2): policy loaded auid=4294967295 ses=4294967295
> Switching to new root and running init.
> unmounting old /dev
> unmounting old /proc
> unmounting old /sys
> Welcome to Fedora
> Press 'I' to enter interactive startup.
> Starting udev: [ OK ]
> Setting hostname andromeda.procyon.org.uk: [ OK ]
> Checking filesystems
> Checking all file systems.
> [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda2
> /1: clean, 330519/2621440 files, 1528859/2620603 blocks
> [/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
> /boot1: clean, 79/50200 files, 72187/200780 blocks
> [ OK ]
> Remounting root filesystem in read-write mode: [ OK ]
> Mounting local filesystems: [ OK ]
> Enabling local filesystem quotas: [ OK ]
> Enabling /etc/fstab swaps: [ OK ]
> Entering non-interactive startup
> Starting background readahead (early, fast mode): [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Bringing up loopback interface: [ OK ]
> Bringing up interface eth0:
> Determining IP information for eth0... done.
> [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Starting restorecond: [ OK ]
> Starting auditd: [ OK ]
> Starting irqbalance: [ OK ]
> Starting mcstransd: [ OK ]
> Starting rpcbind: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for udp6
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for tcp6
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting NFS statd: [ OK ]
> Starting system message bus: [ OK ]
> Starting lm_sensors: not configured, run sensors-detect[WARNING]
> Starting sshd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting ntpd: [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> SysRq : Changing Loglevel
> Loglevel set to 8
> Now booted
> Starting smartd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
>
> Fedora release 9 (Sulphur)
> Kernel 2.6.30-cachefs on an x86_64 (/dev/ttyS0)
>
> andromeda.procyon.org.uk login: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> warning: `capget01' uses 32-bit capabilities (legacy support in use)
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Adding 65528k swap on ./swapfile01. Priority:-1 extents:141 across:498688k
> Adding 65528k swap on ./swapfile01. Priority:-1 extents:203 across:829292k
> Adding 65528k swap on ./swapfile01. Priority:-1 extents:151 across:811620k
> Unable to find swap-space signature
> Adding 32k swap on alreadyused. Priority:-1 extents:4 across:18988k
> Adding 32k swap on swapfile02. Priority:-1 extents:4 across:1064k
> Adding 32k swap on swapfile03. Priority:-2 extents:1 across:32k
> Adding 32k swap on swapfile04. Priority:-3 extents:4 across:18976k
> Adding 32k swap on swapfile05. Priority:-4 extents:2 across:44k
> Adding 32k swap on swapfile06. Priority:-5 extents:1 across:32k
> Adding 32k swap on swapfile07. Priority:-6 extents:2 across:60k
> Adding 32k swap on swapfile08. Priority:-7 extents:2 across:32k
> Adding 32k swap on swapfile09. Priority:-8 extents:1 across:32k
> Adding 32k swap on swapfile10. Priority:-9 extents:2 across:36k
> Adding 32k swap on swapfile11. Priority:-10 extents:1 across:32k
> Adding 32k swap on swapfile12. Priority:-11 extents:2 across:32k
> Adding 32k swap on swapfile13. Priority:-12 extents:1 across:32k
> Adding 32k swap on swapfile14. Priority:-13 extents:1 across:32k
> Adding 32k swap on swapfile15. Priority:-14 extents:1 across:32k
> Adding 32k swap on swapfile16. Priority:-15 extents:2 across:32k
> Adding 32k swap on swapfile17. Priority:-16 extents:1 across:32k
> Adding 32k swap on swapfile18. Priority:-17 extents:2 across:44k
> Adding 32k swap on swapfile19. Priority:-18 extents:2 across:1316k
> Adding 32k swap on swapfile20. Priority:-19 extents:2 across:32k
> Adding 32k swap on swapfile21. Priority:-20 extents:2 across:72k
> Adding 32k swap on swapfile22. Priority:-21 extents:1 across:32k
> Adding 32k swap on swapfile23. Priority:-22 extents:1 across:32k
> Adding 32k swap on swapfile24. Priority:-23 extents:3 across:44k
> Adding 32k swap on swapfile25. Priority:-24 extents:1 across:32k
> Adding 32k swap on swapfile26. Priority:-25 extents:1 across:32k
> Adding 32k swap on swapfile27. Priority:-26 extents:1 across:32k
> Adding 32k swap on swapfile28. Priority:-27 extents:2 across:32k
> Adding 32k swap on swapfile29. Priority:-28 extents:1 across:32k
> Adding 32k swap on swapfile30. Priority:-29 extents:1 across:32k
> Adding 32k swap on swapfile31. Priority:-30 extents:1 across:32k
> Adding 32k swap on firstswapfile. Priority:-31 extents:2 across:32k
> Adding 32k swap on secondswapfile. Priority:-32 extents:2 across:44k
> warning: process `sysctl01' used the deprecated sysctl system call with 1.1.
> warning: process `sysctl01' used the deprecated sysctl system call with 1.2.
> warning: process `sysctl03' used the deprecated sysctl system call with 1.1.
> warning: process `sysctl03' used the deprecated sysctl system call with 1.1.
> warning: process `sysctl04' used the deprecated sysctl system call with
> RPC: Registered udp transport module.
> RPC: Registered tcp transport module.
> Installing knfsd (copyright (C) 1996 [email protected]).
> NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> NFSD: starting 90-second grace period
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 12411, comm: msgctl11 Not tainted 2.6.30-cachefs #107
> Call Trace:
> [<ffffffff81071612>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff810736e7>] ? drain_local_pages+0x0/0x13
> [<ffffffff810718d9>] ? __out_of_memory+0x12b/0x142
> [<ffffffff8107195a>] ? out_of_memory+0x6a/0x94
> [<ffffffff81074002>] ? __alloc_pages_nodemask+0x422/0x50b
> [<ffffffff81031112>] ? copy_process+0x95/0x1158
> [<ffffffff81074155>] ? __get_free_pages+0x12/0x50
> [<ffffffff81031135>] ? copy_process+0xb8/0x1158
> [<ffffffff81081346>] ? handle_mm_fault+0x2d5/0x645
> [<ffffffff81032314>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 57
> CPU 1: hi: 186, btch: 31 usd: 0
> Active_anon:70104 active_file:1 inactive_anon:6557
> inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> free:4062 slab:41969 mapped:541 pagetables:59663 bounce:0
> DMA free:3920kB min:60kB low:72kB high:88kB active_anon:2268kB inactive_anon:428kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:12328kB min:3948kB low:4932kB high:5920kB active_anon:278148kB inactive_anon:25800kB active_file:4kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 8*4kB 0*8kB 1*16kB 1*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3920kB
> DMA32: 2474*4kB 56*8kB 8*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 12328kB
> 1660 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 255749 pages shared
> 215785 pages non-shared
> Out of memory: kill process 6838 (msgctl11) score 152029 or a child
> Killed process 8850 (msgctl11)

2009-06-19 05:59:18

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Thu, Jun 18, 2009 at 10:46:52PM +0800, David Howells wrote:
>
> Hmmm.... It's possible that this makes my test box implode horribly when
> running LTP.
>
> I'm going to bisect it to see if this is actually due to your patches.
>
> Note that I don't have any swap space. This after a fresh reboot:
>
> [root@andromeda ~]# cat /proc/meminfo
> MemTotal: 1000624 kB
> MemFree: 797328 kB
> Buffers: 13272 kB
> Cached: 121744 kB
> SwapCached: 0 kB
> Active: 36240 kB
> Inactive: 115856 kB
> Active(anon): 17448 kB
> Inactive(anon): 0 kB
> Active(file): 18792 kB
> Inactive(file): 115856 kB
> Unevictable: 0 kB
> Mlocked: 0 kB
> SwapTotal: 0 kB
> SwapFree: 0 kB
> Dirty: 28 kB
> Writeback: 0 kB
> AnonPages: 17280 kB
> Mapped: 5376 kB
> Slab: 42984 kB
> SReclaimable: 6956 kB
> SUnreclaim: 36028 kB
> PageTables: 1304 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> CommitLimit: 500312 kB
> Committed_AS: 52596 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 190044 kB
> VmallocChunk: 34359546363 kB
> DirectMap4k: 13312 kB
> DirectMap2M: 1009664 kB
>
> David
> ---
> Initializing cgroup subsys cpuset
> Linux version 2.6.30-cachefs ([email protected]) (gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #106 SMP Wed Jun 17 22:10:31 BST 2009
> Command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> KERNEL supported cpus:
> Intel GenuineIntel
> AMD AuthenticAMD
> Centaur CentaurHauls
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009ec00 (usable)
> BIOS-e820: 000000000009ec00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 000000003e59a000 (usable)
> BIOS-e820: 000000003e59a000 - 000000003e5a6000 (reserved)
> BIOS-e820: 000000003e5a6000 - 000000003e644000 (usable)
> BIOS-e820: 000000003e644000 - 000000003e6a9000 (ACPI NVS)
> BIOS-e820: 000000003e6a9000 - 000000003e6ac000 (ACPI data)
> BIOS-e820: 000000003e6ac000 - 000000003e6f2000 (ACPI NVS)
> BIOS-e820: 000000003e6f2000 - 000000003e6ff000 (ACPI data)
> BIOS-e820: 000000003e6ff000 - 000000003e700000 (usable)
> BIOS-e820: 000000003e700000 - 000000003f000000 (reserved)
> BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
> DMI 2.4 present.
> last_pfn = 0x3e700 max_arch_pfn = 0x400000000
> MTRR default type: uncachable
> MTRR fixed ranges enabled:
> 00000-9FFFF write-back
> A0000-FFFFF uncachable
> MTRR variable ranges enabled:
> 0 base 000000000 mask FC0000000 write-back
> 1 base 03F000000 mask FFF000000 uncachable
> 2 base 03E800000 mask FFF800000 uncachable
> 3 base 03E700000 mask FFFF00000 uncachable
> 4 disabled
> 5 disabled
> 6 disabled
> 7 disabled
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> initial memory mapped : 0 - 20000000
> init_memory_mapping: 0000000000000000-000000003e700000
> 0000000000 - 003e600000 page 2M
> 003e600000 - 003e700000 page 4k
> kernel direct mapping tables up to 3e700000 @ 8000-b000
> RAMDISK: 3e2ee000 - 3e57991c
> ACPI: RSDP 00000000000fe020 00014 (v00 INTEL )
> ACPI: RSDT 000000003e6fd038 0004C (v01 INTEL DG965RY 00000330 01000013)
> ACPI: FACP 000000003e6fc000 00074 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: DSDT 000000003e6f8000 03EDA (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: FACS 000000003e6ac000 00040
> ACPI: APIC 000000003e6f7000 00078 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: WDDT 000000003e6f6000 00040 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: MCFG 000000003e6f5000 0003C (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: ASF! 000000003e6f4000 000A6 (v32 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f3000 001BC (v01 INTEL CpuPm 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f2000 00175 (v01 INTEL Cpu0Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6ab000 00175 (v01 INTEL Cpu1Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6aa000 00175 (v01 INTEL Cpu2Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6a9000 00175 (v01 INTEL Cpu3Ist 00000330 MSFT 01000013)
> ACPI: Local APIC address 0xfee00000
> (7 early reservations) ==> bootmem [0000000000 - 003e700000]
> #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
> #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
> #2 [0001000000 - 0001535d90] TEXT DATA BSS ==> [0001000000 - 0001535d90]
> #3 [003e2ee000 - 003e57991c] RAMDISK ==> [003e2ee000 - 003e57991c]
> #4 [000009e800 - 0000100000] BIOS reserved ==> [000009e800 - 0000100000]
> #5 [0001536000 - 0001536199] BRK ==> [0001536000 - 0001536199]
> #6 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 0000009000]
> found SMP MP-table at [ffff8800000fe200] fe200
> [ffffea0000000000-ffffea0000dfffff] PMD -> [ffff880001a00000-ffff8800027fffff] on node 0
> Zone PFN ranges:
> DMA 0x00000000 -> 0x00001000
> DMA32 0x00001000 -> 0x00100000
> Normal 0x00100000 -> 0x00100000
> Movable zone start PFN for each node
> early_node_map[4] active PFN ranges
> 0: 0x00000000 -> 0x0000009e
> 0: 0x00000100 -> 0x0003e59a
> 0: 0x0003e5a6 -> 0x0003e644
> 0: 0x0003e6ff -> 0x0003e700
> On node 0 totalpages: 255447
> DMA zone: 56 pages used for memmap
> DMA zone: 101 pages reserved
> DMA zone: 3841 pages, LIFO batch:0
> DMA32 zone: 3441 pages used for memmap
> DMA32 zone: 248008 pages, LIFO batch:31
> ACPI: PM-Timer IO Port: 0x408
> ACPI: Local APIC address 0xfee00000
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> ACPI: IRQ0 used by override.
> ACPI: IRQ2 used by override.
> ACPI: IRQ9 used by override.
> Using ACPI (MADT) for SMP configuration information
> 4 Processors exceeds NR_CPUS limit of 2
> SMP: Allowing 2 CPUs, 0 hotplug CPUs
> nr_irqs_gsi: 24
> PM: Registered nosave memory: 000000000009e000 - 000000000009f000
> PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
> PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
> PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
> PM: Registered nosave memory: 000000003e59a000 - 000000003e5a6000
> PM: Registered nosave memory: 000000003e644000 - 000000003e6a9000
> PM: Registered nosave memory: 000000003e6a9000 - 000000003e6ac000
> PM: Registered nosave memory: 000000003e6ac000 - 000000003e6f2000
> PM: Registered nosave memory: 000000003e6f2000 - 000000003e6ff000
> Allocating PCI resources starting at 3f000000 (gap: 3f000000:c0f00000)
> NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
> PERCPU: Embedded 24 pages at ffff880001541000, static data 67296 bytes
> Built 1 zonelists in Zone order, mobility grouping on. Total pages: 251849
> Kernel command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
> Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
> Initializing CPU#0
> Checking aperture...
> No AGP bridge found
> Memory: 996952k/1022976k available (2953k kernel code, 1188k absent, 24132k reserved, 1678k data, 360k init)
> NR_IRQS:320
> Fast TSC calibration using PIT
> Detected 1864.978 MHz processor.
> Console: colour VGA+ 80x25
> console [tty0] enabled
> console [ttyS0] enabled
> Calibrating delay loop (skipped), value calculated using timer frequency.. 3729.95 BogoMIPS (lpj=7459912)
> Security Framework initialized
> SELinux: Initializing.
> SELinux: Starting in enforcing mode
> Mount-cache hash table entries: 256
> Initializing cgroup subsys debug
> Initializing cgroup subsys ns
> Initializing cgroup subsys devices
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> mce: CPU supports 6 MCE banks
> CPU0: Thermal monitoring enabled (TM2)
> using mwait in idle threads.
> ACPI: Core revision 20090521
> Setting APIC routing to flat
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> CPU0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> Booting processor 1 APIC 0x1 ip 0x6000
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 3525.06 BogoMIPS (lpj=7050122)
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> mce: CPU supports 6 MCE banks
> CPU1: Thermal monitoring enabled (TM2)
> x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
> CPU1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> checking TSC synchronization [CPU#0 -> CPU#1]: passed.
> Brought up 2 CPUs
> Total of 2 processors activated (7255.01 BogoMIPS).
> NET: Registered protocol family 16
> ACPI: bus type pci registered
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: Not using MMCONFIG.
> PCI: Using configuration type 1 for base access
> bio: create slab <bio-0> at 0
> ACPI: EC: Look up EC in DSDT
> ACPI: Interpreter enabled
> ACPI: (supports S0 S3 S4 S5)
> ACPI: Using IOAPIC for interrupt routing
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
> PCI: Using MMCONFIG at f0000000 - f7ffffff
> ACPI: No dock devices found.
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> pci 0000:00:02.0: reg 10 32bit mmio: [0x50200000-0x502fffff]
> pci 0000:00:02.0: reg 18 64bit mmio: [0x40000000-0x4fffffff]
> pci 0000:00:02.0: reg 20 io port: [0x2110-0x2117]
> pci 0000:00:03.0: reg 10 64bit mmio: [0x50326100-0x5032610f]
> pci 0000:00:03.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:03.0: PME# disabled
> pci 0000:00:19.0: reg 10 32bit mmio: [0x50300000-0x5031ffff]
> pci 0000:00:19.0: reg 14 32bit mmio: [0x50324000-0x50324fff]
> pci 0000:00:19.0: reg 18 io port: [0x20e0-0x20ff]
> pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:19.0: PME# disabled
> pci 0000:00:1a.0: reg 20 io port: [0x20c0-0x20df]
> pci 0000:00:1a.1: reg 20 io port: [0x20a0-0x20bf]
> pci 0000:00:1a.7: reg 10 32bit mmio: [0x50325c00-0x50325fff]
> pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1a.7: PME# disabled
> pci 0000:00:1b.0: reg 10 64bit mmio: [0x50320000-0x50323fff]
> pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1b.0: PME# disabled
> pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.0: PME# disabled
> pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.1: PME# disabled
> pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.2: PME# disabled
> pci 0000:00:1c.3: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.3: PME# disabled
> pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.4: PME# disabled
> pci 0000:00:1d.0: reg 20 io port: [0x2080-0x209f]
> pci 0000:00:1d.1: reg 20 io port: [0x2060-0x207f]
> pci 0000:00:1d.2: reg 20 io port: [0x2040-0x205f]
> pci 0000:00:1d.7: reg 10 32bit mmio: [0x50325800-0x50325bff]
> pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1d.7: PME# disabled
> pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
> pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO
> pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 007f)
> pci 0000:00:1f.2: reg 10 io port: [0x2108-0x210f]
> pci 0000:00:1f.2: reg 14 io port: [0x211c-0x211f]
> pci 0000:00:1f.2: reg 18 io port: [0x2100-0x2107]
> pci 0000:00:1f.2: reg 1c io port: [0x2118-0x211b]
> pci 0000:00:1f.2: reg 20 io port: [0x2020-0x203f]
> pci 0000:00:1f.2: reg 24 32bit mmio: [0x50325000-0x503257ff]
> pci 0000:00:1f.2: PME# supported from D3hot
> pci 0000:00:1f.2: PME# disabled
> pci 0000:00:1f.3: reg 10 32bit mmio: [0x50326000-0x503260ff]
> pci 0000:00:1f.3: reg 20 io port: [0x2000-0x201f]
> pci 0000:00:1c.0: bridge 32bit mmio: [0x50400000-0x504fffff]
> pci 0000:02:00.0: reg 10 io port: [0x1018-0x101f]
> pci 0000:02:00.0: reg 14 io port: [0x1024-0x1027]
> pci 0000:02:00.0: reg 18 io port: [0x1010-0x1017]
> pci 0000:02:00.0: reg 1c io port: [0x1020-0x1023]
> pci 0000:02:00.0: reg 20 io port: [0x1000-0x100f]
> pci 0000:02:00.0: reg 24 32bit mmio: [0x50100000-0x501001ff]
> pci 0000:02:00.0: supports D1
> pci 0000:02:00.0: PME# supported from D0 D1 D3hot
> pci 0000:02:00.0: PME# disabled
> pci 0000:00:1c.1: bridge io port: [0x1000-0x1fff]
> pci 0000:00:1c.1: bridge 32bit mmio: [0x50100000-0x501fffff]
> pci 0000:00:1c.2: bridge 32bit mmio: [0x50500000-0x505fffff]
> pci 0000:00:1c.3: bridge 32bit mmio: [0x50600000-0x506fffff]
> pci 0000:00:1c.4: bridge 32bit mmio: [0x50700000-0x507fffff]
> pci 0000:06:03.0: reg 10 32bit mmio: [0x50004000-0x500047ff]
> pci 0000:06:03.0: reg 14 32bit mmio: [0x50000000-0x50003fff]
> pci 0000:06:03.0: supports D1 D2
> pci 0000:06:03.0: PME# supported from D0 D1 D2 D3hot
> pci 0000:06:03.0: PME# disabled
> pci 0000:00:1e.0: transparent bridge
> pci 0000:00:1e.0: bridge 32bit mmio: [0x50000000-0x500fffff]
> pci_bus 0000:00: on NUMA node 0
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P32_._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX1._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX2._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX3._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 *11 12)
> SCSI subsystem initialized
> libata version 3.00 loaded.
> PCI: Using ACPI for IRQ routing
> NetLabel: Initializing
> NetLabel: domain hash size = 128
> NetLabel: protocols = UNLABELED CIPSOv4
> NetLabel: unlabeled traffic allowed by default
> pnp: PnP ACPI init
> ACPI: bus type pnp registered
> pnp: PnP ACPI: found 12 devices
> ACPI: ACPI bus type pnp unregistered
> system 00:01: iomem range 0xf0000000-0xf7ffffff has been reserved
> system 00:01: iomem range 0xfed13000-0xfed13fff has been reserved
> system 00:01: iomem range 0xfed14000-0xfed17fff has been reserved
> system 00:01: iomem range 0xfed18000-0xfed18fff has been reserved
> system 00:01: iomem range 0xfed19000-0xfed19fff has been reserved
> system 00:01: iomem range 0xfed1c000-0xfed1ffff has been reserved
> system 00:01: iomem range 0xfed20000-0xfed3ffff has been reserved
> system 00:01: iomem range 0xfed45000-0xfed99fff has been reserved
> system 00:01: iomem range 0xc0000-0xdffff has been reserved
> system 00:01: iomem range 0xe0000-0xfffff could not be reserved
> system 00:06: ioport range 0x500-0x53f has been reserved
> system 00:06: ioport range 0x400-0x47f has been reserved
> system 00:06: ioport range 0x680-0x6ff has been reserved
> pci 0000:00:1c.0: PCI bridge, secondary bus 0000:01
> pci 0000:00:1c.0: IO window: disabled
> pci 0000:00:1c.0: MEM window: 0x50400000-0x504fffff
> pci 0000:00:1c.0: PREFETCH window: disabled
> pci 0000:00:1c.1: PCI bridge, secondary bus 0000:02
> pci 0000:00:1c.1: IO window: 0x1000-0x1fff
> pci 0000:00:1c.1: MEM window: 0x50100000-0x501fffff
> pci 0000:00:1c.1: PREFETCH window: disabled
> pci 0000:00:1c.2: PCI bridge, secondary bus 0000:03
> pci 0000:00:1c.2: IO window: disabled
> pci 0000:00:1c.2: MEM window: 0x50500000-0x505fffff
> pci 0000:00:1c.2: PREFETCH window: disabled
> pci 0000:00:1c.3: PCI bridge, secondary bus 0000:04
> pci 0000:00:1c.3: IO window: disabled
> pci 0000:00:1c.3: MEM window: 0x50600000-0x506fffff
> pci 0000:00:1c.3: PREFETCH window: disabled
> pci 0000:00:1c.4: PCI bridge, secondary bus 0000:05
> pci 0000:00:1c.4: IO window: disabled
> pci 0000:00:1c.4: MEM window: 0x50700000-0x507fffff
> pci 0000:00:1c.4: PREFETCH window: disabled
> pci 0000:00:1e.0: PCI bridge, secondary bus 0000:06
> pci 0000:00:1e.0: IO window: disabled
> pci 0000:00:1e.0: MEM window: 0x50000000-0x500fffff
> pci 0000:00:1e.0: PREFETCH window: disabled
> pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.0: setting latency timer to 64
> pci 0000:00:1c.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16
> pci 0000:00:1c.1: setting latency timer to 64
> pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
> pci 0000:00:1c.2: setting latency timer to 64
> pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
> pci 0000:00:1c.3: setting latency timer to 64
> pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.4: setting latency timer to 64
> pci 0000:00:1e.0: setting latency timer to 64
> pci_bus 0000:00: resource 0 io: [0x00-0xffff]
> pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:01: resource 1 mem: [0x50400000-0x504fffff]
> pci_bus 0000:02: resource 0 io: [0x1000-0x1fff]
> pci_bus 0000:02: resource 1 mem: [0x50100000-0x501fffff]
> pci_bus 0000:03: resource 1 mem: [0x50500000-0x505fffff]
> pci_bus 0000:04: resource 1 mem: [0x50600000-0x506fffff]
> pci_bus 0000:05: resource 1 mem: [0x50700000-0x507fffff]
> pci_bus 0000:06: resource 1 mem: [0x50000000-0x500fffff]
> pci_bus 0000:06: resource 3 io: [0x00-0xffff]
> pci_bus 0000:06: resource 4 mem: [0x000000-0xffffffffffffffff]
> NET: Registered protocol family 2
> IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
> TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
> TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
> TCP: Hash tables configured (established 131072 bind 65536)
> TCP reno registered
> NET: Registered protocol family 1
> Unpacking initramfs...
> Freeing initrd memory: 2606k freed
> audit: initializing netlink socket (disabled)
> type=2000 audit(1245320564.157:1): initialized
> VFS: Disk quotas dquot_6.5.2
> Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
> SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
> msgmni has been set to 1953
> SELinux: Registering netfilter hooks
> alg: No test for fcrypt (fcrypt-generic)
> alg: No test for stdrng (krng)
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
> io scheduler noop registered
> io scheduler anticipatory registered (default)
> io scheduler deadline registered
> io scheduler cfq registered
> pci 0000:00:02.0: Boot video device
> pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
> pcieport-driver 0000:00:1c.0: setting latency timer to 64
> pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
> pcieport-driver 0000:00:1c.1: setting latency timer to 64
> pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
> pcieport-driver 0000:00:1c.2: setting latency timer to 64
> pcieport-driver 0000:00:1c.3: irq 27 for MSI/MSI-X
> pcieport-driver 0000:00:1c.3: setting latency timer to 64
> pcieport-driver 0000:00:1c.4: irq 28 for MSI/MSI-X
> pcieport-driver 0000:00:1c.4: setting latency timer to 64
> input: Power Button as /class/input/input0
> ACPI: Power Button [PWRF]
> input: Sleep Button as /class/input/input1
> ACPI: Sleep Button [SLPB]
> processor ACPI_CPU:00: registered as cooling_device0
> ACPI: Processor [CPU0] (supports 8 throttling states)
> processor ACPI_CPU:01: registered as cooling_device1
> ACPI: Processor [CPU1] (supports 8 throttling states)
> Linux agpgart interface v0.103
> agpgart-intel 0000:00:00.0: Intel 965G Chipset
> agpgart-intel 0000:00:00.0: detected 7676K stolen memory
> agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0x40000000
> intelfb: Framebuffer driver for Intel(R) 830M/845G/852GM/855GM/865G/915G/915GM/945G/945GM/945GME/965G/965GM chipsets
> intelfb: Version 0.9.6
> intelfb 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> intelfb: 00:02.0: Intel(R) 965G, aperture size 256MB, stolen memory 7932kB
> intelfb: Initial video mode is 1024x768-32@70.
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> Platform driver 'serial8250' needs updating - please use dev_pm_ops
> 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> loop: module loaded
> Driver 'sd' needs updating - please use bus_type methods
> ahci 0000:00:1f.2: version 3.0
> ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
> ahci 0000:00:1f.2: irq 29 for MSI/MSI-X
> ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0x33 impl SATA mode
> ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part ems
> ahci 0000:00:1f.2: setting latency timer to 64
> scsi0 : ahci
> scsi1 : ahci
> scsi2 : ahci
> scsi3 : ahci
> scsi4 : ahci
> scsi5 : ahci
> ata1: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325100 irq 29
> ata2: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325180 irq 29
> ata3: DUMMY
> ata4: DUMMY
> ata5: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325300 irq 29
> ata6: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325380 irq 29
> e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
> e1000e: Copyright (c) 1999-2008 Intel Corporation.
> e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
> e1000e 0000:00:19.0: setting latency timer to 64
> e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:16:76:ce:3a:3c
> 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
> 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
> PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
> Platform driver 'i8042' needs updating - please use dev_pm_ops
> serio: i8042 KBD port at 0x60,0x64 irq 1
> serio: i8042 AUX port at 0x60,0x64 irq 12
> mice: PS/2 mouse device common for all mice
> rtc_cmos 00:03: RTC can wake from S4
> rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
> rtc0: alarms up to one month, 114 bytes nvram
> i2c /dev entries driver
> i801_smbus 0000:00:1f.3: PCI INT B -> GSI 21 (level, low) -> IRQ 21
> coretemp coretemp.0: Using relative temperature scale!
> coretemp coretemp.1: Using relative temperature scale!
> cpuidle: using governor ladder
> ip_tables: (C) 2000-2006 Netfilter Core Team
> TCP cubic registered
> input: AT Translated Set 2 keyboard as /class/input/input2
> NET: Registered protocol family 17
> ata2: SATA link down (SStatus 0 SControl 300)
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> registered taskstats version 1
> ata6: SATA link down (SStatus 0 SControl 300)
> ata5: SATA link down (SStatus 0 SControl 300)
> rtc_cmos 00:03: setting system clock to 2009-06-18 10:22:46 UTC (1245320566)
> ata1.00: ATA-7: ST380211AS, 3.AAE, max UDMA/133
> ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata1.00: configured for UDMA/133
> scsi 0:0:0:0: Direct-Access ATA ST380211AS 3.AA PQ: 0 ANSI: 5
> sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> sd 0:0:0:0: [sda] Attached SCSI disk
> Freeing unused kernel memory: 360k freed
> Write protecting the kernel read-only data: 4324k
> Red Hat nash version 6.0.52 starting
> Mounting proc filesystem
> Mounting sysfs filesystem
> Creating /dev
> Creating initial device nodes
> Setting up hotplug.
> input: ImPS/2 Generic Wheel Mouse as /class/input/input3
> Creating block device nodes.
> mount: could not find filesystem '/proc/bus/usb'
> Waiting for driver initialization.
> Waiting for driver initialization.
> Creating root device.
> Mounting root filesystem.
> EXT3-fs: INFO: recovery required on readonly filesystem.
> EXT3-fs: write access will be enabled during recovery.
> kjournald starting. Commit interval 5 seconds
> Setting up otherEXT3-fs: recovery complete.
> filesystems.
> EXT3-fs: mounted filesystem with writeback data mode.
> Setting up new root fs
> no fstab.sys, mounting internal defaults
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 6 users, 12 roles, 2431 types, 118 bools, 1 sens, 1024 cats
> SELinux: 73 classes, 177803 rules
> SELinux: class kernel_service not defined in policy
> SELinux: permission open in class sock_file not defined in policy
> SELinux: permission nlmsg_tty_audit in class netlink_audit_socket not defined in policy
> SELinux: the above unknown classes and permissions will be allowed
> SELinux: Completing initialization.
> SELinux: Setting up existing superblocks.
> SELinux: initialized (dev sda2, type ext3), uses xattr
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
> SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
> SELinux: initialized (dev devpts, type devpts), uses transition SIDs
> SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
> SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
> SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
> SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
> SELinux: initialized (dev proc, type proc), uses genfs_contexts
> SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
> SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
> SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
> type=1403 audit(1245320574.561:2): policy loaded auid=4294967295 ses=4294967295
> Switching to new root and running init.
> unmounting old /dev
> unmounting old /proc
> unmounting old /sys
> Welcome to Fedora
> Press 'I' to enter interactive startup.
> Starting udev: [ OK ]
> Setting hostname andromeda.procyon.org.uk: [ OK ]
> Checking filesystems
> Checking all file systems.
> [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda2
> /1: clean, 330515/2621440 files, 1528849/2620603 blocks
> [/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
> /boot1: recovering journal
> /boot1: clean, 79/50200 files, 72187/200780 blocks
> [ OK ]
> Remounting root filesystem in read-write mode: [ OK ]
> Mounting local filesystems: [ OK ]
> Enabling local filesystem quotas: [ OK ]
> Enabling /etc/fstab swaps: [ OK ]
> Entering non-interactive startup
> Starting background readahead (early, fast mode): [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Bringing up loopback interface: [ OK ]
> Bringing up interface eth0:
> Determining IP information for eth0... done.
> [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Starting restorecond: [ OK ]
> Starting auditd: [ OK ]
> Starting irqbalance: [ OK ]
> Starting mcstransd: [ OK ]
> Starting rpcbind: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for udp6
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for tcp6
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting NFS statd: [ OK ]
> Starting system message bus: [ OK ]
> Starting lm_sensors: not configured, run sensors-detect[WARNING]
> Starting sshd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting ntpd: [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> SysRq : Changing Loglevel
> Loglevel set to 8
> Now booted
> Starting smartd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
>
> Fedora release 9 (Sulphur)
> Kernel 2.6.30-cachefs on an x86_64 (/dev/ttyS0)
>
> andromeda.procyon.org.uk login: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> warning: `capget01' uses 32-bit capabilities (legacy support in use)
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 30549, comm: msgctl11 Not tainted 2.6.30-cachefs #106
> Call Trace:
> [<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
> [<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
> [<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
> [<ffffffff81031110>] ? copy_process+0x93/0x113f
> [<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
> [<ffffffff81031130>] ? copy_process+0xb3/0x113f
> [<ffffffff81081ae2>] ? handle_mm_fault+0x2d5/0x645
> [<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 47
> Active_anon:80388 active_file:0 inactive_anon:822
> inactive_file:2 unevictable:0 dirty:0 writeback:0 unstable:0
> free:2053 slab:38793 mapped:357 pagetables:60476 bounce:0
> DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:128kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:4296kB min:3948kB low:4932kB high:5920kB active_anon:317944kB inactive_anon:3160kB active_file:0kB inactive_file:8kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no

There are hardly any active/inactive_file pages. So it's not likely
Rik or mine patches.

> lowmem_reserve[]: 0 0 0 0
> DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
> DMA32: 576*4kB 15*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4296kB

There are plenty of free pages. Is it a page allocator bug? Is it
stable v2.6.30 or pre 2.6.31-rc1?

Thanks,
Fengguang

> 1854 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 230698 pages shared
> 217103 pages non-shared
> Out of memory: kill process 25166 (msgctl11) score 133496 or a child
> Killed process 28855 (msgctl11)
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 30312, comm: msgctl11 Not tainted 2.6.30-cachefs #106
> Call Trace:
> [<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
> [<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
> [<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
> [<ffffffff81031110>] ? copy_process+0x93/0x113f
> [<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
> [<ffffffff81031130>] ? copy_process+0xb3/0x113f
> [<ffffffff81029a83>] ? update_curr+0x53/0xdf
> [<ffffffff81081e00>] ? handle_mm_fault+0x5f3/0x645
> [<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 0
> Active_anon:79646 active_file:2 inactive_anon:4113
> inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> free:1966 slab:38417 mapped:2 pagetables:61720 bounce:0
> DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:3948kB min:3948kB low:4932kB high:5920kB active_anon:314976kB inactive_anon:16196kB active_file:8kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
> DMA32: 443*4kB 20*8kB 10*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3948kB
> 36 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 151665 pages shared
> 220702 pages non-shared
> Out of memory: kill process 25166 (msgctl11) score 133404 or a child
> Killed process 28860 (msgctl11)

2009-06-19 08:07:35

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

Wu Fengguang <[email protected]> wrote:

> There are plenty of free pages. Is it a page allocator bug? Is it
> stable v2.6.30 or pre 2.6.31-rc1?

Cutting edge Linus after I pulled his new patches yesterday morning:

commit 65795efbd380a832ae508b04dba8f8e53f0b84d9

David

2009-06-20 04:33:46

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Thu, Jun 18, 2009 at 09:57:29AM -0700, Andrew Morton wrote:
> On Thu, 18 Jun 2009 17:18:58 +0100 David Howells <[email protected]> wrote:
>
> >
> > Okay, after dropping all my devel patches, I got the OOM to happen again;
> > fresh trace attached. I was running LTP and an NFSD, and I was spamming the
> > NFSD continuously from another machine (mount;tar;umount;repeat).
> >
> >
> > ...
> >
> > Mem-Info:
> > DMA per-cpu:
> > CPU 0: hi: 0, btch: 1 usd: 0
> > CPU 1: hi: 0, btch: 1 usd: 0
> > DMA32 per-cpu:
> > CPU 0: hi: 186, btch: 31 usd: 57
> > CPU 1: hi: 186, btch: 31 usd: 0
> > Active_anon:70104 active_file:1 inactive_anon:6557
> > inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > free:4062 slab:41969 mapped:541 pagetables:59663 bounce:0
>
> 77000 pages in anonymous memory, no swap online.
>
> 42000 pages in slab. Maybe this is a leak?
>
> 60000 pagetable pages. Seems rather a lot?
>
> 179000 pages accounted for above
>
> > DMA free:3920kB min:60kB low:72kB high:88kB active_anon:2268kB inactive_anon:428kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 968 968 968
> > DMA32 free:12328kB min:3948kB low:4932kB high:5920kB active_anon:278148kB inactive_anon:25800kB active_file:4kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 0 0 0
> > DMA: 8*4kB 0*8kB 1*16kB 1*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3920kB
> > DMA32: 2474*4kB 56*8kB 8*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 12328kB
>
> present memory: 15364 + 992032 = 1007396kB. 250000 pages. It's a 1GB
> box, yes?
>
> > 1660 total pagecache pages
> > 0 pages in swap cache
> > Swap cache stats: add 0, delete 0, find 0/0
> > Free swap = 0kB
> > Total swap = 0kB
> > 255744 pages RAM
> > 5588 pages reserved
> > 255749 pages shared
> > 215785 pages non-shared
> > Out of memory: kill process 6838 (msgctl11) score 152029 or a child
> > Killed process 8850 (msgctl11)
>
> afacit, 70000 pages are unaccounted for (leaked?)

David, could you try running this when it occurred again?

make Documentation/vm/page-types
Documentation/vm/page-types --raw # run as root

Thanks,
Fengguang

2009-06-20 08:25:10

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

Wu Fengguang <[email protected]> wrote:

> David, could you try running this when it occurred again?
>
> make Documentation/vm/page-types
> Documentation/vm/page-types --raw # run as root

On the faulting box? No. It's pretty much dead.

David

2009-06-23 14:44:39

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

Wu Fengguang <[email protected]> wrote:

> David, could you try running this when it occurred again?
>
> make Documentation/vm/page-types
> Documentation/vm/page-types --raw # run as root

Okay. I managed to catch it between the first and second OOMs, and ran the
command you asked for.

David
---
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 142261 555 ________________________________
0x0000000100000000 5588 21 ____________________r___________ reserved
0x0000004000000000 17 0 __________________________h_____ arch
0x0000000800000004 4 0 __R____________________P________ referenced,private
0x0000000800000024 2073 8 __R__l_________________P________ referenced,lru,private
0x0000000400000028 69911 273 ___U_l________________d_________ uptodate,lru,mappedtodisk
0x0001000400000028 16 0 ___U_l________________d_____I___ uptodate,lru,mappedtodisk,readahead
0x0000000800000028 25 0 ___U_l_________________P________ uptodate,lru,private
0x0000000000000028 11 0 ___U_l__________________________ uptodate,lru
0x000000040000002c 3045 11 __RU_l________________d_________ referenced,uptodate,lru,mappedtodisk
0x000000080000002c 4 0 __RU_l_________________P________ referenced,uptodate,lru,private
0x0000000800000034 9 0 __R_Dl_________________P________ referenced,dirty,lru,private
0x0000000800000038 1 0 ___UDl_________________P________ uptodate,dirty,lru,private
0x0000000000004038 13 0 ___UDl________b_________________ uptodate,dirty,lru,swapbacked
0x000000080000003c 1 0 __RUDl_________________P________ referenced,uptodate,dirty,lru,private
0x0000000800000060 183 0 _____lA________________P________ lru,active,private
0x0000000800000064 982 3 __R__lA________________P________ referenced,lru,active,private
0x0000000400000068 473 1 ___U_lA_______________d_________ uptodate,lru,active,mappedtodisk
0x0000000c00000068 1 0 ___U_lA_______________dP________ uptodate,lru,active,mappedtodisk,private
0x000000040000006c 392 1 __RU_lA_______________d_________ referenced,uptodate,lru,active,mappedtodisk
0x0000000c0000006c 1 0 __RU_lA_______________dP________ referenced,uptodate,lru,active,mappedtodisk,private
0x0000000800000070 1 0 ____DlA________________P________ dirty,lru,active,private
0x0000000800000074 20 0 __R_DlA________________P________ referenced,dirty,lru,active,private
0x0000000c00000078 2 0 ___UDlA_______________dP________ uptodate,dirty,lru,active,mappedtodisk,private
0x0000000000004078 1 0 ___UDlA_______b_________________ uptodate,dirty,lru,active,swapbacked
0x000000080000007c 1 0 __RUDlA________________P________ referenced,uptodate,dirty,lru,active,private
0x0000000000000080 18684 72 _______S________________________ slab
0x0000000000000400 6797 26 __________B_____________________ buddy
0x0000000000000804 1 0 __R________M____________________ referenced,mmap
0x0000000400000828 195 0 ___U_l_____M__________d_________ uptodate,lru,mmap,mappedtodisk
0x000000040000082c 35 0 __RU_l_____M__________d_________ referenced,uptodate,lru,mmap,mappedtodisk
0x0000000000004838 2 0 ___UDl_____M__b_________________ uptodate,dirty,lru,mmap,swapbacked
0x0000000400000868 11 0 ___U_lA____M__________d_________ uptodate,lru,active,mmap,mappedtodisk
0x000000040000086c 274 1 __RU_lA____M__________d_________ referenced,uptodate,lru,active,mmap,mappedtodisk
0x0000000800000878 1 0 ___UDlA____M___________P________ uptodate,dirty,lru,active,mmap,private
0x000000080000087c 2 0 __RUDlA____M___________P________ referenced,uptodate,dirty,lru,active,mmap,private
0x0000000000005008 8 0 ___U________a_b_________________ uptodate,anonymous,swapbacked
0x0000000000005808 6 0 ___U_______Ma_b_________________ uptodate,mmap,anonymous,swapbacked
0x0000000000005828 4325 16 ___U_l_____Ma_b_________________ uptodate,lru,mmap,anonymous,swapbacked
0x0000000000005868 366 1 ___U_lA____Ma_b_________________ uptodate,lru,active,mmap,anonymous,swapbacked
0x000000000000586c 1 0 __RU_lA____Ma_b_________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
total 255744 999

2009-06-24 01:43:13

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

> Wu Fengguang <[email protected]> wrote:
>
> > David, could you try running this when it occurred again?
> >
> > make Documentation/vm/page-types
> > Documentation/vm/page-types --raw # run as root
>
> Okay. I managed to catch it between the first and second OOMs, and ran the
> command you asked for.
>
> David
> ---
> flags page-count MB symbolic-flags long-symbolic-flags
> 0x0000000000000000 142261 555 ________________________________
> 0x0000000100000000 5588 21 ____________________r___________ reserved
> 0x0000004000000000 17 0 __________________________h_____ arch
> 0x0000000800000004 4 0 __R____________________P________ referenced,private
> 0x0000000800000024 2073 8 __R__l_________________P________ referenced,lru,private
> 0x0000000400000028 69911 273 ___U_l________________d_________ uptodate,lru,mappedtodisk
> 0x0001000400000028 16 0 ___U_l________________d_____I___ uptodate,lru,mappedtodisk,readahead
> 0x0000000800000028 25 0 ___U_l_________________P________ uptodate,lru,private
> 0x0000000000000028 11 0 ___U_l__________________________ uptodate,lru
> 0x000000040000002c 3045 11 __RU_l________________d_________ referenced,uptodate,lru,mappedtodisk
> 0x000000080000002c 4 0 __RU_l_________________P________ referenced,uptodate,lru,private
> 0x0000000800000034 9 0 __R_Dl_________________P________ referenced,dirty,lru,private
> 0x0000000800000038 1 0 ___UDl_________________P________ uptodate,dirty,lru,private
> 0x0000000000004038 13 0 ___UDl________b_________________ uptodate,dirty,lru,swapbacked
> 0x000000080000003c 1 0 __RUDl_________________P________ referenced,uptodate,dirty,lru,private
> 0x0000000800000060 183 0 _____lA________________P________ lru,active,private
> 0x0000000800000064 982 3 __R__lA________________P________ referenced,lru,active,private
> 0x0000000400000068 473 1 ___U_lA_______________d_________ uptodate,lru,active,mappedtodisk
> 0x0000000c00000068 1 0 ___U_lA_______________dP________ uptodate,lru,active,mappedtodisk,private
> 0x000000040000006c 392 1 __RU_lA_______________d_________ referenced,uptodate,lru,active,mappedtodisk
> 0x0000000c0000006c 1 0 __RU_lA_______________dP________ referenced,uptodate,lru,active,mappedtodisk,private
> 0x0000000800000070 1 0 ____DlA________________P________ dirty,lru,active,private
> 0x0000000800000074 20 0 __R_DlA________________P________ referenced,dirty,lru,active,private
> 0x0000000c00000078 2 0 ___UDlA_______________dP________ uptodate,dirty,lru,active,mappedtodisk,private
> 0x0000000000004078 1 0 ___UDlA_______b_________________ uptodate,dirty,lru,active,swapbacked
> 0x000000080000007c 1 0 __RUDlA________________P________ referenced,uptodate,dirty,lru,active,private
> 0x0000000000000080 18684 72 _______S________________________ slab
> 0x0000000000000400 6797 26 __________B_____________________ buddy
> 0x0000000000000804 1 0 __R________M____________________ referenced,mmap
> 0x0000000400000828 195 0 ___U_l_____M__________d_________ uptodate,lru,mmap,mappedtodisk
> 0x000000040000082c 35 0 __RU_l_____M__________d_________ referenced,uptodate,lru,mmap,mappedtodisk
> 0x0000000000004838 2 0 ___UDl_____M__b_________________ uptodate,dirty,lru,mmap,swapbacked
> 0x0000000400000868 11 0 ___U_lA____M__________d_________ uptodate,lru,active,mmap,mappedtodisk
> 0x000000040000086c 274 1 __RU_lA____M__________d_________ referenced,uptodate,lru,active,mmap,mappedtodisk
> 0x0000000800000878 1 0 ___UDlA____M___________P________ uptodate,dirty,lru,active,mmap,private
> 0x000000080000087c 2 0 __RUDlA____M___________P________ referenced,uptodate,dirty,lru,active,mmap,private
> 0x0000000000005008 8 0 ___U________a_b_________________ uptodate,anonymous,swapbacked
> 0x0000000000005808 6 0 ___U_______Ma_b_________________ uptodate,mmap,anonymous,swapbacked
> 0x0000000000005828 4325 16 ___U_l_____Ma_b_________________ uptodate,lru,mmap,anonymous,swapbacked
> 0x0000000000005868 366 1 ___U_lA____Ma_b_________________ uptodate,lru,active,mmap,anonymous,swapbacked
> 0x000000000000586c 1 0 __RU_lA____Ma_b_________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
> total 255744 999

reclaimable pages are very few. I don't think we see vmscan issue.
I guess it's memory leak issue.



2009-06-24 02:33:20

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Tue, Jun 23, 2009 at 10:43:57PM +0800, David Howells wrote:
> Wu Fengguang <[email protected]> wrote:
>
> > David, could you try running this when it occurred again?
> >
> > make Documentation/vm/page-types
> > Documentation/vm/page-types --raw # run as root
>
> Okay. I managed to catch it between the first and second OOMs, and ran the
> command you asked for.

Thank you!

> 0x0000000000000000 142261 555 ________________________________
> 0x0000000000000400 6797 26 __________B_____________________ buddy

The buddy+free numbers are pretty high. 26MB PG_buddy pages means much
more actual free pages. So I bet the 555MB no-flag pages are mostly free pages.

Thanks,
Fengguang

> David
> ---
> flags page-count MB symbolic-flags long-symbolic-flags
> 0x0000000000000000 142261 555 ________________________________
> 0x0000000100000000 5588 21 ____________________r___________ reserved
> 0x0000004000000000 17 0 __________________________h_____ arch
> 0x0000000800000004 4 0 __R____________________P________ referenced,private
> 0x0000000800000024 2073 8 __R__l_________________P________ referenced,lru,private
> 0x0000000400000028 69911 273 ___U_l________________d_________ uptodate,lru,mappedtodisk
> 0x0001000400000028 16 0 ___U_l________________d_____I___ uptodate,lru,mappedtodisk,readahead
> 0x0000000800000028 25 0 ___U_l_________________P________ uptodate,lru,private
> 0x0000000000000028 11 0 ___U_l__________________________ uptodate,lru
> 0x000000040000002c 3045 11 __RU_l________________d_________ referenced,uptodate,lru,mappedtodisk
> 0x000000080000002c 4 0 __RU_l_________________P________ referenced,uptodate,lru,private
> 0x0000000800000034 9 0 __R_Dl_________________P________ referenced,dirty,lru,private
> 0x0000000800000038 1 0 ___UDl_________________P________ uptodate,dirty,lru,private
> 0x0000000000004038 13 0 ___UDl________b_________________ uptodate,dirty,lru,swapbacked
> 0x000000080000003c 1 0 __RUDl_________________P________ referenced,uptodate,dirty,lru,private
> 0x0000000800000060 183 0 _____lA________________P________ lru,active,private
> 0x0000000800000064 982 3 __R__lA________________P________ referenced,lru,active,private
> 0x0000000400000068 473 1 ___U_lA_______________d_________ uptodate,lru,active,mappedtodisk
> 0x0000000c00000068 1 0 ___U_lA_______________dP________ uptodate,lru,active,mappedtodisk,private
> 0x000000040000006c 392 1 __RU_lA_______________d_________ referenced,uptodate,lru,active,mappedtodisk
> 0x0000000c0000006c 1 0 __RU_lA_______________dP________ referenced,uptodate,lru,active,mappedtodisk,private
> 0x0000000800000070 1 0 ____DlA________________P________ dirty,lru,active,private
> 0x0000000800000074 20 0 __R_DlA________________P________ referenced,dirty,lru,active,private
> 0x0000000c00000078 2 0 ___UDlA_______________dP________ uptodate,dirty,lru,active,mappedtodisk,private
> 0x0000000000004078 1 0 ___UDlA_______b_________________ uptodate,dirty,lru,active,swapbacked
> 0x000000080000007c 1 0 __RUDlA________________P________ referenced,uptodate,dirty,lru,active,private
> 0x0000000000000080 18684 72 _______S________________________ slab
> 0x0000000000000400 6797 26 __________B_____________________ buddy
> 0x0000000000000804 1 0 __R________M____________________ referenced,mmap
> 0x0000000400000828 195 0 ___U_l_____M__________d_________ uptodate,lru,mmap,mappedtodisk
> 0x000000040000082c 35 0 __RU_l_____M__________d_________ referenced,uptodate,lru,mmap,mappedtodisk
> 0x0000000000004838 2 0 ___UDl_____M__b_________________ uptodate,dirty,lru,mmap,swapbacked
> 0x0000000400000868 11 0 ___U_lA____M__________d_________ uptodate,lru,active,mmap,mappedtodisk
> 0x000000040000086c 274 1 __RU_lA____M__________d_________ referenced,uptodate,lru,active,mmap,mappedtodisk
> 0x0000000800000878 1 0 ___UDlA____M___________P________ uptodate,dirty,lru,active,mmap,private
> 0x000000080000087c 2 0 __RUDlA____M___________P________ referenced,uptodate,dirty,lru,active,mmap,private
> 0x0000000000005008 8 0 ___U________a_b_________________ uptodate,anonymous,swapbacked
> 0x0000000000005808 6 0 ___U_______Ma_b_________________ uptodate,mmap,anonymous,swapbacked
> 0x0000000000005828 4325 16 ___U_l_____Ma_b_________________ uptodate,lru,mmap,anonymous,swapbacked
> 0x0000000000005868 366 1 ___U_lA____Ma_b_________________ uptodate,lru,active,mmap,anonymous,swapbacked
> 0x000000000000586c 1 0 __RU_lA____Ma_b_________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
> total 255744 999

2009-06-24 02:43:30

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

> On Tue, Jun 23, 2009 at 10:43:57PM +0800, David Howells wrote:
> > Wu Fengguang <[email protected]> wrote:
> >
> > > David, could you try running this when it occurred again?
> > >
> > > make Documentation/vm/page-types
> > > Documentation/vm/page-types --raw # run as root
> >
> > Okay. I managed to catch it between the first and second OOMs, and ran the
> > command you asked for.
>
> Thank you!
>
> > 0x0000000000000000 142261 555 ________________________________
> > 0x0000000000000400 6797 26 __________B_____________________ buddy
>
> The buddy+free numbers are pretty high. 26MB PG_buddy pages means much
> more actual free pages. So I bet the 555MB no-flag pages are mostly free pages.

You mean our VM can make OOM although it have 600MB free pages?


2009-06-24 02:49:44

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Wed, Jun 24, 2009 at 10:43:21AM +0800, KOSAKI Motohiro wrote:
> > On Tue, Jun 23, 2009 at 10:43:57PM +0800, David Howells wrote:
> > > Wu Fengguang <[email protected]> wrote:
> > >
> > > > David, could you try running this when it occurred again?
> > > >
> > > > make Documentation/vm/page-types
> > > > Documentation/vm/page-types --raw # run as root
> > >
> > > Okay. I managed to catch it between the first and second OOMs, and ran the
> > > command you asked for.
> >
> > Thank you!
> >
> > > 0x0000000000000000 142261 555 ________________________________
> > > 0x0000000000000400 6797 26 __________B_____________________ buddy
> >
> > The buddy+free numbers are pretty high. 26MB PG_buddy pages means much
> > more actual free pages. So I bet the 555MB no-flag pages are mostly free pages.
>
> You mean our VM can make OOM although it have 600MB free pages?

Not exactly from one of the previous OOM messages:

DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
DMA32: 576*4kB 15*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4296kB

It looks like something goes wrong with the buddy system?

Thanks,
Fengguang

2009-06-24 13:08:20

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen


Okay, I've bisected it down to a narrow range of 60 commits, which include
various mm patches from Fengguang and Rik.

bad: b8d9a86590fb334d28c5905a4c419ece7d08e37d
good: 03347e2592078a90df818670fddf97a33eec70fb

The bad one is definitely bad; the good one is very probably good (the V4L
commit list branched from there, and survived about 40 iterations of LTP
without coughing up an OOM).

I've attached my bisection log to this point, and I'm continuing trying to
narrow it down.

git bisect visualise produces a nice linear list of commits between the bounds
it's currently working. Is there any way to produce that as a text dump?

David
---
git bisect start
# bad: [c868d550115b9ccc0027c67265b9520790f05601] mm: Move pgtable_cache_init() earlier
git bisect bad c868d550115b9ccc0027c67265b9520790f05601
# good: [300df7dc89cc276377fc020704e34875d5c473b6] Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
git bisect good 300df7dc89cc276377fc020704e34875d5c473b6
# good: [e1f5b94fd0c93c3e27ede88b7ab652d086dc960f] Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
git bisect good e1f5b94fd0c93c3e27ede88b7ab652d086dc960f
# bad: [b8d9a86590fb334d28c5905a4c419ece7d08e37d] Documentation/accounting/getdelays.c intialize the variable before using it
git bisect bad b8d9a86590fb334d28c5905a4c419ece7d08e37d

2009-06-27 07:13:25

by David Howells

[permalink] [raw]
Subject: Found the commit that causes the OOMs


I've managed to bisect things to find the commit that causes the OOMs. It's:

commit 69c854817566db82c362797b4a6521d0b00fe1d8
Author: MinChan Kim <[email protected]>
Date: Tue Jun 16 15:32:44 2009 -0700

vmscan: prevent shrinking of active anon lru list in case of no swap space V3

shrink_zone() can deactivate active anon pages even if we don't have a
swap device. Many embedded products don't have a swap device. So the
deactivation of anon pages is unnecessary.

This patch prevents unnecessary deactivation of anon lru pages. But, it
don't prevent aging of anon pages to swap out.

Signed-off-by: Minchan Kim <[email protected]>
Acked-by: KOSAKI Motohiro <[email protected]>
Cc: Johannes Weiner <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

This exhibits the problem. The previous commit:

commit 35282a2de4e5e4e173ab61aa9d7015886021a821
Author: Brice Goglin <[email protected]>
Date: Tue Jun 16 15:32:43 2009 -0700

migration: only migrate_prep() once per move_pages()

survives 16 iterations of the LTP syscall testsuite without exhibiting the
problem.

David

2009-06-27 11:57:02

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

On Wed, Jun 24, 2009 at 11:43:21AM +0900, KOSAKI Motohiro wrote:
> > On Tue, Jun 23, 2009 at 10:43:57PM +0800, David Howells wrote:
> > > Wu Fengguang <[email protected]> wrote:
> > >
> > > > David, could you try running this when it occurred again?
> > > >
> > > > make Documentation/vm/page-types
> > > > Documentation/vm/page-types --raw # run as root
> > >
> > > Okay. I managed to catch it between the first and second OOMs, and ran the
> > > command you asked for.
> >
> > Thank you!
> >
> > > 0x0000000000000000 142261 555 ________________________________
> > > 0x0000000000000400 6797 26 __________B_____________________ buddy
> >
> > The buddy+free numbers are pretty high. 26MB PG_buddy pages means much
> > more actual free pages. So I bet the 555MB no-flag pages are mostly free pages.
>
> You mean our VM can make OOM although it have 600MB free pages?

No, it has 600MB free pages after an OOM - which only means that the
OOM killer did a good job ;-)

2009-06-27 12:07:58

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

HI, David.

First of all, Thanks for your effort to find out cause.

Unfortunately, I don't have followed your problem.
I guess you met OOM problem with no swap device. right ?

My patch shouldn't have affect yours.
The patch's motivation is following as.

"If our system have no swap device, we can't reclaim anon pages.
So, anon pages's moving in anon lru list is unnecessary."

If we don't call shrink_active_list in shrink_zone's tail,
it can affect reclaim_stat->recent_[rotated|scanned].

Then it can affect number of pages for scanning in anon lru list.
But, Look at shrink_zone.

If we don't have swap device, we never scan anon lru list forcely.
(anon lru's percent is always zero)

Nonetheless, OOM happen.

Hmm..
Could I show your oops and show_mem information, please ?

Rik, Kosaki, What do you think ?

On Sat, Jun 27, 2009 at 4:12 PM, David Howells<[email protected]> wrote:
>
> I've managed to bisect things to find the commit that causes the OOMs.  It's:
>
>        commit 69c854817566db82c362797b4a6521d0b00fe1d8
>        Author: MinChan Kim <[email protected]>
>        Date:   Tue Jun 16 15:32:44 2009 -0700
>
>            vmscan: prevent shrinking of active anon lru list in case of no swap space V3
>
>            shrink_zone() can deactivate active anon pages even if we don't have a
>            swap device.  Many embedded products don't have a swap device.  So the
>            deactivation of anon pages is unnecessary.
>
>            This patch prevents unnecessary deactivation of anon lru pages.  But, it
>            don't prevent aging of anon pages to swap out.
>
>            Signed-off-by: Minchan Kim <[email protected]>
>            Acked-by: KOSAKI Motohiro <[email protected]>
>            Cc: Johannes Weiner <[email protected]>
>            Acked-by: Rik van Riel <[email protected]>
>            Signed-off-by: Andrew Morton <[email protected]>
>            Signed-off-by: Linus Torvalds <[email protected]>
>
> This exhibits the problem.  The previous commit:
>
>        commit 35282a2de4e5e4e173ab61aa9d7015886021a821
>        Author: Brice Goglin <[email protected]>
>        Date:   Tue Jun 16 15:32:43 2009 -0700
>
>            migration: only migrate_prep() once per move_pages()
>
> survives 16 iterations of the LTP syscall testsuite without exhibiting the
> problem.
>
> David
>



--
Kinds regards,
Minchan Kim

2009-06-27 12:58:11

by Johannes Weiner

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
>
> I've managed to bisect things to find the commit that causes the OOMs. It's:
>
> commit 69c854817566db82c362797b4a6521d0b00fe1d8
> Author: MinChan Kim <[email protected]>
> Date: Tue Jun 16 15:32:44 2009 -0700
>
> vmscan: prevent shrinking of active anon lru list in case of no swap space V3
>
> shrink_zone() can deactivate active anon pages even if we don't have a
> swap device. Many embedded products don't have a swap device. So the
> deactivation of anon pages is unnecessary.
>
> This patch prevents unnecessary deactivation of anon lru pages. But, it
> don't prevent aging of anon pages to swap out.
>
> Signed-off-by: Minchan Kim <[email protected]>
> Acked-by: KOSAKI Motohiro <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Acked-by: Rik van Riel <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
>
> This exhibits the problem. The previous commit:
>
> commit 35282a2de4e5e4e173ab61aa9d7015886021a821
> Author: Brice Goglin <[email protected]>
> Date: Tue Jun 16 15:32:43 2009 -0700
>
> migration: only migrate_prep() once per move_pages()
>
> survives 16 iterations of the LTP syscall testsuite without exhibiting the
> problem.

Here is the patch in question:

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7592d8e..879d034 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
* Even if we did not try to evict anon pages at all, we want to
* rebalance the anon lru active/inactive ratio.
*/
- if (inactive_anon_is_low(zone, sc))
+ if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);

throttle_vm_writeout(sc->gfp_mask);

When this was discussed, I think we missed that nr_swap_pages can
actually get zero on swap systems as well and this should have been
total_swap_pages - otherwise we also stop balancing the two anon lists
when swap is _full_ which was not the intention of this change at all.

[ There is another one hiding in shrink_zone() that does the same - it
was moved from get_scan_ratio() and is pretty old but we still kept
the inactive/active ratio halfway sane without MinChan's patch. ]

This is from your OOM-run dmesg, David:

Adding 32k swap on swapfile22. Priority:-21 extents:1 across:32k
Adding 32k swap on swapfile23. Priority:-22 extents:1 across:32k
Adding 32k swap on swapfile24. Priority:-23 extents:3 across:44k
Adding 32k swap on swapfile25. Priority:-24 extents:1 across:32k

So we actually have swap? Or are those removed again before the OOM?

If not, I think we let the anon lists rot while swap is full and when
some swap space gets freed up and we should be able to evict anon
pages again, we don't find any candidates. The following patch should
improve on that.

If it's not true for your particular situation, I think we still need
it for the scenario described above.

---
From: Johannes Weiner <[email protected]>
Subject: vmscan: keep balancing anon lists on swap-full conditions

Page reclaim doesn't scan and balance the anon LRU lists when
nr_swap_pages is zero to save the scan overhead for swapless systems.

Unfortunately, this variable can reach zero when all present swap
space is occupied as well and we don't want to stop balancing in that
case or we encounter an unreclaimable mess of anon lists when swap
space gets freed up and we are theoretically in the position to page
out again.

Use the total_swap_pages variable to have a better indicator when to
scan the anon LRU lists.

We still might have unbalanced anon lists when swap space is added
during run time but it is a a less dynamic change in state and we
still save the scanning overhead for CONFIG_SWAP systems that never
actually set up swap space.

Signed-off-by: Johannes Weiner <[email protected]>
---

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5415526..5ea7fc3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1524,7 +1524,7 @@ static void shrink_zone(int priority, struct zone *zone,
int noswap = 0;

/* If we have no swap space, do not bother scanning anon pages. */
- if (!sc->may_swap || (nr_swap_pages <= 0)) {
+ if (!sc->may_swap || (total_swap_pages <= 0)) {
noswap = 1;
percent[0] = 0;
percent[1] = 100;
@@ -1578,7 +1578,7 @@ static void shrink_zone(int priority, struct zone *zone,
* Even if we did not try to evict anon pages at all, we want to
* rebalance the anon lru active/inactive ratio.
*/
- if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
+ if (inactive_anon_is_low(zone, sc) && total_swap_pages > 0)
shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);

throttle_vm_writeout(sc->gfp_mask);

2009-06-27 13:50:33

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Hi, Hannes.

On Sat, Jun 27, 2009 at 9:54 PM, Johannes Weiner<[email protected]> wrote:
> On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
>>
>> I've managed to bisect things to find the commit that causes the OOMs.  It's:
>>
>>       commit 69c854817566db82c362797b4a6521d0b00fe1d8
>>       Author: MinChan Kim <[email protected]>
>>       Date:   Tue Jun 16 15:32:44 2009 -0700
>>
>>           vmscan: prevent shrinking of active anon lru list in case of no swap space V3
>>
>>           shrink_zone() can deactivate active anon pages even if we don't have a
>>           swap device.  Many embedded products don't have a swap device.  So the
>>           deactivation of anon pages is unnecessary.
>>
>>           This patch prevents unnecessary deactivation of anon lru pages.  But, it
>>           don't prevent aging of anon pages to swap out.
>>
>>           Signed-off-by: Minchan Kim <[email protected]>
>>           Acked-by: KOSAKI Motohiro <[email protected]>
>>           Cc: Johannes Weiner <[email protected]>
>>           Acked-by: Rik van Riel <[email protected]>
>>           Signed-off-by: Andrew Morton <[email protected]>
>>           Signed-off-by: Linus Torvalds <[email protected]>
>>
>> This exhibits the problem.  The previous commit:
>>
>>       commit 35282a2de4e5e4e173ab61aa9d7015886021a821
>>       Author: Brice Goglin <[email protected]>
>>       Date:   Tue Jun 16 15:32:43 2009 -0700
>>
>>           migration: only migrate_prep() once per move_pages()
>>
>> survives 16 iterations of the LTP syscall testsuite without exhibiting the
>> problem.
>
> Here is the patch in question:
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7592d8e..879d034 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
>         * Even if we did not try to evict anon pages at all, we want to
>         * rebalance the anon lru active/inactive ratio.
>         */
> -       if (inactive_anon_is_low(zone, sc))
> +       if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
>                shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>
>        throttle_vm_writeout(sc->gfp_mask);
>
> When this was discussed, I think we missed that nr_swap_pages can
> actually get zero on swap systems as well and this should have been
> total_swap_pages - otherwise we also stop balancing the two anon lists
> when swap is _full_ which was not the intention of this change at all.

At that time we considered it so that we didn't prevent anon list
aging for background reclaim.
Do you think it is not enough ?



--
Kinds regards,
Minchan Kim

2009-06-27 15:40:27

by Johannes Weiner

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sat, Jun 27, 2009 at 10:50:25PM +0900, Minchan Kim wrote:
> Hi, Hannes.
>
> On Sat, Jun 27, 2009 at 9:54 PM, Johannes Weiner<[email protected]> wrote:
> > On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
> >>
> >> I've managed to bisect things to find the commit that causes the OOMs.  It's:
> >>
> >>       commit 69c854817566db82c362797b4a6521d0b00fe1d8
> >>       Author: MinChan Kim <[email protected]>
> >>       Date:   Tue Jun 16 15:32:44 2009 -0700
> >>
> >>           vmscan: prevent shrinking of active anon lru list in case of no swap space V3
> >>
> >>           shrink_zone() can deactivate active anon pages even if we don't have a
> >>           swap device.  Many embedded products don't have a swap device.  So the
> >>           deactivation of anon pages is unnecessary.
> >>
> >>           This patch prevents unnecessary deactivation of anon lru pages.  But, it
> >>           don't prevent aging of anon pages to swap out.
> >>
> >>           Signed-off-by: Minchan Kim <[email protected]>
> >>           Acked-by: KOSAKI Motohiro <[email protected]>
> >>           Cc: Johannes Weiner <[email protected]>
> >>           Acked-by: Rik van Riel <[email protected]>
> >>           Signed-off-by: Andrew Morton <[email protected]>
> >>           Signed-off-by: Linus Torvalds <[email protected]>
> >>
> >> This exhibits the problem.  The previous commit:
> >>
> >>       commit 35282a2de4e5e4e173ab61aa9d7015886021a821
> >>       Author: Brice Goglin <[email protected]>
> >>       Date:   Tue Jun 16 15:32:43 2009 -0700
> >>
> >>           migration: only migrate_prep() once per move_pages()
> >>
> >> survives 16 iterations of the LTP syscall testsuite without exhibiting the
> >> problem.
> >
> > Here is the patch in question:
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 7592d8e..879d034 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
> >         * Even if we did not try to evict anon pages at all, we want to
> >         * rebalance the anon lru active/inactive ratio.
> >         */
> > -       if (inactive_anon_is_low(zone, sc))
> > +       if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
> >                shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
> >
> >        throttle_vm_writeout(sc->gfp_mask);
> >
> > When this was discussed, I think we missed that nr_swap_pages can
> > actually get zero on swap systems as well and this should have been
> > total_swap_pages - otherwise we also stop balancing the two anon lists
> > when swap is _full_ which was not the intention of this change at all.
>
> At that time we considered it so that we didn't prevent anon list
> aging for background reclaim.
> Do you think it is not enough ?

With a heavy multiprocess anon load, direct reclaimers will likely
reuse the reclaimed pages for anon mappings, so you have a handful of
processes shuffling pages on the active list and only one thread that
tries to balance. I can imagine that it can not keep up for long.

2009-06-27 15:52:27

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

> Here is the patch in question:
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7592d8e..879d034 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
> ? ? ? ? * Even if we did not try to evict anon pages at all, we want to
> ? ? ? ? * rebalance the anon lru active/inactive ratio.
> ? ? ? ? */
> - ? ? ? if (inactive_anon_is_low(zone, sc))
> + ? ? ? if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
> ? ? ? ? ? ? ? ?shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>
> ? ? ? ?throttle_vm_writeout(sc->gfp_mask);
>
> When this was discussed, I think we missed that nr_swap_pages can
> actually get zero on swap systems as well and this should have been
> total_swap_pages - otherwise we also stop balancing the two anon lists
> when swap is _full_ which was not the intention of this change at all.
>
> [ There is another one hiding in shrink_zone() that does the same - it
> was moved from get_scan_ratio() and is pretty old but we still kept
> the inactive/active ratio halfway sane without MinChan's patch. ]
>
> This is from your OOM-run dmesg, David:
>
> ?Adding 32k swap on swapfile22. ?Priority:-21 extents:1 across:32k
> ?Adding 32k swap on swapfile23. ?Priority:-22 extents:1 across:32k
> ?Adding 32k swap on swapfile24. ?Priority:-23 extents:3 across:44k
> ?Adding 32k swap on swapfile25. ?Priority:-24 extents:1 across:32k
>
> So we actually have swap? ?Or are those removed again before the OOM?

[grep to ltp source file]

ltp/testcases/kernel/syscalls/swapon/swapon03.c makes a lot of swap,
but it was removed when the test exit.

Then, When OOM happed, David's system don't have any swap. I don't think
your patch strike the target, unfortunately.

2009-06-27 18:36:01

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Johannes Weiner <[email protected]> wrote:

> This is from your OOM-run dmesg, David:
>
> Adding 32k swap on swapfile22. Priority:-21 extents:1 across:32k
> Adding 32k swap on swapfile23. Priority:-22 extents:1 across:32k
> Adding 32k swap on swapfile24. Priority:-23 extents:3 across:44k
> Adding 32k swap on swapfile25. Priority:-24 extents:1 across:32k
>
> So we actually have swap? Or are those removed again before the OOM?

That's merely a transient situation caused by the LTP swapfile tests.
Ordinarily, my test machine does not have swap. At the time the OOMs occur
there is no swapspace and the msgctl9 or msgctl11 tests are usually being run.

> The following patch should improve on that.

I can give it a spin when I get home later.

David

2009-06-27 18:40:42

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen

Johannes Weiner <[email protected]> wrote:

> No, it has 600MB free pages after an OOM - which only means that the
> OOM killer did a good job ;-)

The system usually gets into a pretty much dead state after a couple of OOMs
of so. There's also the little fact that prior to that commit, the OOMs
don't happen at all as far as I can tell.

I don't know for certain that the OOMs don't happen on the commits that have
come up good. Sadly, all I can say is that after running N commits, I haven't
seen an OOM.

David

2009-06-27 18:59:22

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Minchan Kim <[email protected]> wrote:

> Unfortunately, I don't have followed your problem.
> I guess you met OOM problem with no swap device. right ?

That's correct. There seems to be a little bit of confusion stemming from my
report on the OOM. LTP briefly adds swap devices - which is what's appearing
in the log.

> Could I show your oops and show_mem information, please ?

There wasn't an oops per se, only a couple of OOMs, and then the systems
mostly hung (it was still accessible over the serial link to do SysRq things),
but the network was dead, and the VT logins were unusable.

I put information on the OOM in my initial report (which I'll attach here).
If you want more informaton I can get it for you when I get back home.

David

> Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley
> Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United
> Kingdom.
> Registered in England and Wales under Company Registration No. 3798903
> From: David Howells <[email protected]>
> To: Wu Fengguang <[email protected]>
> Cc: [email protected], Andrew Morton <[email protected]>,
> LKML <[email protected]>,
> Christoph Lameter <[email protected]>,
> KOSAKI Motohiro <[email protected]>,
> "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>, "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>
> Subject: Re: [PATCH 0/3] make mapped executable pages the first class citizen
> Date: Thu, 18 Jun 2009 15:46:52 +0100
> Sender: [email protected]
>
>
> Hmmm.... It's possible that this makes my test box implode horribly when
> running LTP.
>
> I'm going to bisect it to see if this is actually due to your patches.
>
> Note that I don't have any swap space. This after a fresh reboot:
>
> [root@andromeda ~]# cat /proc/meminfo
> MemTotal: 1000624 kB
> MemFree: 797328 kB
> Buffers: 13272 kB
> Cached: 121744 kB
> SwapCached: 0 kB
> Active: 36240 kB
> Inactive: 115856 kB
> Active(anon): 17448 kB
> Inactive(anon): 0 kB
> Active(file): 18792 kB
> Inactive(file): 115856 kB
> Unevictable: 0 kB
> Mlocked: 0 kB
> SwapTotal: 0 kB
> SwapFree: 0 kB
> Dirty: 28 kB
> Writeback: 0 kB
> AnonPages: 17280 kB
> Mapped: 5376 kB
> Slab: 42984 kB
> SReclaimable: 6956 kB
> SUnreclaim: 36028 kB
> PageTables: 1304 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> CommitLimit: 500312 kB
> Committed_AS: 52596 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 190044 kB
> VmallocChunk: 34359546363 kB
> DirectMap4k: 13312 kB
> DirectMap2M: 1009664 kB
>
> David
> ---
> Initializing cgroup subsys cpuset
> Linux version 2.6.30-cachefs ([email protected]) (gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #106 SMP Wed Jun 17 22:10:31 BST 2009
> Command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> KERNEL supported cpus:
> Intel GenuineIntel
> AMD AuthenticAMD
> Centaur CentaurHauls
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009ec00 (usable)
> BIOS-e820: 000000000009ec00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 000000003e59a000 (usable)
> BIOS-e820: 000000003e59a000 - 000000003e5a6000 (reserved)
> BIOS-e820: 000000003e5a6000 - 000000003e644000 (usable)
> BIOS-e820: 000000003e644000 - 000000003e6a9000 (ACPI NVS)
> BIOS-e820: 000000003e6a9000 - 000000003e6ac000 (ACPI data)
> BIOS-e820: 000000003e6ac000 - 000000003e6f2000 (ACPI NVS)
> BIOS-e820: 000000003e6f2000 - 000000003e6ff000 (ACPI data)
> BIOS-e820: 000000003e6ff000 - 000000003e700000 (usable)
> BIOS-e820: 000000003e700000 - 000000003f000000 (reserved)
> BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
> DMI 2.4 present.
> last_pfn = 0x3e700 max_arch_pfn = 0x400000000
> MTRR default type: uncachable
> MTRR fixed ranges enabled:
> 00000-9FFFF write-back
> A0000-FFFFF uncachable
> MTRR variable ranges enabled:
> 0 base 000000000 mask FC0000000 write-back
> 1 base 03F000000 mask FFF000000 uncachable
> 2 base 03E800000 mask FFF800000 uncachable
> 3 base 03E700000 mask FFFF00000 uncachable
> 4 disabled
> 5 disabled
> 6 disabled
> 7 disabled
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> initial memory mapped : 0 - 20000000
> init_memory_mapping: 0000000000000000-000000003e700000
> 0000000000 - 003e600000 page 2M
> 003e600000 - 003e700000 page 4k
> kernel direct mapping tables up to 3e700000 @ 8000-b000
> RAMDISK: 3e2ee000 - 3e57991c
> ACPI: RSDP 00000000000fe020 00014 (v00 INTEL )
> ACPI: RSDT 000000003e6fd038 0004C (v01 INTEL DG965RY 00000330 01000013)
> ACPI: FACP 000000003e6fc000 00074 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: DSDT 000000003e6f8000 03EDA (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: FACS 000000003e6ac000 00040
> ACPI: APIC 000000003e6f7000 00078 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: WDDT 000000003e6f6000 00040 (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: MCFG 000000003e6f5000 0003C (v01 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: ASF! 000000003e6f4000 000A6 (v32 INTEL DG965RY 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f3000 001BC (v01 INTEL CpuPm 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6f2000 00175 (v01 INTEL Cpu0Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6ab000 00175 (v01 INTEL Cpu1Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6aa000 00175 (v01 INTEL Cpu2Ist 00000330 MSFT 01000013)
> ACPI: SSDT 000000003e6a9000 00175 (v01 INTEL Cpu3Ist 00000330 MSFT 01000013)
> ACPI: Local APIC address 0xfee00000
> (7 early reservations) ==> bootmem [0000000000 - 003e700000]
> #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
> #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
> #2 [0001000000 - 0001535d90] TEXT DATA BSS ==> [0001000000 - 0001535d90]
> #3 [003e2ee000 - 003e57991c] RAMDISK ==> [003e2ee000 - 003e57991c]
> #4 [000009e800 - 0000100000] BIOS reserved ==> [000009e800 - 0000100000]
> #5 [0001536000 - 0001536199] BRK ==> [0001536000 - 0001536199]
> #6 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 0000009000]
> found SMP MP-table at [ffff8800000fe200] fe200
> [ffffea0000000000-ffffea0000dfffff] PMD -> [ffff880001a00000-ffff8800027fffff] on node 0
> Zone PFN ranges:
> DMA 0x00000000 -> 0x00001000
> DMA32 0x00001000 -> 0x00100000
> Normal 0x00100000 -> 0x00100000
> Movable zone start PFN for each node
> early_node_map[4] active PFN ranges
> 0: 0x00000000 -> 0x0000009e
> 0: 0x00000100 -> 0x0003e59a
> 0: 0x0003e5a6 -> 0x0003e644
> 0: 0x0003e6ff -> 0x0003e700
> On node 0 totalpages: 255447
> DMA zone: 56 pages used for memmap
> DMA zone: 101 pages reserved
> DMA zone: 3841 pages, LIFO batch:0
> DMA32 zone: 3441 pages used for memmap
> DMA32 zone: 248008 pages, LIFO batch:31
> ACPI: PM-Timer IO Port: 0x408
> ACPI: Local APIC address 0xfee00000
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> ACPI: IRQ0 used by override.
> ACPI: IRQ2 used by override.
> ACPI: IRQ9 used by override.
> Using ACPI (MADT) for SMP configuration information
> 4 Processors exceeds NR_CPUS limit of 2
> SMP: Allowing 2 CPUs, 0 hotplug CPUs
> nr_irqs_gsi: 24
> PM: Registered nosave memory: 000000000009e000 - 000000000009f000
> PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
> PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
> PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
> PM: Registered nosave memory: 000000003e59a000 - 000000003e5a6000
> PM: Registered nosave memory: 000000003e644000 - 000000003e6a9000
> PM: Registered nosave memory: 000000003e6a9000 - 000000003e6ac000
> PM: Registered nosave memory: 000000003e6ac000 - 000000003e6f2000
> PM: Registered nosave memory: 000000003e6f2000 - 000000003e6ff000
> Allocating PCI resources starting at 3f000000 (gap: 3f000000:c0f00000)
> NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
> PERCPU: Embedded 24 pages at ffff880001541000, static data 67296 bytes
> Built 1 zonelists in Zone order, mobility grouping on. Total pages: 251849
> Kernel command line: initrd=andromeda-initrd console=tty0 console=ttyS0,115200 ro root=/dev/sda2 enforcing=1 debug BOOT_IMAGE=andromeda-vmlinuz
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
> Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
> Initializing CPU#0
> Checking aperture...
> No AGP bridge found
> Memory: 996952k/1022976k available (2953k kernel code, 1188k absent, 24132k reserved, 1678k data, 360k init)
> NR_IRQS:320
> Fast TSC calibration using PIT
> Detected 1864.978 MHz processor.
> Console: colour VGA+ 80x25
> console [tty0] enabled
> console [ttyS0] enabled
> Calibrating delay loop (skipped), value calculated using timer frequency.. 3729.95 BogoMIPS (lpj=7459912)
> Security Framework initialized
> SELinux: Initializing.
> SELinux: Starting in enforcing mode
> Mount-cache hash table entries: 256
> Initializing cgroup subsys debug
> Initializing cgroup subsys ns
> Initializing cgroup subsys devices
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> mce: CPU supports 6 MCE banks
> CPU0: Thermal monitoring enabled (TM2)
> using mwait in idle threads.
> ACPI: Core revision 20090521
> Setting APIC routing to flat
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> CPU0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> Booting processor 1 APIC 0x1 ip 0x6000
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 3525.06 BogoMIPS (lpj=7050122)
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> mce: CPU supports 6 MCE banks
> CPU1: Thermal monitoring enabled (TM2)
> x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
> CPU1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 06
> checking TSC synchronization [CPU#0 -> CPU#1]: passed.
> Brought up 2 CPUs
> Total of 2 processors activated (7255.01 BogoMIPS).
> NET: Registered protocol family 16
> ACPI: bus type pci registered
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: Not using MMCONFIG.
> PCI: Using configuration type 1 for base access
> bio: create slab <bio-0> at 0
> ACPI: EC: Look up EC in DSDT
> ACPI: Interpreter enabled
> ACPI: (supports S0 S3 S4 S5)
> ACPI: Using IOAPIC for interrupt routing
> PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 127
> PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
> PCI: Using MMCONFIG at f0000000 - f7ffffff
> ACPI: No dock devices found.
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> pci 0000:00:02.0: reg 10 32bit mmio: [0x50200000-0x502fffff]
> pci 0000:00:02.0: reg 18 64bit mmio: [0x40000000-0x4fffffff]
> pci 0000:00:02.0: reg 20 io port: [0x2110-0x2117]
> pci 0000:00:03.0: reg 10 64bit mmio: [0x50326100-0x5032610f]
> pci 0000:00:03.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:03.0: PME# disabled
> pci 0000:00:19.0: reg 10 32bit mmio: [0x50300000-0x5031ffff]
> pci 0000:00:19.0: reg 14 32bit mmio: [0x50324000-0x50324fff]
> pci 0000:00:19.0: reg 18 io port: [0x20e0-0x20ff]
> pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:19.0: PME# disabled
> pci 0000:00:1a.0: reg 20 io port: [0x20c0-0x20df]
> pci 0000:00:1a.1: reg 20 io port: [0x20a0-0x20bf]
> pci 0000:00:1a.7: reg 10 32bit mmio: [0x50325c00-0x50325fff]
> pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1a.7: PME# disabled
> pci 0000:00:1b.0: reg 10 64bit mmio: [0x50320000-0x50323fff]
> pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1b.0: PME# disabled
> pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.0: PME# disabled
> pci 0000:00:1c.1: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.1: PME# disabled
> pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.2: PME# disabled
> pci 0000:00:1c.3: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.3: PME# disabled
> pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
> pci 0000:00:1c.4: PME# disabled
> pci 0000:00:1d.0: reg 20 io port: [0x2080-0x209f]
> pci 0000:00:1d.1: reg 20 io port: [0x2060-0x207f]
> pci 0000:00:1d.2: reg 20 io port: [0x2040-0x205f]
> pci 0000:00:1d.7: reg 10 32bit mmio: [0x50325800-0x50325bff]
> pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
> pci 0000:00:1d.7: PME# disabled
> pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
> pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO
> pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 007f)
> pci 0000:00:1f.2: reg 10 io port: [0x2108-0x210f]
> pci 0000:00:1f.2: reg 14 io port: [0x211c-0x211f]
> pci 0000:00:1f.2: reg 18 io port: [0x2100-0x2107]
> pci 0000:00:1f.2: reg 1c io port: [0x2118-0x211b]
> pci 0000:00:1f.2: reg 20 io port: [0x2020-0x203f]
> pci 0000:00:1f.2: reg 24 32bit mmio: [0x50325000-0x503257ff]
> pci 0000:00:1f.2: PME# supported from D3hot
> pci 0000:00:1f.2: PME# disabled
> pci 0000:00:1f.3: reg 10 32bit mmio: [0x50326000-0x503260ff]
> pci 0000:00:1f.3: reg 20 io port: [0x2000-0x201f]
> pci 0000:00:1c.0: bridge 32bit mmio: [0x50400000-0x504fffff]
> pci 0000:02:00.0: reg 10 io port: [0x1018-0x101f]
> pci 0000:02:00.0: reg 14 io port: [0x1024-0x1027]
> pci 0000:02:00.0: reg 18 io port: [0x1010-0x1017]
> pci 0000:02:00.0: reg 1c io port: [0x1020-0x1023]
> pci 0000:02:00.0: reg 20 io port: [0x1000-0x100f]
> pci 0000:02:00.0: reg 24 32bit mmio: [0x50100000-0x501001ff]
> pci 0000:02:00.0: supports D1
> pci 0000:02:00.0: PME# supported from D0 D1 D3hot
> pci 0000:02:00.0: PME# disabled
> pci 0000:00:1c.1: bridge io port: [0x1000-0x1fff]
> pci 0000:00:1c.1: bridge 32bit mmio: [0x50100000-0x501fffff]
> pci 0000:00:1c.2: bridge 32bit mmio: [0x50500000-0x505fffff]
> pci 0000:00:1c.3: bridge 32bit mmio: [0x50600000-0x506fffff]
> pci 0000:00:1c.4: bridge 32bit mmio: [0x50700000-0x507fffff]
> pci 0000:06:03.0: reg 10 32bit mmio: [0x50004000-0x500047ff]
> pci 0000:06:03.0: reg 14 32bit mmio: [0x50000000-0x50003fff]
> pci 0000:06:03.0: supports D1 D2
> pci 0000:06:03.0: PME# supported from D0 D1 D2 D3hot
> pci 0000:06:03.0: PME# disabled
> pci 0000:00:1e.0: transparent bridge
> pci 0000:00:1e.0: bridge 32bit mmio: [0x50000000-0x500fffff]
> pci_bus 0000:00: on NUMA node 0
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P32_._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX0._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX1._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX2._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX3._PRT]
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12)
> ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12)
> ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 *9 10 11 12)
> ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 *11 12)
> SCSI subsystem initialized
> libata version 3.00 loaded.
> PCI: Using ACPI for IRQ routing
> NetLabel: Initializing
> NetLabel: domain hash size = 128
> NetLabel: protocols = UNLABELED CIPSOv4
> NetLabel: unlabeled traffic allowed by default
> pnp: PnP ACPI init
> ACPI: bus type pnp registered
> pnp: PnP ACPI: found 12 devices
> ACPI: ACPI bus type pnp unregistered
> system 00:01: iomem range 0xf0000000-0xf7ffffff has been reserved
> system 00:01: iomem range 0xfed13000-0xfed13fff has been reserved
> system 00:01: iomem range 0xfed14000-0xfed17fff has been reserved
> system 00:01: iomem range 0xfed18000-0xfed18fff has been reserved
> system 00:01: iomem range 0xfed19000-0xfed19fff has been reserved
> system 00:01: iomem range 0xfed1c000-0xfed1ffff has been reserved
> system 00:01: iomem range 0xfed20000-0xfed3ffff has been reserved
> system 00:01: iomem range 0xfed45000-0xfed99fff has been reserved
> system 00:01: iomem range 0xc0000-0xdffff has been reserved
> system 00:01: iomem range 0xe0000-0xfffff could not be reserved
> system 00:06: ioport range 0x500-0x53f has been reserved
> system 00:06: ioport range 0x400-0x47f has been reserved
> system 00:06: ioport range 0x680-0x6ff has been reserved
> pci 0000:00:1c.0: PCI bridge, secondary bus 0000:01
> pci 0000:00:1c.0: IO window: disabled
> pci 0000:00:1c.0: MEM window: 0x50400000-0x504fffff
> pci 0000:00:1c.0: PREFETCH window: disabled
> pci 0000:00:1c.1: PCI bridge, secondary bus 0000:02
> pci 0000:00:1c.1: IO window: 0x1000-0x1fff
> pci 0000:00:1c.1: MEM window: 0x50100000-0x501fffff
> pci 0000:00:1c.1: PREFETCH window: disabled
> pci 0000:00:1c.2: PCI bridge, secondary bus 0000:03
> pci 0000:00:1c.2: IO window: disabled
> pci 0000:00:1c.2: MEM window: 0x50500000-0x505fffff
> pci 0000:00:1c.2: PREFETCH window: disabled
> pci 0000:00:1c.3: PCI bridge, secondary bus 0000:04
> pci 0000:00:1c.3: IO window: disabled
> pci 0000:00:1c.3: MEM window: 0x50600000-0x506fffff
> pci 0000:00:1c.3: PREFETCH window: disabled
> pci 0000:00:1c.4: PCI bridge, secondary bus 0000:05
> pci 0000:00:1c.4: IO window: disabled
> pci 0000:00:1c.4: MEM window: 0x50700000-0x507fffff
> pci 0000:00:1c.4: PREFETCH window: disabled
> pci 0000:00:1e.0: PCI bridge, secondary bus 0000:06
> pci 0000:00:1e.0: IO window: disabled
> pci 0000:00:1e.0: MEM window: 0x50000000-0x500fffff
> pci 0000:00:1e.0: PREFETCH window: disabled
> pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.0: setting latency timer to 64
> pci 0000:00:1c.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16
> pci 0000:00:1c.1: setting latency timer to 64
> pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
> pci 0000:00:1c.2: setting latency timer to 64
> pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
> pci 0000:00:1c.3: setting latency timer to 64
> pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> pci 0000:00:1c.4: setting latency timer to 64
> pci 0000:00:1e.0: setting latency timer to 64
> pci_bus 0000:00: resource 0 io: [0x00-0xffff]
> pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:01: resource 1 mem: [0x50400000-0x504fffff]
> pci_bus 0000:02: resource 0 io: [0x1000-0x1fff]
> pci_bus 0000:02: resource 1 mem: [0x50100000-0x501fffff]
> pci_bus 0000:03: resource 1 mem: [0x50500000-0x505fffff]
> pci_bus 0000:04: resource 1 mem: [0x50600000-0x506fffff]
> pci_bus 0000:05: resource 1 mem: [0x50700000-0x507fffff]
> pci_bus 0000:06: resource 1 mem: [0x50000000-0x500fffff]
> pci_bus 0000:06: resource 3 io: [0x00-0xffff]
> pci_bus 0000:06: resource 4 mem: [0x000000-0xffffffffffffffff]
> NET: Registered protocol family 2
> IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
> TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
> TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
> TCP: Hash tables configured (established 131072 bind 65536)
> TCP reno registered
> NET: Registered protocol family 1
> Unpacking initramfs...
> Freeing initrd memory: 2606k freed
> audit: initializing netlink socket (disabled)
> type=2000 audit(1245320564.157:1): initialized
> VFS: Disk quotas dquot_6.5.2
> Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
> SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
> msgmni has been set to 1953
> SELinux: Registering netfilter hooks
> alg: No test for fcrypt (fcrypt-generic)
> alg: No test for stdrng (krng)
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
> io scheduler noop registered
> io scheduler anticipatory registered (default)
> io scheduler deadline registered
> io scheduler cfq registered
> pci 0000:00:02.0: Boot video device
> pcieport-driver 0000:00:1c.0: irq 24 for MSI/MSI-X
> pcieport-driver 0000:00:1c.0: setting latency timer to 64
> pcieport-driver 0000:00:1c.1: irq 25 for MSI/MSI-X
> pcieport-driver 0000:00:1c.1: setting latency timer to 64
> pcieport-driver 0000:00:1c.2: irq 26 for MSI/MSI-X
> pcieport-driver 0000:00:1c.2: setting latency timer to 64
> pcieport-driver 0000:00:1c.3: irq 27 for MSI/MSI-X
> pcieport-driver 0000:00:1c.3: setting latency timer to 64
> pcieport-driver 0000:00:1c.4: irq 28 for MSI/MSI-X
> pcieport-driver 0000:00:1c.4: setting latency timer to 64
> input: Power Button as /class/input/input0
> ACPI: Power Button [PWRF]
> input: Sleep Button as /class/input/input1
> ACPI: Sleep Button [SLPB]
> processor ACPI_CPU:00: registered as cooling_device0
> ACPI: Processor [CPU0] (supports 8 throttling states)
> processor ACPI_CPU:01: registered as cooling_device1
> ACPI: Processor [CPU1] (supports 8 throttling states)
> Linux agpgart interface v0.103
> agpgart-intel 0000:00:00.0: Intel 965G Chipset
> agpgart-intel 0000:00:00.0: detected 7676K stolen memory
> agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0x40000000
> intelfb: Framebuffer driver for Intel(R) 830M/845G/852GM/855GM/865G/915G/915GM/945G/945GM/945GME/965G/965GM chipsets
> intelfb: Version 0.9.6
> intelfb 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> intelfb: 00:02.0: Intel(R) 965G, aperture size 256MB, stolen memory 7932kB
> intelfb: Initial video mode is 1024x768-32@70.
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> Platform driver 'serial8250' needs updating - please use dev_pm_ops
> 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> loop: module loaded
> Driver 'sd' needs updating - please use bus_type methods
> ahci 0000:00:1f.2: version 3.0
> ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
> ahci 0000:00:1f.2: irq 29 for MSI/MSI-X
> ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0x33 impl SATA mode
> ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part ems
> ahci 0000:00:1f.2: setting latency timer to 64
> scsi0 : ahci
> scsi1 : ahci
> scsi2 : ahci
> scsi3 : ahci
> scsi4 : ahci
> scsi5 : ahci
> ata1: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325100 irq 29
> ata2: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325180 irq 29
> ata3: DUMMY
> ata4: DUMMY
> ata5: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325300 irq 29
> ata6: SATA max UDMA/133 abar m2048@0x50325000 port 0x50325380 irq 29
> e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
> e1000e: Copyright (c) 1999-2008 Intel Corporation.
> e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
> e1000e 0000:00:19.0: setting latency timer to 64
> e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:16:76:ce:3a:3c
> 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
> 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
> PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
> Platform driver 'i8042' needs updating - please use dev_pm_ops
> serio: i8042 KBD port at 0x60,0x64 irq 1
> serio: i8042 AUX port at 0x60,0x64 irq 12
> mice: PS/2 mouse device common for all mice
> rtc_cmos 00:03: RTC can wake from S4
> rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
> rtc0: alarms up to one month, 114 bytes nvram
> i2c /dev entries driver
> i801_smbus 0000:00:1f.3: PCI INT B -> GSI 21 (level, low) -> IRQ 21
> coretemp coretemp.0: Using relative temperature scale!
> coretemp coretemp.1: Using relative temperature scale!
> cpuidle: using governor ladder
> ip_tables: (C) 2000-2006 Netfilter Core Team
> TCP cubic registered
> input: AT Translated Set 2 keyboard as /class/input/input2
> NET: Registered protocol family 17
> ata2: SATA link down (SStatus 0 SControl 300)
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> registered taskstats version 1
> ata6: SATA link down (SStatus 0 SControl 300)
> ata5: SATA link down (SStatus 0 SControl 300)
> rtc_cmos 00:03: setting system clock to 2009-06-18 10:22:46 UTC (1245320566)
> ata1.00: ATA-7: ST380211AS, 3.AAE, max UDMA/133
> ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata1.00: configured for UDMA/133
> scsi 0:0:0:0: Direct-Access ATA ST380211AS 3.AA PQ: 0 ANSI: 5
> sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> sd 0:0:0:0: [sda] Attached SCSI disk
> Freeing unused kernel memory: 360k freed
> Write protecting the kernel read-only data: 4324k
> Red Hat nash version 6.0.52 starting
> Mounting proc filesystem
> Mounting sysfs filesystem
> Creating /dev
> Creating initial device nodes
> Setting up hotplug.
> input: ImPS/2 Generic Wheel Mouse as /class/input/input3
> Creating block device nodes.
> mount: could not find filesystem '/proc/bus/usb'
> Waiting for driver initialization.
> Waiting for driver initialization.
> Creating root device.
> Mounting root filesystem.
> EXT3-fs: INFO: recovery required on readonly filesystem.
> EXT3-fs: write access will be enabled during recovery.
> kjournald starting. Commit interval 5 seconds
> Setting up otherEXT3-fs: recovery complete.
> filesystems.
> EXT3-fs: mounted filesystem with writeback data mode.
> Setting up new root fs
> no fstab.sys, mounting internal defaults
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 8192 avtab hash slots, 177803 rules.
> SELinux: 6 users, 12 roles, 2431 types, 118 bools, 1 sens, 1024 cats
> SELinux: 73 classes, 177803 rules
> SELinux: class kernel_service not defined in policy
> SELinux: permission open in class sock_file not defined in policy
> SELinux: permission nlmsg_tty_audit in class netlink_audit_socket not defined in policy
> SELinux: the above unknown classes and permissions will be allowed
> SELinux: Completing initialization.
> SELinux: Setting up existing superblocks.
> SELinux: initialized (dev sda2, type ext3), uses xattr
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
> SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
> SELinux: initialized (dev devpts, type devpts), uses transition SIDs
> SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
> SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
> SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
> SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
> SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
> SELinux: initialized (dev proc, type proc), uses genfs_contexts
> SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
> SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
> SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
> type=1403 audit(1245320574.561:2): policy loaded auid=4294967295 ses=4294967295
> Switching to new root and running init.
> unmounting old /dev
> unmounting old /proc
> unmounting old /sys
> Welcome to Fedora
> Press 'I' to enter interactive startup.
> Starting udev: [ OK ]
> Setting hostname andromeda.procyon.org.uk: [ OK ]
> Checking filesystems
> Checking all file systems.
> [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda2
> /1: clean, 330515/2621440 files, 1528849/2620603 blocks
> [/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
> /boot1: recovering journal
> /boot1: clean, 79/50200 files, 72187/200780 blocks
> [ OK ]
> Remounting root filesystem in read-write mode: [ OK ]
> Mounting local filesystems: [ OK ]
> Enabling local filesystem quotas: [ OK ]
> Enabling /etc/fstab swaps: [ OK ]
> Entering non-interactive startup
> Starting background readahead (early, fast mode): [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Bringing up loopback interface: [ OK ]
> Bringing up interface eth0:
> Determining IP information for eth0... done.
> [ OK ]
> FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
> Starting restorecond: [ OK ]
> Starting auditd: [ OK ]
> Starting irqbalance: [ OK ]
> Starting mcstransd: [ OK ]
> Starting rpcbind: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for udp6
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> rpcbind: cannot create socket for tcp6
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting NFS statd: [ OK ]
> Starting system message bus: [ OK ]
> Starting lm_sensors: not configured, run sensors-detect[WARNING]
> Starting sshd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> Starting ntpd: [ OK ]
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> SysRq : Changing Loglevel
> Loglevel set to 8
> Now booted
> Starting smartd: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> [ OK ]
>
> Fedora release 9 (Sulphur)
> Kernel 2.6.30-cachefs on an x86_64 (/dev/ttyS0)
>
> andromeda.procyon.org.uk login: modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> warning: `capget01' uses 32-bit capabilities (legacy support in use)
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load /lib/modules/2.6.30-cachefs/modules.dep: No such file or directory
>
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 30549, comm: msgctl11 Not tainted 2.6.30-cachefs #106
> Call Trace:
> [<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
> [<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
> [<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
> [<ffffffff81031110>] ? copy_process+0x93/0x113f
> [<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
> [<ffffffff81031130>] ? copy_process+0xb3/0x113f
> [<ffffffff81081ae2>] ? handle_mm_fault+0x2d5/0x645
> [<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 47
> Active_anon:80388 active_file:0 inactive_anon:822
> inactive_file:2 unevictable:0 dirty:0 writeback:0 unstable:0
> free:2053 slab:38793 mapped:357 pagetables:60476 bounce:0
> DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:128kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:4296kB min:3948kB low:4932kB high:5920kB active_anon:317944kB inactive_anon:3160kB active_file:0kB inactive_file:8kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
> DMA32: 576*4kB 15*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4296kB
> 1854 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 230698 pages shared
> 217103 pages non-shared
> Out of memory: kill process 25166 (msgctl11) score 133496 or a child
> Killed process 28855 (msgctl11)
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 30312, comm: msgctl11 Not tainted 2.6.30-cachefs #106
> Call Trace:
> [<ffffffff81071dae>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81072075>] ? __out_of_memory+0x12b/0x142
> [<ffffffff810720f6>] ? out_of_memory+0x6a/0x94
> [<ffffffff8107479e>] ? __alloc_pages_nodemask+0x422/0x50b
> [<ffffffff81031110>] ? copy_process+0x93/0x113f
> [<ffffffff810748f1>] ? __get_free_pages+0x12/0x50
> [<ffffffff81031130>] ? copy_process+0xb3/0x113f
> [<ffffffff81029a83>] ? update_curr+0x53/0xdf
> [<ffffffff81081e00>] ? handle_mm_fault+0x5f3/0x645
> [<ffffffff810322fb>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022a0b>] ? do_page_fault+0x1f1/0x206
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 0
> Active_anon:79646 active_file:2 inactive_anon:4113
> inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> free:1966 slab:38417 mapped:2 pagetables:61720 bounce:0
> DMA free:3916kB min:60kB low:72kB high:88kB active_anon:3608kB inactive_anon:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:3948kB min:3948kB low:4932kB high:5920kB active_anon:314976kB inactive_anon:16196kB active_file:8kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 1*4kB 1*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3916kB
> DMA32: 443*4kB 20*8kB 10*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3948kB
> 36 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5588 pages reserved
> 151665 pages shared
> 220702 pages non-shared
> Out of memory: kill process 25166 (msgctl11) score 133404 or a child
> Killed process 28860 (msgctl11)
>

2009-06-28 07:56:25

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Johannes Weiner <[email protected]> wrote:

> From: Johannes Weiner <[email protected]>
> Subject: vmscan: keep balancing anon lists on swap-full conditions
>
> Page reclaim doesn't scan and balance the anon LRU lists when
> nr_swap_pages is zero to save the scan overhead for swapless systems.
>
> Unfortunately, this variable can reach zero when all present swap
> space is occupied as well and we don't want to stop balancing in that
> case or we encounter an unreclaimable mess of anon lists when swap
> space gets freed up and we are theoretically in the position to page
> out again.
>
> Use the total_swap_pages variable to have a better indicator when to
> scan the anon LRU lists.
>
> We still might have unbalanced anon lists when swap space is added
> during run time but it is a a less dynamic change in state and we
> still save the scanning overhead for CONFIG_SWAP systems that never
> actually set up swap space.
>
> Signed-off-by: Johannes Weiner <[email protected]>

This doesn't help.

It may change the behaviour though: rather than locking up after a couple of
OOMs, it generated 42MB of OOM messages.

It didn't go wrong until its 5th pass through the LTP syscalls testsuite this
time. Attached is the first part of the log where OOM messages were generated.

David
---
msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 689, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #143
Call Trace:
[<ffffffff810718a2>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff81071b69>] ? __out_of_memory+0x12b/0x142
[<ffffffff81071bea>] ? out_of_memory+0x6a/0x94
[<ffffffff810742b4>] ? __alloc_pages_nodemask+0x42e/0x51d
[<ffffffff81090d86>] ? cache_alloc_refill+0x353/0x69c
[<ffffffff8106f20f>] ? find_get_page+0x1a/0x72
[<ffffffff810313e6>] ? copy_process+0x95/0x114f
[<ffffffff81091364>] ? kmem_cache_alloc+0x83/0xc5
[<ffffffff810313e6>] ? copy_process+0x95/0x114f
[<ffffffff810815da>] ? handle_mm_fault+0x2b9/0x62f
[<ffffffff810325df>] ? do_fork+0x13f/0x2ba
[<ffffffff81022c02>] ? do_page_fault+0x1f8/0x20d
[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 62
CPU 1: hi: 186, btch: 31 usd: 0
Active_anon:71393 active_file:1 inactive_anon:4670
inactive_file:0 unevictable:0 dirty:11 writeback:0 unstable:0
free:3987 slab:38927 mapped:451 pagetables:58190 bounce:0
DMA free:3928kB min:60kB low:72kB high:88kB active_anon:3176kB inactive_anon:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:12020kB min:3948kB low:4932kB high:5920kB active_anon:282396kB inactive_anon:18424kB active_file:4kB inactive_file:0kB unevictable:0kB present:992000kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 8*4kB 1*8kB 1*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3928kB
DMA32: 2367*4kB 71*8kB 10*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 12020kB
2342 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5597 pages reserved
230753 pages shared
216782 pages non-shared
Out of memory: kill process 30280 (msgctl11) score 161571 or a child
Killed process 31149 (msgctl11)
msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 689, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #143
Call Trace:
[<ffffffff810718a2>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff81071b69>] ? __out_of_memory+0x12b/0x142
[<ffffffff81071bea>] ? out_of_memory+0x6a/0x94
[<ffffffff810742b4>] ? __alloc_pages_nodemask+0x42e/0x51d
[<ffffffff81090d86>] ? cache_alloc_refill+0x353/0x69c
[<ffffffff8106f20f>] ? find_get_page+0x1a/0x72
[<ffffffff810313e6>] ? copy_process+0x95/0x114f
[<ffffffff81091364>] ? kmem_cache_alloc+0x83/0xc5
[<ffffffff810313e6>] ? copy_process+0x95/0x114f
[<ffffffff810815da>] ? handle_mm_fault+0x2b9/0x62f
[<ffffffff810325df>] ? do_fork+0x13f/0x2ba
[<ffffffff81022c02>] ? do_page_fault+0x1f8/0x20d
[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
Active_anon:75955 active_file:0 inactive_anon:4990
inactive_file:2 unevictable:0 dirty:0 writeback:0 unstable:0
free:1970 slab:38326 mapped:5 pagetables:59166 bounce:0
DMA free:3932kB min:60kB low:72kB high:88kB active_anon:3172kB inactive_anon:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:3948kB min:3948kB low:4932kB high:5920kB active_anon:300648kB inactive_anon:19704kB active_file:0kB inactive_file:8kB unevictable:0kB present:992000kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 9*4kB 1*8kB 1*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
DMA32: 457*4kB 39*8kB 1*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3948kB
36 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5597 pages reserved
162238 pages shared
220698 pages non-shared
Out of memory: kill process 30280 (msgctl11) score 160654 or a child
Killed process 31155 (msgctl11)
msgctl11: page allocation failure. order:1, mode:0x20
Pid: 3095, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #143
Call Trace:
<IRQ> [<ffffffff8107435a>] ? __alloc_pages_nodemask+0x4d4/0x51d
[<ffffffff81090d86>] ? cache_alloc_refill+0x353/0x69c
[<ffffffff810734a4>] ? free_pages_bulk.clone.1+0x4d/0x20d
[<ffffffff81265935>] ? __alloc_skb+0x38/0x148
[<ffffffff81266512>] ? __netdev_alloc_skb+0x15/0x2f
[<ffffffff81091195>] ? __kmalloc_track_caller+0xc6/0x108
[<ffffffff8126595e>] ? __alloc_skb+0x61/0x148
[<ffffffff81266512>] ? __netdev_alloc_skb+0x15/0x2f
[<ffffffff8123f092>] ? e1000_clean_rx_irq+0x1ab/0x2de
[<ffffffff8124072f>] ? e1000_clean+0x71/0x20f
[<ffffffff81269cab>] ? net_rx_action+0x64/0x129
[<ffffffff8103b47d>] ? process_timeout+0x0/0xb
[<ffffffff810375d1>] ? __do_softirq+0x92/0x129
[<ffffffff8100be7c>] ? call_softirq+0x1c/0x28
[<ffffffff8100d824>] ? do_softirq+0x2c/0x68
[<ffffffff8100cf3b>] ? do_IRQ+0x9c/0xb2
[<ffffffff8100b713>] ? ret_from_intr+0x0/0xa
<EOI> [<ffffffff810791e9>] ? shrink_zone+0x1d6/0x30f
[<ffffffff810cec7d>] ? mb_cache_shrink_fn+0x26/0x115
[<ffffffff8118b977>] ? __up_read+0x13/0x90
[<ffffffff81079460>] ? shrink_slab+0x13e/0x150
[<ffffffff8107a004>] ? try_to_free_pages+0x20d/0x362
[<ffffffff8107760f>] ? isolate_pages_global+0x0/0x219
[<ffffffff810741d3>] ? __alloc_pages_nodemask+0x34d/0x51d
[<ffffffff81075f05>] ? __do_page_cache_readahead+0x9e/0x1a1
[<ffffffff81076024>] ? ra_submit+0x1c/0x20
[<ffffffff8106f9f4>] ? filemap_fault+0x18a/0x316
[<ffffffff8107f7cb>] ? __do_fault+0x54/0x3d6
[<ffffffff810815da>] ? handle_mm_fault+0x2b9/0x62f
[<ffffffff81022c02>] ? do_page_fault+0x1f8/0x20d
[<ffffffff812dfb7f>] ? page_fault+0x1f/0x30

2009-06-28 11:33:17

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sat, Jun 27, 2009 at 08:54:12PM +0800, Johannes Weiner wrote:
> On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
> >
> > I've managed to bisect things to find the commit that causes the OOMs. It's:
> >
> > commit 69c854817566db82c362797b4a6521d0b00fe1d8
> > Author: MinChan Kim <[email protected]>
> > Date: Tue Jun 16 15:32:44 2009 -0700
> >
> > vmscan: prevent shrinking of active anon lru list in case of no swap space V3
> >
> > shrink_zone() can deactivate active anon pages even if we don't have a
> > swap device. Many embedded products don't have a swap device. So the
> > deactivation of anon pages is unnecessary.
> >
> > This patch prevents unnecessary deactivation of anon lru pages. But, it
> > don't prevent aging of anon pages to swap out.
> >
> > Signed-off-by: Minchan Kim <[email protected]>
> > Acked-by: KOSAKI Motohiro <[email protected]>
> > Cc: Johannes Weiner <[email protected]>
> > Acked-by: Rik van Riel <[email protected]>
> > Signed-off-by: Andrew Morton <[email protected]>
> > Signed-off-by: Linus Torvalds <[email protected]>
> >
> > This exhibits the problem. The previous commit:
> >
> > commit 35282a2de4e5e4e173ab61aa9d7015886021a821
> > Author: Brice Goglin <[email protected]>
> > Date: Tue Jun 16 15:32:43 2009 -0700
> >
> > migration: only migrate_prep() once per move_pages()
> >
> > survives 16 iterations of the LTP syscall testsuite without exhibiting the
> > problem.
>
> Here is the patch in question:
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7592d8e..879d034 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
> * Even if we did not try to evict anon pages at all, we want to
> * rebalance the anon lru active/inactive ratio.
> */
> - if (inactive_anon_is_low(zone, sc))
> + if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
> shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>
> throttle_vm_writeout(sc->gfp_mask);
>
> When this was discussed, I think we missed that nr_swap_pages can
> actually get zero on swap systems as well and this should have been
> total_swap_pages - otherwise we also stop balancing the two anon lists
> when swap is _full_ which was not the intention of this change at all.

Exactly. In Jesse's OOM case, the swap is exhausted.
total_swap_pages is the better choice in this situation.

Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426766] Active_anon:290797 active_file:28 inactive_anon:97034
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426767] inactive_file:61 unevictable:11322 dirty:0 writeback:0 unstable:0
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426768] free:3341 slab:13776 mapped:5880 pagetables:6851 bounce:0
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426772] DMA free:7776kB min:40kB low:48kB high:60kB active_anon:556kB inactive_anon:524kB
+active_file:16kB inactive_file:0kB unevictable:0kB present:15340kB pages_scanned:30 all_unreclaimable? no
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426775] lowmem_reserve[]: 0 1935 1935 1935
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426781] DMA32 free:5588kB min:5608kB low:7008kB high:8412kB active_anon:1162632kB
+inactive_anon:387612kB active_file:96kB inactive_file:256kB unevictable:45288kB present:1982128kB pages_scanned:980
+all_unreclaimable? no
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426784] lowmem_reserve[]: 0 0 0 0
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426787] DMA: 64*4kB 77*8kB 45*16kB 18*32kB 4*64kB 2*128kB 2*256kB 3*512kB 1*1024kB
+1*2048kB 0*4096kB = 7800kB
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426796] DMA32: 871*4kB 149*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB
+0*2048kB 0*4096kB = 5588kB
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426804] 151250 total pagecache pages
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426806] 18973 pages in swap cache
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426808] Swap cache stats: add 610640, delete 591667, find 144356/181468
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426810] Free swap = 0kB
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426811] Total swap = 979956kB
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434828] 507136 pages RAM
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434831] 23325 pages reserved
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434832] 190892 pages shared
Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434833] 248816 pages non-shared


In David's OOM case, there are two symptoms:
1) 70000 unaccounted/leaked pages as found by Andrew
(plus rather big number of PG_buddy and pagetable pages)
2) almost zero active_file/inactive_file; small inactive_anon;
many slab and active_anon pages.

In the situation of (2), the slab cache is _under_ scanned. So David
got OOM when vmscan should have squeezed some free pages from the slab
cache. Which is one important side effect of MinChan's patch?

Thanks,
Fengguang

> [ There is another one hiding in shrink_zone() that does the same - it
> was moved from get_scan_ratio() and is pretty old but we still kept
> the inactive/active ratio halfway sane without MinChan's patch. ]
>
> This is from your OOM-run dmesg, David:
>
> Adding 32k swap on swapfile22. Priority:-21 extents:1 across:32k
> Adding 32k swap on swapfile23. Priority:-22 extents:1 across:32k
> Adding 32k swap on swapfile24. Priority:-23 extents:3 across:44k
> Adding 32k swap on swapfile25. Priority:-24 extents:1 across:32k
>
> So we actually have swap? Or are those removed again before the OOM?
>
> If not, I think we let the anon lists rot while swap is full and when
> some swap space gets freed up and we should be able to evict anon
> pages again, we don't find any candidates. The following patch should
> improve on that.
>
> If it's not true for your particular situation, I think we still need
> it for the scenario described above.
>
> ---
> From: Johannes Weiner <[email protected]>
> Subject: vmscan: keep balancing anon lists on swap-full conditions
>
> Page reclaim doesn't scan and balance the anon LRU lists when
> nr_swap_pages is zero to save the scan overhead for swapless systems.
>
> Unfortunately, this variable can reach zero when all present swap
> space is occupied as well and we don't want to stop balancing in that
> case or we encounter an unreclaimable mess of anon lists when swap
> space gets freed up and we are theoretically in the position to page
> out again.
>
> Use the total_swap_pages variable to have a better indicator when to
> scan the anon LRU lists.
>
> We still might have unbalanced anon lists when swap space is added
> during run time but it is a a less dynamic change in state and we
> still save the scanning overhead for CONFIG_SWAP systems that never
> actually set up swap space.
>
> Signed-off-by: Johannes Weiner <[email protected]>
> ---
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 5415526..5ea7fc3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1524,7 +1524,7 @@ static void shrink_zone(int priority, struct zone *zone,
> int noswap = 0;
>
> /* If we have no swap space, do not bother scanning anon pages. */
> - if (!sc->may_swap || (nr_swap_pages <= 0)) {
> + if (!sc->may_swap || (total_swap_pages <= 0)) {
> noswap = 1;
> percent[0] = 0;
> percent[1] = 100;
> @@ -1578,7 +1578,7 @@ static void shrink_zone(int priority, struct zone *zone,
> * Even if we did not try to evict anon pages at all, we want to
> * rebalance the anon lru active/inactive ratio.
> */
> - if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
> + if (inactive_anon_is_low(zone, sc) && total_swap_pages > 0)
> shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>
> throttle_vm_writeout(sc->gfp_mask);

2009-06-28 13:31:01

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

HI, Wu.

On Sun, Jun 28, 2009 at 8:32 PM, Wu Fengguang<[email protected]> wrote:
> On Sat, Jun 27, 2009 at 08:54:12PM +0800, Johannes Weiner wrote:
>> On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
>> >
>> > I've managed to bisect things to find the commit that causes the OOMs.  It's:
>> >
>> >     commit 69c854817566db82c362797b4a6521d0b00fe1d8
>> >     Author: MinChan Kim <[email protected]>
>> >     Date:   Tue Jun 16 15:32:44 2009 -0700
>> >
>> >         vmscan: prevent shrinking of active anon lru list in case of no swap space V3
>> >
>> >         shrink_zone() can deactivate active anon pages even if we don't have a
>> >         swap device.  Many embedded products don't have a swap device.  So the
>> >         deactivation of anon pages is unnecessary.
>> >
>> >         This patch prevents unnecessary deactivation of anon lru pages.  But, it
>> >         don't prevent aging of anon pages to swap out.
>> >
>> >         Signed-off-by: Minchan Kim <[email protected]>
>> >         Acked-by: KOSAKI Motohiro <[email protected]>
>> >         Cc: Johannes Weiner <[email protected]>
>> >         Acked-by: Rik van Riel <[email protected]>
>> >         Signed-off-by: Andrew Morton <[email protected]>
>> >         Signed-off-by: Linus Torvalds <[email protected]>
>> >
>> > This exhibits the problem.  The previous commit:
>> >
>> >     commit 35282a2de4e5e4e173ab61aa9d7015886021a821
>> >     Author: Brice Goglin <[email protected]>
>> >     Date:   Tue Jun 16 15:32:43 2009 -0700
>> >
>> >         migration: only migrate_prep() once per move_pages()
>> >
>> > survives 16 iterations of the LTP syscall testsuite without exhibiting the
>> > problem.
>>
>> Here is the patch in question:
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 7592d8e..879d034 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
>>        * Even if we did not try to evict anon pages at all, we want to
>>        * rebalance the anon lru active/inactive ratio.
>>        */
>> -     if (inactive_anon_is_low(zone, sc))
>> +     if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
>>               shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>>
>>       throttle_vm_writeout(sc->gfp_mask);
>>
>> When this was discussed, I think we missed that nr_swap_pages can
>> actually get zero on swap systems as well and this should have been
>> total_swap_pages - otherwise we also stop balancing the two anon lists
>> when swap is _full_ which was not the intention of this change at all.
>
> Exactly. In Jesse's OOM case, the swap is exhausted.
> total_swap_pages is the better choice in this situation.
>
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426766] Active_anon:290797 active_file:28 inactive_anon:97034
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426767]  inactive_file:61 unevictable:11322 dirty:0 writeback:0 unstable:0
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426768]  free:3341 slab:13776 mapped:5880 pagetables:6851 bounce:0
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426772] DMA free:7776kB min:40kB low:48kB high:60kB active_anon:556kB inactive_anon:524kB
> +active_file:16kB inactive_file:0kB unevictable:0kB present:15340kB pages_scanned:30 all_unreclaimable? no
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426775] lowmem_reserve[]: 0 1935 1935 1935
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426781] DMA32 free:5588kB min:5608kB low:7008kB high:8412kB active_anon:1162632kB
> +inactive_anon:387612kB active_file:96kB inactive_file:256kB unevictable:45288kB present:1982128kB pages_scanned:980
> +all_unreclaimable? no
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426784] lowmem_reserve[]: 0 0 0 0
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426787] DMA: 64*4kB 77*8kB 45*16kB 18*32kB 4*64kB 2*128kB 2*256kB 3*512kB 1*1024kB
> +1*2048kB 0*4096kB = 7800kB
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426796] DMA32: 871*4kB 149*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB
> +0*2048kB 0*4096kB = 5588kB
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426804] 151250 total pagecache pages
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426806] 18973 pages in swap cache
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426808] Swap cache stats: add 610640, delete 591667, find 144356/181468
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426810] Free swap  = 0kB
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426811] Total swap = 979956kB
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434828] 507136 pages RAM
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434831] 23325 pages reserved
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434832] 190892 pages shared
> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434833] 248816 pages non-shared
>
>
> In David's OOM case, there are two symptoms:
> 1) 70000 unaccounted/leaked pages as found by Andrew
>   (plus rather big number of PG_buddy and pagetable pages)
> 2) almost zero active_file/inactive_file; small inactive_anon;
>   many slab and active_anon pages.
>
> In the situation of (2), the slab cache is _under_ scanned. So David
> got OOM when vmscan should have squeezed some free pages from the slab
> cache. Which is one important side effect of MinChan's patch?

My patch's side effect is (2).

My guessing is following as.

1. The number of page scanned in shrink_slab is increased in shrink_page_list.
And it is doubled for mapped page or swapcache.
2. shrink_page_list is called by shrink_inactive_list
3. shrink_inactive_list is called by shrink_list

Look at the shrink_list.
If inactive lru list is low, it always call shrink_active_list not
shrink_inactive_list in case of anon.
It means it doesn't increased sc->nr_scanned.
Then shrink_slab can't shrink enough slab pages.
So, David OOM have a lot of slab pages and active anon pages.

Does it make sense ?
If it make sense, we have to change shrink_slab's pressure method.
What do you think ?


--
Kinds regards,
Minchan Kim

2009-06-28 13:36:55

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, Jun 28, 2009 at 10:30 PM, Minchan Kim<[email protected]> wrote:
> HI, Wu.
>
> On Sun, Jun 28, 2009 at 8:32 PM, Wu Fengguang<[email protected]> wrote:
>> On Sat, Jun 27, 2009 at 08:54:12PM +0800, Johannes Weiner wrote:
>>> On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
>>> >
>>> > I've managed to bisect things to find the commit that causes the OOMs.  It's:
>>> >
>>> >     commit 69c854817566db82c362797b4a6521d0b00fe1d8
>>> >     Author: MinChan Kim <[email protected]>
>>> >     Date:   Tue Jun 16 15:32:44 2009 -0700
>>> >
>>> >         vmscan: prevent shrinking of active anon lru list in case of no swap space V3
>>> >
>>> >         shrink_zone() can deactivate active anon pages even if we don't have a
>>> >         swap device.  Many embedded products don't have a swap device.  So the
>>> >         deactivation of anon pages is unnecessary.
>>> >
>>> >         This patch prevents unnecessary deactivation of anon lru pages.  But, it
>>> >         don't prevent aging of anon pages to swap out.
>>> >
>>> >         Signed-off-by: Minchan Kim <[email protected]>
>>> >         Acked-by: KOSAKI Motohiro <[email protected]>
>>> >         Cc: Johannes Weiner <[email protected]>
>>> >         Acked-by: Rik van Riel <[email protected]>
>>> >         Signed-off-by: Andrew Morton <[email protected]>
>>> >         Signed-off-by: Linus Torvalds <[email protected]>
>>> >
>>> > This exhibits the problem.  The previous commit:
>>> >
>>> >     commit 35282a2de4e5e4e173ab61aa9d7015886021a821
>>> >     Author: Brice Goglin <[email protected]>
>>> >     Date:   Tue Jun 16 15:32:43 2009 -0700
>>> >
>>> >         migration: only migrate_prep() once per move_pages()
>>> >
>>> > survives 16 iterations of the LTP syscall testsuite without exhibiting the
>>> > problem.
>>>
>>> Here is the patch in question:
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index 7592d8e..879d034 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
>>>        * Even if we did not try to evict anon pages at all, we want to
>>>        * rebalance the anon lru active/inactive ratio.
>>>        */
>>> -     if (inactive_anon_is_low(zone, sc))
>>> +     if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
>>>               shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>>>
>>>       throttle_vm_writeout(sc->gfp_mask);
>>>
>>> When this was discussed, I think we missed that nr_swap_pages can
>>> actually get zero on swap systems as well and this should have been
>>> total_swap_pages - otherwise we also stop balancing the two anon lists
>>> when swap is _full_ which was not the intention of this change at all.
>>
>> Exactly. In Jesse's OOM case, the swap is exhausted.
>> total_swap_pages is the better choice in this situation.
>>
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426766] Active_anon:290797 active_file:28 inactive_anon:97034
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426767]  inactive_file:61 unevictable:11322 dirty:0 writeback:0 unstable:0
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426768]  free:3341 slab:13776 mapped:5880 pagetables:6851 bounce:0
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426772] DMA free:7776kB min:40kB low:48kB high:60kB active_anon:556kB inactive_anon:524kB
>> +active_file:16kB inactive_file:0kB unevictable:0kB present:15340kB pages_scanned:30 all_unreclaimable? no
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426775] lowmem_reserve[]: 0 1935 1935 1935
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426781] DMA32 free:5588kB min:5608kB low:7008kB high:8412kB active_anon:1162632kB
>> +inactive_anon:387612kB active_file:96kB inactive_file:256kB unevictable:45288kB present:1982128kB pages_scanned:980
>> +all_unreclaimable? no
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426784] lowmem_reserve[]: 0 0 0 0
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426787] DMA: 64*4kB 77*8kB 45*16kB 18*32kB 4*64kB 2*128kB 2*256kB 3*512kB 1*1024kB
>> +1*2048kB 0*4096kB = 7800kB
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426796] DMA32: 871*4kB 149*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB
>> +0*2048kB 0*4096kB = 5588kB
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426804] 151250 total pagecache pages
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426806] 18973 pages in swap cache
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426808] Swap cache stats: add 610640, delete 591667, find 144356/181468
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426810] Free swap  = 0kB
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426811] Total swap = 979956kB
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434828] 507136 pages RAM
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434831] 23325 pages reserved
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434832] 190892 pages shared
>> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434833] 248816 pages non-shared
>>
>>
>> In David's OOM case, there are two symptoms:
>> 1) 70000 unaccounted/leaked pages as found by Andrew
>>   (plus rather big number of PG_buddy and pagetable pages)
>> 2) almost zero active_file/inactive_file; small inactive_anon;
>>   many slab and active_anon pages.
>>
>> In the situation of (2), the slab cache is _under_ scanned. So David
>> got OOM when vmscan should have squeezed some free pages from the slab
>> cache. Which is one important side effect of MinChan's patch?
>
> My patch's side effect is (2).
>
> My guessing is following as.
>
> 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
> And it is doubled for mapped page or swapcache.
> 2. shrink_page_list is called by shrink_inactive_list
> 3. shrink_inactive_list is called by shrink_list
>
> Look at the shrink_list.
> If inactive lru list is low, it always call shrink_active_list not
> shrink_inactive_list in case of anon.

I missed most important point.
My patch's side effect is that it keeps inactive anon's lru low.
So I think it is caused by my patch's side effect.

> It means it doesn't increased sc->nr_scanned.
> Then shrink_slab can't shrink enough slab pages.
> So, David OOM have a lot of slab pages and active anon pages.
>
> Does it make sense ?
> If it make sense, we have to change shrink_slab's pressure method.
> What do you think ?
>
>
> --
> Kinds regards,
> Minchan Kim
>



--
Kinds regards,
Minchan Kim

2009-06-28 14:23:01

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, Jun 28, 2009 at 09:36:49PM +0800, Minchan Kim wrote:
> On Sun, Jun 28, 2009 at 10:30 PM, Minchan Kim<[email protected]> wrote:
> > HI, Wu.
> >
> > On Sun, Jun 28, 2009 at 8:32 PM, Wu Fengguang<[email protected]> wrote:
> >> On Sat, Jun 27, 2009 at 08:54:12PM +0800, Johannes Weiner wrote:
> >>> On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
> >>> >
> >>> > I've managed to bisect things to find the commit that causes the OOMs.  It's:
> >>> >
> >>> >     commit 69c854817566db82c362797b4a6521d0b00fe1d8
> >>> >     Author: MinChan Kim <[email protected]>
> >>> >     Date:   Tue Jun 16 15:32:44 2009 -0700
> >>> >
> >>> >         vmscan: prevent shrinking of active anon lru list in case of no swap space V3
> >>> >
> >>> >         shrink_zone() can deactivate active anon pages even if we don't have a
> >>> >         swap device.  Many embedded products don't have a swap device.  So the
> >>> >         deactivation of anon pages is unnecessary.
> >>> >
> >>> >         This patch prevents unnecessary deactivation of anon lru pages.  But, it
> >>> >         don't prevent aging of anon pages to swap out.
> >>> >
> >>> >         Signed-off-by: Minchan Kim <[email protected]>
> >>> >         Acked-by: KOSAKI Motohiro <[email protected]>
> >>> >         Cc: Johannes Weiner <[email protected]>
> >>> >         Acked-by: Rik van Riel <[email protected]>
> >>> >         Signed-off-by: Andrew Morton <[email protected]>
> >>> >         Signed-off-by: Linus Torvalds <[email protected]>
> >>> >
> >>> > This exhibits the problem.  The previous commit:
> >>> >
> >>> >     commit 35282a2de4e5e4e173ab61aa9d7015886021a821
> >>> >     Author: Brice Goglin <[email protected]>
> >>> >     Date:   Tue Jun 16 15:32:43 2009 -0700
> >>> >
> >>> >         migration: only migrate_prep() once per move_pages()
> >>> >
> >>> > survives 16 iterations of the LTP syscall testsuite without exhibiting the
> >>> > problem.
> >>>
> >>> Here is the patch in question:
> >>>
> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
> >>> index 7592d8e..879d034 100644
> >>> --- a/mm/vmscan.c
> >>> +++ b/mm/vmscan.c
> >>> @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
> >>>        * Even if we did not try to evict anon pages at all, we want to
> >>>        * rebalance the anon lru active/inactive ratio.
> >>>        */
> >>> -     if (inactive_anon_is_low(zone, sc))
> >>> +     if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
> >>>               shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
> >>>
> >>>       throttle_vm_writeout(sc->gfp_mask);
> >>>
> >>> When this was discussed, I think we missed that nr_swap_pages can
> >>> actually get zero on swap systems as well and this should have been
> >>> total_swap_pages - otherwise we also stop balancing the two anon lists
> >>> when swap is _full_ which was not the intention of this change at all.
> >>
> >> Exactly. In Jesse's OOM case, the swap is exhausted.
> >> total_swap_pages is the better choice in this situation.
> >>
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426766] Active_anon:290797 active_file:28 inactive_anon:97034
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426767]  inactive_file:61 unevictable:11322 dirty:0 writeback:0 unstable:0
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426768]  free:3341 slab:13776 mapped:5880 pagetables:6851 bounce:0
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426772] DMA free:7776kB min:40kB low:48kB high:60kB active_anon:556kB inactive_anon:524kB
> >> +active_file:16kB inactive_file:0kB unevictable:0kB present:15340kB pages_scanned:30 all_unreclaimable? no
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426775] lowmem_reserve[]: 0 1935 1935 1935
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426781] DMA32 free:5588kB min:5608kB low:7008kB high:8412kB active_anon:1162632kB
> >> +inactive_anon:387612kB active_file:96kB inactive_file:256kB unevictable:45288kB present:1982128kB pages_scanned:980
> >> +all_unreclaimable? no
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426784] lowmem_reserve[]: 0 0 0 0
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426787] DMA: 64*4kB 77*8kB 45*16kB 18*32kB 4*64kB 2*128kB 2*256kB 3*512kB 1*1024kB
> >> +1*2048kB 0*4096kB = 7800kB
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426796] DMA32: 871*4kB 149*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB
> >> +0*2048kB 0*4096kB = 5588kB
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426804] 151250 total pagecache pages
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426806] 18973 pages in swap cache
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426808] Swap cache stats: add 610640, delete 591667, find 144356/181468
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426810] Free swap  = 0kB
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.426811] Total swap = 979956kB
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434828] 507136 pages RAM
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434831] 23325 pages reserved
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434832] 190892 pages shared
> >> Jun 18 07:44:53 jbarnes-g45 kernel: [64377.434833] 248816 pages non-shared
> >>
> >>
> >> In David's OOM case, there are two symptoms:
> >> 1) 70000 unaccounted/leaked pages as found by Andrew
> >>   (plus rather big number of PG_buddy and pagetable pages)
> >> 2) almost zero active_file/inactive_file; small inactive_anon;
> >>   many slab and active_anon pages.
> >>
> >> In the situation of (2), the slab cache is _under_ scanned. So David
> >> got OOM when vmscan should have squeezed some free pages from the slab
> >> cache. Which is one important side effect of MinChan's patch?
> >
> > My patch's side effect is (2).
> >
> > My guessing is following as.
> >
> > 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
> > And it is doubled for mapped page or swapcache.
> > 2. shrink_page_list is called by shrink_inactive_list
> > 3. shrink_inactive_list is called by shrink_list
> >
> > Look at the shrink_list.
> > If inactive lru list is low, it always call shrink_active_list not
> > shrink_inactive_list in case of anon.
>
> I missed most important point.
> My patch's side effect is that it keeps inactive anon's lru low.
> So I think it is caused by my patch's side effect.

Yes, smaller inactive_anon means smaller (pointless) nr_scanned,
and therefore less slab scans. Strictly speaking, it's not the fault
of your patch. It indicates that the slab scan ratio algorithm should
be updated too :)

We could refine the estimation of "reclaimable" pages like this:

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 416f748..e9c5b0e 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -167,14 +167,7 @@ static inline unsigned long zone_page_state(struct zone *zone,
}

extern unsigned long global_lru_pages(void);
-
-static inline unsigned long zone_lru_pages(struct zone *zone)
-{
- return (zone_page_state(zone, NR_ACTIVE_ANON)
- + zone_page_state(zone, NR_ACTIVE_FILE)
- + zone_page_state(zone, NR_INACTIVE_ANON)
- + zone_page_state(zone, NR_INACTIVE_FILE));
-}
+extern unsigned long zone_lru_pages(void);

#ifdef CONFIG_NUMA
/*
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 026f452..4281c6f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2123,10 +2123,31 @@ void wakeup_kswapd(struct zone *zone, int order)

unsigned long global_lru_pages(void)
{
- return global_page_state(NR_ACTIVE_ANON)
- + global_page_state(NR_ACTIVE_FILE)
- + global_page_state(NR_INACTIVE_ANON)
- + global_page_state(NR_INACTIVE_FILE);
+ int nr;
+
+ nr = global_page_state(zone, NR_ACTIVE_FILE) +
+ global_page_state(zone, NR_INACTIVE_FILE);
+
+ if (total_swap_pages)
+ nr += global_page_state(zone, NR_ACTIVE_ANON) +
+ global_page_state(zone, NR_INACTIVE_ANON);
+
+ return nr;
+}
+
+
+unsigned long zone_lru_pages(struct zone *zone)
+{
+ int nr;
+
+ nr = zone_page_state(zone, NR_ACTIVE_FILE) +
+ zone_page_state(zone, NR_INACTIVE_FILE);
+
+ if (total_swap_pages)
+ nr += zone_page_state(zone, NR_ACTIVE_ANON) +
+ zone_page_state(zone, NR_INACTIVE_ANON);
+
+ return nr;
}

#ifdef CONFIG_HIBERNATION

2009-06-28 14:57:29

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

>> In David's OOM case, there are two symptoms:
>> 1) 70000 unaccounted/leaked pages as found by Andrew
>> ? (plus rather big number of PG_buddy and pagetable pages)
>> 2) almost zero active_file/inactive_file; small inactive_anon;
>> ? many slab and active_anon pages.
>>
>> In the situation of (2), the slab cache is _under_ scanned. So David
>> got OOM when vmscan should have squeezed some free pages from the slab
>> cache. Which is one important side effect of MinChan's patch?
>
> My patch's side effect is (2).
>
> My guessing is following as.
>
> 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
> And it is doubled for mapped page or swapcache.
> 2. shrink_page_list is called by shrink_inactive_list
> 3. shrink_inactive_list is called by shrink_list
>
> Look at the shrink_list.
> If inactive lru list is low, it always call shrink_active_list not
> shrink_inactive_list in case of anon.
> It means it doesn't increased sc->nr_scanned.
> Then shrink_slab can't shrink enough slab pages.
> So, David OOM have a lot of slab pages and active anon pages.
>
> Does it make sense ?
> If it make sense, we have to change shrink_slab's pressure method.
> What do you think ?

I'm confused.

if system have no swap, get_scan_ratio() always return anon=0%.
Then, the numver of inactive_anon is not effect to sc.nr_scanned.

2009-06-28 15:01:45

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

> Yes, smaller inactive_anon means smaller (pointless) nr_scanned,
> and therefore less slab scans. Strictly speaking, it's not the fault
> of your patch. It indicates that the slab scan ratio algorithm should
> be updated too :)

I don't think this patch is related to minchan's patch.
but I think this patch is good.


> We could refine the estimation of "reclaimable" pages like this:

hmhm, reasonable idea.

>
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index 416f748..e9c5b0e 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -167,14 +167,7 @@ static inline unsigned long zone_page_state(struct zone *zone,
> ?}
>
> ?extern unsigned long global_lru_pages(void);
> -
> -static inline unsigned long zone_lru_pages(struct zone *zone)
> -{
> - ? ? ? return (zone_page_state(zone, NR_ACTIVE_ANON)
> - ? ? ? ? ? ? ? + zone_page_state(zone, NR_ACTIVE_FILE)
> - ? ? ? ? ? ? ? + zone_page_state(zone, NR_INACTIVE_ANON)
> - ? ? ? ? ? ? ? + zone_page_state(zone, NR_INACTIVE_FILE));
> -}
> +extern unsigned long zone_lru_pages(void);
>
> ?#ifdef CONFIG_NUMA
> ?/*
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 026f452..4281c6f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2123,10 +2123,31 @@ void wakeup_kswapd(struct zone *zone, int order)
>
> ?unsigned long global_lru_pages(void)
> ?{
> - ? ? ? return global_page_state(NR_ACTIVE_ANON)
> - ? ? ? ? ? ? ? + global_page_state(NR_ACTIVE_FILE)
> - ? ? ? ? ? ? ? + global_page_state(NR_INACTIVE_ANON)
> - ? ? ? ? ? ? ? + global_page_state(NR_INACTIVE_FILE);
> + ? ? ? int nr;
> +
> + ? ? ? nr = global_page_state(zone, NR_ACTIVE_FILE) +
> + ? ? ? ? ? ?global_page_state(zone, NR_INACTIVE_FILE);
> +
> + ? ? ? if (total_swap_pages)
> + ? ? ? ? ? ? ? nr += global_page_state(zone, NR_ACTIVE_ANON) +
> + ? ? ? ? ? ? ? ? ? ? global_page_state(zone, NR_INACTIVE_ANON);
> +
> + ? ? ? return nr;
> +}

Please change function name too.
Now, this function only account reclaimable pages.

Plus, total_swap_pages is bad. if we need to concern "reclaimable
pages", we should use nr_swap_pages.
I mean, swap-full also makes anon is unreclaimable althouth system
have sone swap device.



> +
> +
> +unsigned long zone_lru_pages(struct zone *zone)
> +{
> + ? ? ? int nr;
> +
> + ? ? ? nr = zone_page_state(zone, NR_ACTIVE_FILE) +
> + ? ? ? ? ? ?zone_page_state(zone, NR_INACTIVE_FILE);
> +
> + ? ? ? if (total_swap_pages)
> + ? ? ? ? ? ? ? nr += zone_page_state(zone, NR_ACTIVE_ANON) +
> + ? ? ? ? ? ? ? ? ? ? zone_page_state(zone, NR_INACTIVE_ANON);
> +
> + ? ? ? return nr;
> ?}
>
> ?#ifdef CONFIG_HIBERNATION
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. ?For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2009-06-28 15:04:32

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, Jun 28, 2009 at 10:49:52PM +0800, KOSAKI Motohiro wrote:
> >> In David's OOM case, there are two symptoms:
> >> 1) 70000 unaccounted/leaked pages as found by Andrew
> >>   (plus rather big number of PG_buddy and pagetable pages)
> >> 2) almost zero active_file/inactive_file; small inactive_anon;
> >>   many slab and active_anon pages.
> >>
> >> In the situation of (2), the slab cache is _under_ scanned. So David
> >> got OOM when vmscan should have squeezed some free pages from the slab
> >> cache. Which is one important side effect of MinChan's patch?
> >
> > My patch's side effect is (2).
> >
> > My guessing is following as.
> >
> > 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
> > And it is doubled for mapped page or swapcache.
> > 2. shrink_page_list is called by shrink_inactive_list
> > 3. shrink_inactive_list is called by shrink_list
> >
> > Look at the shrink_list.
> > If inactive lru list is low, it always call shrink_active_list not
> > shrink_inactive_list in case of anon.
> > It means it doesn't increased sc->nr_scanned.
> > Then shrink_slab can't shrink enough slab pages.
> > So, David OOM have a lot of slab pages and active anon pages.
> >
> > Does it make sense ?
> > If it make sense, we have to change shrink_slab's pressure method.
> > What do you think ?
>
> I'm confused.
>
> if system have no swap, get_scan_ratio() always return anon=0%.
> Then, the numver of inactive_anon is not effect to sc.nr_scanned.

You are right. Hehe, so that's not a real side effect.

2009-06-28 15:10:48

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, Jun 28, 2009 at 11:01:40PM +0800, KOSAKI Motohiro wrote:
> > Yes, smaller inactive_anon means smaller (pointless) nr_scanned,
> > and therefore less slab scans. Strictly speaking, it's not the fault
> > of your patch. It indicates that the slab scan ratio algorithm should
> > be updated too :)
>
> I don't think this patch is related to minchan's patch.
> but I think this patch is good.

OK.

>
> > We could refine the estimation of "reclaimable" pages like this:
>
> hmhm, reasonable idea.

Thank you.

> >
> > diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> > index 416f748..e9c5b0e 100644
> > --- a/include/linux/vmstat.h
> > +++ b/include/linux/vmstat.h
> > @@ -167,14 +167,7 @@ static inline unsigned long zone_page_state(struct zone *zone,
> >  }
> >
> >  extern unsigned long global_lru_pages(void);
> > -
> > -static inline unsigned long zone_lru_pages(struct zone *zone)
> > -{
> > -       return (zone_page_state(zone, NR_ACTIVE_ANON)
> > -               + zone_page_state(zone, NR_ACTIVE_FILE)
> > -               + zone_page_state(zone, NR_INACTIVE_ANON)
> > -               + zone_page_state(zone, NR_INACTIVE_FILE));
> > -}
> > +extern unsigned long zone_lru_pages(void);
> >
> >  #ifdef CONFIG_NUMA
> >  /*
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 026f452..4281c6f 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2123,10 +2123,31 @@ void wakeup_kswapd(struct zone *zone, int order)
> >
> >  unsigned long global_lru_pages(void)
> >  {
> > -       return global_page_state(NR_ACTIVE_ANON)
> > -               + global_page_state(NR_ACTIVE_FILE)
> > -               + global_page_state(NR_INACTIVE_ANON)
> > -               + global_page_state(NR_INACTIVE_FILE);
> > +       int nr;
> > +
> > +       nr = global_page_state(zone, NR_ACTIVE_FILE) +
> > +            global_page_state(zone, NR_INACTIVE_FILE);
> > +
> > +       if (total_swap_pages)
> > +               nr += global_page_state(zone, NR_ACTIVE_ANON) +
> > +                     global_page_state(zone, NR_INACTIVE_ANON);
> > +
> > +       return nr;
> > +}
>
> Please change function name too.
> Now, this function only account reclaimable pages.

Good suggestion - I did considered renaming them to *_relaimable_pages.

> Plus, total_swap_pages is bad. if we need to concern "reclaimable
> pages", we should use nr_swap_pages.

> I mean, swap-full also makes anon is unreclaimable althouth system
> have sone swap device.

Right, changed to (nr_swap_pages > 0).

Thanks,
Fengguang
---

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 416f748..8d8aa20 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -166,15 +166,8 @@ static inline unsigned long zone_page_state(struct zone *zone,
return x;
}

-extern unsigned long global_lru_pages(void);
-
-static inline unsigned long zone_lru_pages(struct zone *zone)
-{
- return (zone_page_state(zone, NR_ACTIVE_ANON)
- + zone_page_state(zone, NR_ACTIVE_FILE)
- + zone_page_state(zone, NR_INACTIVE_ANON)
- + zone_page_state(zone, NR_INACTIVE_FILE));
-}
+extern unsigned long global_reclaimable_pages(void);
+extern unsigned long zone_reclaimable_pages(void);

#ifdef CONFIG_NUMA
/*
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index a91b870..74c3067 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -394,7 +394,8 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
struct zone *z =
&NODE_DATA(node)->node_zones[ZONE_HIGHMEM];

- x += zone_page_state(z, NR_FREE_PAGES) + zone_lru_pages(z);
+ x += zone_page_state(z, NR_FREE_PAGES) +
+ zone_reclaimable_pages(z);
}
/*
* Make sure that the number of highmem pages is never larger
@@ -418,7 +419,7 @@ unsigned long determine_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_lru_pages();
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 026f452..3768332 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1693,7 +1693,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
continue;

- lru_pages += zone_lru_pages(zone);
+ lru_pages += zone_reclaimable_pages(zone);
}
}

@@ -1910,7 +1910,7 @@ loop_again:
for (i = 0; i <= end_zone; i++) {
struct zone *zone = pgdat->node_zones + i;

- lru_pages += zone_lru_pages(zone);
+ lru_pages += zone_reclaimable_pages(zone);
}

/*
@@ -1954,7 +1954,7 @@ loop_again:
if (zone_is_all_unreclaimable(zone))
continue;
if (nr_slab == 0 && zone->pages_scanned >=
- (zone_lru_pages(zone) * 6))
+ (zone_reclaimable_pages(zone) * 6))
zone_set_flag(zone,
ZONE_ALL_UNRECLAIMABLE);
/*
@@ -2121,12 +2121,33 @@ void wakeup_kswapd(struct zone *zone, int order)
wake_up_interruptible(&pgdat->kswapd_wait);
}

-unsigned long global_lru_pages(void)
+unsigned long global_reclaimable_pages(void)
{
- return global_page_state(NR_ACTIVE_ANON)
- + global_page_state(NR_ACTIVE_FILE)
- + global_page_state(NR_INACTIVE_ANON)
- + global_page_state(NR_INACTIVE_FILE);
+ int nr;
+
+ nr = global_page_state(zone, NR_ACTIVE_FILE) +
+ global_page_state(zone, NR_INACTIVE_FILE);
+
+ if (total_swap_pages)
+ nr += global_page_state(zone, NR_ACTIVE_ANON) +
+ global_page_state(zone, NR_INACTIVE_ANON);
+
+ return nr;
+}
+
+
+unsigned long zone_reclaimable_pages(struct zone *zone)
+{
+ int nr;
+
+ nr = zone_page_state(zone, NR_ACTIVE_FILE) +
+ zone_page_state(zone, NR_INACTIVE_FILE);
+
+ if (nr_swap_pages > 0)
+ nr += zone_page_state(zone, NR_ACTIVE_ANON) +
+ zone_page_state(zone, NR_INACTIVE_ANON);
+
+ return nr;
}

#ifdef CONFIG_HIBERNATION
@@ -2198,7 +2219,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)

current->reclaim_state = &reclaim_state;

- lru_pages = global_lru_pages();
+ lru_pages = global_reclaimable_pages();
nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
/* If slab caches are huge, it's better to hit them first */
while (nr_slab >= lru_pages) {
@@ -2240,7 +2261,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)

reclaim_state.reclaimed_slab = 0;
shrink_slab(sc.nr_scanned, sc.gfp_mask,
- global_lru_pages());
+ global_reclaimable_pages());
sc.nr_reclaimed += reclaim_state.reclaimed_slab;
if (sc.nr_reclaimed >= nr_pages)
goto out;
@@ -2257,7 +2278,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
if (!sc.nr_reclaimed) {
do {
reclaim_state.reclaimed_slab = 0;
- shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
+ shrink_slab(nr_pages, sc.gfp_mask,
+ global_reclaimable_pages());
sc.nr_reclaimed += reclaim_state.reclaimed_slab;
} while (sc.nr_reclaimed < nr_pages &&
reclaim_state.reclaimed_slab > 0);

2009-06-28 16:47:38

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, Jun 28, 2009 at 11:49 PM, KOSAKI
Motohiro<[email protected]> wrote:
>>> In David's OOM case, there are two symptoms:
>>> 1) 70000 unaccounted/leaked pages as found by Andrew
>>>   (plus rather big number of PG_buddy and pagetable pages)
>>> 2) almost zero active_file/inactive_file; small inactive_anon;
>>>   many slab and active_anon pages.
>>>
>>> In the situation of (2), the slab cache is _under_ scanned. So David
>>> got OOM when vmscan should have squeezed some free pages from the slab
>>> cache. Which is one important side effect of MinChan's patch?
>>
>> My patch's side effect is (2).
>>
>> My guessing is following as.
>>
>> 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
>> And it is doubled for mapped page or swapcache.
>> 2. shrink_page_list is called by shrink_inactive_list
>> 3. shrink_inactive_list is called by shrink_list
>>
>> Look at the shrink_list.
>> If inactive lru list is low, it always call shrink_active_list not
>> shrink_inactive_list in case of anon.
>> It means it doesn't increased sc->nr_scanned.
>> Then shrink_slab can't shrink enough slab pages.
>> So, David OOM have a lot of slab pages and active anon pages.
>>
>> Does it make sense ?
>> If it make sense, we have to change shrink_slab's pressure method.
>> What do you think ?
>
> I'm confused.
>
> if system have no swap, get_scan_ratio() always return anon=0%.
> Then, the numver of inactive_anon is not effect to sc.nr_scanned.
>

My patch isn't a concern since the number of anon lru list(active +
anon) always same. I mean shrink_slab's lru_pages is same whether my
patch there is. OOM or Pass depends on sc->nr_scanned, I think.

Why I think it is my patch's side effect is follow as.

Compared to old behavior, my patch can change balancing of anon lru
list when "swap file" is full as Hannes already pointed me out.

It can affect reclaimable anon pages while David is going on swap test on LTP.
When swap file test is end, pages on swap file is inserted anon lru list, again.

My patch can change physical location of anon pages on ram compared to old.

>From now on, we have no swap file so that we can reclaim only file pages.
But we have missed one thing. lumpy reclaim!. (In fact, we should not
reclaim anon pages in no swap space. A few days ago, I sended patch
about this problem. http://patchwork.kernel.org/patch/32651/)

It can reclaim anon pages although we have no swap file.
But after all, shrink_page_list can't reclaim anon pages. But it
increases sc->nr_scanned.

So I think whether Shrink_slab can reclaim enough or not depends on
sc->nr_scanned.

David's problem is very subtle.

1. If lumpy picks up the anon pages, it can pass LTP since
sc->nr_scanned is increased.
2. If lumpy don't pick up the anon pages, it can meet OOM since
sc->nr_scanned is almost zero or very small.

Unfortunately, my patch seems to change physical location of pages on
ram compared to old so that it selects 2.

It's my imaginary novel.

Okay. I believe Wu's patch will solve David's problem.
David. Could you test with Wu's patch ?

--
Kinds regards,
Minchan Kim

2009-06-28 16:50:26

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Looks good.

David, Can you test with this patch ?

On Mon, Jun 29, 2009 at 12:10 AM, Wu Fengguang<[email protected]> wrote:
> On Sun, Jun 28, 2009 at 11:01:40PM +0800, KOSAKI Motohiro wrote:
>> > Yes, smaller inactive_anon means smaller (pointless) nr_scanned,
>> > and therefore less slab scans. Strictly speaking, it's not the fault
>> > of your patch. It indicates that the slab scan ratio algorithm should
>> > be updated too :)
>>
>> I don't think this patch is related to minchan's patch.
>> but I think this patch is good.
>
> OK.
>
>>
>> > We could refine the estimation of "reclaimable" pages like this:
>>
>> hmhm, reasonable idea.
>
> Thank you.
>
>> >
>> > diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
>> > index 416f748..e9c5b0e 100644
>> > --- a/include/linux/vmstat.h
>> > +++ b/include/linux/vmstat.h
>> > @@ -167,14 +167,7 @@ static inline unsigned long zone_page_state(struct zone *zone,
>> >  }
>> >
>> >  extern unsigned long global_lru_pages(void);
>> > -
>> > -static inline unsigned long zone_lru_pages(struct zone *zone)
>> > -{
>> > -       return (zone_page_state(zone, NR_ACTIVE_ANON)
>> > -               + zone_page_state(zone, NR_ACTIVE_FILE)
>> > -               + zone_page_state(zone, NR_INACTIVE_ANON)
>> > -               + zone_page_state(zone, NR_INACTIVE_FILE));
>> > -}
>> > +extern unsigned long zone_lru_pages(void);
>> >
>> >  #ifdef CONFIG_NUMA
>> >  /*
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index 026f452..4281c6f 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -2123,10 +2123,31 @@ void wakeup_kswapd(struct zone *zone, int order)
>> >
>> >  unsigned long global_lru_pages(void)
>> >  {
>> > -       return global_page_state(NR_ACTIVE_ANON)
>> > -               + global_page_state(NR_ACTIVE_FILE)
>> > -               + global_page_state(NR_INACTIVE_ANON)
>> > -               + global_page_state(NR_INACTIVE_FILE);
>> > +       int nr;
>> > +
>> > +       nr = global_page_state(zone, NR_ACTIVE_FILE) +
>> > +            global_page_state(zone, NR_INACTIVE_FILE);
>> > +
>> > +       if (total_swap_pages)
>> > +               nr += global_page_state(zone, NR_ACTIVE_ANON) +
>> > +                     global_page_state(zone, NR_INACTIVE_ANON);
>> > +
>> > +       return nr;
>> > +}
>>
>> Please change function name too.
>> Now, this function only account reclaimable pages.
>
> Good suggestion - I did considered renaming them to *_relaimable_pages.
>
>> Plus, total_swap_pages is bad. if we need to concern "reclaimable
>> pages", we should use nr_swap_pages.
>
>> I mean, swap-full also makes anon is unreclaimable althouth system
>> have sone swap device.
>
> Right, changed to (nr_swap_pages > 0).
>
> Thanks,
> Fengguang
> ---
>
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index 416f748..8d8aa20 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -166,15 +166,8 @@ static inline unsigned long zone_page_state(struct zone *zone,
>        return x;
>  }
>
> -extern unsigned long global_lru_pages(void);
> -
> -static inline unsigned long zone_lru_pages(struct zone *zone)
> -{
> -       return (zone_page_state(zone, NR_ACTIVE_ANON)
> -               + zone_page_state(zone, NR_ACTIVE_FILE)
> -               + zone_page_state(zone, NR_INACTIVE_ANON)
> -               + zone_page_state(zone, NR_INACTIVE_FILE));
> -}
> +extern unsigned long global_reclaimable_pages(void);
> +extern unsigned long zone_reclaimable_pages(void);
>
>  #ifdef CONFIG_NUMA
>  /*
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index a91b870..74c3067 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -394,7 +394,8 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
>                struct zone *z =
>                        &NODE_DATA(node)->node_zones[ZONE_HIGHMEM];
>
> -               x += zone_page_state(z, NR_FREE_PAGES) + zone_lru_pages(z);
> +               x += zone_page_state(z, NR_FREE_PAGES) +
> +                    zone_reclaimable_pages(z);
>        }
>        /*
>         * Make sure that the number of highmem pages is never larger
> @@ -418,7 +419,7 @@ unsigned long determine_dirtyable_memory(void)
>  {
>        unsigned long x;
>
> -       x = global_page_state(NR_FREE_PAGES) + global_lru_pages();
> +       x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
>
>        if (!vm_highmem_is_dirtyable)
>                x -= highmem_dirtyable_memory(x);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 026f452..3768332 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1693,7 +1693,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>                        if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
>                                continue;
>
> -                       lru_pages += zone_lru_pages(zone);
> +                       lru_pages += zone_reclaimable_pages(zone);
>                }
>        }
>
> @@ -1910,7 +1910,7 @@ loop_again:
>                for (i = 0; i <= end_zone; i++) {
>                        struct zone *zone = pgdat->node_zones + i;
>
> -                       lru_pages += zone_lru_pages(zone);
> +                       lru_pages += zone_reclaimable_pages(zone);
>                }
>
>                /*
> @@ -1954,7 +1954,7 @@ loop_again:
>                        if (zone_is_all_unreclaimable(zone))
>                                continue;
>                        if (nr_slab == 0 && zone->pages_scanned >=
> -                                               (zone_lru_pages(zone) * 6))
> +                                       (zone_reclaimable_pages(zone) * 6))
>                                        zone_set_flag(zone,
>                                                      ZONE_ALL_UNRECLAIMABLE);
>                        /*
> @@ -2121,12 +2121,33 @@ void wakeup_kswapd(struct zone *zone, int order)
>        wake_up_interruptible(&pgdat->kswapd_wait);
>  }
>
> -unsigned long global_lru_pages(void)
> +unsigned long global_reclaimable_pages(void)
>  {
> -       return global_page_state(NR_ACTIVE_ANON)
> -               + global_page_state(NR_ACTIVE_FILE)
> -               + global_page_state(NR_INACTIVE_ANON)
> -               + global_page_state(NR_INACTIVE_FILE);
> +       int nr;
> +
> +       nr = global_page_state(zone, NR_ACTIVE_FILE) +
> +            global_page_state(zone, NR_INACTIVE_FILE);
> +
> +       if (total_swap_pages)
> +               nr += global_page_state(zone, NR_ACTIVE_ANON) +
> +                     global_page_state(zone, NR_INACTIVE_ANON);
> +
> +       return nr;
> +}
> +
> +
> +unsigned long zone_reclaimable_pages(struct zone *zone)
> +{
> +       int nr;
> +
> +       nr = zone_page_state(zone, NR_ACTIVE_FILE) +
> +            zone_page_state(zone, NR_INACTIVE_FILE);
> +
> +       if (nr_swap_pages > 0)
> +               nr += zone_page_state(zone, NR_ACTIVE_ANON) +
> +                     zone_page_state(zone, NR_INACTIVE_ANON);
> +
> +       return nr;
>  }
>
>  #ifdef CONFIG_HIBERNATION
> @@ -2198,7 +2219,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
>
>        current->reclaim_state = &reclaim_state;
>
> -       lru_pages = global_lru_pages();
> +       lru_pages = global_reclaimable_pages();
>        nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
>        /* If slab caches are huge, it's better to hit them first */
>        while (nr_slab >= lru_pages) {
> @@ -2240,7 +2261,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
>
>                        reclaim_state.reclaimed_slab = 0;
>                        shrink_slab(sc.nr_scanned, sc.gfp_mask,
> -                                       global_lru_pages());
> +                                   global_reclaimable_pages());
>                        sc.nr_reclaimed += reclaim_state.reclaimed_slab;
>                        if (sc.nr_reclaimed >= nr_pages)
>                                goto out;
> @@ -2257,7 +2278,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
>        if (!sc.nr_reclaimed) {
>                do {
>                        reclaim_state.reclaimed_slab = 0;
> -                       shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
> +                       shrink_slab(nr_pages, sc.gfp_mask,
> +                                   global_reclaimable_pages());
>                        sc.nr_reclaimed += reclaim_state.reclaimed_slab;
>                } while (sc.nr_reclaimed < nr_pages &&
>                                reclaim_state.reclaimed_slab > 0);
>



--
Kinds regards,
Minchan Kim

2009-06-28 16:53:38

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, Jun 28, 2009 at 12:36 AM, Johannes Weiner<[email protected]> wrote:
> On Sat, Jun 27, 2009 at 10:50:25PM +0900, Minchan Kim wrote:
>> Hi, Hannes.
>>
>> On Sat, Jun 27, 2009 at 9:54 PM, Johannes Weiner<[email protected]> wrote:
>> > On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote:
>> >>
>> >> I've managed to bisect things to find the commit that causes the OOMs.  It's:
>> >>
>> >>       commit 69c854817566db82c362797b4a6521d0b00fe1d8
>> >>       Author: MinChan Kim <[email protected]>
>> >>       Date:   Tue Jun 16 15:32:44 2009 -0700
>> >>
>> >>           vmscan: prevent shrinking of active anon lru list in case of no swap space V3
>> >>
>> >>           shrink_zone() can deactivate active anon pages even if we don't have a
>> >>           swap device.  Many embedded products don't have a swap device.  So the
>> >>           deactivation of anon pages is unnecessary.
>> >>
>> >>           This patch prevents unnecessary deactivation of anon lru pages.  But, it
>> >>           don't prevent aging of anon pages to swap out.
>> >>
>> >>           Signed-off-by: Minchan Kim <[email protected]>
>> >>           Acked-by: KOSAKI Motohiro <[email protected]>
>> >>           Cc: Johannes Weiner <[email protected]>
>> >>           Acked-by: Rik van Riel <[email protected]>
>> >>           Signed-off-by: Andrew Morton <[email protected]>
>> >>           Signed-off-by: Linus Torvalds <[email protected]>
>> >>
>> >> This exhibits the problem.  The previous commit:
>> >>
>> >>       commit 35282a2de4e5e4e173ab61aa9d7015886021a821
>> >>       Author: Brice Goglin <[email protected]>
>> >>       Date:   Tue Jun 16 15:32:43 2009 -0700
>> >>
>> >>           migration: only migrate_prep() once per move_pages()
>> >>
>> >> survives 16 iterations of the LTP syscall testsuite without exhibiting the
>> >> problem.
>> >
>> > Here is the patch in question:
>> >
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index 7592d8e..879d034 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone,
>> >         * Even if we did not try to evict anon pages at all, we want to
>> >         * rebalance the anon lru active/inactive ratio.
>> >         */
>> > -       if (inactive_anon_is_low(zone, sc))
>> > +       if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
>> >                shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>> >
>> >        throttle_vm_writeout(sc->gfp_mask);
>> >
>> > When this was discussed, I think we missed that nr_swap_pages can
>> > actually get zero on swap systems as well and this should have been
>> > total_swap_pages - otherwise we also stop balancing the two anon lists
>> > when swap is _full_ which was not the intention of this change at all.
>>
>> At that time we considered it so that we didn't prevent anon list
>> aging for background reclaim.
>> Do you think it is not enough ?
>
> With a heavy multiprocess anon load, direct reclaimers will likely
> reuse the reclaimed pages for anon mappings, so you have a handful of
> processes shuffling pages on the active list and only one thread that
> tries to balance.  I can imagine that it can not keep up for long.

I agree. :)
total_swap_pages is better than nr_swap_pages although it isn't
related this problem.


>



--
Kinds regards,
Minchan Kim

2009-06-29 00:21:49

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, 28 Jun 2009 23:10:26 +0800
Wu Fengguang <[email protected]> wrote:

> On Sun, Jun 28, 2009 at 11:01:40PM +0800, KOSAKI Motohiro wrote:
> > > Yes, smaller inactive_anon means smaller (pointless) nr_scanned,
> > > and therefore less slab scans. Strictly speaking, it's not the fault
> > > of your patch. It indicates that the slab scan ratio algorithm should
> > > be updated too :)
> >
> > I don't think this patch is related to minchan's patch.
> > but I think this patch is good.
>
> OK.
>
> >
> > > We could refine the estimation of "reclaimable" pages like this:
> >
> > hmhm, reasonable idea.
>
> Thank you.
>
> > >
> > > diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> > > index 416f748..e9c5b0e 100644
> > > --- a/include/linux/vmstat.h
> > > +++ b/include/linux/vmstat.h
> > > @@ -167,14 +167,7 @@ static inline unsigned long zone_page_state(struct zone *zone,
> > >  }
> > >
> > >  extern unsigned long global_lru_pages(void);
> > > -
> > > -static inline unsigned long zone_lru_pages(struct zone *zone)
> > > -{
> > > -       return (zone_page_state(zone, NR_ACTIVE_ANON)
> > > -               + zone_page_state(zone, NR_ACTIVE_FILE)
> > > -               + zone_page_state(zone, NR_INACTIVE_ANON)
> > > -               + zone_page_state(zone, NR_INACTIVE_FILE));
> > > -}
> > > +extern unsigned long zone_lru_pages(void);
> > >
> > >  #ifdef CONFIG_NUMA
> > >  /*
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index 026f452..4281c6f 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -2123,10 +2123,31 @@ void wakeup_kswapd(struct zone *zone, int order)
> > >
> > >  unsigned long global_lru_pages(void)
> > >  {
> > > -       return global_page_state(NR_ACTIVE_ANON)
> > > -               + global_page_state(NR_ACTIVE_FILE)
> > > -               + global_page_state(NR_INACTIVE_ANON)
> > > -               + global_page_state(NR_INACTIVE_FILE);
> > > +       int nr;
> > > +
> > > +       nr = global_page_state(zone, NR_ACTIVE_FILE) +
> > > +            global_page_state(zone, NR_INACTIVE_FILE);
> > > +
> > > +       if (total_swap_pages)
> > > +               nr += global_page_state(zone, NR_ACTIVE_ANON) +
> > > +                     global_page_state(zone, NR_INACTIVE_ANON);
> > > +
> > > +       return nr;
> > > +}
> >
> > Please change function name too.
> > Now, this function only account reclaimable pages.
>
> Good suggestion - I did considered renaming them to *_relaimable_pages.
>
> > Plus, total_swap_pages is bad. if we need to concern "reclaimable
> > pages", we should use nr_swap_pages.
>
> > I mean, swap-full also makes anon is unreclaimable althouth system
> > have sone swap device.
>
> Right, changed to (nr_swap_pages > 0).
>
> Thanks,
> Fengguang
> ---
>
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index 416f748..8d8aa20 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -166,15 +166,8 @@ static inline unsigned long zone_page_state(struct zone *zone,
> return x;
> }
>
> -extern unsigned long global_lru_pages(void);
> -
> -static inline unsigned long zone_lru_pages(struct zone *zone)
> -{
> - return (zone_page_state(zone, NR_ACTIVE_ANON)
> - + zone_page_state(zone, NR_ACTIVE_FILE)
> - + zone_page_state(zone, NR_INACTIVE_ANON)
> - + zone_page_state(zone, NR_INACTIVE_FILE));
> -}
> +extern unsigned long global_reclaimable_pages(void);
> +extern unsigned long zone_reclaimable_pages(void);
>
> #ifdef CONFIG_NUMA
> /*
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index a91b870..74c3067 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -394,7 +394,8 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
> struct zone *z =
> &NODE_DATA(node)->node_zones[ZONE_HIGHMEM];
>
> - x += zone_page_state(z, NR_FREE_PAGES) + zone_lru_pages(z);
> + x += zone_page_state(z, NR_FREE_PAGES) +
> + zone_reclaimable_pages(z);
> }
> /*
> * Make sure that the number of highmem pages is never larger
> @@ -418,7 +419,7 @@ unsigned long determine_dirtyable_memory(void)
> {
> unsigned long x;
>
> - x = global_page_state(NR_FREE_PAGES) + global_lru_pages();
> + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
>
> if (!vm_highmem_is_dirtyable)
> x -= highmem_dirtyable_memory(x);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 026f452..3768332 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1693,7 +1693,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
> if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
> continue;
>
> - lru_pages += zone_lru_pages(zone);
> + lru_pages += zone_reclaimable_pages(zone);
> }
> }
>
> @@ -1910,7 +1910,7 @@ loop_again:
> for (i = 0; i <= end_zone; i++) {
> struct zone *zone = pgdat->node_zones + i;
>
> - lru_pages += zone_lru_pages(zone);
> + lru_pages += zone_reclaimable_pages(zone);
> }
>
> /*
> @@ -1954,7 +1954,7 @@ loop_again:
> if (zone_is_all_unreclaimable(zone))
> continue;
> if (nr_slab == 0 && zone->pages_scanned >=
> - (zone_lru_pages(zone) * 6))
> + (zone_reclaimable_pages(zone) * 6))
> zone_set_flag(zone,
> ZONE_ALL_UNRECLAIMABLE);
> /*
> @@ -2121,12 +2121,33 @@ void wakeup_kswapd(struct zone *zone, int order)
> wake_up_interruptible(&pgdat->kswapd_wait);
> }
>
> -unsigned long global_lru_pages(void)
> +unsigned long global_reclaimable_pages(void)
> {
> - return global_page_state(NR_ACTIVE_ANON)
> - + global_page_state(NR_ACTIVE_FILE)
> - + global_page_state(NR_INACTIVE_ANON)
> - + global_page_state(NR_INACTIVE_FILE);
> + int nr;
> +
> + nr = global_page_state(zone, NR_ACTIVE_FILE) +
> + global_page_state(zone, NR_INACTIVE_FILE);
> +
> + if (total_swap_pages)


Dont' we have to change from total_swap_pages to nr_swap_pages, too ?

> + nr += global_page_state(zone, NR_ACTIVE_ANON) +
> + global_page_state(zone, NR_INACTIVE_ANON);
> +
> + return nr;
> +}
> +
> +
> +unsigned long zone_reclaimable_pages(struct zone *zone)
> +{
> + int nr;
> +
> + nr = zone_page_state(zone, NR_ACTIVE_FILE) +
> + zone_page_state(zone, NR_INACTIVE_FILE);
> +
> + if (nr_swap_pages > 0)
> + nr += zone_page_state(zone, NR_ACTIVE_ANON) +
> + zone_page_state(zone, NR_INACTIVE_ANON);
> +
> + return nr;
> }
>
> #ifdef CONFIG_HIBERNATION
> @@ -2198,7 +2219,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
>
> current->reclaim_state = &reclaim_state;
>
> - lru_pages = global_lru_pages();
> + lru_pages = global_reclaimable_pages();
> nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
> /* If slab caches are huge, it's better to hit them first */
> while (nr_slab >= lru_pages) {
> @@ -2240,7 +2261,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
>
> reclaim_state.reclaimed_slab = 0;
> shrink_slab(sc.nr_scanned, sc.gfp_mask,
> - global_lru_pages());
> + global_reclaimable_pages());
> sc.nr_reclaimed += reclaim_state.reclaimed_slab;
> if (sc.nr_reclaimed >= nr_pages)
> goto out;
> @@ -2257,7 +2278,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
> if (!sc.nr_reclaimed) {
> do {
> reclaim_state.reclaimed_slab = 0;
> - shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
> + shrink_slab(nr_pages, sc.gfp_mask,
> + global_reclaimable_pages());
> sc.nr_reclaimed += reclaim_state.reclaimed_slab;
> } while (sc.nr_reclaimed < nr_pages &&
> reclaim_state.reclaimed_slab > 0);


--
Kinds Regards
Minchan Kim

2009-06-29 07:36:00

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, Jun 29, 2009 at 08:17:41AM +0800, Minchan Kim wrote:
> On Sun, 28 Jun 2009 23:10:26 +0800
> Wu Fengguang <[email protected]> wrote:
> > +unsigned long global_reclaimable_pages(void)
> > {
> > - return global_page_state(NR_ACTIVE_ANON)
> > - + global_page_state(NR_ACTIVE_FILE)
> > - + global_page_state(NR_INACTIVE_ANON)
> > - + global_page_state(NR_INACTIVE_FILE);
> > + int nr;
> > +
> > + nr = global_page_state(zone, NR_ACTIVE_FILE) +
> > + global_page_state(zone, NR_INACTIVE_FILE);
> > +
> > + if (total_swap_pages)
>
>
> Dont' we have to change from total_swap_pages to nr_swap_pages, too ?

Yes, good catch! (sorry I was in a hurry at the time..)

Thanks,
Fengguang

---

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 416f748..8d8aa20 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -166,15 +166,8 @@ static inline unsigned long zone_page_state(struct zone *zone,
return x;
}

-extern unsigned long global_lru_pages(void);
-
-static inline unsigned long zone_lru_pages(struct zone *zone)
-{
- return (zone_page_state(zone, NR_ACTIVE_ANON)
- + zone_page_state(zone, NR_ACTIVE_FILE)
- + zone_page_state(zone, NR_INACTIVE_ANON)
- + zone_page_state(zone, NR_INACTIVE_FILE));
-}
+extern unsigned long global_reclaimable_pages(void);
+extern unsigned long zone_reclaimable_pages(void);

#ifdef CONFIG_NUMA
/*
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index a91b870..74c3067 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -394,7 +394,8 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
struct zone *z =
&NODE_DATA(node)->node_zones[ZONE_HIGHMEM];

- x += zone_page_state(z, NR_FREE_PAGES) + zone_lru_pages(z);
+ x += zone_page_state(z, NR_FREE_PAGES) +
+ zone_reclaimable_pages(z);
}
/*
* Make sure that the number of highmem pages is never larger
@@ -418,7 +419,7 @@ unsigned long determine_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_lru_pages();
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 026f452..09976da 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1693,7 +1693,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
continue;

- lru_pages += zone_lru_pages(zone);
+ lru_pages += zone_reclaimable_pages(zone);
}
}

@@ -1910,7 +1910,7 @@ loop_again:
for (i = 0; i <= end_zone; i++) {
struct zone *zone = pgdat->node_zones + i;

- lru_pages += zone_lru_pages(zone);
+ lru_pages += zone_reclaimable_pages(zone);
}

/*
@@ -1954,7 +1954,7 @@ loop_again:
if (zone_is_all_unreclaimable(zone))
continue;
if (nr_slab == 0 && zone->pages_scanned >=
- (zone_lru_pages(zone) * 6))
+ (zone_reclaimable_pages(zone) * 6))
zone_set_flag(zone,
ZONE_ALL_UNRECLAIMABLE);
/*
@@ -2121,12 +2121,33 @@ void wakeup_kswapd(struct zone *zone, int order)
wake_up_interruptible(&pgdat->kswapd_wait);
}

-unsigned long global_lru_pages(void)
+unsigned long global_reclaimable_pages(void)
{
- return global_page_state(NR_ACTIVE_ANON)
- + global_page_state(NR_ACTIVE_FILE)
- + global_page_state(NR_INACTIVE_ANON)
- + global_page_state(NR_INACTIVE_FILE);
+ int nr;
+
+ nr = global_page_state(zone, NR_ACTIVE_FILE) +
+ global_page_state(zone, NR_INACTIVE_FILE);
+
+ if (nr_swap_pages > 0)
+ nr += global_page_state(zone, NR_ACTIVE_ANON) +
+ global_page_state(zone, NR_INACTIVE_ANON);
+
+ return nr;
+}
+
+
+unsigned long zone_reclaimable_pages(struct zone *zone)
+{
+ int nr;
+
+ nr = zone_page_state(zone, NR_ACTIVE_FILE) +
+ zone_page_state(zone, NR_INACTIVE_FILE);
+
+ if (nr_swap_pages > 0)
+ nr += zone_page_state(zone, NR_ACTIVE_ANON) +
+ zone_page_state(zone, NR_INACTIVE_ANON);
+
+ return nr;
}

#ifdef CONFIG_HIBERNATION
@@ -2198,7 +2219,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)

current->reclaim_state = &reclaim_state;

- lru_pages = global_lru_pages();
+ lru_pages = global_reclaimable_pages();
nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
/* If slab caches are huge, it's better to hit them first */
while (nr_slab >= lru_pages) {
@@ -2240,7 +2261,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)

reclaim_state.reclaimed_slab = 0;
shrink_slab(sc.nr_scanned, sc.gfp_mask,
- global_lru_pages());
+ global_reclaimable_pages());
sc.nr_reclaimed += reclaim_state.reclaimed_slab;
if (sc.nr_reclaimed >= nr_pages)
goto out;
@@ -2257,7 +2278,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
if (!sc.nr_reclaimed) {
do {
reclaim_state.reclaimed_slab = 0;
- shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
+ shrink_slab(nr_pages, sc.gfp_mask,
+ global_reclaimable_pages());
sc.nr_reclaimed += reclaim_state.reclaimed_slab;
} while (sc.nr_reclaimed < nr_pages &&
reclaim_state.reclaimed_slab > 0);

2009-06-29 07:48:20

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

2009/6/29 Minchan Kim <[email protected]>:
> On Sun, Jun 28, 2009 at 11:49 PM, KOSAKI
> Motohiro<[email protected]> wrote:
>>>> In David's OOM case, there are two symptoms:
>>>> 1) 70000 unaccounted/leaked pages as found by Andrew
>>>> ? (plus rather big number of PG_buddy and pagetable pages)
>>>> 2) almost zero active_file/inactive_file; small inactive_anon;
>>>> ? many slab and active_anon pages.
>>>>
>>>> In the situation of (2), the slab cache is _under_ scanned. So David
>>>> got OOM when vmscan should have squeezed some free pages from the slab
>>>> cache. Which is one important side effect of MinChan's patch?
>>>
>>> My patch's side effect is (2).
>>>
>>> My guessing is following as.
>>>
>>> 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
>>> And it is doubled for mapped page or swapcache.
>>> 2. shrink_page_list is called by shrink_inactive_list
>>> 3. shrink_inactive_list is called by shrink_list
>>>
>>> Look at the shrink_list.
>>> If inactive lru list is low, it always call shrink_active_list not
>>> shrink_inactive_list in case of anon.
>>> It means it doesn't increased sc->nr_scanned.
>>> Then shrink_slab can't shrink enough slab pages.
>>> So, David OOM have a lot of slab pages and active anon pages.
>>>
>>> Does it make sense ?
>>> If it make sense, we have to change shrink_slab's pressure method.
>>> What do you think ?
>>
>> I'm confused.
>>
>> if system have no swap, get_scan_ratio() always return anon=0%.
>> Then, the numver of inactive_anon is not effect to sc.nr_scanned.
>>
>
> My patch isn't a concern since the number of anon lru list(active +
> anon) always same. ?I mean shrink_slab's lru_pages is same whether my
> patch there is. ?OOM or Pass depends on sc->nr_scanned, I think.
>
> Why I think it is my patch's side effect is follow as.
>
> Compared to old behavior, my patch can change balancing of anon lru
> list when "swap file" is full as Hannes already pointed me out.
>
> It can affect reclaimable anon pages while David is going on swap test on LTP.
> When swap file test is end, pages on swap file is inserted anon lru list, again.
>
> My patch can change physical location of anon pages on ram compared to old.

No.
shrink_active_list() doesn't change page physical address.


> From now on, we have no swap file so that we can reclaim only file pages.
> But we have missed one thing. lumpy reclaim!. (In fact, we should not
> reclaim anon pages in no swap space. A few days ago, I sended patch
> about this problem. http://patchwork.kernel.org/patch/32651/)
>
> It can reclaim anon pages although we have no swap file.
> But after all, shrink_page_list can't reclaim anon pages. ?But it
> increases sc->nr_scanned.
>
> So I think whether Shrink_slab can reclaim enough or not depends on
> sc->nr_scanned.
>
> David's problem is very subtle.
>
> 1. If lumpy picks up the anon pages, it can pass LTP since
> sc->nr_scanned is increased.
> 2. If lumpy don't pick up the anon pages, it can meet OOM since
> sc->nr_scanned is almost zero or very small.
>
> Unfortunately, my patch seems to change physical location of pages on
> ram compared to old so that it selects 2.
>
> It's my imaginary novel.
>
> Okay. I believe Wu's patch will solve David's problem.
> David. Could you test with Wu's patch ?

However, lumpy reclaim is good viewpoint.
Recently KAMEZAWA-san fix one serious lumpy reclaim problem. since
2.6.28 lumpy reclaim can insert file mapped pages to anon lru list.
Then, the page become to be not able to reclaimable.

David, Can you please try to following patch? it was posted to LKML
about 1-2 week ago.

Subject "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
isolate_lru_pages v2"

2009-06-29 09:33:42

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, 29 Jun 2009 16:48:13 +0900
KOSAKI Motohiro <[email protected]> wrote:

> 2009/6/29 Minchan Kim <[email protected]>:
> > On Sun, Jun 28, 2009 at 11:49 PM, KOSAKI
> > Motohiro<[email protected]> wrote:
> >>>> In David's OOM case, there are two symptoms:
> >>>> 1) 70000 unaccounted/leaked pages as found by Andrew
> >>>>   (plus rather big number of PG_buddy and pagetable pages)
> >>>> 2) almost zero active_file/inactive_file; small inactive_anon;
> >>>>   many slab and active_anon pages.
> >>>>
> >>>> In the situation of (2), the slab cache is _under_ scanned. So David
> >>>> got OOM when vmscan should have squeezed some free pages from the slab
> >>>> cache. Which is one important side effect of MinChan's patch?
> >>>
> >>> My patch's side effect is (2).
> >>>
> >>> My guessing is following as.
> >>>
> >>> 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
> >>> And it is doubled for mapped page or swapcache.
> >>> 2. shrink_page_list is called by shrink_inactive_list
> >>> 3. shrink_inactive_list is called by shrink_list
> >>>
> >>> Look at the shrink_list.
> >>> If inactive lru list is low, it always call shrink_active_list not
> >>> shrink_inactive_list in case of anon.
> >>> It means it doesn't increased sc->nr_scanned.
> >>> Then shrink_slab can't shrink enough slab pages.
> >>> So, David OOM have a lot of slab pages and active anon pages.
> >>>
> >>> Does it make sense ?
> >>> If it make sense, we have to change shrink_slab's pressure method.
> >>> What do you think ?
> >>
> >> I'm confused.
> >>
> >> if system have no swap, get_scan_ratio() always return anon=0%.
> >> Then, the numver of inactive_anon is not effect to sc.nr_scanned.
> >>
> >
> > My patch isn't a concern since the number of anon lru list(active +
> > anon) always same.  I mean shrink_slab's lru_pages is same whether my
> > patch there is.  OOM or Pass depends on sc->nr_scanned, I think.
> >
> > Why I think it is my patch's side effect is follow as.
> >
> > Compared to old behavior, my patch can change balancing of anon lru
> > list when "swap file" is full as Hannes already pointed me out.
> >
> > It can affect reclaimable anon pages while David is going on swap test on LTP.
> > When swap file test is end, pages on swap file is inserted anon lru list, again.
> >
> > My patch can change physical location of anon pages on ram compared to old.
>
> No.
> shrink_active_list() doesn't change page physical address.

Sorry for makeig misunderstanding you.
I mean follow as.

1. Daivd tests swapfile on LTP.
2. while it is going on, swap file is full
(My patch didn't consider this case. It means it didn't do aging of anon pages.
so my patch can change swap out page's pattern)
3. swapfile test is ended successfully.
4. Anon pages on swap file will reload on DRAM from HDD or any swap device.

In 4) when anon pages are swapped in, we have to allocate new page to copy from swap page.
So, It could change page's physical location.
Then, It can affect lumpy reclaim. :)

>
> > From now on, we have no swap file so that we can reclaim only file pages.
> > But we have missed one thing. lumpy reclaim!. (In fact, we should not
> > reclaim anon pages in no swap space. A few days ago, I sended patch
> > about this problem. http://patchwork.kernel.org/patch/32651/)
> >
> > It can reclaim anon pages although we have no swap file.
> > But after all, shrink_page_list can't reclaim anon pages.  But it
> > increases sc->nr_scanned.
> >
> > So I think whether Shrink_slab can reclaim enough or not depends on
> > sc->nr_scanned.
> >
> > David's problem is very subtle.
> >
> > 1. If lumpy picks up the anon pages, it can pass LTP since
> > sc->nr_scanned is increased.
> > 2. If lumpy don't pick up the anon pages, it can meet OOM since
> > sc->nr_scanned is almost zero or very small.
> >
> > Unfortunately, my patch seems to change physical location of pages on
> > ram compared to old so that it selects 2.
> >
> > It's my imaginary novel.
> >
> > Okay. I believe Wu's patch will solve David's problem.
> > David. Could you test with Wu's patch ?
>
> However, lumpy reclaim is good viewpoint.
> Recently KAMEZAWA-san fix one serious lumpy reclaim problem. since
> 2.6.28 lumpy reclaim can insert file mapped pages to anon lru list.
> Then, the page become to be not able to reclaimable.

Yes. It is also another possibility.
But I have a question why it didn't happen without my patch.
My question is thath why my patch happen OOM with high probability ?

> David, Can you please try to following patch? it was posted to LKML
> about 1-2 week ago.
>
> Subject "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
> isolate_lru_pages v2"


--
Kinds Regards
Minchan Kim

2009-06-29 10:11:18

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Wu Fengguang <[email protected]> wrote:

> Yes, good catch! (sorry I was in a hurry at the time..)

That doesn't compile:

mm/vmscan.c: In function 'do_try_to_free_pages':
mm/vmscan.c:1683: error: too many arguments to function 'zone_reclaimable_pages'
mm/vmscan.c: In function 'balance_pgdat':
mm/vmscan.c:1900: error: too many arguments to function 'zone_reclaimable_pages'
mm/vmscan.c:1944: error: too many arguments to function 'zone_reclaimable_pages'
mm/vmscan.c: In function 'global_reclaimable_pages':
mm/vmscan.c:2115: error: 'zone' undeclared (first use in this function)
mm/vmscan.c:2115: error: (Each undeclared identifier is reported only once
mm/vmscan.c:2115: error: for each function it appears in.)
mm/vmscan.c:2115: error: too many arguments to function 'global_page_state'
mm/vmscan.c:2116: error: too many arguments to function 'global_page_state'
mm/vmscan.c:2119: error: too many arguments to function 'global_page_state'
mm/vmscan.c:2120: error: too many arguments to function 'global_page_state'
mm/vmscan.c: At top level:
mm/vmscan.c:2126: error: conflicting types for 'zone_reclaimable_pages'
include/linux/vmstat.h:170: note: previous declaration of 'zone_reclaimable_pages' was here
make[1]: *** [mm/vmscan.o] Error 1

David

2009-06-29 12:44:31

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

KOSAKI Motohiro <[email protected]> wrote:

> David, Can you please try to following patch? it was posted to LKML
> about 1-2 week ago.
>
> Subject "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
> isolate_lru_pages v2"

It is already committed, but I ran a test on the latest Linus kernel anyway:

msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 20366, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #144
Call Trace:
[<ffffffff810718d2>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff81071b99>] ? __out_of_memory+0x12b/0x142
[<ffffffff81071c1a>] ? out_of_memory+0x6a/0x94
[<ffffffff810742e4>] ? __alloc_pages_nodemask+0x42e/0x51d
[<ffffffff81031416>] ? copy_process+0x95/0x114f
[<ffffffff8107443c>] ? __get_free_pages+0x12/0x4f
[<ffffffff81031439>] ? copy_process+0xb8/0x114f
[<ffffffff8108192e>] ? handle_mm_fault+0x5dd/0x62f
[<ffffffff8103260f>] ? do_fork+0x13f/0x2ba
[<ffffffff81022c22>] ? do_page_fault+0x1f8/0x20d
[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 159
CPU 1: hi: 186, btch: 31 usd: 2
Active_anon:70477 active_file:1 inactive_anon:4514
inactive_file:7 unevictable:0 dirty:0 writeback:0 unstable:0
free:1954 slab:42078 mapped:237 pagetables:57791 bounce:0
DMA free:3932kB min:60kB low:72kB high:88kB active_anon:236kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:3884kB min:3948kB low:4932kB high:5920kB active_anon:281672kB inactive_anon:18056kB active_file:0kB inactive_file:24kB unevictable:0kB present:992032kB pages_scanned:6 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 180*4kB 36*8kB 3*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3936kB
DMA32: 491*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3884kB
1808 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5589 pages reserved
249340 pages shared
219039 pages non-shared
Out of memory: kill process 11471 (msgctl11) score 112393 or a child
Killed process 12318 (msgctl11)


David

2009-06-29 12:56:26

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, Jun 29, 2009 at 06:10:19PM +0800, David Howells wrote:
> Wu Fengguang <[email protected]> wrote:
>
> > Yes, good catch! (sorry I was in a hurry at the time..)
>
> That doesn't compile:

Sorry! This one compiles OK:

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 416f748..30bb1fe 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -166,15 +166,8 @@ static inline unsigned long zone_page_state(struct zone *zone,
return x;
}

-extern unsigned long global_lru_pages(void);
-
-static inline unsigned long zone_lru_pages(struct zone *zone)
-{
- return (zone_page_state(zone, NR_ACTIVE_ANON)
- + zone_page_state(zone, NR_ACTIVE_FILE)
- + zone_page_state(zone, NR_INACTIVE_ANON)
- + zone_page_state(zone, NR_INACTIVE_FILE));
-}
+extern unsigned long global_reclaimable_pages(void);
+extern unsigned long zone_reclaimable_pages(struct zone *zone);

#ifdef CONFIG_NUMA
/*
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index a91b870..74c3067 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -394,7 +394,8 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
struct zone *z =
&NODE_DATA(node)->node_zones[ZONE_HIGHMEM];

- x += zone_page_state(z, NR_FREE_PAGES) + zone_lru_pages(z);
+ x += zone_page_state(z, NR_FREE_PAGES) +
+ zone_reclaimable_pages(z);
}
/*
* Make sure that the number of highmem pages is never larger
@@ -418,7 +419,7 @@ unsigned long determine_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_lru_pages();
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 026f452..1e29c7d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1693,7 +1693,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
continue;

- lru_pages += zone_lru_pages(zone);
+ lru_pages += zone_reclaimable_pages(zone);
}
}

@@ -1910,7 +1910,7 @@ loop_again:
for (i = 0; i <= end_zone; i++) {
struct zone *zone = pgdat->node_zones + i;

- lru_pages += zone_lru_pages(zone);
+ lru_pages += zone_reclaimable_pages(zone);
}

/*
@@ -1954,7 +1954,7 @@ loop_again:
if (zone_is_all_unreclaimable(zone))
continue;
if (nr_slab == 0 && zone->pages_scanned >=
- (zone_lru_pages(zone) * 6))
+ (zone_reclaimable_pages(zone) * 6))
zone_set_flag(zone,
ZONE_ALL_UNRECLAIMABLE);
/*
@@ -2121,12 +2121,33 @@ void wakeup_kswapd(struct zone *zone, int order)
wake_up_interruptible(&pgdat->kswapd_wait);
}

-unsigned long global_lru_pages(void)
+unsigned long global_reclaimable_pages(void)
{
- return global_page_state(NR_ACTIVE_ANON)
- + global_page_state(NR_ACTIVE_FILE)
- + global_page_state(NR_INACTIVE_ANON)
- + global_page_state(NR_INACTIVE_FILE);
+ int nr;
+
+ nr = global_page_state(NR_ACTIVE_FILE) +
+ global_page_state(NR_INACTIVE_FILE);
+
+ if (nr_swap_pages > 0)
+ nr += global_page_state(NR_ACTIVE_ANON) +
+ global_page_state(NR_INACTIVE_ANON);
+
+ return nr;
+}
+
+
+unsigned long zone_reclaimable_pages(struct zone *zone)
+{
+ int nr;
+
+ nr = zone_page_state(zone, NR_ACTIVE_FILE) +
+ zone_page_state(zone, NR_INACTIVE_FILE);
+
+ if (nr_swap_pages > 0)
+ nr += zone_page_state(zone, NR_ACTIVE_ANON) +
+ zone_page_state(zone, NR_INACTIVE_ANON);
+
+ return nr;
}

#ifdef CONFIG_HIBERNATION
@@ -2198,7 +2219,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)

current->reclaim_state = &reclaim_state;

- lru_pages = global_lru_pages();
+ lru_pages = global_reclaimable_pages();
nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
/* If slab caches are huge, it's better to hit them first */
while (nr_slab >= lru_pages) {
@@ -2240,7 +2261,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)

reclaim_state.reclaimed_slab = 0;
shrink_slab(sc.nr_scanned, sc.gfp_mask,
- global_lru_pages());
+ global_reclaimable_pages());
sc.nr_reclaimed += reclaim_state.reclaimed_slab;
if (sc.nr_reclaimed >= nr_pages)
goto out;
@@ -2257,7 +2278,8 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
if (!sc.nr_reclaimed) {
do {
reclaim_state.reclaimed_slab = 0;
- shrink_slab(nr_pages, sc.gfp_mask, global_lru_pages());
+ shrink_slab(nr_pages, sc.gfp_mask,
+ global_reclaimable_pages());
sc.nr_reclaimed += reclaim_state.reclaimed_slab;
} while (sc.nr_reclaimed < nr_pages &&
reclaim_state.reclaimed_slab > 0);

2009-06-29 12:59:32

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, Jun 29, 2009 at 08:43:55PM +0800, David Howells wrote:
> KOSAKI Motohiro <[email protected]> wrote:
>
> > David, Can you please try to following patch? it was posted to LKML
> > about 1-2 week ago.
> >
> > Subject "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
> > isolate_lru_pages v2"
>
> It is already committed, but I ran a test on the latest Linus kernel anyway:

page-types showed that there are only ~1MB mapped regular (non-tmpfs) file pages.
So not surprisingly it didn't help.

> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 20366, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #144
> Call Trace:
> [<ffffffff810718d2>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81071b99>] ? __out_of_memory+0x12b/0x142
> [<ffffffff81071c1a>] ? out_of_memory+0x6a/0x94
> [<ffffffff810742e4>] ? __alloc_pages_nodemask+0x42e/0x51d
> [<ffffffff81031416>] ? copy_process+0x95/0x114f
> [<ffffffff8107443c>] ? __get_free_pages+0x12/0x4f
> [<ffffffff81031439>] ? copy_process+0xb8/0x114f
> [<ffffffff8108192e>] ? handle_mm_fault+0x5dd/0x62f
> [<ffffffff8103260f>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022c22>] ? do_page_fault+0x1f8/0x20d
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 159
> CPU 1: hi: 186, btch: 31 usd: 2
> Active_anon:70477 active_file:1 inactive_anon:4514
> inactive_file:7 unevictable:0 dirty:0 writeback:0 unstable:0
> free:1954 slab:42078 mapped:237 pagetables:57791 bounce:0
> DMA free:3932kB min:60kB low:72kB high:88kB active_anon:236kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:3884kB min:3948kB low:4932kB high:5920kB active_anon:281672kB inactive_anon:18056kB active_file:0kB inactive_file:24kB unevictable:0kB present:992032kB pages_scanned:6 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 180*4kB 36*8kB 3*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3936kB
> DMA32: 491*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3884kB
> 1808 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5589 pages reserved
> 249340 pages shared
> 219039 pages non-shared
> Out of memory: kill process 11471 (msgctl11) score 112393 or a child
> Killed process 12318 (msgctl11)
>
>
> David

2009-06-29 14:22:20

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Wu Fengguang <[email protected]> wrote:

> Sorry! This one compiles OK:

Sadly that doesn't seem to work either:

msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 30858, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
Call Trace:
[<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
[<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
[<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
[<ffffffff81080843>] ? do_wp_page+0x2c6/0x5f5
[<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
[<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
[<ffffffff812e069f>] ? page_fault+0x1f/0x30
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 38
CPU 1: hi: 186, btch: 31 usd: 106
Active_anon:75040 active_file:0 inactive_anon:2031
inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
free:1951 slab:41499 mapped:301 pagetables:60674 bounce:0
DMA free:3932kB min:60kB low:72kB high:88kB active_anon:2868kB inactive_anon:384kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:3872kB min:3948kB low:4932kB high:5920kB active_anon:297292kB inactive_anon:7740kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 7*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
DMA32: 500*4kB 2*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3872kB
1928 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5589 pages reserved
238251 pages shared
216210 pages non-shared
Out of memory: kill process 25221 (msgctl11) score 130560 or a child
Killed process 26379 (msgctl11)


Is there any extra debugging I can put in to get more information out of the
OOM?

David

2009-06-29 15:00:32

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, Jun 29, 2009 at 11:21 PM, David Howells<[email protected]> wrote:
> Wu Fengguang <[email protected]> wrote:
>
>> Sorry! This one compiles OK:
>
> Sadly that doesn't seem to work either:
>
> msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 30858, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
> Call Trace:
>  [<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
>  [<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
>  [<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
>  [<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
>  [<ffffffff81080843>] ? do_wp_page+0x2c6/0x5f5
>  [<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
>  [<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
>  [<ffffffff812e069f>] ? page_fault+0x1f/0x30
> Mem-Info:
> DMA per-cpu:
> CPU    0: hi:    0, btch:   1 usd:   0
> CPU    1: hi:    0, btch:   1 usd:   0
> DMA32 per-cpu:
> CPU    0: hi:  186, btch:  31 usd:  38
> CPU    1: hi:  186, btch:  31 usd: 106
> Active_anon:75040 active_file:0 inactive_anon:2031
>  inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
>  free:1951 slab:41499 mapped:301 pagetables:60674 bounce:0
> DMA free:3932kB min:60kB low:72kB high:88kB active_anon:2868kB inactive_anon:384kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:3872kB min:3948kB low:4932kB high:5920kB active_anon:297292kB inactive_anon:7740kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 7*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
> DMA32: 500*4kB 2*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3872kB
> 1928 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap  = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5589 pages reserved
> 238251 pages shared
> 216210 pages non-shared
> Out of memory: kill process 25221 (msgctl11) score 130560 or a child
> Killed process 26379 (msgctl11)

Totally, I can't understand this situation.
Now, this page allocation is order zero and It is just likely GFP_HIGHUSER.
So it's unlikely interrupt context.

Buddy already has enough fallback DMA32, I think.
Why kernel can't allocate page for order 0 ?
Is it allocator bug ?

--
Kinds regards,
Minchan Kim

2009-06-29 15:14:40

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, Jun 29, 2009 at 11:00:26PM +0800, Minchan Kim wrote:
> On Mon, Jun 29, 2009 at 11:21 PM, David Howells<[email protected]> wrote:
> > Wu Fengguang <[email protected]> wrote:
> >
> >> Sorry! This one compiles OK:
> >
> > Sadly that doesn't seem to work either:
> >
> > msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
> > msgctl11 cpuset=/ mems_allowed=0
> > Pid: 30858, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
> > Call Trace:
> >  [<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
> >  [<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
> >  [<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
> >  [<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
> >  [<ffffffff81080843>] ? do_wp_page+0x2c6/0x5f5
> >  [<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
> >  [<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
> >  [<ffffffff812e069f>] ? page_fault+0x1f/0x30
> > Mem-Info:
> > DMA per-cpu:
> > CPU    0: hi:    0, btch:   1 usd:   0
> > CPU    1: hi:    0, btch:   1 usd:   0
> > DMA32 per-cpu:
> > CPU    0: hi:  186, btch:  31 usd:  38
> > CPU    1: hi:  186, btch:  31 usd: 106
> > Active_anon:75040 active_file:0 inactive_anon:2031
> >  inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> >  free:1951 slab:41499 mapped:301 pagetables:60674 bounce:0
> > DMA free:3932kB min:60kB low:72kB high:88kB active_anon:2868kB inactive_anon:384kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 968 968 968
> > DMA32 free:3872kB min:3948kB low:4932kB high:5920kB active_anon:297292kB inactive_anon:7740kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 0 0 0
> > DMA: 7*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
> > DMA32: 500*4kB 2*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3872kB
> > 1928 total pagecache pages
> > 0 pages in swap cache
> > Swap cache stats: add 0, delete 0, find 0/0
> > Free swap  = 0kB
> > Total swap = 0kB
> > 255744 pages RAM
> > 5589 pages reserved
> > 238251 pages shared
> > 216210 pages non-shared
> > Out of memory: kill process 25221 (msgctl11) score 130560 or a child
> > Killed process 26379 (msgctl11)
>
> Totally, I can't understand this situation.
> Now, this page allocation is order zero and It is just likely GFP_HIGHUSER.
> So it's unlikely interrupt context.
>
> Buddy already has enough fallback DMA32, I think.
> Why kernel can't allocate page for order 0 ?
> Is it allocator bug ?

Yes this time the OOM order/flags are much different from all previous OOMs.

btw, I found that msgctl11 is pretty good at making a lot of SUnreclaim and PageTables pages:

before during 1 during 2 after

MemTotal: 3931880 kB MemTotal: 3931880 kB MemTotal: 3931880 kB MemTotal: 3931880 kB
MemFree: 985944 kB MemFree: 1489364 kB MemFree: 2069184 kB MemFree: 2853900 kB
Buffers: 41704 kB Buffers: 16080 kB Buffers: 16104 kB Buffers: 16200 kB
Cached: 1899740 kB Cached: 126780 kB Cached: 129092 kB Cached: 130552 kB
SwapCached: 0 kB SwapCached: 0 kB SwapCached: 0 kB SwapCached: 0 kB
Active: 402420 kB Active: 812320 kB Active: 643868 kB Active: 354880 kB
Inactive: 2325644 kB Inactive: 576732 kB Inactive: 578792 kB Inactive: 579640 kB
Active(anon): 333720 kB Active(anon): 781264 kB Active(anon): 612632 kB Active(anon): 323448 kB
Inactive(anon): 470764 kB Inactive(anon): 482792 kB Inactive(anon): 482680 kB Inactive(anon): 482268 kB
Active(file): 68700 kB Active(file): 31056 kB Active(file): 31236 kB Active(file): 31432 kB
Inactive(file): 1854880 kB Inactive(file): 93940 kB Inactive(file): 96112 kB Inactive(file): 97372 kB
Unevictable: 4 kB Unevictable: 4 kB Unevictable: 4 kB Unevictable: 4 kB
Mlocked: 4 kB Mlocked: 4 kB Mlocked: 4 kB Mlocked: 4 kB
SwapTotal: 0 kB SwapTotal: 0 kB SwapTotal: 0 kB SwapTotal: 0 kB
SwapFree: 0 kB SwapFree: 0 kB SwapFree: 0 kB SwapFree: 0 kB
Dirty: 996 kB Dirty: 536 kB Dirty: 1348 kB Dirty: 212 kB
Writeback: 0 kB Writeback: 0 kB Writeback: 0 kB Writeback: 0 kB
AnonPages: 786772 kB AnonPages: 1246280 kB AnonPages: 1077352 kB AnonPages: 787856 kB
Mapped: 53504 kB Mapped: 50420 kB Mapped: 50668 kB Mapped: 50716 kB
Slab: 159340 kB Slab: 339708 kB Slab: 227164 kB Slab: 85272 kB
SReclaimable: 125152 kB SReclaimable: 49188 kB SReclaimable: 48944 kB SReclaimable: 48508 kB
SUnreclaim: 34188 kB SUnreclaim: 290520 kB SUnreclaim: 178220 kB SUnreclaim: 36764 kB
PageTables: 17068 kB PageTables: 363716 kB PageTables: 204336 kB PageTables: 16620 kB
NFS_Unstable: 0 kB NFS_Unstable: 0 kB NFS_Unstable: 0 kB NFS_Unstable: 0 kB
Bounce: 0 kB Bounce: 0 kB Bounce: 0 kB Bounce: 0 kB
WritebackTmp: 0 kB WritebackTmp: 0 kB WritebackTmp: 0 kB WritebackTmp: 0 kB
CommitLimit: 1965940 kB CommitLimit: 1965940 kB CommitLimit: 1965940 kB CommitLimit: 1965940 kB
Committed_AS: 1130516 kB Committed_AS: 79437584 kB Committed_AS: 43472636 kB Committed_AS: 1122240 kB
VmallocTotal: 34359738367 kB VmallocTotal: 34359738367 kB VmallocTotal: 34359738367 kB VmallocTotal: 34359738367 kB
VmallocUsed: 91504 kB VmallocUsed: 91504 kB VmallocUsed: 91504 kB VmallocUsed: 91504 kB
VmallocChunk: 34359582075 kB VmallocChunk: 34359582075 kB VmallocChunk: 34359582075 kB VmallocChunk: 34359582075 kB
HugePages_Total: 0 HugePages_Total: 0 HugePages_Total: 0 HugePages_Total: 0
HugePages_Free: 0 HugePages_Free: 0 HugePages_Free: 0 HugePages_Free: 0
HugePages_Rsvd: 0 HugePages_Rsvd: 0 HugePages_Rsvd: 0 HugePages_Rsvd: 0
HugePages_Surp: 0 HugePages_Surp: 0 HugePages_Surp: 0 HugePages_Surp: 0
Hugepagesize: 2048 kB Hugepagesize: 2048 kB Hugepagesize: 2048 kB Hugepagesize: 2048 kB
DirectMap4k: 6848 kB DirectMap4k: 6848 kB DirectMap4k: 6848 kB DirectMap4k: 6848 kB
DirectMap2M: 4120576 kB DirectMap2M: 4120576 kB DirectMap2M: 4120576 kB DirectMap2M: 4120576 kB


My kernel is 2.6.30.

Thanks,
Fengguang

2009-06-29 15:29:19

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs


Minchan Kim <[email protected]> wrote:

> Totally, I can't understand this situation.
> Now, this page allocation is order zero and It is just likely GFP_HIGHUSER.
> So it's unlikely interrupt context.
>
> Buddy already has enough fallback DMA32, I think.
> Why kernel can't allocate page for order 0 ?
> Is it allocator bug ?

I don't know, but I've got you some more information.

I can reproduce the problem much, much quicker, it turns out by just running
msgctl11 from the LTP syscalls testsuite a few times. No NFSD traffic this
time to confuse the issue or any other tests.

I also managed to get a list of the most in-use slabs at the time:

002732 shmem_inode_cache 2750 800 5 1
003143 fs_cache 3180 72 53 1
003145 files_cache 3145 728 5 1
003150 mm_struct 3150 840 9 2
003152 task_xstate 3160 512 8 1
003174 sighand_cache 3180 2112 3 2
003185 task_struct 3185 1632 5 2
003192 signal_cache 3192 928 4 1
003192 task_delay_info 3304 136 28 1
003205 pid 3330 104 37 1
003262 cred_jar 3381 168 23 1
003438 size-2048 3438 2072 3 2
003589 inode_cache 3606 608 6 1
004570 size-192 4572 216 18 1
007644 size-64 7656 88 44 1
007687 sysfs_dir_cache 7733 104 37 1
007875 selinux_inode_security 7920 96 40 1
008149 dentry 8190 216 18 1
013692 size-128 13900 152 25 1
047134 vm_area_struct 47440 192 20 1
182903 size-32 183178 56 67 1
312010 avtab_node 312081 48 77 1

This is from /proc/slabinfo, with the first two columns swapped to make
sorting on it easier. I've also attached the OOM report.

David

msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
msgctl11 cpuset=/ mems_allowed=0
Pid: 12170, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
Call Trace:
[<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
[<ffffffff81074168>] ? drain_local_pages+0x0/0x13
[<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
[<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
[<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
[<ffffffff81091546>] ? cache_alloc_refill+0x353/0x69c
[<ffffffff81031424>] ? copy_process+0x93/0x1136
[<ffffffff81091b24>] ? kmem_cache_alloc+0x83/0xc5
[<ffffffff81031424>] ? copy_process+0x93/0x1136
[<ffffffff81029da3>] ? update_curr+0x53/0xdf
[<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
[<ffffffff81032606>] ? do_fork+0x13f/0x2ba
[<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 25
Active_anon:75004 active_file:0 inactive_anon:2192
inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
free:2200 slab:37795 mapped:618 pagetables:60369 bounce:0
DMA free:3928kB min:60kB low:72kB high:88kB active_anon:3024kB inactive_anon:128kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 968 968 968
DMA32 free:4748kB min:3948kB low:4932kB high:5920kB active_anon:297092kB inactive_anon:8640kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 0*4kB 1*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3928kB
DMA32: 476*4kB 56*8kB 14*16kB 4*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4624kB
1154 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
255744 pages RAM
5589 pages reserved
248214 pages shared
219940 pages non-shared
Out of memory: kill process 4164 (msgctl11) score 119366 or a child
Killed process 10211 (msgctl11)

2009-06-29 15:55:52

by David Howells

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Wu Fengguang <[email protected]> wrote:

> Yes this time the OOM order/flags are much different from all previous OOMs.
>
> btw, I found that msgctl11 is pretty good at making a lot of SUnreclaim and
> PageTables pages:

I got David Woodhouse to run this on one of this boxes, but he doesn't see the
problem, I think because he's got 4GB of RAM, and never comes close to running
out.

I've asked him to reboot with mem=1G to see if that helps reproduce it.

David

2009-06-29 15:57:22

by David Woodhouse

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, 2009-06-29 at 16:54 +0100, David Howells wrote:
> Wu Fengguang <[email protected]> wrote:
>
> > Yes this time the OOM order/flags are much different from all previous OOMs.
> >
> > btw, I found that msgctl11 is pretty good at making a lot of SUnreclaim and
> > PageTables pages:
>
> I got David Woodhouse to run this on one of this boxes, but he doesn't see the
> problem, I think because he's got 4GB of RAM, and never comes close to running
> out.
>
> I've asked him to reboot with mem=1G to see if that helps reproduce it.

msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
Pid: 5795, comm: msgctl11 Not tainted 2.6.31-rc1 #147
Call Trace:
[<ffffffff81092c77>] oom_kill_process.clone.0+0xac/0x254
[<ffffffff81092b5c>] ? badness+0x24d/0x2bc
[<ffffffff81092f5f>] __out_of_memory+0x140/0x157
[<ffffffff8109308f>] out_of_memory+0x119/0x150
[<ffffffff81095c65>] ? drain_local_pages+0x16/0x18
[<ffffffff810967ab>] __alloc_pages_nodemask+0x45a/0x55b
[<ffffffff810a32b0>] ? __inc_zone_page_state+0x2e/0x30
[<ffffffff810bb6b9>] alloc_pages_current+0xae/0xb6
[<ffffffff810a604a>] ? do_wp_page+0x621/0x6c3
[<ffffffff81094d7e>] __get_free_pages+0xe/0x4b
[<ffffffff810403a7>] copy_process+0xab/0x11a5
[<ffffffff810327c8>] ? check_preempt_wakeup+0x11a/0x142
[<ffffffff810a7a06>] ? handle_mm_fault+0x678/0x6e9
[<ffffffff810415ec>] do_fork+0x14b/0x338
[<ffffffff8105b50a>] ? up_read+0xe/0x10
[<ffffffff814ee655>] ? do_page_fault+0x2da/0x307
[<ffffffff8100a55c>] sys_clone+0x28/0x2a
[<ffffffff8100bfc3>] stub_clone+0x13/0x20
[<ffffffff8100bcdb>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
Node 0 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
CPU 4: hi: 0, btch: 1 usd: 0
CPU 5: hi: 0, btch: 1 usd: 0
CPU 6: hi: 0, btch: 1 usd: 0
CPU 7: hi: 0, btch: 1 usd: 0
Node 0 DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 20
CPU 2: hi: 186, btch: 31 usd: 19
CPU 3: hi: 186, btch: 31 usd: 20
CPU 4: hi: 186, btch: 31 usd: 19
CPU 5: hi: 186, btch: 31 usd: 24
CPU 6: hi: 186, btch: 31 usd: 41
CPU 7: hi: 186, btch: 31 usd: 25
Active_anon:72835 active_file:89 inactive_anon:575
inactive_file:103 unevictable:0 dirty:36 writeback:0 unstable:0
free:2467 slab:38211 mapped:229 pagetables:66918 bounce:0
Node 0 DMA free:4036kB min:60kB low:72kB high:88kB active_anon:3228kB inactive_a
non:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15356kB page
s_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 994 994 994
Node 0 DMA32 free:5832kB min:4000kB low:5000kB high:6000kB active_anon:288112kB
inactive_anon:2044kB active_file:356kB inactive_file:412kB unevictable:0kB prese
nt:1018080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 1*4kB 2*8kB 1*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*
2048kB 0*4096kB = 3940kB
Node 0 DMA32: 852*4kB 1*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024k
B 0*2048kB 0*4096kB = 5304kB
437 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
262144 pages RAM
6503 pages reserved
205864 pages shared
226536 pages non-shared
Out of memory: kill process 3855 (msgctl11) score 179248 or a child
Killed process 4222 (msgctl11)


--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-06-29 16:07:32

by Mel Gorman

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Tue, Jun 30, 2009 at 12:00:26AM +0900, Minchan Kim wrote:
> On Mon, Jun 29, 2009 at 11:21 PM, David Howells<[email protected]> wrote:
> > Wu Fengguang <[email protected]> wrote:
> >
> >> Sorry! This one compiles OK:
> >
> > Sadly that doesn't seem to work either:
> >
> > msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
> > msgctl11 cpuset=/ mems_allowed=0
> > Pid: 30858, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
> > Call Trace:
> > ?[<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
> > ?[<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
> > ?[<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
> > ?[<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
> > ?[<ffffffff81080843>] ? do_wp_page+0x2c6/0x5f5
> > ?[<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
> > ?[<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
> > ?[<ffffffff812e069f>] ? page_fault+0x1f/0x30
> > Mem-Info:
> > DMA per-cpu:
> > CPU ? ?0: hi: ? ?0, btch: ? 1 usd: ? 0
> > CPU ? ?1: hi: ? ?0, btch: ? 1 usd: ? 0
> > DMA32 per-cpu:
> > CPU ? ?0: hi: ?186, btch: ?31 usd: ?38
> > CPU ? ?1: hi: ?186, btch: ?31 usd: 106
> > Active_anon:75040 active_file:0 inactive_anon:2031
> > ?inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > ?free:1951 slab:41499 mapped:301 pagetables:60674 bounce:0
> > DMA free:3932kB min:60kB low:72kB high:88kB active_anon:2868kB inactive_anon:384kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 968 968 968
> > DMA32 free:3872kB min:3948kB low:4932kB high:5920kB active_anon:297292kB inactive_anon:7740kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 0 0 0
> > DMA: 7*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
> > DMA32: 500*4kB 2*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3872kB
> > 1928 total pagecache pages
> > 0 pages in swap cache
> > Swap cache stats: add 0, delete 0, find 0/0
> > Free swap ?= 0kB
> > Total swap = 0kB
> > 255744 pages RAM
> > 5589 pages reserved
> > 238251 pages shared
> > 216210 pages non-shared
> > Out of memory: kill process 25221 (msgctl11) score 130560 or a child
> > Killed process 26379 (msgctl11)
>
> Totally, I can't understand this situation.
> Now, this page allocation is order zero and It is just likely GFP_HIGHUSER.
> So it's unlikely interrupt context.

The GFP flags that are set are

#define __GFP_HIGHMEM (0x02)
#define __GFP_MOVABLE (0x08) /* Page is movable */
#define __GFP_WAIT (0x10) /* Can wait and reschedule? */
#define __GFP_IO (0x40) /* Can start physical IO? */
#define __GFP_FS (0x80) /* Can call down to low-level FS? */
#define __GFP_HARDWALL (0x20000) /* Enforce hardwall cpuset memory allocs */

which are fairly permissive in terms of what action can be taken.

> Buddy already has enough fallback DMA32, I think.

It doesn't really. We are below the minimum watermark. It wouldn't be
able to grant the allocation until a few pages had been freed.

> Why kernel can't allocate page for order 0 ?
> Is it allocator bug ?
>

If it is, it is not because the allocation failed as the watermarks were not
being met. For this situation to be occuring, it has to be scanning the LRU
lists and making no forward progress. Odd things to note;

o active_anon is very large in comparison to inactive_anon. Is this
because there is no swap and they are no longer being rotated?
o Slab and pagetables are very large. Is slab genuinely unshrinkable?

I think this system might be genuinely OOM. It can't reclaim memory and
we are below the minimum watermarks.

Is it possible there are pages that are counted as active_anon that in
fact are reclaimable because they are on the wrong LRU list? If that was
the case, the lack of rotation to inactive list would prevent them
getting discovered.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2009-06-29 16:58:42

by Andrew Morton

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, 29 Jun 2009 13:43:55 +0100 David Howells <[email protected]> wrote:

> KOSAKI Motohiro <[email protected]> wrote:
>
> > David, Can you please try to following patch? it was posted to LKML
> > about 1-2 week ago.
> >
> > Subject "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
> > isolate_lru_pages v2"
>
> It is already committed, but I ran a test on the latest Linus kernel anyway:
>
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> msgctl11 cpuset=/ mems_allowed=0
> Pid: 20366, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #144
> Call Trace:
> [<ffffffff810718d2>] ? oom_kill_process.clone.0+0xa9/0x245
> [<ffffffff81071b99>] ? __out_of_memory+0x12b/0x142
> [<ffffffff81071c1a>] ? out_of_memory+0x6a/0x94
> [<ffffffff810742e4>] ? __alloc_pages_nodemask+0x42e/0x51d
> [<ffffffff81031416>] ? copy_process+0x95/0x114f
> [<ffffffff8107443c>] ? __get_free_pages+0x12/0x4f
> [<ffffffff81031439>] ? copy_process+0xb8/0x114f
> [<ffffffff8108192e>] ? handle_mm_fault+0x5dd/0x62f
> [<ffffffff8103260f>] ? do_fork+0x13f/0x2ba
> [<ffffffff81022c22>] ? do_page_fault+0x1f8/0x20d
> [<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
> [<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 159
> CPU 1: hi: 186, btch: 31 usd: 2
> Active_anon:70477 active_file:1 inactive_anon:4514
> inactive_file:7 unevictable:0 dirty:0 writeback:0 unstable:0
> free:1954 slab:42078 mapped:237 pagetables:57791 bounce:0

~170k pages unreclaimable and ~70k pages unaccounted for.

This does not look like a reclaim problem?

> DMA free:3932kB min:60kB low:72kB high:88kB active_anon:236kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 968 968 968
> DMA32 free:3884kB min:3948kB low:4932kB high:5920kB active_anon:281672kB inactive_anon:18056kB active_file:0kB inactive_file:24kB unevictable:0kB present:992032kB pages_scanned:6 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 180*4kB 36*8kB 3*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3936kB
> DMA32: 491*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3884kB
> 1808 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 255744 pages RAM
> 5589 pages reserved
> 249340 pages shared
> 219039 pages non-shared
> Out of memory: kill process 11471 (msgctl11) score 112393 or a child
> Killed process 12318 (msgctl11)

2009-06-29 18:54:58

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

2009/6/30 Andrew Morton <[email protected]>:
> On Mon, 29 Jun 2009 13:43:55 +0100 David Howells <[email protected]> wrote:
>
>> KOSAKI Motohiro <[email protected]> wrote:
>>
>> > David, Can you please try to following patch? it was posted to LKML
>> > about 1-2 week ago.
>> >
>> > Subject "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
>> > isolate_lru_pages v2"
>>
>> It is already committed, but I ran a test on the latest Linus kernel anyway:
>>
>> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
>> msgctl11 cpuset=/ mems_allowed=0
>> Pid: 20366, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #144
>> Call Trace:
>> ?[<ffffffff810718d2>] ? oom_kill_process.clone.0+0xa9/0x245
>> ?[<ffffffff81071b99>] ? __out_of_memory+0x12b/0x142
>> ?[<ffffffff81071c1a>] ? out_of_memory+0x6a/0x94
>> ?[<ffffffff810742e4>] ? __alloc_pages_nodemask+0x42e/0x51d
>> ?[<ffffffff81031416>] ? copy_process+0x95/0x114f
>> ?[<ffffffff8107443c>] ? __get_free_pages+0x12/0x4f
>> ?[<ffffffff81031439>] ? copy_process+0xb8/0x114f
>> ?[<ffffffff8108192e>] ? handle_mm_fault+0x5dd/0x62f
>> ?[<ffffffff8103260f>] ? do_fork+0x13f/0x2ba
>> ?[<ffffffff81022c22>] ? do_page_fault+0x1f8/0x20d
>> ?[<ffffffff8100b0d3>] ? stub_clone+0x13/0x20
>> ?[<ffffffff8100ad6b>] ? system_call_fastpath+0x16/0x1b
>> Mem-Info:
>> DMA per-cpu:
>> CPU ? ?0: hi: ? ?0, btch: ? 1 usd: ? 0
>> CPU ? ?1: hi: ? ?0, btch: ? 1 usd: ? 0
>> DMA32 per-cpu:
>> CPU ? ?0: hi: ?186, btch: ?31 usd: 159
>> CPU ? ?1: hi: ?186, btch: ?31 usd: ? 2
>> Active_anon:70477 active_file:1 inactive_anon:4514
>> ?inactive_file:7 unevictable:0 dirty:0 writeback:0 unstable:0
>> ?free:1954 slab:42078 mapped:237 pagetables:57791 bounce:0
>
> ~170k pages unreclaimable and ~70k pages unaccounted for.
>
> This does not look like a reclaim problem?

OK. we need learn testcase more.

[read test program source code... ]

this program makes `cat /proc/sys/kernel/msgmni` * 10 processes.
At least, one process creation need one userland stack page (i.e. one anon)
+ one kernel stack page (i.e. one unaccount page) + one pagetable page.

In my 1GB box environment, default msgmni is 11969.
Oh well, the system physical ram (255744) is less than needed pages (11969 * 3).

In addition, those processes call msgsnd(lrand48() % 99) 1000 times.
msgsnd makes one kmalloc. it mean kernel makes tons random size slab heap and
it become very fragment.

Ummm, I think this test don't gurantee success on 1GB box.


note: I use distro kernel (Fedora11: kernel-2.6.29+ ).


>> DMA free:3932kB min:60kB low:72kB high:88kB active_anon:236kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 968 968 968
>> DMA32 free:3884kB min:3948kB low:4932kB high:5920kB active_anon:281672kB inactive_anon:18056kB active_file:0kB inactive_file:24kB unevictable:0kB present:992032kB pages_scanned:6 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> DMA: 180*4kB 36*8kB 3*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3936kB
>> DMA32: 491*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3884kB
>> 1808 total pagecache pages
>> 0 pages in swap cache
>> Swap cache stats: add 0, delete 0, find 0/0
>> Free swap ?= 0kB
>> Total swap = 0kB
>> 255744 pages RAM
>> 5589 pages reserved
>> 249340 pages shared
>> 219039 pages non-shared
>> Out of memory: kill process 11471 (msgctl11) score 112393 or a child
>> Killed process 12318 (msgctl11)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. ?For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2009-06-29 19:09:05

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

typo.

> OK. we need learn testcase more.
>
> [read test program source code... ]
>
> this program makes `cat /proc/sys/kernel/msgmni` * 10 processes.
> At least, one process creation need one userland stack page (i.e. one anon)
> + one kernel stack page (i.e. one unaccount page) + one pagetable page.
>
> In my 1GB box environment, ?default msgmni is 11969.
> Oh well, the system physical ram (255744) is less than needed pages (11969 * 3).

wrong) 11969 * 3
correct) 119690 * 3

2009-06-30 04:08:43

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, 29 Jun 2009 17:07:25 +0100
Mel Gorman <[email protected]> wrote:

> On Tue, Jun 30, 2009 at 12:00:26AM +0900, Minchan Kim wrote:
> > On Mon, Jun 29, 2009 at 11:21 PM, David Howells<[email protected]> wrote:
> > > Wu Fengguang <[email protected]> wrote:
> > >
> > >> Sorry! This one compiles OK:
> > >
> > > Sadly that doesn't seem to work either:
> > >
> > > msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
> > > msgctl11 cpuset=/ mems_allowed=0
> > > Pid: 30858, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
> > > Call Trace:
> > >  [<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
> > >  [<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
> > >  [<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
> > >  [<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
> > >  [<ffffffff81080843>] ? do_wp_page+0x2c6/0x5f5
> > >  [<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
> > >  [<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
> > >  [<ffffffff812e069f>] ? page_fault+0x1f/0x30
> > > Mem-Info:
> > > DMA per-cpu:
> > > CPU    0: hi:    0, btch:   1 usd:   0
> > > CPU    1: hi:    0, btch:   1 usd:   0
> > > DMA32 per-cpu:
> > > CPU    0: hi:  186, btch:  31 usd:  38
> > > CPU    1: hi:  186, btch:  31 usd: 106
> > > Active_anon:75040 active_file:0 inactive_anon:2031
> > >  inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > >  free:1951 slab:41499 mapped:301 pagetables:60674 bounce:0
> > > DMA free:3932kB min:60kB low:72kB high:88kB active_anon:2868kB inactive_anon:384kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> > > lowmem_reserve[]: 0 968 968 968
> > > DMA32 free:3872kB min:3948kB low:4932kB high:5920kB active_anon:297292kB inactive_anon:7740kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> > > lowmem_reserve[]: 0 0 0 0
> > > DMA: 7*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
> > > DMA32: 500*4kB 2*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3872kB
> > > 1928 total pagecache pages
> > > 0 pages in swap cache
> > > Swap cache stats: add 0, delete 0, find 0/0
> > > Free swap  = 0kB
> > > Total swap = 0kB
> > > 255744 pages RAM
> > > 5589 pages reserved
> > > 238251 pages shared
> > > 216210 pages non-shared
> > > Out of memory: kill process 25221 (msgctl11) score 130560 or a child
> > > Killed process 26379 (msgctl11)
> >
> > Totally, I can't understand this situation.
> > Now, this page allocation is order zero and It is just likely GFP_HIGHUSER.
> > So it's unlikely interrupt context.
>
> The GFP flags that are set are
>
> #define __GFP_HIGHMEM (0x02)
> #define __GFP_MOVABLE (0x08) /* Page is movable */
> #define __GFP_WAIT (0x10) /* Can wait and reschedule? */
> #define __GFP_IO (0x40) /* Can start physical IO? */
> #define __GFP_FS (0x80) /* Can call down to low-level FS? */
> #define __GFP_HARDWALL (0x20000) /* Enforce hardwall cpuset memory allocs */
>
> which are fairly permissive in terms of what action can be taken.
>
> > Buddy already has enough fallback DMA32, I think.
>
> It doesn't really. We are below the minimum watermark. It wouldn't be
> able to grant the allocation until a few pages had been freed.

Yes. I missed that.

> > Why kernel can't allocate page for order 0 ?
> > Is it allocator bug ?
> >
>
> If it is, it is not because the allocation failed as the watermarks were not
> being met. For this situation to be occuring, it has to be scanning the LRU
> lists and making no forward progress. Odd things to note;
>
> o active_anon is very large in comparison to inactive_anon. Is this
> because there is no swap and they are no longer being rotated?

Yes. My patch's intention was that.

commit 69c854817566db82c362797b4a6521d0b00fe1d8
Author: MinChan Kim <[email protected]>
Date: Tue Jun 16 15:32:44 2009 -0700

> o Slab and pagetables are very large. Is slab genuinely unshrinkable?
>
> I think this system might be genuinely OOM. It can't reclaim memory and
> we are below the minimum watermarks.
>
> Is it possible there are pages that are counted as active_anon that in
> fact are reclaimable because they are on the wrong LRU list? If that was
> the case, the lack of rotation to inactive list would prevent them
> getting discovered.

I agree.
One of them is that "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
isolate_lru_pages v2" as Kosaki already said.

Unfortunately, David said it's not.
But I think your guessing make sense.

David. Doesn't it happen OOM if you revert my patch, still?


>
> --
> Mel Gorman
> Part-time Phd Student Linux Technology Center
> University of Limerick IBM Dublin Software Lab


--
Kinds Regards
Minchan Kim

2009-06-30 09:22:44

by Mel Gorman

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Tue, Jun 30, 2009 at 01:07:41PM +0900, Minchan Kim wrote:
> On Mon, 29 Jun 2009 17:07:25 +0100
> Mel Gorman <[email protected]> wrote:
>
> > On Tue, Jun 30, 2009 at 12:00:26AM +0900, Minchan Kim wrote:
> > > On Mon, Jun 29, 2009 at 11:21 PM, David Howells<[email protected]> wrote:
> > > > Wu Fengguang <[email protected]> wrote:
> > > >
> > > >> Sorry! This one compiles OK:
> > > >
> > > > Sadly that doesn't seem to work either:
> > > >
> > > > msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
> > > > msgctl11 cpuset=/ mems_allowed=0
> > > > Pid: 30858, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
> > > > Call Trace:
> > > > ?[<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
> > > > ?[<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
> > > > ?[<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
> > > > ?[<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
> > > > ?[<ffffffff81080843>] ? do_wp_page+0x2c6/0x5f5
> > > > ?[<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
> > > > ?[<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
> > > > ?[<ffffffff812e069f>] ? page_fault+0x1f/0x30
> > > > Mem-Info:
> > > > DMA per-cpu:
> > > > CPU ? ?0: hi: ? ?0, btch: ? 1 usd: ? 0
> > > > CPU ? ?1: hi: ? ?0, btch: ? 1 usd: ? 0
> > > > DMA32 per-cpu:
> > > > CPU ? ?0: hi: ?186, btch: ?31 usd: ?38
> > > > CPU ? ?1: hi: ?186, btch: ?31 usd: 106
> > > > Active_anon:75040 active_file:0 inactive_anon:2031
> > > > ?inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > > > ?free:1951 slab:41499 mapped:301 pagetables:60674 bounce:0
> > > > DMA free:3932kB min:60kB low:72kB high:88kB active_anon:2868kB inactive_anon:384kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
> > > > lowmem_reserve[]: 0 968 968 968
> > > > DMA32 free:3872kB min:3948kB low:4932kB high:5920kB active_anon:297292kB inactive_anon:7740kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
> > > > lowmem_reserve[]: 0 0 0 0
> > > > DMA: 7*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
> > > > DMA32: 500*4kB 2*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3872kB
> > > > 1928 total pagecache pages
> > > > 0 pages in swap cache
> > > > Swap cache stats: add 0, delete 0, find 0/0
> > > > Free swap ?= 0kB
> > > > Total swap = 0kB
> > > > 255744 pages RAM
> > > > 5589 pages reserved
> > > > 238251 pages shared
> > > > 216210 pages non-shared
> > > > Out of memory: kill process 25221 (msgctl11) score 130560 or a child
> > > > Killed process 26379 (msgctl11)
> > >
> > > Totally, I can't understand this situation.
> > > Now, this page allocation is order zero and It is just likely GFP_HIGHUSER.
> > > So it's unlikely interrupt context.
> >
> > The GFP flags that are set are
> >
> > #define __GFP_HIGHMEM (0x02)
> > #define __GFP_MOVABLE (0x08) /* Page is movable */
> > #define __GFP_WAIT (0x10) /* Can wait and reschedule? */
> > #define __GFP_IO (0x40) /* Can start physical IO? */
> > #define __GFP_FS (0x80) /* Can call down to low-level FS? */
> > #define __GFP_HARDWALL (0x20000) /* Enforce hardwall cpuset memory allocs */
> >
> > which are fairly permissive in terms of what action can be taken.
> >
> > > Buddy already has enough fallback DMA32, I think.
> >
> > It doesn't really. We are below the minimum watermark. It wouldn't be
> > able to grant the allocation until a few pages had been freed.
>
> Yes. I missed that.
>
> > > Why kernel can't allocate page for order 0 ?
> > > Is it allocator bug ?
> > >
> >
> > If it is, it is not because the allocation failed as the watermarks were not
> > being met. For this situation to be occuring, it has to be scanning the LRU
> > lists and making no forward progress. Odd things to note;
> >
> > o active_anon is very large in comparison to inactive_anon. Is this
> > because there is no swap and they are no longer being rotated?
>
> Yes. My patch's intention was that.
>
> commit 69c854817566db82c362797b4a6521d0b00fe1d8
> Author: MinChan Kim <[email protected]>
> Date: Tue Jun 16 15:32:44 2009 -0700
>
> > o Slab and pagetables are very large. Is slab genuinely unshrinkable?
> >
> > I think this system might be genuinely OOM. It can't reclaim memory and
> > we are below the minimum watermarks.
> >
> > Is it possible there are pages that are counted as active_anon that in
> > fact are reclaimable because they are on the wrong LRU list? If that was
> > the case, the lack of rotation to inactive list would prevent them
> > getting discovered.
>
> I agree.
> One of them is that "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
> isolate_lru_pages v2" as Kosaki already said.
>
> Unfortunately, David said it's not.
> But I think your guessing make sense.
>
> David. Doesn't it happen OOM if you revert my patch, still?
>

In the event the OOM does not happen with the patch reverted, I suggest
you put together a debugging patch that prints out details of all pages
on the active_anon LRU list in the event of an OOM. The intention is to
figure out what pages are on the active_anon list that shouldn't be.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2009-06-30 09:30:28

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Tue, Jun 30, 2009 at 6:22 PM, Mel Gorman<[email protected]> wrote:
> On Tue, Jun 30, 2009 at 01:07:41PM +0900, Minchan Kim wrote:
>> On Mon, 29 Jun 2009 17:07:25 +0100
>> Mel Gorman <[email protected]> wrote:
>>
>> > On Tue, Jun 30, 2009 at 12:00:26AM +0900, Minchan Kim wrote:
>> > > On Mon, Jun 29, 2009 at 11:21 PM, David Howells<[email protected]> wrote:
>> > > > Wu Fengguang <[email protected]> wrote:
>> > > >
>> > > >> Sorry! This one compiles OK:
>> > > >
>> > > > Sadly that doesn't seem to work either:
>> > > >
>> > > > msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
>> > > > msgctl11 cpuset=/ mems_allowed=0
>> > > > Pid: 30858, comm: msgctl11 Not tainted 2.6.31-rc1-cachefs #146
>> > > > Call Trace:
>> > > >  [<ffffffff8107207e>] ? oom_kill_process.clone.0+0xa9/0x245
>> > > >  [<ffffffff81072345>] ? __out_of_memory+0x12b/0x142
>> > > >  [<ffffffff810723c6>] ? out_of_memory+0x6a/0x94
>> > > >  [<ffffffff81074a90>] ? __alloc_pages_nodemask+0x42e/0x51d
>> > > >  [<ffffffff81080843>] ? do_wp_page+0x2c6/0x5f5
>> > > >  [<ffffffff810820c1>] ? handle_mm_fault+0x5dd/0x62f
>> > > >  [<ffffffff81022c32>] ? do_page_fault+0x1f8/0x20d
>> > > >  [<ffffffff812e069f>] ? page_fault+0x1f/0x30
>> > > > Mem-Info:
>> > > > DMA per-cpu:
>> > > > CPU    0: hi:    0, btch:   1 usd:   0
>> > > > CPU    1: hi:    0, btch:   1 usd:   0
>> > > > DMA32 per-cpu:
>> > > > CPU    0: hi:  186, btch:  31 usd:  38
>> > > > CPU    1: hi:  186, btch:  31 usd: 106
>> > > > Active_anon:75040 active_file:0 inactive_anon:2031
>> > > >  inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
>> > > >  free:1951 slab:41499 mapped:301 pagetables:60674 bounce:0
>> > > > DMA free:3932kB min:60kB low:72kB high:88kB active_anon:2868kB inactive_anon:384kB active_file:0kB inactive_file:0kB unevictable:0kB present:15364kB pages_scanned:0 all_unreclaimable? no
>> > > > lowmem_reserve[]: 0 968 968 968
>> > > > DMA32 free:3872kB min:3948kB low:4932kB high:5920kB active_anon:297292kB inactive_anon:7740kB active_file:0kB inactive_file:0kB unevictable:0kB present:992032kB pages_scanned:0 all_unreclaimable? no
>> > > > lowmem_reserve[]: 0 0 0 0
>> > > > DMA: 7*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3932kB
>> > > > DMA32: 500*4kB 2*8kB 0*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3872kB
>> > > > 1928 total pagecache pages
>> > > > 0 pages in swap cache
>> > > > Swap cache stats: add 0, delete 0, find 0/0
>> > > > Free swap  = 0kB
>> > > > Total swap = 0kB
>> > > > 255744 pages RAM
>> > > > 5589 pages reserved
>> > > > 238251 pages shared
>> > > > 216210 pages non-shared
>> > > > Out of memory: kill process 25221 (msgctl11) score 130560 or a child
>> > > > Killed process 26379 (msgctl11)
>> > >
>> > > Totally, I can't understand this situation.
>> > > Now, this page allocation is order zero and It is just likely GFP_HIGHUSER.
>> > > So it's unlikely interrupt context.
>> >
>> > The GFP flags that are set are
>> >
>> > #define __GFP_HIGHMEM       (0x02)
>> > #define __GFP_MOVABLE       (0x08)  /* Page is movable */
>> > #define __GFP_WAIT  (0x10)  /* Can wait and reschedule? */
>> > #define __GFP_IO    (0x40)  /* Can start physical IO? */
>> > #define __GFP_FS    (0x80)  /* Can call down to low-level FS? */
>> > #define __GFP_HARDWALL   (0x20000) /* Enforce hardwall cpuset memory allocs */
>> >
>> > which are fairly permissive in terms of what action can be taken.
>> >
>> > > Buddy already has enough fallback DMA32, I think.
>> >
>> > It doesn't really. We are below the minimum watermark. It wouldn't be
>> > able to grant the allocation until a few pages had been freed.
>>
>> Yes. I missed that.
>>
>> > > Why kernel can't allocate page for order 0 ?
>> > > Is it allocator bug ?
>> > >
>> >
>> > If it is, it is not because the allocation failed as the watermarks were not
>> > being met. For this situation to be occuring, it has to be scanning the LRU
>> > lists and making no forward progress. Odd things to note;
>> >
>> > o active_anon is very large in comparison to inactive_anon. Is this
>> >   because there is no swap and they are no longer being rotated?
>>
>> Yes. My patch's intention was that.
>>
>>        commit 69c854817566db82c362797b4a6521d0b00fe1d8
>>        Author: MinChan Kim <[email protected]>
>>        Date:   Tue Jun 16 15:32:44 2009 -0700
>>
>> > o Slab and pagetables are very large. Is slab genuinely unshrinkable?
>> >
>> > I think this system might be genuinely OOM. It can't reclaim memory and
>> > we are below the minimum watermarks.
>> >
>> > Is it possible there are pages that are counted as active_anon that in
>> > fact are reclaimable because they are on the wrong LRU list? If that was
>> > the case, the lack of rotation to inactive list would prevent them
>> > getting discovered.
>>
>> I agree.
>> One of them is that "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
>> isolate_lru_pages v2" as Kosaki already said.
>>
>> Unfortunately, David said it's not.
>> But I think your guessing make sense.
>>
>> David. Doesn't it happen OOM if you revert my patch, still?
>>
>
> In the event the OOM does not happen with the patch reverted, I suggest
> you put together a debugging patch that prints out details of all pages
> on the active_anon LRU list in the event of an OOM. The intention is to
> figure out what pages are on the active_anon list that shouldn't be.

Okay. But unfortunately, I will do it after the day after tomorrow. ;-(

> --
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab
>



--
Kinds regards,
Minchan Kim

2009-06-30 14:02:09

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

Hi, David.

On Tue, 30 Jun 2009 10:22:36 +0100
Mel Gorman <[email protected]> wrote:

> > > I think this system might be genuinely OOM. It can't reclaim memory and
> > > we are below the minimum watermarks.
> > >
> > > Is it possible there are pages that are counted as active_anon that in
> > > fact are reclaimable because they are on the wrong LRU list? If that was
> > > the case, the lack of rotation to inactive list would prevent them
> > > getting discovered.
> >
> > I agree.
> > One of them is that "[BUGFIX][PATCH] fix lumpy reclaim lru handiling at
> > isolate_lru_pages v2" as Kosaki already said.
> >
> > Unfortunately, David said it's not.
> > But I think your guessing make sense.
> >
> > David. Doesn't it happen OOM if you revert my patch, still?
> >
>
> In the event the OOM does not happen with the patch reverted, I suggest
> you put together a debugging patch that prints out details of all pages
> on the active_anon LRU list in the event of an OOM. The intention is to
> figure out what pages are on the active_anon list that shouldn't be.

Befor I go to the trip, I made debugging patch in a hurry.
Mel and I suspect to put the wrong page in lru list.

This patch's goal is that print page's detail on active anon lru when it happen OOM.
Maybe you could expand your log buffer size.

Could you show me the information with OOM, please ?

---
include/linux/mm.h | 1 +
lib/show_mem.c | 2 +-
mm/page_alloc.c | 22 ++++++++++++++++++++++
mm/vmstat.c | 14 ++++++++++++++
4 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ba3a7cb..cfd8111 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -713,6 +713,7 @@ extern void pagefault_out_of_memory(void);

#define offset_in_page(p) ((unsigned long)(p) & ~PAGE_MASK)

+extern void show_active_anonpages(void);
extern void show_free_areas(void);

#ifdef CONFIG_SHMEM
diff --git a/lib/show_mem.c b/lib/show_mem.c
index 238e72a..32a3a32 100644
--- a/lib/show_mem.c
+++ b/lib/show_mem.c
@@ -17,7 +17,7 @@ void show_mem(void)

printk(KERN_INFO "Mem-Info:\n");
show_free_areas();
-
+ show_active_anonpages();
for_each_online_pgdat(pgdat) {
unsigned long i, flags;

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5d714f8..d666f9e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2090,6 +2090,28 @@ void si_meminfo_node(struct sysinfo *val, int nid)

#define K(x) ((x) << (PAGE_SHIFT-10))

+void show_active_anonpages(void)
+{
+ struct zone *zone;
+ struct list_head *list;
+ struct page *page;
+
+ for_each_populated_zone(zone) {
+ if (list_empty(&zone->lru[LRU_ACTIVE_ANON].list))
+ continue;
+
+ spin_lock_irq(&zone->lru_lock);
+ list = &zone->lru[LRU_ACTIVE_ANON].list;
+ printk("==== %s ==== \n", zone->name);
+ list_for_each_entry(page, list, lru) {
+ printk(KERN_INFO "pfn:0x%08lx F:0x%08lx anon:%d C:%d M:%d\n",
+ page_to_pfn(page), page->flags, PageAnon(page),
+ atomic_read(&page->_count), atomic_read(&page->_mapcount));
+ }
+ spin_unlock_irq(&zone->lru_lock);
+ }
+
+}
/*
* Show free area list (used inside shift_scroll-lock stuff)
* We also calculate the percentage fragmentation. We do this by counting the
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 138bed5..c23ecaa 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -575,6 +575,12 @@ static int fragmentation_open(struct inode *inode, struct file *file)
return seq_open(file, &fragmentation_op);
}

+static int active_anon_open(struct inode *inode, struct file *file)
+{
+ show_active_anonpages();
+ return -ENOENT;
+}
+
static const struct file_operations fragmentation_file_operations = {
.open = fragmentation_open,
.read = seq_read,
@@ -582,6 +588,13 @@ static const struct file_operations fragmentation_file_operations = {
.release = seq_release,
};

+static const struct file_operations active_anon_file_operations = {
+ .open = active_anon_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
static const struct seq_operations pagetypeinfo_op = {
.start = frag_start,
.next = frag_next,
@@ -938,6 +951,7 @@ static int __init setup_vmstat(void)
#endif
#ifdef CONFIG_PROC_FS
proc_create("buddyinfo", S_IRUGO, NULL, &fragmentation_file_operations);
+ proc_create("activelruinfo", S_IRUGO, NULL, &active_anon_file_operations);
proc_create("pagetypeinfo", S_IRUGO, NULL, &pagetypeinfo_file_ops);
proc_create("vmstat", S_IRUGO, NULL, &proc_vmstat_file_operations);
proc_create("zoneinfo", S_IRUGO, NULL, &proc_zoneinfo_file_operations);
--
1.5.4.3


--
Kinds Regards
Minchan Kim

2009-06-30 14:05:26

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Mon, Jun 29, 2009 at 11:56:47PM +0800, David Woodhouse wrote:
> On Mon, 2009-06-29 at 16:54 +0100, David Howells wrote:
> > Wu Fengguang <[email protected]> wrote:
> >
> > > Yes this time the OOM order/flags are much different from all previous OOMs.
> > >
> > > btw, I found that msgctl11 is pretty good at making a lot of SUnreclaim and
> > > PageTables pages:
> >
> > I got David Woodhouse to run this on one of this boxes, but he doesn't see the
> > problem, I think because he's got 4GB of RAM, and never comes close to running
> > out.
> >
> > I've asked him to reboot with mem=1G to see if that helps reproduce it.
>
> msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> Pid: 5795, comm: msgctl11 Not tainted 2.6.31-rc1 #147
> Call Trace:
> [<ffffffff81092c77>] oom_kill_process.clone.0+0xac/0x254
> [<ffffffff81092b5c>] ? badness+0x24d/0x2bc
> [<ffffffff81092f5f>] __out_of_memory+0x140/0x157
> [<ffffffff8109308f>] out_of_memory+0x119/0x150
> [<ffffffff81095c65>] ? drain_local_pages+0x16/0x18
> [<ffffffff810967ab>] __alloc_pages_nodemask+0x45a/0x55b
> [<ffffffff810a32b0>] ? __inc_zone_page_state+0x2e/0x30
> [<ffffffff810bb6b9>] alloc_pages_current+0xae/0xb6
> [<ffffffff810a604a>] ? do_wp_page+0x621/0x6c3
> [<ffffffff81094d7e>] __get_free_pages+0xe/0x4b
> [<ffffffff810403a7>] copy_process+0xab/0x11a5
> [<ffffffff810327c8>] ? check_preempt_wakeup+0x11a/0x142
> [<ffffffff810a7a06>] ? handle_mm_fault+0x678/0x6e9
> [<ffffffff810415ec>] do_fork+0x14b/0x338
> [<ffffffff8105b50a>] ? up_read+0xe/0x10
> [<ffffffff814ee655>] ? do_page_fault+0x2da/0x307
> [<ffffffff8100a55c>] sys_clone+0x28/0x2a
> [<ffffffff8100bfc3>] stub_clone+0x13/0x20
> [<ffffffff8100bcdb>] ? system_call_fastpath+0x16/0x1b
> Mem-Info:
> Node 0 DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> CPU 2: hi: 0, btch: 1 usd: 0
> CPU 3: hi: 0, btch: 1 usd: 0
> CPU 4: hi: 0, btch: 1 usd: 0
> CPU 5: hi: 0, btch: 1 usd: 0
> CPU 6: hi: 0, btch: 1 usd: 0
> CPU 7: hi: 0, btch: 1 usd: 0
> Node 0 DMA32 per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 20
> CPU 2: hi: 186, btch: 31 usd: 19
> CPU 3: hi: 186, btch: 31 usd: 20
> CPU 4: hi: 186, btch: 31 usd: 19
> CPU 5: hi: 186, btch: 31 usd: 24
> CPU 6: hi: 186, btch: 31 usd: 41
> CPU 7: hi: 186, btch: 31 usd: 25
> Active_anon:72835 active_file:89 inactive_anon:575
> inactive_file:103 unevictable:0 dirty:36 writeback:0 unstable:0
> free:2467 slab:38211 mapped:229 pagetables:66918 bounce:0
> Node 0 DMA free:4036kB min:60kB low:72kB high:88kB active_anon:3228kB inactive_a
> non:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15356kB page
> s_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 994 994 994
> Node 0 DMA32 free:5832kB min:4000kB low:5000kB high:6000kB active_anon:288112kB
> inactive_anon:2044kB active_file:356kB inactive_file:412kB unevictable:0kB prese
> nt:1018080kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> Node 0 DMA: 1*4kB 2*8kB 1*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*
> 2048kB 0*4096kB = 3940kB
> Node 0 DMA32: 852*4kB 1*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024k
> B 0*2048kB 0*4096kB = 5304kB
> 437 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 262144 pages RAM
> 6503 pages reserved
> 205864 pages shared
> 226536 pages non-shared
> Out of memory: kill process 3855 (msgctl11) score 179248 or a child
> Killed process 4222 (msgctl11)

More data: I boot 2.6.30-rc1 with mem=1G and enabled 1GB swap and run msgctl11.

It goes OOM at the 2nd run. They are very interesting numbers: memory leaked?

[ 2259.825958] msgctl11 invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0
[ 2259.828092] Pid: 29657, comm: msgctl11 Not tainted 2.6.31-rc1 #22
[ 2259.830505] Call Trace:
[ 2259.832010] [<ffffffff8156f366>] ? _spin_unlock+0x26/0x30
[ 2259.834219] [<ffffffff810c8b26>] oom_kill_process+0x176/0x270
[ 2259.837603] [<ffffffff810c8def>] ? badness+0x18f/0x300
[ 2259.839906] [<ffffffff810c9095>] __out_of_memory+0x135/0x170
[ 2259.842035] [<ffffffff810c91c5>] out_of_memory+0xf5/0x180
[ 2259.844270] [<ffffffff810cd86c>] __alloc_pages_nodemask+0x6ac/0x6c0
[ 2259.846743] [<ffffffff810f8fa8>] alloc_pages_current+0x78/0x100
[ 2259.849083] [<ffffffff81033515>] pte_alloc_one+0x15/0x50
[ 2259.851282] [<ffffffff810e0eda>] __pte_alloc+0x2a/0xf0
[ 2259.853454] [<ffffffff810e16e2>] handle_mm_fault+0x742/0x830
[ 2259.855793] [<ffffffff815725cb>] do_page_fault+0x1cb/0x330
[ 2259.858033] [<ffffffff8156fdf5>] page_fault+0x25/0x30
[ 2259.860301] Mem-Info:
[ 2259.861706] Node 0 DMA per-cpu:
[ 2259.862523] CPU 0: hi: 0, btch: 1 usd: 0
[ 2259.864454] CPU 1: hi: 0, btch: 1 usd: 0
[ 2259.866608] Node 0 DMA32 per-cpu:
[ 2259.867404] CPU 0: hi: 186, btch: 31 usd: 197
[ 2259.869283] CPU 1: hi: 186, btch: 31 usd: 175
[ 2259.870511] Active_anon:0 active_file:11 inactive_anon:0

zero anon pages!

[ 2259.870512] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
[ 2259.870513] free:1986 slab:42170 mapped:96 pagetables:59427 bounce:0
[ 2259.877722] Node 0 DMA free:3976kB min:56kB low:68kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15164kB pages_scanned:429 all_unreclaimable? no
[ 2259.883804] lowmem_reserve[]: 0 982 982 982
[ 2259.885814] Node 0 DMA32 free:3968kB min:3980kB low:4972kB high:5968kB active_anon:0kB inactive_anon:0kB active_file:44kB inactive_file:0kB unevictable:0kB present:1005984kB pages_scanned:152 all_unreclaimable? no
[ 2259.890958] lowmem_reserve[]: 0 0 0 0
[ 2259.893183] Node 0 DMA: 4*4kB 3*8kB 2*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3976kB
[ 2259.897406] Node 0 DMA32: 334*4kB 77*8kB 24*16kB 27*32kB 10*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3968kB
[ 2259.902753] 625 total pagecache pages
[ 2259.903623] 454 pages in swap cache
[ 2259.905299] Swap cache stats: add 95129, delete 94675, find 55783/67607
[ 2259.908858] Free swap = 1041232kB
[ 2259.909618] Total swap = 1048568kB

swap far from full!

[ 2259.919456] 262144 pages RAM
[ 2259.921071] 12513 pages reserved
[ 2259.922790] 314212 pages shared
[ 2259.923548] 165757 pages non-shared
[ 2259.925234] Out of memory: kill process 20791 (msgctl11) score 2280094 or a child
[ 2259.928982] Killed process 21946 (msgctl11)

2009-06-30 15:50:47

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Tue, Jun 30, 2009 at 11:05 PM, Wu Fengguang<[email protected]> wrote:
>
> More data: I boot 2.6.30-rc1 with mem=1G and enabled 1GB swap and run msgctl11.
>
> It goes OOM at the 2nd run. They are very interesting numbers: memory leaked?

Hmm. It's very serious and another problem since this system have swap
device and it's not full.

Can you reproduce it easily ?

I want to reproduce it in my system.

Did you ran only msgctl11 not all LTP test ?
Just default parameter ? ex) $ ./testcases/bin/msgctl11

2nd run ? You mean you execute msgctl11 two time in order ?
I mean after first test is finished successfully and OOM happens
second test before ending successfully ?


>        [ 2259.825958] msgctl11 invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0
>        [ 2259.828092] Pid: 29657, comm: msgctl11 Not tainted 2.6.31-rc1 #22
>        [ 2259.830505] Call Trace:
>        [ 2259.832010]  [<ffffffff8156f366>] ? _spin_unlock+0x26/0x30
>        [ 2259.834219]  [<ffffffff810c8b26>] oom_kill_process+0x176/0x270
>        [ 2259.837603]  [<ffffffff810c8def>] ? badness+0x18f/0x300
>        [ 2259.839906]  [<ffffffff810c9095>] __out_of_memory+0x135/0x170
>        [ 2259.842035]  [<ffffffff810c91c5>] out_of_memory+0xf5/0x180
>        [ 2259.844270]  [<ffffffff810cd86c>] __alloc_pages_nodemask+0x6ac/0x6c0
>        [ 2259.846743]  [<ffffffff810f8fa8>] alloc_pages_current+0x78/0x100
>        [ 2259.849083]  [<ffffffff81033515>] pte_alloc_one+0x15/0x50
>        [ 2259.851282]  [<ffffffff810e0eda>] __pte_alloc+0x2a/0xf0
>        [ 2259.853454]  [<ffffffff810e16e2>] handle_mm_fault+0x742/0x830
>        [ 2259.855793]  [<ffffffff815725cb>] do_page_fault+0x1cb/0x330
>        [ 2259.858033]  [<ffffffff8156fdf5>] page_fault+0x25/0x30
>        [ 2259.860301] Mem-Info:
>        [ 2259.861706] Node 0 DMA per-cpu:
>        [ 2259.862523] CPU    0: hi:    0, btch:   1 usd:   0
>        [ 2259.864454] CPU    1: hi:    0, btch:   1 usd:   0
>        [ 2259.866608] Node 0 DMA32 per-cpu:
>        [ 2259.867404] CPU    0: hi:  186, btch:  31 usd: 197
>        [ 2259.869283] CPU    1: hi:  186, btch:  31 usd: 175
>        [ 2259.870511] Active_anon:0 active_file:11 inactive_anon:0
>
> zero anon pages!
>
>        [ 2259.870512]  inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
>        [ 2259.870513]  free:1986 slab:42170 mapped:96 pagetables:59427 bounce:0
>        [ 2259.877722] Node 0 DMA free:3976kB min:56kB low:68kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15164kB pages_scanned:429 all_unreclaimable? no
>        [ 2259.883804] lowmem_reserve[]: 0 982 982 982
>        [ 2259.885814] Node 0 DMA32 free:3968kB min:3980kB low:4972kB high:5968kB active_anon:0kB inactive_anon:0kB active_file:44kB inactive_file:0kB unevictable:0kB present:1005984kB pages_scanned:152 all_unreclaimable? no
>        [ 2259.890958] lowmem_reserve[]: 0 0 0 0
>        [ 2259.893183] Node 0 DMA: 4*4kB 3*8kB 2*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3976kB
>        [ 2259.897406] Node 0 DMA32: 334*4kB 77*8kB 24*16kB 27*32kB 10*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3968kB
>        [ 2259.902753] 625 total pagecache pages
>        [ 2259.903623] 454 pages in swap cache
>        [ 2259.905299] Swap cache stats: add 95129, delete 94675, find 55783/67607
>        [ 2259.908858] Free swap  = 1041232kB
>        [ 2259.909618] Total swap = 1048568kB
>
> swap far from full!
>
>        [ 2259.919456] 262144 pages RAM
>        [ 2259.921071] 12513 pages reserved
>        [ 2259.922790] 314212 pages shared
>        [ 2259.923548] 165757 pages non-shared
>        [ 2259.925234] Out of memory: kill process 20791 (msgctl11) score 2280094 or a child
>        [ 2259.928982] Killed process 21946 (msgctl11)
>
>



--
Kinds regards,
Minchan Kim

2009-07-01 01:27:23

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

> On Mon, Jun 29, 2009 at 11:56:47PM +0800, David Woodhouse wrote:
> > On Mon, 2009-06-29 at 16:54 +0100, David Howells wrote:
> > > Wu Fengguang <[email protected]> wrote:
> > >
> > > > Yes this time the OOM order/flags are much different from all previous OOMs.
> > > >
> > > > btw, I found that msgctl11 is pretty good at making a lot of SUnreclaim and
> > > > PageTables pages:
> > >
> > > I got David Woodhouse to run this on one of this boxes, but he doesn't see the
> > > problem, I think because he's got 4GB of RAM, and never comes close to running
> > > out.
> > >
> > > I've asked him to reboot with mem=1G to see if that helps reproduce it.
> >
> > msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> > Pid: 5795, comm: msgctl11 Not tainted 2.6.31-rc1 #147
> > Call Trace:
> > [<ffffffff81092c77>] oom_kill_process.clone.0+0xac/0x254
> > [<ffffffff81092b5c>] ? badness+0x24d/0x2bc
> > [<ffffffff81092f5f>] __out_of_memory+0x140/0x157
> > [<ffffffff8109308f>] out_of_memory+0x119/0x150
> > [<ffffffff81095c65>] ? drain_local_pages+0x16/0x18
> > [<ffffffff810967ab>] __alloc_pages_nodemask+0x45a/0x55b
> > [<ffffffff810a32b0>] ? __inc_zone_page_state+0x2e/0x30
> > [<ffffffff810bb6b9>] alloc_pages_current+0xae/0xb6
> > [<ffffffff810a604a>] ? do_wp_page+0x621/0x6c3
> > [<ffffffff81094d7e>] __get_free_pages+0xe/0x4b
> > [<ffffffff810403a7>] copy_process+0xab/0x11a5
> > [<ffffffff810327c8>] ? check_preempt_wakeup+0x11a/0x142
> > [<ffffffff810a7a06>] ? handle_mm_fault+0x678/0x6e9
> > [<ffffffff810415ec>] do_fork+0x14b/0x338
> > [<ffffffff8105b50a>] ? up_read+0xe/0x10
> > [<ffffffff814ee655>] ? do_page_fault+0x2da/0x307
> > [<ffffffff8100a55c>] sys_clone+0x28/0x2a
> > [<ffffffff8100bfc3>] stub_clone+0x13/0x20
> > [<ffffffff8100bcdb>] ? system_call_fastpath+0x16/0x1b
> > Mem-Info:
> > Node 0 DMA per-cpu:
> > CPU 0: hi: 0, btch: 1 usd: 0
> > CPU 1: hi: 0, btch: 1 usd: 0
> > CPU 2: hi: 0, btch: 1 usd: 0
> > CPU 3: hi: 0, btch: 1 usd: 0
> > CPU 4: hi: 0, btch: 1 usd: 0
> > CPU 5: hi: 0, btch: 1 usd: 0
> > CPU 6: hi: 0, btch: 1 usd: 0
> > CPU 7: hi: 0, btch: 1 usd: 0
> > Node 0 DMA32 per-cpu:
> > CPU 0: hi: 186, btch: 31 usd: 0
> > CPU 1: hi: 186, btch: 31 usd: 20
> > CPU 2: hi: 186, btch: 31 usd: 19
> > CPU 3: hi: 186, btch: 31 usd: 20
> > CPU 4: hi: 186, btch: 31 usd: 19
> > CPU 5: hi: 186, btch: 31 usd: 24
> > CPU 6: hi: 186, btch: 31 usd: 41
> > CPU 7: hi: 186, btch: 31 usd: 25
> > Active_anon:72835 active_file:89 inactive_anon:575
> > inactive_file:103 unevictable:0 dirty:36 writeback:0 unstable:0
> > free:2467 slab:38211 mapped:229 pagetables:66918 bounce:0
> > Node 0 DMA free:4036kB min:60kB low:72kB high:88kB active_anon:3228kB inactive_a
> > non:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15356kB page
> > s_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 994 994 994
> > Node 0 DMA32 free:5832kB min:4000kB low:5000kB high:6000kB active_anon:288112kB
> > inactive_anon:2044kB active_file:356kB inactive_file:412kB unevictable:0kB prese
> > nt:1018080kB pages_scanned:0 all_unreclaimable? no
> > lowmem_reserve[]: 0 0 0 0
> > Node 0 DMA: 1*4kB 2*8kB 1*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*
> > 2048kB 0*4096kB = 3940kB
> > Node 0 DMA32: 852*4kB 1*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024k
> > B 0*2048kB 0*4096kB = 5304kB
> > 437 total pagecache pages
> > 0 pages in swap cache
> > Swap cache stats: add 0, delete 0, find 0/0
> > Free swap = 0kB
> > Total swap = 0kB
> > 262144 pages RAM
> > 6503 pages reserved
> > 205864 pages shared
> > 226536 pages non-shared
> > Out of memory: kill process 3855 (msgctl11) score 179248 or a child
> > Killed process 4222 (msgctl11)
>
> More data: I boot 2.6.30-rc1 with mem=1G and enabled 1GB swap and run msgctl11.
>
> It goes OOM at the 2nd run. They are very interesting numbers: memory leaked?
>
> [ 2259.825958] msgctl11 invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0
> [ 2259.828092] Pid: 29657, comm: msgctl11 Not tainted 2.6.31-rc1 #22
> [ 2259.830505] Call Trace:
> [ 2259.832010] [<ffffffff8156f366>] ? _spin_unlock+0x26/0x30
> [ 2259.834219] [<ffffffff810c8b26>] oom_kill_process+0x176/0x270
> [ 2259.837603] [<ffffffff810c8def>] ? badness+0x18f/0x300
> [ 2259.839906] [<ffffffff810c9095>] __out_of_memory+0x135/0x170
> [ 2259.842035] [<ffffffff810c91c5>] out_of_memory+0xf5/0x180
> [ 2259.844270] [<ffffffff810cd86c>] __alloc_pages_nodemask+0x6ac/0x6c0
> [ 2259.846743] [<ffffffff810f8fa8>] alloc_pages_current+0x78/0x100
> [ 2259.849083] [<ffffffff81033515>] pte_alloc_one+0x15/0x50
> [ 2259.851282] [<ffffffff810e0eda>] __pte_alloc+0x2a/0xf0
> [ 2259.853454] [<ffffffff810e16e2>] handle_mm_fault+0x742/0x830
> [ 2259.855793] [<ffffffff815725cb>] do_page_fault+0x1cb/0x330
> [ 2259.858033] [<ffffffff8156fdf5>] page_fault+0x25/0x30
> [ 2259.860301] Mem-Info:
> [ 2259.861706] Node 0 DMA per-cpu:
> [ 2259.862523] CPU 0: hi: 0, btch: 1 usd: 0
> [ 2259.864454] CPU 1: hi: 0, btch: 1 usd: 0
> [ 2259.866608] Node 0 DMA32 per-cpu:
> [ 2259.867404] CPU 0: hi: 186, btch: 31 usd: 197
> [ 2259.869283] CPU 1: hi: 186, btch: 31 usd: 175
> [ 2259.870511] Active_anon:0 active_file:11 inactive_anon:0
>
> zero anon pages!
>
> [ 2259.870512] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> [ 2259.870513] free:1986 slab:42170 mapped:96 pagetables:59427 bounce:0

I bet this is NOT zero. it only hidden.

I guess this system's memory usage is,
pagetables: 60k pages
kernel stack: 60k pages
anon (hidden): 60k pages
slab: 40k pages
other: 30k pages
===================
total: 250k pages = 1GB

What is "hidden" anon pages?
each shrink_{in}active_list isolate 32 pages from lru. it mean anon or file lru
accounting decrease temporary.

if system have plenty thread or process, heavy memory pressure makes
#-of-thread x 32pages isolation.

msgctl11 makes >10K processes.

I have debugging patch for this case.
Wu, Can you please try this patch?

if my guess is correct, we need to implement #-of-reclaim-process throttling
mechanism.

============================================
If the system have plenty thread, concurrent reclaim can isolate very much pages.
Unfortunately, current /proc/meminfo and OOM log can't show it.

Machine
IA64 x8 CPU
MEM 8GB

reproduce way

% ./hackbench 140 process 1000
=> couse OOM

Active_anon:203 active_file:91 inactive_anon:104
inactive_file:76 unevictable:0 dirty:0 writeback:72 unstable:0
free:168 slab:4968 mapped:136 pagetables:28203 bounce:0
isolate:49088
^^^^

---
fs/proc/meminfo.c | 6 ++++--
include/linux/mmzone.h | 1 +
mm/page_alloc.c | 6 ++++--
mm/vmscan.c | 5 +++++
mm/vmstat.c | 1 +
5 files changed, 15 insertions(+), 4 deletions(-)

Index: b/fs/proc/meminfo.c
===================================================================
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -95,7 +95,8 @@ static int meminfo_proc_show(struct seq_
"Committed_AS: %8lu kB\n"
"VmallocTotal: %8lu kB\n"
"VmallocUsed: %8lu kB\n"
- "VmallocChunk: %8lu kB\n",
+ "VmallocChunk: %8lu kB\n"
+ "IsolatePages: %8lu kB\n",
K(i.totalram),
K(i.freeram),
K(i.bufferram),
@@ -139,7 +140,8 @@ static int meminfo_proc_show(struct seq_
K(committed),
(unsigned long)VMALLOC_TOTAL >> 10,
vmi.used >> 10,
- vmi.largest_chunk >> 10
+ vmi.largest_chunk >> 10,
+ K(global_page_state(NR_ISOLATE)),
);

hugetlb_report_meminfo(m);
Index: b/include/linux/mmzone.h
===================================================================
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -107,6 +107,7 @@ enum zone_stat_item {
NUMA_LOCAL, /* allocation from local node */
NUMA_OTHER, /* allocation from other node */
#endif
+ NR_ISOLATE,
NR_VM_ZONE_STAT_ITEMS };

/*
Index: b/mm/page_alloc.c
===================================================================
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2119,7 +2119,8 @@ void show_free_areas(void)
" inactive_file:%lu"
" unevictable:%lu"
" dirty:%lu writeback:%lu unstable:%lu\n"
- " free:%lu slab:%lu mapped:%lu pagetables:%lu bounce:%lu\n",
+ " free:%lu slab:%lu mapped:%lu pagetables:%lu bounce:%lu\n"
+ " isolate:%lu\n",
global_page_state(NR_ACTIVE_ANON),
global_page_state(NR_ACTIVE_FILE),
global_page_state(NR_INACTIVE_ANON),
@@ -2133,7 +2134,8 @@ void show_free_areas(void)
global_page_state(NR_SLAB_UNRECLAIMABLE),
global_page_state(NR_FILE_MAPPED),
global_page_state(NR_PAGETABLE),
- global_page_state(NR_BOUNCE));
+ global_page_state(NR_BOUNCE),
+ global_page_state(NR_ISOLATE));

for_each_populated_zone(zone) {
int i;
Index: b/mm/vmscan.c
===================================================================
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1066,6 +1066,7 @@ static unsigned long shrink_inactive_lis
unsigned long nr_freed;
unsigned long nr_active;
unsigned int count[NR_LRU_LISTS] = { 0, };
+ unsigned int total_count;
int mode = lumpy_reclaim ? ISOLATE_BOTH : ISOLATE_INACTIVE;

nr_taken = sc->isolate_pages(sc->swap_cluster_max,
@@ -1082,6 +1083,7 @@ static unsigned long shrink_inactive_lis
-count[LRU_ACTIVE_ANON]);
__mod_zone_page_state(zone, NR_INACTIVE_ANON,
-count[LRU_INACTIVE_ANON]);
+ __mod_zone_page_state(zone, NR_ISOLATE, nr_taken);

if (scanning_global_lru(sc))
zone->pages_scanned += nr_scan;
@@ -1131,6 +1133,7 @@ static unsigned long shrink_inactive_lis
goto done;

spin_lock(&zone->lru_lock);
+ __mod_zone_page_state(zone, NR_ISOLATE, -nr_taken);
/*
* Put back any unfreeable pages.
*/
@@ -1232,6 +1235,7 @@ static void move_active_pages_to_lru(str
}
}
__mod_zone_page_state(zone, NR_LRU_BASE + lru, pgmoved);
+ __mod_zone_page_state(zone, NR_ISOLATE, -pgmoved);
if (!is_active_lru(lru))
__count_vm_events(PGDEACTIVATE, pgmoved);
}
@@ -1267,6 +1271,7 @@ static void shrink_active_list(unsigned
__mod_zone_page_state(zone, NR_ACTIVE_FILE, -pgmoved);
else
__mod_zone_page_state(zone, NR_ACTIVE_ANON, -pgmoved);
+ __mod_zone_page_state(zone, NR_ISOLATE, pgmoved);
spin_unlock_irq(&zone->lru_lock);

pgmoved = 0; /* count referenced (mapping) mapped pages */
Index: b/mm/vmstat.c
===================================================================
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -697,6 +697,7 @@ static const char * const vmstat_text[]
"unevictable_pgs_stranded",
"unevictable_pgs_mlockfreed",
#endif
+ "isolate_pages",
};

static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,


2009-07-01 02:14:30

by Rik van Riel

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

KOSAKI Motohiro wrote:

> if my guess is correct, we need to implement #-of-reclaim-process throttling
> mechanism.

There are probably some other things that want throttling,
too.

For example, the number of pages currently under IO can
be as large as the entire file and anon inactive lists,
which can cause page reclaim to fail because none of the
pages are reclaimable yet.

This is probably not a big issue for the page cache,
since the readahead window will collapse before we hit
this problem.

However, we may want to take measures to ensure that
the total number of pages in swap readahead do not
take up the entire inactive anon list - maybe we should
limit it to half that amount, to stay on the safe side?

I'll whip up a patch for this tomorrow.

That should get rid of the OOMs that have been observed
with the swap readahead patches by Johannes.

--
All rights reversed.

2009-07-01 02:16:56

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Wed, Jul 01, 2009 at 10:18:03AM +0900, KOSAKI Motohiro wrote:
> > On Mon, Jun 29, 2009 at 11:56:47PM +0800, David Woodhouse wrote:
> > > On Mon, 2009-06-29 at 16:54 +0100, David Howells wrote:
> > > > Wu Fengguang <[email protected]> wrote:
> > > >
> > > > > Yes this time the OOM order/flags are much different from all previous OOMs.
> > > > >
> > > > > btw, I found that msgctl11 is pretty good at making a lot of SUnreclaim and
> > > > > PageTables pages:
> > > >
> > > > I got David Woodhouse to run this on one of this boxes, but he doesn't see the
> > > > problem, I think because he's got 4GB of RAM, and never comes close to running
> > > > out.
> > > >
> > > > I've asked him to reboot with mem=1G to see if that helps reproduce it.
> > >
> > > msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> > > Pid: 5795, comm: msgctl11 Not tainted 2.6.31-rc1 #147
> > > Call Trace:
> > > [<ffffffff81092c77>] oom_kill_process.clone.0+0xac/0x254
> > > [<ffffffff81092b5c>] ? badness+0x24d/0x2bc
> > > [<ffffffff81092f5f>] __out_of_memory+0x140/0x157
> > > [<ffffffff8109308f>] out_of_memory+0x119/0x150
> > > [<ffffffff81095c65>] ? drain_local_pages+0x16/0x18
> > > [<ffffffff810967ab>] __alloc_pages_nodemask+0x45a/0x55b
> > > [<ffffffff810a32b0>] ? __inc_zone_page_state+0x2e/0x30
> > > [<ffffffff810bb6b9>] alloc_pages_current+0xae/0xb6
> > > [<ffffffff810a604a>] ? do_wp_page+0x621/0x6c3
> > > [<ffffffff81094d7e>] __get_free_pages+0xe/0x4b
> > > [<ffffffff810403a7>] copy_process+0xab/0x11a5
> > > [<ffffffff810327c8>] ? check_preempt_wakeup+0x11a/0x142
> > > [<ffffffff810a7a06>] ? handle_mm_fault+0x678/0x6e9
> > > [<ffffffff810415ec>] do_fork+0x14b/0x338
> > > [<ffffffff8105b50a>] ? up_read+0xe/0x10
> > > [<ffffffff814ee655>] ? do_page_fault+0x2da/0x307
> > > [<ffffffff8100a55c>] sys_clone+0x28/0x2a
> > > [<ffffffff8100bfc3>] stub_clone+0x13/0x20
> > > [<ffffffff8100bcdb>] ? system_call_fastpath+0x16/0x1b
> > > Mem-Info:
> > > Node 0 DMA per-cpu:
> > > CPU 0: hi: 0, btch: 1 usd: 0
> > > CPU 1: hi: 0, btch: 1 usd: 0
> > > CPU 2: hi: 0, btch: 1 usd: 0
> > > CPU 3: hi: 0, btch: 1 usd: 0
> > > CPU 4: hi: 0, btch: 1 usd: 0
> > > CPU 5: hi: 0, btch: 1 usd: 0
> > > CPU 6: hi: 0, btch: 1 usd: 0
> > > CPU 7: hi: 0, btch: 1 usd: 0
> > > Node 0 DMA32 per-cpu:
> > > CPU 0: hi: 186, btch: 31 usd: 0
> > > CPU 1: hi: 186, btch: 31 usd: 20
> > > CPU 2: hi: 186, btch: 31 usd: 19
> > > CPU 3: hi: 186, btch: 31 usd: 20
> > > CPU 4: hi: 186, btch: 31 usd: 19
> > > CPU 5: hi: 186, btch: 31 usd: 24
> > > CPU 6: hi: 186, btch: 31 usd: 41
> > > CPU 7: hi: 186, btch: 31 usd: 25
> > > Active_anon:72835 active_file:89 inactive_anon:575
> > > inactive_file:103 unevictable:0 dirty:36 writeback:0 unstable:0
> > > free:2467 slab:38211 mapped:229 pagetables:66918 bounce:0
> > > Node 0 DMA free:4036kB min:60kB low:72kB high:88kB active_anon:3228kB inactive_a
> > > non:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15356kB page
> > > s_scanned:0 all_unreclaimable? no
> > > lowmem_reserve[]: 0 994 994 994
> > > Node 0 DMA32 free:5832kB min:4000kB low:5000kB high:6000kB active_anon:288112kB
> > > inactive_anon:2044kB active_file:356kB inactive_file:412kB unevictable:0kB prese
> > > nt:1018080kB pages_scanned:0 all_unreclaimable? no
> > > lowmem_reserve[]: 0 0 0 0
> > > Node 0 DMA: 1*4kB 2*8kB 1*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*
> > > 2048kB 0*4096kB = 3940kB
> > > Node 0 DMA32: 852*4kB 1*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024k
> > > B 0*2048kB 0*4096kB = 5304kB
> > > 437 total pagecache pages
> > > 0 pages in swap cache
> > > Swap cache stats: add 0, delete 0, find 0/0
> > > Free swap = 0kB
> > > Total swap = 0kB
> > > 262144 pages RAM
> > > 6503 pages reserved
> > > 205864 pages shared
> > > 226536 pages non-shared
> > > Out of memory: kill process 3855 (msgctl11) score 179248 or a child
> > > Killed process 4222 (msgctl11)
> >
> > More data: I boot 2.6.30-rc1 with mem=1G and enabled 1GB swap and run msgctl11.
> >
> > It goes OOM at the 2nd run. They are very interesting numbers: memory leaked?
> >
> > [ 2259.825958] msgctl11 invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0
> > [ 2259.828092] Pid: 29657, comm: msgctl11 Not tainted 2.6.31-rc1 #22
> > [ 2259.830505] Call Trace:
> > [ 2259.832010] [<ffffffff8156f366>] ? _spin_unlock+0x26/0x30
> > [ 2259.834219] [<ffffffff810c8b26>] oom_kill_process+0x176/0x270
> > [ 2259.837603] [<ffffffff810c8def>] ? badness+0x18f/0x300
> > [ 2259.839906] [<ffffffff810c9095>] __out_of_memory+0x135/0x170
> > [ 2259.842035] [<ffffffff810c91c5>] out_of_memory+0xf5/0x180
> > [ 2259.844270] [<ffffffff810cd86c>] __alloc_pages_nodemask+0x6ac/0x6c0
> > [ 2259.846743] [<ffffffff810f8fa8>] alloc_pages_current+0x78/0x100
> > [ 2259.849083] [<ffffffff81033515>] pte_alloc_one+0x15/0x50
> > [ 2259.851282] [<ffffffff810e0eda>] __pte_alloc+0x2a/0xf0
> > [ 2259.853454] [<ffffffff810e16e2>] handle_mm_fault+0x742/0x830
> > [ 2259.855793] [<ffffffff815725cb>] do_page_fault+0x1cb/0x330
> > [ 2259.858033] [<ffffffff8156fdf5>] page_fault+0x25/0x30
> > [ 2259.860301] Mem-Info:
> > [ 2259.861706] Node 0 DMA per-cpu:
> > [ 2259.862523] CPU 0: hi: 0, btch: 1 usd: 0
> > [ 2259.864454] CPU 1: hi: 0, btch: 1 usd: 0
> > [ 2259.866608] Node 0 DMA32 per-cpu:
> > [ 2259.867404] CPU 0: hi: 186, btch: 31 usd: 197
> > [ 2259.869283] CPU 1: hi: 186, btch: 31 usd: 175
> > [ 2259.870511] Active_anon:0 active_file:11 inactive_anon:0
> >
> > zero anon pages!
> >
> > [ 2259.870512] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > [ 2259.870513] free:1986 slab:42170 mapped:96 pagetables:59427 bounce:0
>
> I bet this is NOT zero. it only hidden.

Yes, very likely! I noticed that it's all about direct scans:

pgscan_kswapd_dma 0
pgscan_kswapd_dma32 0
pgscan_kswapd_normal 0
pgscan_kswapd_movable 0
pgscan_direct_dma 0
pgscan_direct_dma32 7295
pgscan_direct_normal 143810
pgscan_direct_movable 0
zone_reclaim_failed 0

> I guess this system's memory usage is,
> pagetables: 60k pages
> kernel stack: 60k pages
> anon (hidden): 60k pages
> slab: 40k pages
> other: 30k pages
> ===================
> total: 250k pages = 1GB
>
> What is "hidden" anon pages?
> each shrink_{in}active_list isolate 32 pages from lru. it mean anon or file lru
> accounting decrease temporary.
>
> if system have plenty thread or process, heavy memory pressure makes
> #-of-thread x 32pages isolation.
>
> msgctl11 makes >10K processes.

More exactly, ~16K processes:

msgctl11 0 INFO : Using upto 16298 pids

So the maximum number of isolated pages is 16K * 32 = 512K, or 2GiB.

> I have debugging patch for this case.
> Wu, Can you please try this patch?

OK. But the OOM is not quite reproducible. Sometimes it produces these
messages:

[ 480.921813] INFO: task msgctl11:21576 blocked for more than 120 seconds.
[ 480.923604] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 480.926330] msgctl11 D ffffffff8180e650 5992 21576 20749 0x00000000
[ 480.929877] ffff880020c87dd8 0000000000000046 0000000000000000 0000000000000046
[ 480.933694] ffff880020c87d48 00000000001d2d80 000000000000cec8 ffff88000d8f8000
[ 480.936458] ffff880034822280 ffff88000d8f8380 0000000020c87d88 ffffffff8107d5d8
[ 480.941100] Call Trace:
[ 480.941706] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 480.943798] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 480.946098] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 480.948623] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 480.950960] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 480.953102] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 480.955276] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 480.957637] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 480.959897] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 480.962024] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 480.964177] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 480.966438] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 480.968996] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 480.971421] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 480.974826] 1 lock held by msgctl11/21576:
[ 480.976828] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 480.980709] INFO: task msgctl11:21602 blocked for more than 120 seconds.
[ 480.983198] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 480.985973] msgctl11 D ffffffff8180e650 5992 21602 20749 0x00000000
[ 480.988581] ffff88001fea7dd8 0000000000000046 0000000000000000 0000000000000046
[ 480.992378] ffff88001fea7d48 00000000001d2d80 000000000000cec8 ffff88002db02280
[ 480.996046] ffff88000f0b0000 ffff88002db02600 000000011fea7d88 ffffffff8107d5d8
[ 480.998791] Call Trace:
[ 481.000111] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.002636] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.004775] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.007406] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.009474] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.011810] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.013932] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.016245] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.018489] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.020638] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.022885] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.025086] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.027644] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.030087] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.032424] 1 lock held by msgctl11/21602:
[ 481.034358] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.038314] INFO: task msgctl11:21603 blocked for more than 120 seconds.
[ 481.040852] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.043573] msgctl11 D ffffffff8180e650 5992 21603 20749 0x00000000
[ 481.048159] ffff88003e051dd8 0000000000000046 0000000000000000 0000000000000046
[ 481.051955] ffff88003e051d48 00000000001d2d80 000000000000cec8 ffff88002db04500
[ 481.054755] ffff88003842a280 ffff88002db04880 000000013e051d88 ffffffff8107d5d8
[ 481.058423] Call Trace:
[ 481.059062] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.061049] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.063352] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.065890] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.068213] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.070388] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.072531] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.074918] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.077266] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.079328] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.081413] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.084243] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.086253] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.088653] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.093086] 1 lock held by msgctl11/21603:
[ 481.095000] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.099994] INFO: task msgctl11:21604 blocked for more than 120 seconds.
[ 481.102728] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.105238] msgctl11 D ffffffff8180e650 6024 21604 20749 0x00000000
[ 481.108100] ffff88001d8dddd8 0000000000000046 0000000000000000 0000000000000046
[ 481.111671] ffff88001d8ddd48 00000000001d2d80 000000000000cec8 ffff8800261e8000
[ 481.115274] ffff880011da2280 ffff8800261e8380 000000011d8ddd88 ffffffff8107d5d8
[ 481.118169] Call Trace:
[ 481.119356] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.121621] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.125037] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.127587] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.129854] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.132100] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.134228] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.136518] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.138748] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.140988] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.143146] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.145382] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.147988] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.150339] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.152653] 1 lock held by msgctl11/21604:
[ 481.154622] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.158578] INFO: task msgctl11:21605 blocked for more than 120 seconds.
[ 481.161122] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.163820] msgctl11 D ffffffff8180e650 5992 21605 20749 0x00000000
[ 481.167579] ffff88003ac9bdd8 0000000000000046 0000000000000000 0000000000000046
[ 481.171269] ffff88003ac9bd48 00000000001d2d80 000000000000cec8 ffff8800261ea280
[ 481.174033] ffff88001b18c500 ffff8800261ea600 000000003ac9bd88 ffffffff8107d5d8
[ 481.177742] Call Trace:
[ 481.178353] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.180308] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.182594] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.185166] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.187611] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.189586] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.191787] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.194182] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.196414] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.198593] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.200719] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.203212] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.205518] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.208072] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.211357] 1 lock held by msgctl11/21605:
[ 481.213263] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.217340] INFO: task msgctl11:21606 blocked for more than 120 seconds.
[ 481.219787] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.222503] msgctl11 D ffffffff8180e650 5992 21606 20749 0x00000000
[ 481.225146] ffff88003c46fdd8 0000000000000046 0000000000000000 0000000000000046
[ 481.228946] ffff88003c46fd48 00000000001d2d80 000000000000cec8 ffff8800261ec500
[ 481.233527] ffff88000d524500 ffff8800261ec880 000000003c46fd88 ffffffff8107d5d8
[ 481.236324] Call Trace:
[ 481.237669] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.239944] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.242294] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.244740] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.247035] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.249302] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.251494] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.253789] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.255967] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.259279] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.261388] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.263678] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.266087] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.269651] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.271956] 1 lock held by msgctl11/21606:
[ 481.273861] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.277914] INFO: task msgctl11:21607 blocked for more than 120 seconds.
[ 481.280416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.283078] msgctl11 D ffffffff8180e650 5992 21607 20749 0x00000000
[ 481.286706] ffff880037541dd8 0000000000000046 0000000000000000 0000000000000046
[ 481.290514] ffff880037541d48 00000000001d2d80 000000000000cec8 ffff880032778000
[ 481.293299] ffff880026138000 ffff880032778380 0000000037541d88 ffffffff8107d5d8
[ 481.296913] Call Trace:
[ 481.297602] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.299598] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.301883] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.304459] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.307723] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.309897] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.312082] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.314457] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.316683] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.318874] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.320968] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.323255] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.325778] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.328244] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.330614] 1 lock held by msgctl11/21607:
[ 481.332534] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.336512] INFO: task msgctl11:21608 blocked for more than 120 seconds.
[ 481.338992] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.341831] msgctl11 D ffffffff8180e650 5992 21608 20749 0x00000000
[ 481.344388] ffff880037543dd8 0000000000000046 0000000000000000 0000000000000046
[ 481.349179] ffff880037543d48 00000000001d2d80 000000000000cec8 ffff88003277a280
[ 481.352782] ffff8800238a4500 ffff88003277a600 0000000037543d88 ffffffff8107d5d8
[ 481.355573] Call Trace:
[ 481.356895] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.359168] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.361546] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.364026] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.366314] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.369593] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.371761] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.374024] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.376267] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.379570] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.381661] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.383910] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.386391] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.389858] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.392210] 1 lock held by msgctl11/21608:
[ 481.394137] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.398198] INFO: task msgctl11:21609 blocked for more than 120 seconds.
[ 481.400671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.403631] msgctl11 D ffffffff8180e650 5992 21609 20749 0x00000000
[ 481.406951] ffff88002987bdd8 0000000000000046 0000000000000000 0000000000000046
[ 481.410783] ffff88002987bd48 00000000001d2d80 000000000000cec8 ffff88003277c500
[ 481.413558] ffff880038d40000 ffff88003277c880 000000002987bd88 ffffffff8107d5d8
[ 481.417817] Call Trace:
[ 481.418735] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.421819] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.424177] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.426707] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.429080] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.431200] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.433302] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.435736] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.437966] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.440139] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.442243] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.444473] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.447078] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.449563] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.451897] 1 lock held by msgctl11/21609:
[ 481.453829] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.457796] INFO: task msgctl11:21611 blocked for more than 120 seconds.
[ 481.460287] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 481.463045] msgctl11 D ffffffff8180e650 5992 21611 20749 0x00000000
[ 481.465725] ffff88001a45fdd8 0000000000000046 0000000000000000 0000000000000046
[ 481.469609] ffff88001a45fd48 00000000001d2d80 000000000000cec8 ffff8800238e2280
[ 481.473053] ffff88001edd0000 ffff8800238e2600 000000011a45fd88 ffffffff8107d5d8
[ 481.475887] Call Trace:
[ 481.477197] [<ffffffff8107d5d8>] ? mark_held_locks+0x68/0x90
[ 481.479530] [<ffffffff8158db60>] ? _spin_unlock_irq+0x30/0x40
[ 481.481820] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.484352] [<ffffffff8158d535>] __down_write_nested+0x85/0xc0
[ 481.486579] [<ffffffff8158d57b>] __down_write+0xb/0x10
[ 481.488980] [<ffffffff8158c76d>] down_write+0x6d/0x90
[ 481.491034] [<ffffffff8126d88d>] ? ipcctl_pre_down+0x3d/0x150
[ 481.493963] [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150
[ 481.495567] [<ffffffff8126f04e>] sys_msgctl+0xbe/0x5a0
[ 481.497803] [<ffffffff8106e74b>] ? up_read+0x2b/0x40
[ 481.499943] [<ffffffff8100cc35>] ? retint_swapgs+0x13/0x1b
[ 481.502171] [<ffffffff8107d915>] ? trace_hardirqs_on_caller+0x155/0x1a0
[ 481.504634] [<ffffffff8158d66e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 481.507161] [<ffffffff8100c0f2>] system_call_fastpath+0x16/0x1b
[ 481.509674] 1 lock held by msgctl11/21611:
[ 481.511401] #0: (&ids->rw_mutex){+++++.}, at: [<ffffffff8126d88d>] ipcctl_pre_down+0x3d/0x150

> if my guess is correct, we need to implement #-of-reclaim-process throttling
> mechanism.
>
> ============================================
> If the system have plenty thread, concurrent reclaim can isolate very much pages.
> Unfortunately, current /proc/meminfo and OOM log can't show it.
>
> Machine
> IA64 x8 CPU
> MEM 8GB
>
> reproduce way
>
> % ./hackbench 140 process 1000
> => couse OOM
>
> Active_anon:203 active_file:91 inactive_anon:104
> inactive_file:76 unevictable:0 dirty:0 writeback:72 unstable:0
> free:168 slab:4968 mapped:136 pagetables:28203 bounce:0
> isolate:49088
> ^^^^
>
> ---
> fs/proc/meminfo.c | 6 ++++--
> include/linux/mmzone.h | 1 +
> mm/page_alloc.c | 6 ++++--
> mm/vmscan.c | 5 +++++
> mm/vmstat.c | 1 +
> 5 files changed, 15 insertions(+), 4 deletions(-)
>
> Index: b/fs/proc/meminfo.c
> ===================================================================
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -95,7 +95,8 @@ static int meminfo_proc_show(struct seq_
> "Committed_AS: %8lu kB\n"
> "VmallocTotal: %8lu kB\n"
> "VmallocUsed: %8lu kB\n"
> - "VmallocChunk: %8lu kB\n",
> + "VmallocChunk: %8lu kB\n"
> + "IsolatePages: %8lu kB\n",
> K(i.totalram),
> K(i.freeram),
> K(i.bufferram),
> @@ -139,7 +140,8 @@ static int meminfo_proc_show(struct seq_
> K(committed),
> (unsigned long)VMALLOC_TOTAL >> 10,
> vmi.used >> 10,
> - vmi.largest_chunk >> 10
> + vmi.largest_chunk >> 10,
> + K(global_page_state(NR_ISOLATE)),
> );
>
> hugetlb_report_meminfo(m);
> Index: b/include/linux/mmzone.h
> ===================================================================
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -107,6 +107,7 @@ enum zone_stat_item {
> NUMA_LOCAL, /* allocation from local node */
> NUMA_OTHER, /* allocation from other node */
> #endif
> + NR_ISOLATE,
> NR_VM_ZONE_STAT_ITEMS };
>
> /*
> Index: b/mm/page_alloc.c
> ===================================================================
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2119,7 +2119,8 @@ void show_free_areas(void)
> " inactive_file:%lu"
> " unevictable:%lu"
> " dirty:%lu writeback:%lu unstable:%lu\n"
> - " free:%lu slab:%lu mapped:%lu pagetables:%lu bounce:%lu\n",
> + " free:%lu slab:%lu mapped:%lu pagetables:%lu bounce:%lu\n"
> + " isolate:%lu\n",
> global_page_state(NR_ACTIVE_ANON),
> global_page_state(NR_ACTIVE_FILE),
> global_page_state(NR_INACTIVE_ANON),
> @@ -2133,7 +2134,8 @@ void show_free_areas(void)
> global_page_state(NR_SLAB_UNRECLAIMABLE),
> global_page_state(NR_FILE_MAPPED),
> global_page_state(NR_PAGETABLE),
> - global_page_state(NR_BOUNCE));
> + global_page_state(NR_BOUNCE),
> + global_page_state(NR_ISOLATE));
>
> for_each_populated_zone(zone) {
> int i;
> Index: b/mm/vmscan.c
> ===================================================================
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1066,6 +1066,7 @@ static unsigned long shrink_inactive_lis
> unsigned long nr_freed;
> unsigned long nr_active;
> unsigned int count[NR_LRU_LISTS] = { 0, };
> + unsigned int total_count;
> int mode = lumpy_reclaim ? ISOLATE_BOTH : ISOLATE_INACTIVE;
>
> nr_taken = sc->isolate_pages(sc->swap_cluster_max,
> @@ -1082,6 +1083,7 @@ static unsigned long shrink_inactive_lis
> -count[LRU_ACTIVE_ANON]);
> __mod_zone_page_state(zone, NR_INACTIVE_ANON,
> -count[LRU_INACTIVE_ANON]);
> + __mod_zone_page_state(zone, NR_ISOLATE, nr_taken);
>
> if (scanning_global_lru(sc))
> zone->pages_scanned += nr_scan;
> @@ -1131,6 +1133,7 @@ static unsigned long shrink_inactive_lis
> goto done;
>
> spin_lock(&zone->lru_lock);
> + __mod_zone_page_state(zone, NR_ISOLATE, -nr_taken);
> /*
> * Put back any unfreeable pages.
> */
> @@ -1232,6 +1235,7 @@ static void move_active_pages_to_lru(str
> }
> }
> __mod_zone_page_state(zone, NR_LRU_BASE + lru, pgmoved);
> + __mod_zone_page_state(zone, NR_ISOLATE, -pgmoved);
> if (!is_active_lru(lru))
> __count_vm_events(PGDEACTIVATE, pgmoved);
> }
> @@ -1267,6 +1271,7 @@ static void shrink_active_list(unsigned
> __mod_zone_page_state(zone, NR_ACTIVE_FILE, -pgmoved);
> else
> __mod_zone_page_state(zone, NR_ACTIVE_ANON, -pgmoved);
> + __mod_zone_page_state(zone, NR_ISOLATE, pgmoved);
> spin_unlock_irq(&zone->lru_lock);
>
> pgmoved = 0; /* count referenced (mapping) mapped pages */
> Index: b/mm/vmstat.c
> ===================================================================
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -697,6 +697,7 @@ static const char * const vmstat_text[]
> "unevictable_pgs_stranded",
> "unevictable_pgs_mlockfreed",
> #endif
> + "isolate_pages",
> };
>
> static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
>
>

2009-07-01 02:26:54

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Wed, Jul 01, 2009 at 10:16:45AM +0800, Wu Fengguang wrote:
> On Wed, Jul 01, 2009 at 10:18:03AM +0900, KOSAKI Motohiro wrote:
> > > On Mon, Jun 29, 2009 at 11:56:47PM +0800, David Woodhouse wrote:
> > > > On Mon, 2009-06-29 at 16:54 +0100, David Howells wrote:
> > > > > Wu Fengguang <[email protected]> wrote:
> > > > >
> > > > > > Yes this time the OOM order/flags are much different from all previous OOMs.
> > > > > >
> > > > > > btw, I found that msgctl11 is pretty good at making a lot of SUnreclaim and
> > > > > > PageTables pages:
> > > > >
> > > > > I got David Woodhouse to run this on one of this boxes, but he doesn't see the
> > > > > problem, I think because he's got 4GB of RAM, and never comes close to running
> > > > > out.
> > > > >
> > > > > I've asked him to reboot with mem=1G to see if that helps reproduce it.
> > > >
> > > > msgctl11 invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0
> > > > Pid: 5795, comm: msgctl11 Not tainted 2.6.31-rc1 #147
> > > > Call Trace:
> > > > [<ffffffff81092c77>] oom_kill_process.clone.0+0xac/0x254
> > > > [<ffffffff81092b5c>] ? badness+0x24d/0x2bc
> > > > [<ffffffff81092f5f>] __out_of_memory+0x140/0x157
> > > > [<ffffffff8109308f>] out_of_memory+0x119/0x150
> > > > [<ffffffff81095c65>] ? drain_local_pages+0x16/0x18
> > > > [<ffffffff810967ab>] __alloc_pages_nodemask+0x45a/0x55b
> > > > [<ffffffff810a32b0>] ? __inc_zone_page_state+0x2e/0x30
> > > > [<ffffffff810bb6b9>] alloc_pages_current+0xae/0xb6
> > > > [<ffffffff810a604a>] ? do_wp_page+0x621/0x6c3
> > > > [<ffffffff81094d7e>] __get_free_pages+0xe/0x4b
> > > > [<ffffffff810403a7>] copy_process+0xab/0x11a5
> > > > [<ffffffff810327c8>] ? check_preempt_wakeup+0x11a/0x142
> > > > [<ffffffff810a7a06>] ? handle_mm_fault+0x678/0x6e9
> > > > [<ffffffff810415ec>] do_fork+0x14b/0x338
> > > > [<ffffffff8105b50a>] ? up_read+0xe/0x10
> > > > [<ffffffff814ee655>] ? do_page_fault+0x2da/0x307
> > > > [<ffffffff8100a55c>] sys_clone+0x28/0x2a
> > > > [<ffffffff8100bfc3>] stub_clone+0x13/0x20
> > > > [<ffffffff8100bcdb>] ? system_call_fastpath+0x16/0x1b
> > > > Mem-Info:
> > > > Node 0 DMA per-cpu:
> > > > CPU 0: hi: 0, btch: 1 usd: 0
> > > > CPU 1: hi: 0, btch: 1 usd: 0
> > > > CPU 2: hi: 0, btch: 1 usd: 0
> > > > CPU 3: hi: 0, btch: 1 usd: 0
> > > > CPU 4: hi: 0, btch: 1 usd: 0
> > > > CPU 5: hi: 0, btch: 1 usd: 0
> > > > CPU 6: hi: 0, btch: 1 usd: 0
> > > > CPU 7: hi: 0, btch: 1 usd: 0
> > > > Node 0 DMA32 per-cpu:
> > > > CPU 0: hi: 186, btch: 31 usd: 0
> > > > CPU 1: hi: 186, btch: 31 usd: 20
> > > > CPU 2: hi: 186, btch: 31 usd: 19
> > > > CPU 3: hi: 186, btch: 31 usd: 20
> > > > CPU 4: hi: 186, btch: 31 usd: 19
> > > > CPU 5: hi: 186, btch: 31 usd: 24
> > > > CPU 6: hi: 186, btch: 31 usd: 41
> > > > CPU 7: hi: 186, btch: 31 usd: 25
> > > > Active_anon:72835 active_file:89 inactive_anon:575
> > > > inactive_file:103 unevictable:0 dirty:36 writeback:0 unstable:0
> > > > free:2467 slab:38211 mapped:229 pagetables:66918 bounce:0
> > > > Node 0 DMA free:4036kB min:60kB low:72kB high:88kB active_anon:3228kB inactive_a
> > > > non:256kB active_file:0kB inactive_file:0kB unevictable:0kB present:15356kB page
> > > > s_scanned:0 all_unreclaimable? no
> > > > lowmem_reserve[]: 0 994 994 994
> > > > Node 0 DMA32 free:5832kB min:4000kB low:5000kB high:6000kB active_anon:288112kB
> > > > inactive_anon:2044kB active_file:356kB inactive_file:412kB unevictable:0kB prese
> > > > nt:1018080kB pages_scanned:0 all_unreclaimable? no
> > > > lowmem_reserve[]: 0 0 0 0
> > > > Node 0 DMA: 1*4kB 2*8kB 1*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*
> > > > 2048kB 0*4096kB = 3940kB
> > > > Node 0 DMA32: 852*4kB 1*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024k
> > > > B 0*2048kB 0*4096kB = 5304kB
> > > > 437 total pagecache pages
> > > > 0 pages in swap cache
> > > > Swap cache stats: add 0, delete 0, find 0/0
> > > > Free swap = 0kB
> > > > Total swap = 0kB
> > > > 262144 pages RAM
> > > > 6503 pages reserved
> > > > 205864 pages shared
> > > > 226536 pages non-shared
> > > > Out of memory: kill process 3855 (msgctl11) score 179248 or a child
> > > > Killed process 4222 (msgctl11)
> > >
> > > More data: I boot 2.6.30-rc1 with mem=1G and enabled 1GB swap and run msgctl11.
> > >
> > > It goes OOM at the 2nd run. They are very interesting numbers: memory leaked?
> > >
> > > [ 2259.825958] msgctl11 invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0
> > > [ 2259.828092] Pid: 29657, comm: msgctl11 Not tainted 2.6.31-rc1 #22
> > > [ 2259.830505] Call Trace:
> > > [ 2259.832010] [<ffffffff8156f366>] ? _spin_unlock+0x26/0x30
> > > [ 2259.834219] [<ffffffff810c8b26>] oom_kill_process+0x176/0x270
> > > [ 2259.837603] [<ffffffff810c8def>] ? badness+0x18f/0x300
> > > [ 2259.839906] [<ffffffff810c9095>] __out_of_memory+0x135/0x170
> > > [ 2259.842035] [<ffffffff810c91c5>] out_of_memory+0xf5/0x180
> > > [ 2259.844270] [<ffffffff810cd86c>] __alloc_pages_nodemask+0x6ac/0x6c0
> > > [ 2259.846743] [<ffffffff810f8fa8>] alloc_pages_current+0x78/0x100
> > > [ 2259.849083] [<ffffffff81033515>] pte_alloc_one+0x15/0x50
> > > [ 2259.851282] [<ffffffff810e0eda>] __pte_alloc+0x2a/0xf0
> > > [ 2259.853454] [<ffffffff810e16e2>] handle_mm_fault+0x742/0x830
> > > [ 2259.855793] [<ffffffff815725cb>] do_page_fault+0x1cb/0x330
> > > [ 2259.858033] [<ffffffff8156fdf5>] page_fault+0x25/0x30
> > > [ 2259.860301] Mem-Info:
> > > [ 2259.861706] Node 0 DMA per-cpu:
> > > [ 2259.862523] CPU 0: hi: 0, btch: 1 usd: 0
> > > [ 2259.864454] CPU 1: hi: 0, btch: 1 usd: 0
> > > [ 2259.866608] Node 0 DMA32 per-cpu:
> > > [ 2259.867404] CPU 0: hi: 186, btch: 31 usd: 197
> > > [ 2259.869283] CPU 1: hi: 186, btch: 31 usd: 175
> > > [ 2259.870511] Active_anon:0 active_file:11 inactive_anon:0
> > >
> > > zero anon pages!
> > >
> > > [ 2259.870512] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > > [ 2259.870513] free:1986 slab:42170 mapped:96 pagetables:59427 bounce:0
> >
> > I bet this is NOT zero. it only hidden.
>
> Yes, very likely! I noticed that it's all about direct scans:
>
> pgscan_kswapd_dma 0
> pgscan_kswapd_dma32 0
> pgscan_kswapd_normal 0
> pgscan_kswapd_movable 0
> pgscan_direct_dma 0
> pgscan_direct_dma32 7295
> pgscan_direct_normal 143810
> pgscan_direct_movable 0
> zone_reclaim_failed 0
>
> > I guess this system's memory usage is,
> > pagetables: 60k pages
> > kernel stack: 60k pages
> > anon (hidden): 60k pages
> > slab: 40k pages
> > other: 30k pages
> > ===================
> > total: 250k pages = 1GB
> >
> > What is "hidden" anon pages?
> > each shrink_{in}active_list isolate 32 pages from lru. it mean anon or file lru
> > accounting decrease temporary.
> >
> > if system have plenty thread or process, heavy memory pressure makes
> > #-of-thread x 32pages isolation.
> >
> > msgctl11 makes >10K processes.
>
> More exactly, ~16K processes:
>
> msgctl11 0 INFO : Using upto 16298 pids
>
> So the maximum number of isolated pages is 16K * 32 = 512K, or 2GiB.
>
> > I have debugging patch for this case.
> > Wu, Can you please try this patch?
>
> OK. But the OOM is not quite reproducible. Sometimes it produces these
> messages:

This time I got the OOM: there are 69817 isolated pages (just as expected)!

[ 1521.979074] msgctl11 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
[ 1521.980996] Pid: 16405, comm: msgctl11 Not tainted 2.6.31-rc1 #27
[ 1521.983271] Call Trace:
[ 1521.983936] [<ffffffff8158dc1b>] ? _spin_unlock+0x2b/0x40
[ 1521.985195] [<ffffffff810d3526>] oom_kill_process+0x176/0x270
[ 1521.987384] [<ffffffff810d37f7>] ? badness+0x197/0x310
[ 1521.989019] [<ffffffff810d3ab5>] __out_of_memory+0x145/0x180
[ 1521.990981] [<ffffffff810d3bed>] out_of_memory+0xfd/0x190
[ 1521.993199] [<ffffffff810d83bc>] __alloc_pages_nodemask+0x6bc/0x6d0
[ 1521.995770] [<ffffffff81012e69>] ? sched_clock+0x9/0x10
[ 1521.997880] [<ffffffff8110485e>] alloc_page_vma+0x8e/0x1c0
[ 1522.000091] [<ffffffff810ea5aa>] do_wp_page+0x23a/0x840
[ 1522.002246] [<ffffffff810ec7b6>] handle_mm_fault+0x656/0x840
[ 1522.003476] [<ffffffff81590ecb>] do_page_fault+0x1cb/0x330
[ 1522.004995] [<ffffffff8158e6e5>] page_fault+0x25/0x30
[ 1522.007006] Mem-Info:
[ 1522.007535] Node 0 DMA per-cpu:
[ 1522.009342] CPU 0: hi: 0, btch: 1 usd: 0
[ 1522.011277] CPU 1: hi: 0, btch: 1 usd: 0
[ 1522.013401] Node 0 DMA32 per-cpu:
[ 1522.015291] CPU 0: hi: 186, btch: 31 usd: 176
[ 1522.017232] CPU 1: hi: 186, btch: 31 usd: 155
[ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
[ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
[ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
[ 1522.019262] isolate:69817
[ 1522.025145] Node 0 DMA free:3964kB min:56kB low:68kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15164kB pages_scanned:655 all_unreclaimable? yes
[ 1522.030180] lowmem_reserve[]: 0 982 982 982
[ 1522.031506] Node 0 DMA32 free:3976kB min:3980kB low:4972kB high:5968kB active_anon:44kB inactive_anon:0kB active_file:24kB inactive_file:0kB unevictable:0kB present:1005984kB pages_scanned:249 all_unreclaimable? no
[ 1522.037463] lowmem_reserve[]: 0 0 0 0
[ 1522.039637] Node 0 DMA: 3*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3964kB
[ 1522.043998] Node 0 DMA32: 102*4kB 6*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 2*512kB 2*1024kB 0*2048kB 0*4096kB = 3976kB
[ 1522.049241] 1312 total pagecache pages
[ 1522.050996] 1112 pages in swap cache
[ 1522.051759] Swap cache stats: add 218714, delete 217602, find 97535/130636
[ 1522.055428] Free swap = 1037356kB
[ 1522.057113] Total swap = 1048568kB

2009-07-01 02:30:37

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Wed, Jul 01, 2009 at 12:50:42AM +0900, Minchan Kim wrote:
> On Tue, Jun 30, 2009 at 11:05 PM, Wu Fengguang<[email protected]> wrote:
> >
> > More data: I boot 2.6.30-rc1 with mem=1G and enabled 1GB swap and run msgctl11.
> >
> > It goes OOM at the 2nd run. They are very interesting numbers: memory leaked?
>
> Hmm. It's very serious and another problem since this system have swap
> device and it's not full.

Yes.

> Can you reproduce it easily ?

Not always. It runs OK in the first run (after fresh boot).
At the second run, it may OOM, or lockup (dmesg in another email).

> I want to reproduce it in my system.
>
> Did you ran only msgctl11 not all LTP test ?
> Just default parameter ? ex) $ ./testcases/bin/msgctl11

Yes, I run it standalone with no parameters.

> 2nd run ? You mean you execute msgctl11 two time in order ?
> I mean after first test is finished successfully and OOM happens
> second test before ending successfully ?

Yes, to run it two times after fresh boot.
Because the first run seem to always succeed.

Thanks,
Fengguang

>
> >        [ 2259.825958] msgctl11 invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0
> >        [ 2259.828092] Pid: 29657, comm: msgctl11 Not tainted 2.6.31-rc1 #22
> >        [ 2259.830505] Call Trace:
> >        [ 2259.832010]  [<ffffffff8156f366>] ? _spin_unlock+0x26/0x30
> >        [ 2259.834219]  [<ffffffff810c8b26>] oom_kill_process+0x176/0x270
> >        [ 2259.837603]  [<ffffffff810c8def>] ? badness+0x18f/0x300
> >        [ 2259.839906]  [<ffffffff810c9095>] __out_of_memory+0x135/0x170
> >        [ 2259.842035]  [<ffffffff810c91c5>] out_of_memory+0xf5/0x180
> >        [ 2259.844270]  [<ffffffff810cd86c>] __alloc_pages_nodemask+0x6ac/0x6c0
> >        [ 2259.846743]  [<ffffffff810f8fa8>] alloc_pages_current+0x78/0x100
> >        [ 2259.849083]  [<ffffffff81033515>] pte_alloc_one+0x15/0x50
> >        [ 2259.851282]  [<ffffffff810e0eda>] __pte_alloc+0x2a/0xf0
> >        [ 2259.853454]  [<ffffffff810e16e2>] handle_mm_fault+0x742/0x830
> >        [ 2259.855793]  [<ffffffff815725cb>] do_page_fault+0x1cb/0x330
> >        [ 2259.858033]  [<ffffffff8156fdf5>] page_fault+0x25/0x30
> >        [ 2259.860301] Mem-Info:
> >        [ 2259.861706] Node 0 DMA per-cpu:
> >        [ 2259.862523] CPU    0: hi:    0, btch:   1 usd:   0
> >        [ 2259.864454] CPU    1: hi:    0, btch:   1 usd:   0
> >        [ 2259.866608] Node 0 DMA32 per-cpu:
> >        [ 2259.867404] CPU    0: hi:  186, btch:  31 usd: 197
> >        [ 2259.869283] CPU    1: hi:  186, btch:  31 usd: 175
> >        [ 2259.870511] Active_anon:0 active_file:11 inactive_anon:0
> >
> > zero anon pages!
> >
> >        [ 2259.870512]  inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> >        [ 2259.870513]  free:1986 slab:42170 mapped:96 pagetables:59427 bounce:0
> >        [ 2259.877722] Node 0 DMA free:3976kB min:56kB low:68kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15164kB pages_scanned:429 all_unreclaimable? no
> >        [ 2259.883804] lowmem_reserve[]: 0 982 982 982
> >        [ 2259.885814] Node 0 DMA32 free:3968kB min:3980kB low:4972kB high:5968kB active_anon:0kB inactive_anon:0kB active_file:44kB inactive_file:0kB unevictable:0kB present:1005984kB pages_scanned:152 all_unreclaimable? no
> >        [ 2259.890958] lowmem_reserve[]: 0 0 0 0
> >        [ 2259.893183] Node 0 DMA: 4*4kB 3*8kB 2*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3976kB
> >        [ 2259.897406] Node 0 DMA32: 334*4kB 77*8kB 24*16kB 27*32kB 10*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3968kB
> >        [ 2259.902753] 625 total pagecache pages
> >        [ 2259.903623] 454 pages in swap cache
> >        [ 2259.905299] Swap cache stats: add 95129, delete 94675, find 55783/67607
> >        [ 2259.908858] Free swap  = 1041232kB
> >        [ 2259.909618] Total swap = 1048568kB
> >
> > swap far from full!
> >
> >        [ 2259.919456] 262144 pages RAM
> >        [ 2259.921071] 12513 pages reserved
> >        [ 2259.922790] 314212 pages shared
> >        [ 2259.923548] 165757 pages non-shared
> >        [ 2259.925234] Out of memory: kill process 20791 (msgctl11) score 2280094 or a child
> >        [ 2259.928982] Killed process 21946 (msgctl11)
> >
> >
>
>
>
> --
> Kinds regards,
> Minchan Kim

2009-07-01 02:52:05

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

> > > What is "hidden" anon pages?
> > > each shrink_{in}active_list isolate 32 pages from lru. it mean anon or file lru
> > > accounting decrease temporary.
> > >
> > > if system have plenty thread or process, heavy memory pressure makes
> > > #-of-thread x 32pages isolation.
> > >
> > > msgctl11 makes >10K processes.
> >
> > More exactly, ~16K processes:
> >
> > msgctl11 0 INFO : Using upto 16298 pids
> >
> > So the maximum number of isolated pages is 16K * 32 = 512K, or 2GiB.
> >
> > > I have debugging patch for this case.
> > > Wu, Can you please try this patch?
> >
> > OK. But the OOM is not quite reproducible. Sometimes it produces these
> > messages:
>
> This time I got the OOM: there are 69817 isolated pages (just as expected)!
>
(snip)

> [ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
> [ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> [ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
> [ 1522.019262] isolate:69817

OK. thanks.
I plan to submit this patch after small more tests. it is useful for OOM analysis.



2009-07-01 02:57:54

by Rik van Riel

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

KOSAKI Motohiro wrote:

>> [ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
>> [ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
>> [ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
>> [ 1522.019262] isolate:69817
>
> OK. thanks.
> I plan to submit this patch after small more tests. it is useful for OOM analysis.

It is also useful for throttling page reclaim.

If more than half of the inactive pages in a zone are
isolated, we are probably beyond the point where adding
additional reclaim processes will do more harm than good.

--
All rights reversed.

2009-07-01 03:54:25

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Wed, Jul 01, 2009 at 11:51:54AM +0900, KOSAKI Motohiro wrote:
> > > > What is "hidden" anon pages?
> > > > each shrink_{in}active_list isolate 32 pages from lru. it mean anon or file lru
> > > > accounting decrease temporary.
> > > >
> > > > if system have plenty thread or process, heavy memory pressure makes
> > > > #-of-thread x 32pages isolation.
> > > >
> > > > msgctl11 makes >10K processes.
> > >
> > > More exactly, ~16K processes:
> > >
> > > msgctl11 0 INFO : Using upto 16298 pids
> > >
> > > So the maximum number of isolated pages is 16K * 32 = 512K, or 2GiB.
> > >
> > > > I have debugging patch for this case.
> > > > Wu, Can you please try this patch?
> > >
> > > OK. But the OOM is not quite reproducible. Sometimes it produces these
> > > messages:
> >
> > This time I got the OOM: there are 69817 isolated pages (just as expected)!
> >
> (snip)
>
> > [ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
> > [ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > [ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
> > [ 1522.019262] isolate:69817
>
> OK. thanks.
> I plan to submit this patch after small more tests. it is useful for OOM analysis.

Other counters to consider are NR_ANON_PAGES/NR_FILE_PAGES.

If they were showed in the oom message, this problem could be found
much earlier. In this case, we'll find that the total file+anon pages
outnumbered the active+inactive file/anon pages.

Thanks,
Fengguang

2009-07-01 04:06:59

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Tue, Jun 30, 2009 at 10:57:02PM -0400, Rik van Riel wrote:
> KOSAKI Motohiro wrote:
>
>>> [ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
>>> [ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
>>> [ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
>>> [ 1522.019262] isolate:69817
>>
>> OK. thanks.
>> I plan to submit this patch after small more tests. it is useful for OOM analysis.
>
> It is also useful for throttling page reclaim.
>
> If more than half of the inactive pages in a zone are
> isolated, we are probably beyond the point where adding
> additional reclaim processes will do more harm than good.

There are probably more problems in this case. For example,
followed is the vmstat after first (successful) run of msgctl11.

The question is: Why kswapd reclaims are absent here?

Thanks,
Fengguang
---

wfg@hp ~% /cc/ltp/ltp-full-20090531/testcases/kernel/syscalls/ipc/msgctl/msgctl11
msgctl11 0 INFO : Using upto 16298 pids
msgctl11 1 PASS : msgctl11 ran successfully!

wfg@hp ~% cat /proc/vmstat
nr_free_pages 237277
nr_inactive_anon 696
nr_active_anon 152
nr_inactive_file 1378
nr_active_file 44
nr_unevictable 0
nr_mlock 0
nr_anon_pages 385
nr_mapped 362
nr_file_pages 2176
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 1319
nr_slab_unreclaimable 4457
nr_page_table_pages 334
nr_unstable 0
nr_bounce 0
nr_vmscan_write 42098
nr_writeback_temp 0
numa_hit 774529
numa_miss 0
numa_foreign 0
numa_interleave 3177
numa_local 774529
numa_other 0
pgpgin 0
pgpgout 104695
pswpin 119952
pswpout 24118
pgalloc_dma 29987
pgalloc_dma32 3061
pgalloc_normal 842682
pgalloc_movable 0
pgfree 0
pgactivate 1083151
pgdeactivate 11427
pgfault 96023
pgmajfault 1341351
pgrefill_dma 9092
pgrefill_dma32 894
pgrefill_normal 96974
pgrefill_movable 0
pgsteal_dma 0
pgsteal_dma32 104
pgsteal_normal 47883
pgsteal_movable 0
pgscan_kswapd_dma 0
pgscan_kswapd_dma32 0
pgscan_kswapd_normal 0
pgscan_kswapd_movable 0
pgscan_direct_dma 0
pgscan_direct_dma32 7295
pgscan_direct_normal 143810
pgscan_direct_movable 0
zone_reclaim_failed 0
pginodesteal 0
slabs_scanned 1501
kswapd_steal 9216
kswapd_inodesteal 0
pageoutrun 0
allocstall 1
pgrotated 1965
htlb_buddy_alloc_success 6666
htlb_buddy_alloc_fail 0
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0
isolate_pages 0

2009-07-01 04:18:56

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

> On Tue, Jun 30, 2009 at 10:57:02PM -0400, Rik van Riel wrote:
> > KOSAKI Motohiro wrote:
> >
> >>> [ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
> >>> [ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> >>> [ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
> >>> [ 1522.019262] isolate:69817
> >>
> >> OK. thanks.
> >> I plan to submit this patch after small more tests. it is useful for OOM analysis.
> >
> > It is also useful for throttling page reclaim.
> >
> > If more than half of the inactive pages in a zone are
> > isolated, we are probably beyond the point where adding
> > additional reclaim processes will do more harm than good.
>
> There are probably more problems in this case. For example,
> followed is the vmstat after first (successful) run of msgctl11.
>
> The question is: Why kswapd reclaims are absent here?

if direct reclaim isolate all pages, kswapd can't reclaim any pages.

I believe Rik's idea solve this problem.

2009-07-01 04:26:08

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Wed, Jul 01, 2009 at 01:18:39PM +0900, KOSAKI Motohiro wrote:
> > On Tue, Jun 30, 2009 at 10:57:02PM -0400, Rik van Riel wrote:
> > > KOSAKI Motohiro wrote:
> > >
> > >>> [ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
> > >>> [ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > >>> [ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
> > >>> [ 1522.019262] isolate:69817
> > >>
> > >> OK. thanks.
> > >> I plan to submit this patch after small more tests. it is useful for OOM analysis.
> > >
> > > It is also useful for throttling page reclaim.
> > >
> > > If more than half of the inactive pages in a zone are
> > > isolated, we are probably beyond the point where adding
> > > additional reclaim processes will do more harm than good.
> >
> > There are probably more problems in this case. For example,
> > followed is the vmstat after first (successful) run of msgctl11.
> >
> > The question is: Why kswapd reclaims are absent here?
>
> if direct reclaim isolate all pages, kswapd can't reclaim any pages.

OOM will occur in that condition. What happened before that time?

> I believe Rik's idea solve this problem.

Me too :)

Thanks,
Fengguang

2009-07-01 04:31:02

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

> > > The question is: Why kswapd reclaims are absent here?
> >
> > if direct reclaim isolate all pages, kswapd can't reclaim any pages.
>
> OOM will occur in that condition. What happened before that time?

maybe yes, maybe no.
At first test, the system still have droppable file cache. if direct reclaim luckly take it,
the benchmark become successful end, I think.

Thanks.

>
> > I believe Rik's idea solve this problem.
>
> Me too :)
>
> Thanks,
> Fengguang
>


2009-07-01 11:27:18

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Wed, Jul 01, 2009 at 01:30:51PM +0900, KOSAKI Motohiro wrote:
> > > > The question is: Why kswapd reclaims are absent here?

Ah, maybe kswapd simply didn't have the opportunity to be scheduled
for running, because msgctl11 is busy forking thousands of processes?

> > > if direct reclaim isolate all pages, kswapd can't reclaim any pages.
> >
> > OOM will occur in that condition. What happened before that time?
>
> maybe yes, maybe no.
> At first test, the system still have droppable file cache. if direct
> reclaim luckly take it, the benchmark become successful end, I
> think.

Yes that's the main difference between first and second run. Note that
file cache can be dropped quickly, while the pageout of tmpfs pages
populated by msgctl11 itself takes time.

Thanks,
Fengguang

2009-07-02 07:41:23

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs



On Tue, 30 Jun 2009 20:57:47 +0100
David Howells <[email protected]> wrote:

> Minchan Kim <[email protected]> wrote:
>
> > David. Doesn't it happen OOM if you revert my patch, still?
>
> It does happen, and indeed happens in v2.6.30, but requires two adjacent runs
> of msgctl11 to trigger, rather than usually triggering on the first run. If
> you interpolate the rest of LTP between the iterations, it doesn't seem to
> happen at all on v2.6.30. My guess is that with the rest of LTP interpolated,
> there's either enough time for some cleanup or something triggers a cleanup
> (the swapfile tests perhaps?).
>
> > Befor I go to the trip, I made debugging patch in a hurry. Mel and I
> > suspect to put the wrong page in lru list.
> >
> > This patch's goal is that print page's detail on active anon lru when it
> > happen OOM. Maybe you could expand your log buffer size.
>
> Do you mean to expand the dmesg buffer? That's probably unnecessary: I capture
> the kernel log over a serial port into a file on another machine.
>
> > Could you show me the information with OOM, please ?
>
> Attached. It's compressed as there was rather a lot.
>
> David
> ---

Hi, David.

Sorry for late response.

I looked over your captured data when I got home but I didn't find any problem
in lru page moving scheme.
As Wu, Kosaki and Rik discussed, I think this issue is also related to process fork bomb.

When I tested msgctl11 in my machine with 2.6.31-rc1, I found that:

2.6.31-rc1
real 0m38.628s
user 0m10.589s
sys 1m12.613s

vmstat

allocstall 3196

2.6.31-rc1-revert-mypatch

real 1m17.396s
user 0m11.193s
sys 4m3.803s

vmstat

allocstall 584

Sometimes I got OOM, sometime not in with 2.6.31-rc1.

Anyway, the current kernel's test took a rather short time than my reverted patch.
In addition, the current kernel has small allocstall(direct reclaim)

As you know, my patch was just to remove calling shrink_active_list in case of no swap.
shrink_active_list function is a big cost function.
The old shrink_active_list could throttle to fork processes by chance.
But by removing that function with my patch, we have a high probability to make process fork bomb. Wu, KOSAKI and Rik, does it make sense?

So I think you were just lucky with a unnecessary routine.
Anyway, AFAIK, Rik is making throttling page reclaim.
I think it can solve your problem.

--
Kind regards,
Minchan Kim

2009-07-02 07:45:22

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Thu, 2 Jul 2009 16:41:06 +0900
Minchan Kim <[email protected]> wrote:

>
>
> On Tue, 30 Jun 2009 20:57:47 +0100
> David Howells <[email protected]> wrote:
>
> > Minchan Kim <[email protected]> wrote:
> >
> > > David. Doesn't it happen OOM if you revert my patch, still?
> >
> > It does happen, and indeed happens in v2.6.30, but requires two adjacent runs
> > of msgctl11 to trigger, rather than usually triggering on the first run. If
> > you interpolate the rest of LTP between the iterations, it doesn't seem to
> > happen at all on v2.6.30. My guess is that with the rest of LTP interpolated,
> > there's either enough time for some cleanup or something triggers a cleanup
> > (the swapfile tests perhaps?).
> >
> > > Befor I go to the trip, I made debugging patch in a hurry. Mel and I
> > > suspect to put the wrong page in lru list.
> > >
> > > This patch's goal is that print page's detail on active anon lru when it
> > > happen OOM. Maybe you could expand your log buffer size.
> >
> > Do you mean to expand the dmesg buffer? That's probably unnecessary: I capture
> > the kernel log over a serial port into a file on another machine.
> >
> > > Could you show me the information with OOM, please ?
> >
> > Attached. It's compressed as there was rather a lot.
> >
> > David
> > ---
>
> Hi, David.
>
> Sorry for late response.
>
> I looked over your captured data when I got home but I didn't find any problem
> in lru page moving scheme.
> As Wu, Kosaki and Rik discussed, I think this issue is also related to process fork bomb.
>
> When I tested msgctl11 in my machine with 2.6.31-rc1, I found that:
>
> 2.6.31-rc1
> real 0m38.628s
> user 0m10.589s
> sys 1m12.613s
>
> vmstat
>
> allocstall 3196
>
> 2.6.31-rc1-revert-mypatch
>
> real 1m17.396s
> user 0m11.193s
> sys 4m3.803s
>
> vmstat
>
> allocstall 584
>
> Sometimes I got OOM, sometime not in with 2.6.31-rc1.
>
> Anyway, the current kernel's test took a rather short time than my reverted patch.
> In addition, the current kernel has small allocstall(direct reclaim)
^^^^^
many
typo

> As you know, my patch was just to remove calling shrink_active_list in case of no swap.
> shrink_active_list function is a big cost function.
> The old shrink_active_list could throttle to fork processes by chance.
> But by removing that function with my patch, we have a high probability to make process fork bomb. Wu, KOSAKI and Rik, does it make sense?
>
> So I think you were just lucky with a unnecessary routine.
> Anyway, AFAIK, Rik is making throttling page reclaim.
> I think it can solve your problem.
>
> --
> Kind regards,
> Minchan Kim


--
Kind regards,
Minchan Kim

2009-07-02 13:41:55

by Fengguang Wu

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Thu, Jul 02, 2009 at 03:41:06PM +0800, Minchan Kim wrote:
>
>
> On Tue, 30 Jun 2009 20:57:47 +0100
> David Howells <[email protected]> wrote:
>
> > Minchan Kim <[email protected]> wrote:
> >
> > > David. Doesn't it happen OOM if you revert my patch, still?
> >
> > It does happen, and indeed happens in v2.6.30, but requires two adjacent runs
> > of msgctl11 to trigger, rather than usually triggering on the first run. If
> > you interpolate the rest of LTP between the iterations, it doesn't seem to
> > happen at all on v2.6.30. My guess is that with the rest of LTP interpolated,
> > there's either enough time for some cleanup or something triggers a cleanup
> > (the swapfile tests perhaps?).
> >
> > > Befor I go to the trip, I made debugging patch in a hurry. Mel and I
> > > suspect to put the wrong page in lru list.
> > >
> > > This patch's goal is that print page's detail on active anon lru when it
> > > happen OOM. Maybe you could expand your log buffer size.
> >
> > Do you mean to expand the dmesg buffer? That's probably unnecessary: I capture
> > the kernel log over a serial port into a file on another machine.
> >
> > > Could you show me the information with OOM, please ?
> >
> > Attached. It's compressed as there was rather a lot.
> >
> > David
> > ---
>
> Hi, David.
>
> Sorry for late response.
>
> I looked over your captured data when I got home but I didn't find any problem
> in lru page moving scheme.
> As Wu, Kosaki and Rik discussed, I think this issue is also related to process fork bomb.

Yes, me think so.

> When I tested msgctl11 in my machine with 2.6.31-rc1, I found that:

Were you testing the no-swap case?

> 2.6.31-rc1
> real 0m38.628s
> user 0m10.589s
> sys 1m12.613s
>
> vmstat
>
> allocstall 3196
>
> 2.6.31-rc1-revert-mypatch
>
> real 1m17.396s
> user 0m11.193s
> sys 4m3.803s

It's interesting that (sys > real).

> vmstat
>
> allocstall 584
>
> Sometimes I got OOM, sometime not in with 2.6.31-rc1.
>
> Anyway, the current kernel's test took a rather short time than my reverted patch.
> In addition, the current kernel has small allocstall(direct reclaim)
>
> As you know, my patch was just to remove calling shrink_active_list in case of no swap.
> shrink_active_list function is a big cost function.
> The old shrink_active_list could throttle to fork processes by chance.
> But by removing that function with my patch, we have a high
> probability to make process fork bomb. Wu, KOSAKI and Rik, does it
> make sense?

Maybe, but I'm not sure on how to explain the time/vmstat numbers :(

> So I think you were just lucky with a unnecessary routine.
> Anyway, AFAIK, Rik is making throttling page reclaim.
> I think it can solve your problem.

Yes, with good luck :)

Thanks,
Fengguang

2009-07-02 14:08:26

by Minchan Kim

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Thu, Jul 2, 2009 at 9:43 PM, Wu Fengguang<[email protected]> wrote:
> On Thu, Jul 02, 2009 at 03:41:06PM +0800, Minchan Kim wrote:
>>
>>
>> On Tue, 30 Jun 2009 20:57:47 +0100
>> David Howells <[email protected]> wrote:
>>
>> > Minchan Kim <[email protected]> wrote:
>> >
>> > > David. Doesn't it happen OOM if you revert my patch, still?
>> >
>> > It does happen, and indeed happens in v2.6.30, but requires two adjacent runs
>> > of msgctl11 to trigger, rather than usually triggering on the first run.  If
>> > you interpolate the rest of LTP between the iterations, it doesn't seem to
>> > happen at all on v2.6.30.  My guess is that with the rest of LTP interpolated,
>> > there's either enough time for some cleanup or something triggers a cleanup
>> > (the swapfile tests perhaps?).
>> >
>> > > Befor I go to the trip, I made debugging patch in a hurry.  Mel and I
>> > > suspect to put the wrong page in lru list.
>> > >
>> > > This patch's goal is that print page's detail on active anon lru when it
>> > > happen OOM.  Maybe you could expand your log buffer size.
>> >
>> > Do you mean to expand the dmesg buffer?  That's probably unnecessary: I capture
>> > the kernel log over a serial port into a file on another machine.
>> >
>> > > Could you show me the information with OOM, please ?
>> >
>> > Attached.  It's compressed as there was rather a lot.
>> >
>> > David
>> > ---
>>
>> Hi, David.
>>
>> Sorry for late response.
>>
>> I looked over your captured data when I got home but I didn't find any problem
>> in lru page moving scheme.
>> As Wu, Kosaki and Rik discussed, I think this issue is also related to process fork bomb.
>
> Yes, me think so.
>
>> When I tested msgctl11 in my machine with 2.6.31-rc1, I found that:
>
> Were you testing the no-swap case?

Yes.

>> 2.6.31-rc1
>> real  0m38.628s
>> user  0m10.589s
>> sys   1m12.613s
>>
>> vmstat
>>
>> allocstall 3196
>>
>> 2.6.31-rc1-revert-mypatch
>>
>> real  1m17.396s
>> user  0m11.193s
>> sys   4m3.803s
>
> It's interesting that (sys > real).

My test environment is quad core. :)

>> vmstat
>>
>> allocstall 584
>>
>> Sometimes I got OOM, sometime not in with 2.6.31-rc1.
>>
>> Anyway, the current kernel's test took a rather short time than my reverted patch.
>> In addition, the current kernel has small allocstall(direct reclaim)
>>
>> As you know, my patch was just to remove calling shrink_active_list in case of no swap.
>> shrink_active_list function is a big cost function.
>> The old shrink_active_list could throttle to fork processes by chance.
>> But by removing that function with my patch, we have a high
>> probability to make process fork bomb. Wu, KOSAKI and Rik, does it
>> make sense?
>
> Maybe, but I'm not sure on how to explain the time/vmstat numbers :(

I think we can prove it following as.
For example, whenever the each forking 1000 processes from starting msgctl11,
we look at the vmstat and check the elasped time.

I think current kernel may take a very short time but many allocstall .
but reverted one may take a rather long time but small allocstall increasement
after some time(maybe when inactive_anon_is low).

In addition, we can check shrink_active_list's collpased time when the
inactive_aon_is low.

>
>> So I think you were just lucky with a unnecessary routine.
>> Anyway, AFAIK, Rik is making throttling page reclaim.
>> I think it can solve your problem.
>
> Yes, with good luck :)
>
> Thanks,
> Fengguang
>



--
Kind regards,
Minchan Kim

2009-07-05 09:55:39

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Tue, Jun 30, 2009 at 10:57:02PM -0400, Rik van Riel wrote:
> KOSAKI Motohiro wrote:
>
>>> [ 1522.019259] Active_anon:11 active_file:6 inactive_anon:0
>>> [ 1522.019260] inactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
>>> [ 1522.019261] free:1985 slab:44399 mapped:132 pagetables:61830 bounce:0
>>> [ 1522.019262] isolate:69817
>>
>> OK. thanks.
>> I plan to submit this patch after small more tests. it is useful for OOM analysis.
>
> It is also useful for throttling page reclaim.
>
> If more than half of the inactive pages in a zone are
> isolated, we are probably beyond the point where adding
> additional reclaim processes will do more harm than good.

Maybe we can try limiting the isolation phase of direct reclaims to
one per CPU?

mutex_lock(per_cpu_lock);
isolate_pages();
shrink_page_list();
put_back_pages();
mutex_unlock(per_cpu_lock);

This way the isolated pages as well as major parts of direct reclaims
will be bounded by CPU numbers. The added overheads should be trivial
comparing to the reclaim costs.

Thanks,
Fengguang

2009-07-05 10:39:08

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

> >> OK. thanks.
> >> I plan to submit this patch after small more tests. it is useful for OOM analysis.
> >
> > It is also useful for throttling page reclaim.
> >
> > If more than half of the inactive pages in a zone are
> > isolated, we are probably beyond the point where adding
> > additional reclaim processes will do more harm than good.
>
> Maybe we can try limiting the isolation phase of direct reclaims to
> one per CPU?
>
> mutex_lock(per_cpu_lock);
> isolate_pages();
> shrink_page_list();
> put_back_pages();
> mutex_unlock(per_cpu_lock);
>
> This way the isolated pages as well as major parts of direct reclaims
> will be bounded by CPU numbers. The added overheads should be trivial
> comparing to the reclaim costs.

hm, this idea makes performance degression on few CPU machine, I think.

e.g.
if system have only one cpu and sysmtem makes lumpy reclaim, lumpy reclaim
makes synchronous pageout and it makes very long waiting time.

I suspect per-cpu decision is not useful in this area.

thanks.

2009-07-05 10:52:18

by Wu Fengguang

[permalink] [raw]
Subject: Re: Found the commit that causes the OOMs

On Sun, Jul 05, 2009 at 07:38:54PM +0900, KOSAKI Motohiro wrote:
> > >> OK. thanks.
> > >> I plan to submit this patch after small more tests. it is useful for OOM analysis.
> > >
> > > It is also useful for throttling page reclaim.
> > >
> > > If more than half of the inactive pages in a zone are
> > > isolated, we are probably beyond the point where adding
> > > additional reclaim processes will do more harm than good.
> >
> > Maybe we can try limiting the isolation phase of direct reclaims to
> > one per CPU?
> >
> > mutex_lock(per_cpu_lock);
> > isolate_pages();
> > shrink_page_list();
> > put_back_pages();
> > mutex_unlock(per_cpu_lock);
> >
> > This way the isolated pages as well as major parts of direct reclaims
> > will be bounded by CPU numbers. The added overheads should be trivial
> > comparing to the reclaim costs.
>
> hm, this idea makes performance degression on few CPU machine, I think.

Yes, this is also my big worry. But one possible workaround is to
allow N direct reclaims per CPU.

> e.g.
> if system have only one cpu and sysmtem makes lumpy reclaim, lumpy reclaim
> makes synchronous pageout and it makes very long waiting time.

We can temporarily drop the lock during the writeback waiting.
0-order reclaims shall not be blocked by ongoing high order reclaims.

> I suspect per-cpu decision is not useful in this area.

Maybe. I'm just proposing one more possible way to choose from :)

Thanks,
Fengguang