2007-08-09 11:11:18

by Alexey Dobriyan

[permalink] [raw]
Subject: 2.6.23-rc2-mm1: hang, prop_norm_single involved

LTP run reproducably hangs during rwtest01 test
rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
Calltrace is always the same:

INFO: trying to register non-static key
__lock_acquire+0x210/0xc9e
lock_acquire+0x87/0xa3
_spin_lock_irqsave+0x2f/0x5f
prop_norm_single+0x32/0x74
set_page_dirty+0xd6/0x151
set_page_dirty_balance+0xc/0x71
do_wp_page+0x4ae/0x52b
handle_mm_fault+0x616/0x6c4
do_page_fault+0x1b0/0x51c

Box is UP 2.66GHz Celeron, 1G of RAM. Pretty much every debugging option
is on, it's -mm after all...


CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_NONIRQ_WAKEUP=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
CONFIG_SYSFS_DEPRECATED=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_PROC_KPAGEMAP=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_KMOD=y
CONFIG_BLOCK=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_CFQ=y
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_X86_PC=y
CONFIG_MPENTIUM4=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_VM86=y
CONFIG_X86_MSR=y
CONFIG_HIGHMEM4G=y
CONFIG_VMSPLIT_3G=y
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_NR_QUICK=1
CONFIG_VIRT_TO_BUS=y
CONFIG_HIGHPTE=y
CONFIG_MTRR=y
CONFIG_HZ_300=y
CONFIG_HZ=300
CONFIG_PHYSICAL_START=0x100000
CONFIG_PHYSICAL_ALIGN=0x100000
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_ACPI_FAN=m
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_ISA_DMA_API=y
CONFIG_BINFMT_ELF=y
CONFIG_NET=y
CONFIG_PACKET=m
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_FIB_HASH=y
CONFIG_SYN_COOKIES=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_PNP=y
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_MISC_DEVICES=y
CONFIG_IDE=y
CONFIG_IDE_MAX_HWIFS=4
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=m
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_IDEPCI_PCIBUS_ORDER=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_BLK_DEV_PIIX=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_BLK_DEV_SD=y
CONFIG_SCSI_WAIT_SCAN=m
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_ATA_PIIX=y
CONFIG_NETDEVICES=y
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
CONFIG_NET_PCI=y
CONFIG_8139TOO=y
CONFIG_8139TOO_PIO=y
CONFIG_NETDEV_1000=y
CONFIG_E1000=y
CONFIG_NETCONSOLE=m
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_EVDEV=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
CONFIG_SERIO_LIBPS2=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_RTC=y
CONFIG_AGP=y
CONFIG_AGP_INTEL=y
CONFIG_DRM=y
CONFIG_DRM_I915=y
CONFIG_DEVPORT=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_ALGOBIT=y
CONFIG_FB=y
CONFIG_FB_DDC=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_DEFERRED_IO=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_INTEL=y
CONFIG_FB_INTEL_I2C=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_HID_SUPPORT=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_EXT2_FS=y
CONFIG_EXT3_FS=y
CONFIG_JBD=y
CONFIG_QUOTA=y
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_ISO9660_FS=m
CONFIG_FAT_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=866
CONFIG_FAT_DEFAULT_IOCHARSET="koi8-r"
CONFIG_PROC_FS=y
CONFIG_PROC_SYSCTL=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_UTF8=m
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_KERNEL=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
CONFIG_TIMER_STATS=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y
CONFIG_DEBUG_PREEMPT=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_LOCK_STAT=y
CONFIG_TRACE_IRQFLAGS=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
CONFIG_STACKTRACE=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_LIST=y
CONFIG_FRAME_POINTER=y
CONFIG_PROFILE_LIKELY=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_RODATA=y
CONFIG_4KSTACKS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_DOUBLEFAULT=y
CONFIG_BITREVERSE=y
CONFIG_CRC32=y
CONFIG_PLIST=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_KTIME_SCALAR=y


2007-08-09 12:48:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 2.6.23-rc2-mm1: hang, prop_norm_single involved

On Thu, 2007-08-09 at 15:10 +0400, Alexey Dobriyan wrote:
> LTP run reproducably hangs during rwtest01 test
> rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
> Calltrace is always the same:
>
> INFO: trying to register non-static key
> __lock_acquire+0x210/0xc9e
> lock_acquire+0x87/0xa3
> _spin_lock_irqsave+0x2f/0x5f
> prop_norm_single+0x32/0x74
> set_page_dirty+0xd6/0x151
> set_page_dirty_balance+0xc/0x71
> do_wp_page+0x4ae/0x52b
> handle_mm_fault+0x616/0x6c4
> do_page_fault+0x1b0/0x51c


[root@opteron ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
rwtest01 1 PASS : Test passed
[root@opteron ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$

I can reproduce, but not always.

Also, since the task->dirties member is initialized in fork.c this
should either _always_ happen or never. So this does point to some
memory corruption, ->dirties is the very last member of the task struct.

/me goes try with slab_debug,...


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2007-08-09 12:51:17

by Cornelia Huck

[permalink] [raw]
Subject: Re: 2.6.23-rc2-mm1: hang, prop_norm_single involved

On Thu, 9 Aug 2007 15:10:44 +0400,
Alexey Dobriyan <[email protected]> wrote:

> LTP run reproducably hangs during rwtest01 test
> rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
> Calltrace is always the same:
>
> INFO: trying to register non-static key
> __lock_acquire+0x210/0xc9e
> lock_acquire+0x87/0xa3
> _spin_lock_irqsave+0x2f/0x5f
> prop_norm_single+0x32/0x74
> set_page_dirty+0xd6/0x151
> set_page_dirty_balance+0xc/0x71
> do_wp_page+0x4ae/0x52b
> handle_mm_fault+0x616/0x6c4
> do_page_fault+0x1b0/0x51c
>

On s390, I get a bad spinlock magic which also has prop_norm_single in
the call chain. Don't know if that is related:

BUG: spinlock bad magic on CPU#1, 05-pam_console./2560
lock: 000000000602de00, .magic: 00000000, .owner: 05-pam_console./2560, .owner_cpu: 1
000000000038d9f0 0000000000000002 0000000000000000 000000000604ba40
000000000604b9b8 0000000000434800 0000000000434800 00000000000153d4
0000000000000000 0000000000000000 0000000000000000 0000000000517020
0000000000000000 000000000604ba00 000000000604b9a0 000000000000000e
000000000038d9e0 000000000001541e 000000000604b9a0 000000000604b9f0
Call Trace:
([<000000000001535c>] show_trace+0xc8/0xe4)
[<000000000001544e>] show_stack+0xd6/0x100
[<00000000000154a6>] dump_stack+0x2e/0x3c
[<00000000001a6fee>] spin_bug+0x106/0x120
[<00000000001a7268>] _raw_spin_unlock+0xb0/0xd4
[<00000000003724f2>] _spin_unlock_irqrestore+0x2a/0x70
[<000000000019e962>] prop_norm_single+0xce/0xdc
[<000000000007fcc2>] set_page_dirty+0x9e/0x144
[<0000000000096862>] page_remove_rmap+0xd2/0x1e8
[<000000000008c74c>] unmap_vmas+0x4ac/0xc1c
[<0000000000092248>] exit_mmap+0xac/0x288
[<000000000003810e>] mmput+0x7a/0x110
[<000000000003d79e>] exit_mm+0xaa/0x114
[<000000000003e89a>] do_exit+0x152/0x908
[<000000000003f0a0>] do_group_exit+0x50/0xa8
[<000000000003f12e>] sys_exit_group+0x36/0x44
[<0000000000022c30>] sysc_noemu+0x10/0x16
[<00000200000d8e1c>] 0x200000d8e1c

The system seems to be completely usable afterwards. I can reliably
produce this callchain (or a similar one, but always involving mmput)
with the attached .config; however, turning on more lock debugging
makes it go away :(


Attachments:
(No filename) (2.16 kB)
config-t29lp20-mm-nodbg (16.57 kB)
Download all attachments

2007-08-09 15:32:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 2.6.23-rc2-mm1: hang, prop_norm_single involved

On Thu, 2007-08-09 at 14:45 +0200, Peter Zijlstra wrote:
> On Thu, 2007-08-09 at 15:10 +0400, Alexey Dobriyan wrote:
> > LTP run reproducably hangs during rwtest01 test
> > rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
> > Calltrace is always the same:
> >

> [root@opteron ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
> rwtest01 1 PASS : Test passed
> [root@opteron ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
>
> I can reproduce, but not always.
>
> Also, since the task->dirties member is initialized in fork.c this
> should either _always_ happen or never. So this does point to some
> memory corruption, ->dirties is the very last member of the task struct.
>
> /me goes try with slab_debug,...

to no avail, banging head against the wall what is happening here.

Andrew, could you:

# mm-dirty-balancing-for-tasks.patch

while I try to figure this one out?

That seems to make the unhappies I could reproduce here go away.