LinuxLists.cc - linux-next: Tree for Aug 22

[permalink] [raw]

Subject: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

Hello,

======================================================
WARNING: possible circular locking dependency detected
4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
------------------------------------------------------
fsck.ext4/148 is trying to acquire lock:
(&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190

but now in release context of a crosslock acquired at the following:
((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 ((complete)&wait#2){+.+.}:
lock_acquire+0x176/0x19e
__wait_for_common+0x50/0x1e3
blk_execute_rq+0xbb/0xda
scsi_execute+0xc3/0x17d [scsi_mod]
sd_revalidate_disk+0x112/0x1549 [sd_mod]
rescan_partitions+0x48/0x2c4
__blkdev_get+0x14b/0x37c
blkdev_get+0x191/0x2c0
device_add_disk+0x2b4/0x3e5
sd_probe_async+0xf8/0x17e [sd_mod]
async_run_entry_fn+0x34/0xe0
process_one_work+0x2af/0x4d1
worker_thread+0x19a/0x24f
kthread+0x133/0x13b
ret_from_fork+0x27/0x40

-> #0 (&bdev->bd_mutex){+.+.}:
__blkdev_put+0x33/0x190
blkdev_close+0x24/0x27
__fput+0xee/0x18a
task_work_run+0x79/0xa0
prepare_exit_to_usermode+0x9b/0xb5

other info that might help us debug this:
Possible unsafe locking scenario by crosslock:
CPU0 CPU1
---- ----
lock(&bdev->bd_mutex);
lock((complete)&wait#2);
lock(&bdev->bd_mutex);
unlock((complete)&wait#2);

*** DEADLOCK ***
4 locks held by fsck.ext4/148:
#0: (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
#1: (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
#2: (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
#3: (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50

stack backtrace:
CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
Call Trace:
dump_stack+0x67/0x8e
print_circular_bug+0x2a1/0x2af
? zap_class+0xc5/0xc5
check_prev_add+0x76/0x20d
? __lock_acquire+0xc27/0xcc8
lock_commit_crosslock+0x327/0x35e
complete+0x24/0x50
scsi_end_request+0x8d/0x176 [scsi_mod]
scsi_io_completion+0x1be/0x423 [scsi_mod]
__blk_mq_complete_request+0x112/0x131
ata_scsi_simulate+0x212/0x218 [libata]
__ata_scsi_queuecmd+0x1be/0x1de [libata]
ata_scsi_queuecmd+0x41/0x74 [libata]
scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
scsi_queue_rq+0x1e0/0x26f [scsi_mod]
blk_mq_dispatch_rq_list+0x193/0x2a7
? _raw_spin_unlock+0x2e/0x40
blk_mq_sched_dispatch_requests+0x132/0x176
__blk_mq_run_hw_queue+0x59/0xc5
__blk_mq_delay_run_hw_queue+0x5f/0xc1
blk_mq_flush_plug_list+0xfc/0x10b
blk_flush_plug_list+0xc6/0x1eb
blk_finish_plug+0x25/0x32
generic_writepages+0x56/0x63
do_writepages+0x36/0x70
__filemap_fdatawrite_range+0x59/0x5f
filemap_write_and_wait+0x19/0x4f
__blkdev_put+0x5f/0x190
blkdev_close+0x24/0x27
__fput+0xee/0x18a
task_work_run+0x79/0xa0
prepare_exit_to_usermode+0x9b/0xb5
entry_SYSCALL_64_fastpath+0xab/0xad
RIP: 0033:0x7ff5755a2f74
RSP: 002b:00007ffe46fce038 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000555ddeddded0 RCX: 00007ff5755a2f74
RDX: 0000000000001000 RSI: 0000555ddede2580 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000555ddede2580 R09: 0000555ddedde080
R10: 0000000108000000 R11: 0000000000000246 R12: 0000555ddedddfa0
R13: 00007ff576523680 R14: 0000000000001000 R15: 0000555ddeddc2b0

-ss

2017-08-22 18:11:22

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

Hi all,

This tree fails to boot on my qemu test. 2 boot logs attached.

Paul, Nick, is this the same/similar to the other RCU/lockup bug you
are chasing. This is the first time I have seen this failure.

This qemu boot is in full emulation mode if I add --enable-kvm to the
qemu command, it does not fail to boot. (the test just boots and then shuts down)
--
Cheers,
Stephen Rothwell

2017-08-22 18:14:30

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

Hi all,

On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <[email protected]> wrote:
>
> This tree fails to boot on my qemu test. 2 boot logs attached.
>
> Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> are chasing. This is the first time I have seen this failure.
>
> This qemu boot is in full emulation mode if I add --enable-kvm to the
> qemu command, it does not fail to boot. (the test just boots and then shuts down)

Boot logs attached this time.
--
Cheers,
Stephen Rothwell

Attachments:

(No filename) (518.00 B)
bad-log-1 (15.30 kB)
bad-log-2 (13.49 kB)
Download all attachments

2017-08-22 18:59:27

by Paul E. McKenney

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> Hi all,
>
> On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <[email protected]> wrote:
> >
> > This tree fails to boot on my qemu test. 2 boot logs attached.
> >
> > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > are chasing. This is the first time I have seen this failure.
> >
> > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > qemu command, it does not fail to boot. (the test just boots and then shuts down)
>
> Boot logs attached this time.

That does not look good!

Given that the hard lockup happened during timer lock acquisition, I
have to ask if you built with lockdep...

Thanx, Paul

> --
> Cheers,
> Stephen Rothwell

> spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /home/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.cpio.gz
>
>
> SLOF[0m[?25l **********************************************************************
> [1mQEMU Starting
> [0m Build Date = Jan 3 2017 22:22:01
> FW Version = buildd@ release 20161019
> Press "s" to enter Open Firmware.
>
> [0m[?25hC0000C0100C0120C0140C0200C0201C0220C0240C0260C02E0C0300C0320C0340C0360C0370C0380C0371C0372C0373C0374C03F0C0400C0480C04C0C04D0C0500Populating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /vdevice/l-lan@71000002
> Populating /vdevice/v-scsi@71000003
> SCSI: Looking for devices
> 8200000000000000 CD-ROM : "QEMU QEMU CD-ROM 2.5+"
> C0580C05A0Populating /pci@800000020000000
> C0600C0640C0690C06A0C06A8C06B0C06B8C06C0C06E0C0700C0800C0880No NVRAM common partition, re-initializing...
> C0890C08A0C08A8C08B0Scanning USB
> C08C0C08D0Using default console: /vdevice/vty@71000000
> C08E0C08E8Detected RAM kernel at 400000 (1071618 bytes) C08FF
> Welcome to Open Firmware
>
> Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
> This program and the accompanying materials are made available
> under the terms of the BSD License available at
> http://www.opensource.org/licenses/bsd-license.php
>
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Detected machine type: 0000000000000101
> command line:
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
> memory_limit : 0000000000000000 (16 MB aligned)
> alloc_bottom : 0000000001490000
> alloc_top : 0000000030000000
> alloc_top_hi : 0000000080000000
> rmo_top : 0000000030000000
> ram_top : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
> Device tree struct 0x00000000016b0000 -> 0x00000000016c0000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> Page sizes from device-tree:
> base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
> base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
> base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
> base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
> base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
> base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
> base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3
> Using 1TB segments
> Initializing hash mmu with SLB
> Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Found initrd at 0xc000000001490000:0xc00000000165d70b
> Using pSeries machine description
> bootconsole [udbg0] enabled
> Partition configured for 1 cpus.
> CPU maps initialized for 1 thread per core
> -> smp_release_cpus()
> spinning_secondaries = 0
> <- smp_release_cpus()
> -----------------------------------------------------
> ppc64_pft_size = 0x18
> phys_mem_size = 0x80000000
> dcache_bsize = 0x80
> icache_bsize = 0x80
> cpu_features = 0x077c7a6c18500249
> possible = 0x5fffffff18500649
> always = 0x0000000018100040
> cpu_user_features = 0xdc0065c2 0xae000000
> mmu_features = 0x7c006001
> firmware_features = 0x00000001405a445f
> htab_hash_mask = 0x1ffff
> -----------------------------------------------------
> numa: NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
> PCI host bridge /pci@800000020000000 ranges:
> IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
> MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000
> MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000
> PPC64 nvram contains 65536 bytes
> Zone ranges:
> DMA [mem 0x0000000000000000-0x000000007fffffff]
> DMA32 empty
> Normal empty
> Movable zone start for each node
> Early memory node ranges
> node 0: [mem 0x0000000000000000-0x000000007fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
> percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
> Built 1 zonelists, mobility grouping on. Total pages: 32736
> Policy zone: DMA
> Kernel command line:
> PID hash table entries: 4096 (order: -1, 32768 bytes)
> Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 2752K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> Hierarchical RCU implementation.
> RCU event tracing is enabled.
> RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
> RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
> clocksource: timebase mult[1f40000] shift[24] registered
> Console: colour dummy device 80x25
> console [hvc0] enabled
> console [hvc0] enabled
> bootconsole [udbg0] disabled
> bootconsole [udbg0] disabled
> pid_max: default: 32768 minimum: 301
> Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
> Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
> Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
> EEH: pSeries platform initialized
> POWER8 performance monitor hardware support registered
> Hierarchical SRCU implementation.
> smp: Bringing up secondary CPUs ...
> smp: Brought up 1 node, 1 CPU
> numa: Node 0 CPUs: 0
> devtmpfs: initialized
> random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with crng_init=0
> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
> futex hash table entries: 256 (order: -1, 32768 bytes)
> NET: Registered protocol family 16
> EEH: No capable adapters found
> cpuidle: using governor menu
> kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
> kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
> random: fast init done
> kworker/u2:0 (17) used greatest stack depth: 12352 bytes left
> pstore: using zlib compression
> pstore: Registered nvram as persistent store backend
> Linux ppc64le
> #2 SMP Tue Aug 2PCI: Probing PCI hardware
> PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [io 0x10000-0x1ffff] (bus address [0x0000-0xffff])
> pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
> pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
> pci_bus 0000:00: root bus resource [bus 00-ff]
> IOMMU table initialized, virtual merging enabled
> HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> vgaarb: loaded
> SCSI subsystem initialized
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> pps_core: LinuxPPS API ver. 1 registered
> pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <[email protected]>
> PTP clock support registered
> clocksource: Switched to clocksource timebase
> NET: Registered protocol family 2
> TCP established hash table entries: 16384 (order: 1, 131072 bytes)
> TCP bind hash table entries: 16384 (order: 2, 262144 bytes)
> TCP: Hash tables configured (established 16384 bind 16384)
> UDP hash table entries: 2048 (order: 0, 65536 bytes)
> UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
> NET: Registered protocol family 1
> RPC: Registered named UNIX socket transport module.
> RPC: Registered udp transport module.
> RPC: Registered tcp transport module.
> RPC: Registered tcp NFSv4.1 backchannel transport module.
> Trying to unpack rootfs image as initramfs...
> Freeing initrd memory: 1792K
> audit: initializing netlink subsys (disabled)
> audit: type=2000 audit(1503390342.630:1): state=initialized audit_enabled=0 res=1
> workingset: timestamp_bits=38 max_order=15 bucket_order=0
> NFS: Registering the id_resolver key type
> Key type id_resolver registered
> Key type id_legacy registered
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
> io scheduler noop registered
> io scheduler deadline registered
> io scheduler cfq registered (default)
> io scheduler mq-deadline registered
> io scheduler kyber registered
> Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> brd: module loaded
> loop: module loaded
> ipr: IBM Power RAID SCSI Device Driver version: 2.6.4 (March 14, 2017)
> ibmvscsi 71000003: SRP_VERSION: 16.a
> ibmvscsi 71000003: Maximum ID: 64 Maximum LUN: 32 Maximum Channel: 3
> scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
> ibmvscsi 71000003: partner initialization complete
> ibmvscsi 71000003: host srp version: 16.a, host partition qemu (0), OS 2, max io 2097152
> ibmvscsi 71000003: sent SRP login
> ibmvscsi 71000003: SRP_LOGIN succeeded
> scsi 0:0:2:0: CD-ROM QEMU QEMU CD-ROM 2.5+ PQ: 0 ANSI: 5
> sr 0:0:2:0: [sr0] scsi3-mmc drive: 16x/50x cd/rw xa/form2 cdda tray
> cdrom: Uniform CD-ROM driver Revision: 3.20
> sr 0:0:2:0: Attached scsi generic sg0 type 5
> libphy: Fixed MDIO Bus: probed
> e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
> e100: Copyright(c) 1999-2006 Intel Corporation
> e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
> e1000: Copyright (c) 1999-2006 Intel Corporation.
> e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> ehci-pci: EHCI PCI platform driver
> ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> ohci-pci: OHCI PCI platform driver
> rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
> IR NEC protocol handler initialized
> IR RC5(x/sz) protocol handler initialized
> IR RC6 protocol handler initialized
> IR JVC protocol handler initialized
> IR Sony protocol handler initialized
> IR SANYO protocol handler initialized
> IR Sharp protocol handler initialized
> IR MCE Keyboard/mouse protocol handler initialized
> IR XMP protocol handler initialized
> device-mapper: uevent: version 1.0.3
> device-mapper: ioctl: 4.36.0-ioctl (2017-06-09) initialised: [email protected]
> usbcore: registered new interface driver usbhid
> usbhid: USB HID core driver
> ipip: IPv4 and MPLS over IPv4 tunneling driver
> NET: Registered protocol family 17
> Key type dns_resolver registered
> registered taskstats version 1
> console [netcon0] enabled
> netconsole: network logging started
> rtc-generic rtc-generic: setting system clock to 2017-08-22 08:25:43 UTC (1503390343)
> Freeing unused kernel memory: 896K
> This architecture does not have kernel memory protection.
> INFO: rcu_sched self-detected stall on CPU
> 0-...: (2100 ticks this GP) idle=026/140000000000001/0 softirq=1069/1069 fqs=0
> (t=2100 jiffies g=-66 c=-67 q=17)
> rcu_sched kthread starved for 2100 jiffies! g18446744073709551550 c18446744073709551549 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
> rcu_sched R running task 14192 8 2 0x00000800
> Call Trace:
> [c00000007e65f8d0] [c00000007e65f900] 0xc00000007e65f900 (unreliable)
> [c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
> [c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
> [c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
> [c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
> [c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
> [c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> NMI backtrace for cpu 0
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> Call Trace:
> [c00000007e65b0e0] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
> [c00000007e65b120] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
> [c00000007e65b1b0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0/0x220
> [c00000007e65b240] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20/0x40
> [c00000007e65b260] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
> [c00000007e65b2b0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
> [c00000007e65b3e0] [c00000000017f34c] update_process_times+0x3c/0x90
> [c00000007e65b410] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
> [c00000007e65b440] [c000000000195cf8] tick_sched_timer+0x58/0xd0
> [c00000007e65b480] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
> [c00000007e65b500] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
> [c00000007e65b5d0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
> [c00000007e65b620] [c0000000000238c0] timer_interrupt+0xa0/0xe0
> [c00000007e65b650] [c0000000000091c0] decrementer_common+0x150/0x160
> --- interrupt: 901 at .L142+0x0/0x4
> LR = arch_local_irq_restore+0x74/0x90
> [c00000007e65b940] [fed0895fbd054278] 0xfed0895fbd054278 (unreliable)
> [c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
> [c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
> [c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000000ad4c LR: c000000000015ae4 CTR: c00000000002d4f0
> REGS: c00000007e65b6c0 TRAP: 0901 Not tainted (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
> CR: 24000244 XER: 20000000
> CFAR: c000000000334910 SOFTE: 1
> GPR00: c00000000002c688 c00000007e65b940 c000000000ea5a00 0000000000000900
> GPR04: 0000000000000001 000000007f0b0000 000000055f52d5ae 0000000000000000
> GPR08: c00000000fd40000 0000000000400000 0000000000400000 0000000000000000
> GPR12: 0000000028000222 c00000000fd40000
> NIP [c00000000000ad4c] .L142+0x0/0x4
> LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
> Call Trace:
> [c00000007e65b940] [98a11310c2925282] 0x98a11310c2925282 (unreliable)
> [c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
> [c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
> [c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0
> 60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349
> timeout waiting for login
>
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

> spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /home/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.cpio.gz
>
>
> SLOF[0m[?25l **********************************************************************
> [1mQEMU Starting
> [0m Build Date = Jan 3 2017 22:22:01
> FW Version = buildd@ release 20161019
> Press "s" to enter Open Firmware.
>
> [0m[?25hC0000C0100C0120C0140C0200C0201C0220C0240C0260C02E0C0300C0320C0340C0360C0370C0380C0371C0372C0373C0374C03F0C0400C0480C04C0C04D0C0500Populating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /vdevice/l-lan@71000002
> Populating /vdevice/v-scsi@71000003
> SCSI: Looking for devices
> 8200000000000000 CD-ROM : "QEMU QEMU CD-ROM 2.5+"
> C0580C05A0Populating /pci@800000020000000
> C0600C0640C0690C06A0C06A8C06B0C06B8C06C0C06E0C0700C0800C0880No NVRAM common partition, re-initializing...
> C0890C08A0C08A8C08B0Scanning USB
> C08C0C08D0Using default console: /vdevice/vty@71000000
> C08E0C08E8Detected RAM kernel at 400000 (1071618 bytes) C08FF
> Welcome to Open Firmware
>
> Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
> This program and the accompanying materials are made available
> under the terms of the BSD License available at
> http://www.opensource.org/licenses/bsd-license.php
>
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Detected machine type: 0000000000000101
> command line:
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
> memory_limit : 0000000000000000 (16 MB aligned)
> alloc_bottom : 0000000001490000
> alloc_top : 0000000030000000
> alloc_top_hi : 0000000080000000
> rmo_top : 0000000030000000
> ram_top : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
> Device tree struct 0x00000000016b0000 -> 0x00000000016c0000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> Page sizes from device-tree:
> base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
> base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
> base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
> base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
> base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
> base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
> base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3
> Using 1TB segments
> Initializing hash mmu with SLB
> Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Found initrd at 0xc000000001490000:0xc00000000165d70b
> Using pSeries machine description
> bootconsole [udbg0] enabled
> Partition configured for 1 cpus.
> CPU maps initialized for 1 thread per core
> -> smp_release_cpus()
> spinning_secondaries = 0
> <- smp_release_cpus()
> -----------------------------------------------------
> ppc64_pft_size = 0x18
> phys_mem_size = 0x80000000
> dcache_bsize = 0x80
> icache_bsize = 0x80
> cpu_features = 0x077c7a6c18500249
> possible = 0x5fffffff18500649
> always = 0x0000000018100040
> cpu_user_features = 0xdc0065c2 0xae000000
> mmu_features = 0x7c006001
> firmware_features = 0x00000001405a445f
> htab_hash_mask = 0x1ffff
> -----------------------------------------------------
> numa: NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
> PCI host bridge /pci@800000020000000 ranges:
> IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
> MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000
> MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000
> PPC64 nvram contains 65536 bytes
> Zone ranges:
> DMA [mem 0x0000000000000000-0x000000007fffffff]
> DMA32 empty
> Normal empty
> Movable zone start for each node
> Early memory node ranges
> node 0: [mem 0x0000000000000000-0x000000007fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
> percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
> Built 1 zonelists, mobility grouping on. Total pages: 32736
> Policy zone: DMA
> Kernel command line:
> PID hash table entries: 4096 (order: -1, 32768 bytes)
> Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 2752K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> Hierarchical RCU implementation.
> RCU event tracing is enabled.
> RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
> RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
> clocksource: timebase mult[1f40000] shift[24] registered
> Console: colour dummy device 80x25
> console [hvc0] enabled
> console [hvc0] enabled
> bootconsole [udbg0] disabled
> bootconsole [udbg0] disabled
> pid_max: default: 32768 minimum: 301
> Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
> Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
> Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
> EEH: pSeries platform initialized
> POWER8 performance monitor hardware support registered
> Hierarchical SRCU implementation.
> smp: Bringing up secondary CPUs ...
> smp: Brought up 1 node, 1 CPU
> numa: Node 0 CPUs: 0
> devtmpfs: initialized
> random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with crng_init=0
> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
> futex hash table entries: 256 (order: -1, 32768 bytes)
> NET: Registered protocol family 16
> EEH: No capable adapters found
> cpuidle: using governor menu
> kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
> kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
> random: fast init done
> kworker/u2:0 (17) used greatest stack depth: 12192 bytes left
> pstore: using zlib compression
> pstore: Registered nvram as persistent store backend
> Linux ppc64le
> #2 SMP Tue Aug 2PCI: Probing PCI hardware
> PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [io 0x10000-0x1ffff] (bus address [0x0000-0xffff])
> pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
> pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
> pci_bus 0000:00: root bus resource [bus 00-ff]
> IOMMU table initialized, virtual merging enabled
> HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> vgaarb: loaded
> SCSI subsystem initialized
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> pps_core: LinuxPPS API ver. 1 registered
> pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <[email protected]>
> PTP clock support registered
> clocksource: Switched to clocksource timebase
> Watchdog CPU:0 Hard LOCKUP
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000017bb68 LR: c00000000017bb68 CTR: c000000000106330
> REGS: c00000003ffefd80 TRAP: 0900 Not tainted (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
> CR: 24000802 XER: 00000000
> CFAR: c0000000009dbc34 SOFTE: 0
> GPR00: c00000000017bb68 c00000007e65b9b0 c000000000ea5a00 0000000000000000
> GPR04: c00000007e65ba30 c00000007e160000 8000000000000000 c00000007e65bbe8
> GPR08: 0000000000000004 0000000000000000 0000000080000000 c00000007fe0fea8
> GPR12: c000000000106330 c00000000fd40000 c0000000001128b8 c00000007e150180
> GPR16: 0000000000000100 0000000004208040 c00000007e658000 0000000000000000
> GPR20: c000000000d74f00 c000000000ed3b00 00000000ffff8af5 000000000000000a
> GPR24: c000000000d74f00 c000000000d5fe80 c000000000eddbf8 c00000007e65ba30
> GPR28: c000000000d5ee00 c00000007e160048 c00000007fe0fe80 000000007fc80000
> NIP [c00000000017bb68] lock_timer_base+0x98/0xf0
> LR [c00000000017bb68] lock_timer_base+0x98/0xf0
> Call Trace:
> [c00000007e65b9b0] [c00000000017bb68] lock_timer_base+0x98/0xf0 (unreliable)
> [c00000007e65ba10] [c00000000017ed9c] mod_timer+0x2fc/0x350
> [c00000007e65ba80] [c000000000106468] idle_worker_timeout+0x138/0x190
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7be91ae8 4082ffec 7d5a482a 7be96fe3 7fdc5214 893e0025 2f890000 419e000c
> 41820008 7fcaca14 7fc3f378 48860065 <60000000> f87b0000 7c641b78 7fc3f378
> INFO: rcu_sched self-detected stall on CPU
> 0-...: (2100 ticks this GP) idle=002/140000000000001/0 softirq=183/183 fqs=0
> (t=2100 jiffies g=-278 c=-279 q=136)
> rcu_sched kthread starved for 2100 jiffies! g18446744073709551338 c18446744073709551337 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
> rcu_sched R running task 13216 8 2 0x00000800
> Call Trace:
> [c00000007e65f8d0] [c00000007e65fbd0] 0xc00000007e65fbd0 (unreliable)
> [c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
> [c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
> [c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
> [c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
> [c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
> [c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> NMI backtrace for cpu 0
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> Call Trace:
> [c00000007e65b200] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
> [c00000007e65b240] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
> [c00000007e65b2d0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0/0x220
> [c00000007e65b360] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20/0x40
> [c00000007e65b380] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
> [c00000007e65b3d0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
> [c00000007e65b500] [c00000000017f34c] update_process_times+0x3c/0x90
> [c00000007e65b530] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
> [c00000007e65b560] [c000000000195cf8] tick_sched_timer+0x58/0xd0
> [c00000007e65b5a0] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
> [c00000007e65b620] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
> [c00000007e65b6f0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
> [c00000007e65b740] [c0000000000238c0] timer_interrupt+0xa0/0xe0
> [c00000007e65b770] [c0000000000091c0] decrementer_common+0x150/0x160
> --- interrupt: 901 at .L142+0x0/0x4
> LR = arch_local_irq_restore+0x74/0x90
> [c00000007e65ba60] [c000000000ed3b00] jiffies+0x0/0x80 (unreliable)
> [c00000007e65ba80] [c000000000106434] idle_worker_timeout+0x104/0x190
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000000ad4c LR: c000000000015ae4 CTR: 0000000000000001
> REGS: c00000007e65b8b0 TRAP: 0901 Not tainted (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
> CR: 24000804 XER: 00000000
> CFAR: c00000000017dbdc SOFTE: 1
> GPR00: c00000000017e0ec c00000007e65bb30 c000000000ea5a00 0000000000000900
> GPR04: 0000000000000001 c00000007e160000 8000000000000000 0000000000000000
> GPR08: c00000000fd40000 0000000000000012 0000000000000000 c00000007fe0fea8
> GPR12: c000000000106330 c00000000fd40000
> NIP [c00000000000ad4c] .L142+0x0/0x4
> LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
> Call Trace:
> [c00000007e65bb30] [c000000000d74f00] irq_stat+0x0/0x80 (unreliable)
> [c00000007e65bb50] [c00000000017e0ec] expire_timers+0x12c/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0
> 60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349
> timeout waiting for login
>
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

2017-08-22 19:12:25

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

Hi Paul,

On Tue, 22 Aug 2017 11:59:23 -0700 "Paul E. McKenney" <[email protected]> wrote:
>
> On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> > Hi all,
> >
> > On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <[email protected]> wrote:
> > >
> > > This tree fails to boot on my qemu test. 2 boot logs attached.
> > >
> > > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > > are chasing. This is the first time I have seen this failure.
> > >
> > > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > > qemu command, it does not fail to boot. (the test just boots and then shuts down)
> >
> > Boot logs attached this time.
>
> That does not look good!
>
> Given that the hard lockup happened during timer lock acquisition, I
> have to ask if you built with lockdep...

$ grep LOCKDEP .config
CONFIG_LOCKDEP_SUPPORT=y

so, no. This is just a powerpc pseries_le_defconfig build.

--
Cheers,
Stephen Rothwell

2017-08-22 19:32:34

by Paul E. McKenney

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

On Wed, Aug 23, 2017 at 05:12:16AM +1000, Stephen Rothwell wrote:
> Hi Paul,
>
> On Tue, 22 Aug 2017 11:59:23 -0700 "Paul E. McKenney" <[email protected]> wrote:
> >
> > On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> > > Hi all,
> > >
> > > On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <[email protected]> wrote:
> > > >
> > > > This tree fails to boot on my qemu test. 2 boot logs attached.
> > > >
> > > > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > > > are chasing. This is the first time I have seen this failure.
> > > >
> > > > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > > > qemu command, it does not fail to boot. (the test just boots and then shuts down)
> > >
> > > Boot logs attached this time.
> >
> > That does not look good!
> >
> > Given that the hard lockup happened during timer lock acquisition, I
> > have to ask if you built with lockdep...
>
> $ grep LOCKDEP .config
> CONFIG_LOCKDEP_SUPPORT=y
>
> so, no. This is just a powerpc pseries_le_defconfig build.

This is without Nick's recent patch, I am guessing?

Hmmm... My testing of that patch omitted lockdep as well. Rerunning
on the full set of rcutorture scenarios...

Thanx, Paul

2017-08-22 19:36:23

by Paul E. McKenney

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

On Tue, Aug 22, 2017 at 12:32:31PM -0700, Paul E. McKenney wrote:
> On Wed, Aug 23, 2017 at 05:12:16AM +1000, Stephen Rothwell wrote:
> > Hi Paul,
> >
> > On Tue, 22 Aug 2017 11:59:23 -0700 "Paul E. McKenney" <[email protected]> wrote:
> > >
> > > On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> > > > Hi all,
> > > >
> > > > On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <[email protected]> wrote:
> > > > >
> > > > > This tree fails to boot on my qemu test. 2 boot logs attached.
> > > > >
> > > > > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > > > > are chasing. This is the first time I have seen this failure.
> > > > >
> > > > > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > > > > qemu command, it does not fail to boot. (the test just boots and then shuts down)
> > > >
> > > > Boot logs attached this time.
> > >
> > > That does not look good!
> > >
> > > Given that the hard lockup happened during timer lock acquisition, I
> > > have to ask if you built with lockdep...
> >
> > $ grep LOCKDEP .config
> > CONFIG_LOCKDEP_SUPPORT=y
> >
> > so, no. This is just a powerpc pseries_le_defconfig build.
>
> This is without Nick's recent patch, I am guessing?
>
> Hmmm... My testing of that patch omitted lockdep as well. Rerunning
> on the full set of rcutorture scenarios...

To complete the thought, if you aren't already using it, I suggest
applying Nick's patch:

http://lkml.kernel.org/r/[email protected]

Thanx, Paul

2017-08-22 21:44:49

by Bart Van Assche

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> ------------------------------------------------------
> fsck.ext4/148 is trying to acquire lock:
> (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
>
> but now in release context of a crosslock acquired at the following:
> ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 ((complete)&wait#2){+.+.}:
> lock_acquire+0x176/0x19e
> __wait_for_common+0x50/0x1e3
> blk_execute_rq+0xbb/0xda
> scsi_execute+0xc3/0x17d [scsi_mod]
> sd_revalidate_disk+0x112/0x1549 [sd_mod]
> rescan_partitions+0x48/0x2c4
> __blkdev_get+0x14b/0x37c
> blkdev_get+0x191/0x2c0
> device_add_disk+0x2b4/0x3e5
> sd_probe_async+0xf8/0x17e [sd_mod]
> async_run_entry_fn+0x34/0xe0
> process_one_work+0x2af/0x4d1
> worker_thread+0x19a/0x24f
> kthread+0x133/0x13b
> ret_from_fork+0x27/0x40
>
> -> #0 (&bdev->bd_mutex){+.+.}:
> __blkdev_put+0x33/0x190
> blkdev_close+0x24/0x27
> __fput+0xee/0x18a
> task_work_run+0x79/0xa0
> prepare_exit_to_usermode+0x9b/0xb5
>
> other info that might help us debug this:
> Possible unsafe locking scenario by crosslock:
> CPU0 CPU1
> ---- ----
> lock(&bdev->bd_mutex);
> lock((complete)&wait#2);
> lock(&bdev->bd_mutex);
> unlock((complete)&wait#2);
>
> *** DEADLOCK ***
> 4 locks held by fsck.ext4/148:
> #0: (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> #1: (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
> #2: (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
> #3: (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50
>
> stack backtrace:
> CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
> Call Trace:
> dump_stack+0x67/0x8e
> print_circular_bug+0x2a1/0x2af
> ? zap_class+0xc5/0xc5
> check_prev_add+0x76/0x20d
> ? __lock_acquire+0xc27/0xcc8
> lock_commit_crosslock+0x327/0x35e
> complete+0x24/0x50
> scsi_end_request+0x8d/0x176 [scsi_mod]
> scsi_io_completion+0x1be/0x423 [scsi_mod]
> __blk_mq_complete_request+0x112/0x131
> ata_scsi_simulate+0x212/0x218 [libata]
> __ata_scsi_queuecmd+0x1be/0x1de [libata]
> ata_scsi_queuecmd+0x41/0x74 [libata]
> scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
> scsi_queue_rq+0x1e0/0x26f [scsi_mod]
> blk_mq_dispatch_rq_list+0x193/0x2a7
> ? _raw_spin_unlock+0x2e/0x40
> blk_mq_sched_dispatch_requests+0x132/0x176
> __blk_mq_run_hw_queue+0x59/0xc5
> __blk_mq_delay_run_hw_queue+0x5f/0xc1
> blk_mq_flush_plug_list+0xfc/0x10b
> blk_flush_plug_list+0xc6/0x1eb
> blk_finish_plug+0x25/0x32
> generic_writepages+0x56/0x63
> do_writepages+0x36/0x70
> __filemap_fdatawrite_range+0x59/0x5f
> filemap_write_and_wait+0x19/0x4f
> __blkdev_put+0x5f/0x190
> blkdev_close+0x24/0x27
> __fput+0xee/0x18a
> task_work_run+0x79/0xa0
> prepare_exit_to_usermode+0x9b/0xb5
> entry_SYSCALL_64_fastpath+0xab/0xad

Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
the above report? That report namely doesn't make sense to me.

Bart.

2017-08-22 21:57:09

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

Hi Paul,

On Tue, 22 Aug 2017 12:36:20 -0700 "Paul E. McKenney" <[email protected]> wrote:
>
> To complete the thought, if you aren't already using it, I suggest
> applying Nick's patch:
>
> http://lkml.kernel.org/r/[email protected]

OK, I applied that - with a little shoehorning due to commit

71acb768f5b3 ("timers: Fix excessive granularity of new timers after a nohz idle")

from your cru tree.

my qemu test now boots and shuts down fine.

--
Cheers,
Stephen Rothwell

2017-08-22 22:27:16

[permalink] [raw]

Subject: Re: linux-next: Tree for Aug 22

Hi Paul,

On Wed, 23 Aug 2017 07:57:05 +1000 Stephen Rothwell <[email protected]> wrote:
>
> On Tue, 22 Aug 2017 12:36:20 -0700 "Paul E. McKenney" <[email protected]> wrote:
> >
> > To complete the thought, if you aren't already using it, I suggest
> > applying Nick's patch:
> >
> > http://lkml.kernel.org/r/[email protected]
>
> OK, I applied that - with a little shoehorning due to commit
>
> 71acb768f5b3 ("timers: Fix excessive granularity of new timers after a nohz idle")
>
> from your cru tree.
^^^
(I meant rcu, of course)

I will apply the resulting patch to linux-next (as part of the rcu tree
merge) today - unless you get around to updating your tree before then.

--
Cheers,
Stephen Rothwell

2017-08-23 00:03:17

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > ------------------------------------------------------
> > fsck.ext4/148 is trying to acquire lock:
> > (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> >
> > but now in release context of a crosslock acquired at the following:
> > ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> >
> > which lock already depends on the new lock.
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #1 ((complete)&wait#2){+.+.}:
> > lock_acquire+0x176/0x19e
> > __wait_for_common+0x50/0x1e3
> > blk_execute_rq+0xbb/0xda
> > scsi_execute+0xc3/0x17d [scsi_mod]
> > sd_revalidate_disk+0x112/0x1549 [sd_mod]
> > rescan_partitions+0x48/0x2c4
> > __blkdev_get+0x14b/0x37c
> > blkdev_get+0x191/0x2c0
> > device_add_disk+0x2b4/0x3e5
> > sd_probe_async+0xf8/0x17e [sd_mod]
> > async_run_entry_fn+0x34/0xe0
> > process_one_work+0x2af/0x4d1
> > worker_thread+0x19a/0x24f
> > kthread+0x133/0x13b
> > ret_from_fork+0x27/0x40
> >
> > -> #0 (&bdev->bd_mutex){+.+.}:
> > __blkdev_put+0x33/0x190
> > blkdev_close+0x24/0x27
> > __fput+0xee/0x18a
> > task_work_run+0x79/0xa0
> > prepare_exit_to_usermode+0x9b/0xb5
> >
> > other info that might help us debug this:
> > Possible unsafe locking scenario by crosslock:
> > CPU0 CPU1
> > ---- ----
> > lock(&bdev->bd_mutex);
> > lock((complete)&wait#2);
> > lock(&bdev->bd_mutex);
> > unlock((complete)&wait#2);
> >
> > *** DEADLOCK ***
> > 4 locks held by fsck.ext4/148:
> > #0: (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > #1: (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
> > #2: (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
> > #3: (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50
> >
> > stack backtrace:
> > CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
> > Call Trace:
> > dump_stack+0x67/0x8e
> > print_circular_bug+0x2a1/0x2af
> > ? zap_class+0xc5/0xc5
> > check_prev_add+0x76/0x20d
> > ? __lock_acquire+0xc27/0xcc8
> > lock_commit_crosslock+0x327/0x35e
> > complete+0x24/0x50
> > scsi_end_request+0x8d/0x176 [scsi_mod]
> > scsi_io_completion+0x1be/0x423 [scsi_mod]
> > __blk_mq_complete_request+0x112/0x131
> > ata_scsi_simulate+0x212/0x218 [libata]
> > __ata_scsi_queuecmd+0x1be/0x1de [libata]
> > ata_scsi_queuecmd+0x41/0x74 [libata]
> > scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
> > scsi_queue_rq+0x1e0/0x26f [scsi_mod]
> > blk_mq_dispatch_rq_list+0x193/0x2a7
> > ? _raw_spin_unlock+0x2e/0x40
> > blk_mq_sched_dispatch_requests+0x132/0x176
> > __blk_mq_run_hw_queue+0x59/0xc5
> > __blk_mq_delay_run_hw_queue+0x5f/0xc1
> > blk_mq_flush_plug_list+0xfc/0x10b
> > blk_flush_plug_list+0xc6/0x1eb
> > blk_finish_plug+0x25/0x32
> > generic_writepages+0x56/0x63
> > do_writepages+0x36/0x70
> > __filemap_fdatawrite_range+0x59/0x5f
> > filemap_write_and_wait+0x19/0x4f
> > __blkdev_put+0x5f/0x190
> > blkdev_close+0x24/0x27
> > __fput+0xee/0x18a
> > task_work_run+0x79/0xa0
> > prepare_exit_to_usermode+0x9b/0xb5
> > entry_SYSCALL_64_fastpath+0xab/0xad
>
> Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
> the above report? That report namely doesn't make sense to me.

The report is talking about the following lockup:

A work in a worker A task work on exit to user
------------------ ---------------------------
mutex_lock(&bdev->bd_mutex)
mutext_lock(&bdev->bd_mutex)
blk_execute_rq()
wait_for_completion_io_timeout(&A)
complete(&A)

Is this impossible?

To Peterz,

Anyway I wanted to avoid lockdep reports in the case using a timeout
interface. Do you think it's still worth reporting the kind of lockup?
I'm ok if you do.

2017-08-23 02:36:35

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On (08/23/17 09:03), Byungchul Park wrote:
[..]

aha, ok

> The report is talking about the following lockup:
>
> A work in a worker A task work on exit to user
> ------------------ ---------------------------
> mutex_lock(&bdev->bd_mutex)
> mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
> wait_for_completion_io_timeout(&A)
> complete(&A)
>
> Is this impossible?

I was really confused how this "unlock" may lead to a deadlock

> > > other info that might help us debug this:
> > > Possible unsafe locking scenario by crosslock:
> > > CPU0 CPU1
> > > ---- ----
> > > lock(&bdev->bd_mutex);
> > > lock((complete)&wait#2);
> > > lock(&bdev->bd_mutex);
> > > unlock((complete)&wait#2);

any chance the report can be improved? mention timeout, etc?
// well, if this functionality will stay.

p.s.
Bart Van Assche, thanks for Cc-ing Park Byungchul, I was really
sure I didn't enabled the cross-release, but apparently I was wrong:
CONFIG_LOCKDEP_CROSSRELEASE=y
CONFIG_LOCKDEP_COMPLETIONS=y

-ss

2017-08-23 02:59:13

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 11:36:49AM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 09:03), Byungchul Park wrote:
> [..]
>
> aha, ok
>
> > The report is talking about the following lockup:
> >
> > A work in a worker A task work on exit to user
> > ------------------ ---------------------------
> > mutex_lock(&bdev->bd_mutex)
> > mutext_lock(&bdev->bd_mutex)
> > blk_execute_rq()
> > wait_for_completion_io_timeout(&A)
> > complete(&A)
> >
> > Is this impossible?
>
> I was really confused how this "unlock" may lead to a deadlock

Hi Sergey,

Right. It should be enhanced.

>
> > > > other info that might help us debug this:
> > > > Possible unsafe locking scenario by crosslock:
> > > > CPU0 CPU1
> > > > ---- ----
> > > > lock(&bdev->bd_mutex);
> > > > lock((complete)&wait#2);
> > > > lock(&bdev->bd_mutex);
> > > > unlock((complete)&wait#2);
>
>
> any chance the report can be improved? mention timeout, etc?
> // well, if this functionality will stay.
>
>
> p.s.
> Bart Van Assche, thanks for Cc-ing Park Byungchul, I was really
> sure I didn't enabled the cross-release, but apparently I was wrong:
> CONFIG_LOCKDEP_CROSSRELEASE=y
> CONFIG_LOCKDEP_COMPLETIONS=y
>
> -ss

2017-08-23 03:49:34

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

Hi Byungchul,

On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > ------------------------------------------------------
> > > fsck.ext4/148 is trying to acquire lock:
> > > (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > >
> > > but now in release context of a crosslock acquired at the following:
> > > ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > >
> > > which lock already depends on the new lock.
> > >

I felt this message really misleading, because the deadlock is detected
at the commit time of "((complete)&wait#2)" rather than the acquisition
time of "(&bdev->bd_mutex)", so I made the following improvement.

Thoughts?

Regards,
Boqun

----------------------->8
From: Boqun Feng <[email protected]>
Date: Wed, 23 Aug 2017 10:18:30 +0800
Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
splats

When a crossrelease related deadlock is detected in a commit, the
current implemention makes splats like:

> fsck.ext4/148 is trying to acquire lock:
> (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
>
> but now in release context of a crosslock acquired at the following:
> ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
>
> which lock already depends on the new lock.
> ...

However, it could be misleading because the current task has got the
lock already, and in fact the deadlock is detected when it is doing the
commit of the crossrelease lock. So make the splats more accurate to
describe the deadlock case.

Signed-off-by: Boqun Feng <[email protected]>
---
kernel/locking/lockdep.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 66011c9f5df3..642fb5362507 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
pr_warn("WARNING: possible circular locking dependency detected\n");
print_kernel_ident();
pr_warn("------------------------------------------------------\n");
- pr_warn("%s/%d is trying to acquire lock:\n",
- curr->comm, task_pid_nr(curr));
- print_lock(check_src);

- if (cross_lock(check_tgt->instance))
- pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
- else
+ if (cross_lock(check_tgt->instance)) {
+ pr_warn("%s/%d is committing a crossrelease lock:\n",
+ curr->comm, task_pid_nr(curr));
+ print_lock(check_tgt);
+ pr_warn("\n, with the following lock held:\n");
+ print_lock(check_src);
+ pr_warn("\non which lock the crossrelease lock already depends.\n\n");
+ } else {
+ pr_warn("%s/%d is trying to acquire lock:\n",
+ curr->comm, task_pid_nr(curr));
+ print_lock(check_src);
pr_warn("\nbut task is already holding lock:\n");
+ print_lock(check_tgt);
+ pr_warn("\nwhich lock already depends on the new lock.\n\n");
+ }

- print_lock(check_tgt);
- pr_warn("\nwhich lock already depends on the new lock.\n\n");
pr_warn("\nthe existing dependency chain (in reverse order) is:\n");

print_circular_bug_entry(entry, depth);
--
2.14.1

2017-08-23 04:37:53

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> Hi Byungchul,
>
> On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > ------------------------------------------------------
> > > > fsck.ext4/148 is trying to acquire lock:
> > > > (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > >
> > > > but now in release context of a crosslock acquired at the following:
> > > > ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > >
> > > > which lock already depends on the new lock.
> > > >
>
> I felt this message really misleading, because the deadlock is detected
> at the commit time of "((complete)&wait#2)" rather than the acquisition
> time of "(&bdev->bd_mutex)", so I made the following improvement.
>
> Thoughts?
>
> Regards,
> Boqun
>

While I'm on this one, I think we should also add a case in @check_src
is a cross lock, i.e. we detect cross deadlock at the acquisition time
of the cross lock. How about the following?

Regards,
Boqun

--------------------------------------->8
From: Boqun Feng <[email protected]>
Date: Wed, 23 Aug 2017 12:12:16 +0800
Subject: [PATCH] lockdep: Print proper scenario if cross deadlock detected at
acquisition time

For a potential deadlock about CROSSRELEASE as follow:

P1 P2
=========== =============
lock(A)
lock(X)
lock(A)
commit(X)

A: normal lock, X: cross lock

, we could detect it at two places:

1. commit time:

We have run P1 first, and have dependency A --> X in graph, and
then we run P2, and find the deadlock.

2. acquisition time:

We have run P2 first, and have dependency A --> X, in
graph(because another P3 may run previously and is acquiring for
lock X), and then we run P1 and find the deadlock.

In current print_circular_lock_scenario(), for 1) we could print the
right scenario and note that's a deadlock related to CROSSRELEASE,
however for 2) we print the scenario as a normal lockdep deadlock.

It's better to print a proper scenario related to CROSSRELEASE to help
users find their bugs more easily, so improve this.

Signed-off-by: Boqun Feng <[email protected]>
---
kernel/locking/lockdep.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 642fb5362507..a3709e15f609 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
__print_lock_name(target);
printk(KERN_CONT ");\n");
printk("\n *** DEADLOCK ***\n\n");
+ } else if (cross_lock(src->instance)) {
+ printk(" Possible unsafe locking scenario by crosslock:\n\n");
+ printk(" CPU0 CPU1\n");
+ printk(" ---- ----\n");
+ printk(" lock(");
+ __print_lock_name(target);
+ printk(KERN_CONT ");\n");
+ printk(" lock(");
+ __print_lock_name(source);
+ printk(KERN_CONT ");\n");
+ printk(" lock(");
+ __print_lock_name(parent == source ? target : parent);
+ printk(KERN_CONT ");\n");
+ printk(" unlock(");
+ __print_lock_name(source);
+ printk(KERN_CONT ");\n");
+ printk("\n *** DEADLOCK ***\n\n");
} else {
printk(" Possible unsafe locking scenario:\n\n");
printk(" CPU0 CPU1\n");
--
2.14.1

2017-08-23 04:46:24

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> Hi Byungchul,
>
> On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > ------------------------------------------------------
> > > > fsck.ext4/148 is trying to acquire lock:
> > > > (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > >
> > > > but now in release context of a crosslock acquired at the following:
> > > > ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > >
> > > > which lock already depends on the new lock.
> > > >
>
> I felt this message really misleading, because the deadlock is detected
> at the commit time of "((complete)&wait#2)" rather than the acquisition
> time of "(&bdev->bd_mutex)", so I made the following improvement.
>
> Thoughts?
>
> Regards,
> Boqun
>
> ----------------------->8
> From: Boqun Feng <[email protected]>
> Date: Wed, 23 Aug 2017 10:18:30 +0800
> Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
> splats
>
> When a crossrelease related deadlock is detected in a commit, the
> current implemention makes splats like:
>
> > fsck.ext4/148 is trying to acquire lock:
> > (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> >
> > but now in release context of a crosslock acquired at the following:
> > ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> >
> > which lock already depends on the new lock.
> > ...
>
> However, it could be misleading because the current task has got the
> lock already, and in fact the deadlock is detected when it is doing the
> commit of the crossrelease lock. So make the splats more accurate to
> describe the deadlock case.
>
> Signed-off-by: Boqun Feng <[email protected]>
> ---
> kernel/locking/lockdep.c | 22 ++++++++++++++--------
> 1 file changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 66011c9f5df3..642fb5362507 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
> pr_warn("WARNING: possible circular locking dependency detected\n");
> print_kernel_ident();
> pr_warn("------------------------------------------------------\n");
> - pr_warn("%s/%d is trying to acquire lock:\n",
> - curr->comm, task_pid_nr(curr));
> - print_lock(check_src);
>
> - if (cross_lock(check_tgt->instance))
> - pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
> - else
> + if (cross_lock(check_tgt->instance)) {
> + pr_warn("%s/%d is committing a crossrelease lock:\n",
> + curr->comm, task_pid_nr(curr));

I think it would be better to print something in term of acquisition,
since the following print_lock() will print infromation of acquisition.

> + print_lock(check_tgt);
> + pr_warn("\n, with the following lock held:\n");

The lock does not have to be held at the commit.

> + print_lock(check_src);
> + pr_warn("\non which lock the crossrelease lock already depends.\n\n");
> + } else {
> + pr_warn("%s/%d is trying to acquire lock:\n",
> + curr->comm, task_pid_nr(curr));
> + print_lock(check_src);
> pr_warn("\nbut task is already holding lock:\n");
> + print_lock(check_tgt);
> + pr_warn("\nwhich lock already depends on the new lock.\n\n");
> + }
>
> - print_lock(check_tgt);
> - pr_warn("\nwhich lock already depends on the new lock.\n\n");
> pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
>
> print_circular_bug_entry(entry, depth);
> --
> 2.14.1

2017-08-23 04:46:34

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On (08/23/17 12:38), Boqun Feng wrote:
[..]
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 642fb5362507..a3709e15f609 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
> __print_lock_name(target);
> printk(KERN_CONT ");\n");

KERN_CONT and "\n" should not be together. "\n" flushes the cont
buffer immediately.

-ss

> printk("\n *** DEADLOCK ***\n\n");
> + } else if (cross_lock(src->instance)) {
> + printk(" Possible unsafe locking scenario by crosslock:\n\n");
> + printk(" CPU0 CPU1\n");
> + printk(" ---- ----\n");
> + printk(" lock(");
> + __print_lock_name(target);
> + printk(KERN_CONT ");\n");
> + printk(" lock(");
> + __print_lock_name(source);
> + printk(KERN_CONT ");\n");
> + printk(" lock(");
> + __print_lock_name(parent == source ? target : parent);
> + printk(KERN_CONT ");\n");
> + printk(" unlock(");
> + __print_lock_name(source);
> + printk(KERN_CONT ");\n");
> + printk("\n *** DEADLOCK ***\n\n");
> } else {
> printk(" Possible unsafe locking scenario:\n\n");
> printk(" CPU0 CPU1\n");
> --
> 2.14.1
>

2017-08-23 05:02:00

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 01:46:17PM +0900, Byungchul Park wrote:
> On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> > Hi Byungchul,
> >
> > On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > > ======================================================
> > > > > WARNING: possible circular locking dependency detected
> > > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > > ------------------------------------------------------
> > > > > fsck.ext4/148 is trying to acquire lock:
> > > > > (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > > >
> > > > > but now in release context of a crosslock acquired at the following:
> > > > > ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > > >
> > > > > which lock already depends on the new lock.
> > > > >
> >
> > I felt this message really misleading, because the deadlock is detected
> > at the commit time of "((complete)&wait#2)" rather than the acquisition
> > time of "(&bdev->bd_mutex)", so I made the following improvement.
> >
> > Thoughts?
> >
> > Regards,
> > Boqun
> >
> > ----------------------->8
> > From: Boqun Feng <[email protected]>
> > Date: Wed, 23 Aug 2017 10:18:30 +0800
> > Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
> > splats
> >
> > When a crossrelease related deadlock is detected in a commit, the
> > current implemention makes splats like:
> >
> > > fsck.ext4/148 is trying to acquire lock:
> > > (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > >
> > > but now in release context of a crosslock acquired at the following:
> > > ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > >
> > > which lock already depends on the new lock.
> > > ...
> >
> > However, it could be misleading because the current task has got the
> > lock already, and in fact the deadlock is detected when it is doing the
> > commit of the crossrelease lock. So make the splats more accurate to
> > describe the deadlock case.
> >
> > Signed-off-by: Boqun Feng <[email protected]>
> > ---
> > kernel/locking/lockdep.c | 22 ++++++++++++++--------
> > 1 file changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index 66011c9f5df3..642fb5362507 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
> > pr_warn("WARNING: possible circular locking dependency detected\n");
> > print_kernel_ident();
> > pr_warn("------------------------------------------------------\n");
> > - pr_warn("%s/%d is trying to acquire lock:\n",
> > - curr->comm, task_pid_nr(curr));
> > - print_lock(check_src);
> >
> > - if (cross_lock(check_tgt->instance))
> > - pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
> > - else
> > + if (cross_lock(check_tgt->instance)) {
> > + pr_warn("%s/%d is committing a crossrelease lock:\n",
> > + curr->comm, task_pid_nr(curr));
>
> I think it would be better to print something in term of acquisition,
> since the following print_lock() will print infromation of acquisition.
>

Well, that print_lock() will print the cross lock acquisition
information at other contexts, but the current thread is doing the
commit. So I think the information would be a little misleading. I will
add "aacquired at" to indicate the lock information is for acquisition.

> > + print_lock(check_tgt);
> > + pr_warn("\n, with the following lock held:\n");
>
> The lock does not have to be held at the commit.
>

Ah.. right.

How about this:

pr_warn("%s/%d is committing a crossrelease lock acquired at:\n",
curr->comm, task_pid_nr(curr));
print_lock(check_tgt);
pr_warn("\n, after having the following lock held at least once:\n");

Regards,
Boqun

> > + print_lock(check_src);
> > + pr_warn("\non which lock the crossrelease lock already depends.\n\n");
> > + } else {
> > + pr_warn("%s/%d is trying to acquire lock:\n",
> > + curr->comm, task_pid_nr(curr));
> > + print_lock(check_src);
> > pr_warn("\nbut task is already holding lock:\n");
> > + print_lock(check_tgt);
> > + pr_warn("\nwhich lock already depends on the new lock.\n\n");
> > + }
> >
> > - print_lock(check_tgt);
> > - pr_warn("\nwhich lock already depends on the new lock.\n\n");
> > pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
> >
> > print_circular_bug_entry(entry, depth);
> > --
> > 2.14.1

Attachments:

(No filename) (4.68 kB)
signature.asc (488.00 B)
Download all attachments

2017-08-23 05:35:10

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 01:46:48PM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 12:38), Boqun Feng wrote:
> [..]
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index 642fb5362507..a3709e15f609 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
> > __print_lock_name(target);
> > printk(KERN_CONT ");\n");
>
> KERN_CONT and "\n" should not be together. "\n" flushes the cont
> buffer immediately.
>

Hmm.. Not quite familiar with printk() stuffs, but I could see several
usages of printk(KERN_CONT "...\n") in kernel.

Did a bit research myself, and I now think the inappropriate use is to
use a KERN_CONT printk *after* another printk ending with a "\n". Am I
missing some recent changes or rules of KERN_CONT?

Regards,
Boqun

> -ss
>
> > printk("\n *** DEADLOCK ***\n\n");
> > + } else if (cross_lock(src->instance)) {
> > + printk(" Possible unsafe locking scenario by crosslock:\n\n");
> > + printk(" CPU0 CPU1\n");
> > + printk(" ---- ----\n");
> > + printk(" lock(");
> > + __print_lock_name(target);
> > + printk(KERN_CONT ");\n");
> > + printk(" lock(");
> > + __print_lock_name(source);
> > + printk(KERN_CONT ");\n");
> > + printk(" lock(");
> > + __print_lock_name(parent == source ? target : parent);
> > + printk(KERN_CONT ");\n");
> > + printk(" unlock(");
> > + __print_lock_name(source);
> > + printk(KERN_CONT ");\n");
> > + printk("\n *** DEADLOCK ***\n\n");
> > } else {
> > printk(" Possible unsafe locking scenario:\n\n");
> > printk(" CPU0 CPU1\n");
> > --
> > 2.14.1
> >

Attachments:

(No filename) (1.76 kB)
signature.asc (488.00 B)
Download all attachments

2017-08-23 05:44:22

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 12:38:13PM +0800, Boqun Feng wrote:
> From: Boqun Feng <[email protected]>
> Date: Wed, 23 Aug 2017 12:12:16 +0800
> Subject: [PATCH] lockdep: Print proper scenario if cross deadlock detected at
> acquisition time
>
> For a potential deadlock about CROSSRELEASE as follow:
>
> P1 P2
> =========== =============
> lock(A)
> lock(X)
> lock(A)
> commit(X)
>
> A: normal lock, X: cross lock
>
> , we could detect it at two places:
>
> 1. commit time:
>
> We have run P1 first, and have dependency A --> X in graph, and
> then we run P2, and find the deadlock.
>
> 2. acquisition time:
>
> We have run P2 first, and have dependency A --> X, in

X -> A

> graph(because another P3 may run previously and is acquiring for

".. another P3 may have run previously and was holding .."
^
Additionally, not only P3 but also P2 like:

lock(A)
lock(X)
lock(X) // I mean it's at _P2_
lock(A)
commit(X)

> lock X), and then we run P1 and find the deadlock.
>
> In current print_circular_lock_scenario(), for 1) we could print the
> right scenario and note that's a deadlock related to CROSSRELEASE,
> however for 2) we print the scenario as a normal lockdep deadlock.
>
> It's better to print a proper scenario related to CROSSRELEASE to help
> users find their bugs more easily, so improve this.
>
> Signed-off-by: Boqun Feng <[email protected]>
> ---
> kernel/locking/lockdep.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 642fb5362507..a3709e15f609 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
> __print_lock_name(target);
> printk(KERN_CONT ");\n");
> printk("\n *** DEADLOCK ***\n\n");
> + } else if (cross_lock(src->instance)) {
> + printk(" Possible unsafe locking scenario by crosslock:\n\n");
> + printk(" CPU0 CPU1\n");
> + printk(" ---- ----\n");
> + printk(" lock(");
> + __print_lock_name(target);
> + printk(KERN_CONT ");\n");
> + printk(" lock(");
> + __print_lock_name(source);
> + printk(KERN_CONT ");\n");
> + printk(" lock(");
> + __print_lock_name(parent == source ? target : parent);
> + printk(KERN_CONT ");\n");
> + printk(" unlock(");
> + __print_lock_name(source);
> + printk(KERN_CONT ");\n");
> + printk("\n *** DEADLOCK ***\n\n");
> } else {
> printk(" Possible unsafe locking scenario:\n\n");
> printk(" CPU0 CPU1\n");

I need time to be sure if it's correct.

2017-08-23 05:44:27

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On (08/23/17 13:35), Boqun Feng wrote:
[..]
> > > printk(KERN_CONT ");\n");
> >
> > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > buffer immediately.
> >
>
> Hmm.. Not quite familiar with printk() stuffs, but I could see several
> usages of printk(KERN_CONT "...\n") in kernel.
>
> Did a bit research myself, and I now think the inappropriate use is to
> use a KERN_CONT printk *after* another printk ending with a "\n". Am I
> missing some recent changes or rules of KERN_CONT?

has been this way for quite some time (if not always).

LOG_NEWLINE results in cont_flush(), which log_store() the content
of KERN_CONT buffer.

if we see that supplied message has no \n then we store it in a
dedicated buffer (cont buffer)

if (!(lflags & LOG_NEWLINE))
return cont_add();

return log_store();

we flush that buffer (move its content to the kernel log buffer) when
we receive a message with a \n or when printk() from another task/context
interrupts the current cont line and, thus, forces us to flush.

-ss

2017-08-23 05:55:03

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On (08/23/17 13:35), Boqun Feng wrote:
> > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > buffer immediately.
> >
>
> Hmm.. Not quite familiar with printk() stuffs, but I could see several
> usages of printk(KERN_CONT "...\n") in kernel.
>
> Did a bit research myself, and I now think the inappropriate use is to
> use a KERN_CONT printk *after* another printk ending with a "\n".

ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
upon the return. sorry, your code is correct.

-ss

> > > printk("\n *** DEADLOCK ***\n\n");
> > > + } else if (cross_lock(src->instance)) {
> > > + printk(" Possible unsafe locking scenario by crosslock:\n\n");
> > > + printk(" CPU0 CPU1\n");
> > > + printk(" ---- ----\n");
> > > + printk(" lock(");
> > > + __print_lock_name(target);
> > > + printk(KERN_CONT ");\n");
> > > + printk(" lock(");
> > > + __print_lock_name(source);
> > > + printk(KERN_CONT ");\n");
> > > + printk(" lock(");
> > > + __print_lock_name(parent == source ? target : parent);
> > > + printk(KERN_CONT ");\n");
> > > + printk(" unlock(");
> > > + __print_lock_name(source);
> > > + printk(KERN_CONT ");\n");
> > > + printk("\n *** DEADLOCK ***\n\n");
> > > } else {
> > > printk(" Possible unsafe locking scenario:\n\n");
> > > printk(" CPU0 CPU1\n");
> > > --
> > > 2.14.1
> > >

2017-08-23 07:53:18

by Peter Zijlstra

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:

> The report is talking about the following lockup:
>
> A work in a worker A task work on exit to user
> ------------------ ---------------------------
> mutex_lock(&bdev->bd_mutex)
> mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
> wait_for_completion_io_timeout(&A)
> complete(&A)
>
> Is this impossible?
>
> To Peterz,
>
> Anyway I wanted to avoid lockdep reports in the case using a timeout
> interface. Do you think it's still worth reporting the kind of lockup?

Yes, people might not have expected to hit the timeout on this. They
might think timeout means a dead device or something like that.

I'd like to heard from the block folks if this was constructed thus on
purpose though.

2017-08-24 04:38:50

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 13:35), Boqun Feng wrote:
> > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > buffer immediately.
> > >
> >
> > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > usages of printk(KERN_CONT "...\n") in kernel.
> >
> > Did a bit research myself, and I now think the inappropriate use is to
> > use a KERN_CONT printk *after* another printk ending with a "\n".
>
> ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> upon the return. sorry, your code is correct.
>

So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
correct usage, right? Thanks. Again, not familiar with printk stuff,
glad you can help me go through this ;-)

Regards,
Boqun

> -ss
>
> > > > printk("\n *** DEADLOCK ***\n\n");
> > > > + } else if (cross_lock(src->instance)) {
> > > > + printk(" Possible unsafe locking scenario by crosslock:\n\n");
> > > > + printk(" CPU0 CPU1\n");
> > > > + printk(" ---- ----\n");
> > > > + printk(" lock(");
> > > > + __print_lock_name(target);
> > > > + printk(KERN_CONT ");\n");
> > > > + printk(" lock(");
> > > > + __print_lock_name(source);
> > > > + printk(KERN_CONT ");\n");
> > > > + printk(" lock(");
> > > > + __print_lock_name(parent == source ? target : parent);
> > > > + printk(KERN_CONT ");\n");
> > > > + printk(" unlock(");
> > > > + __print_lock_name(source);
> > > > + printk(KERN_CONT ");\n");
> > > > + printk("\n *** DEADLOCK ***\n\n");
> > > > } else {
> > > > printk(" Possible unsafe locking scenario:\n\n");
> > > > printk(" CPU0 CPU1\n");
> > > > --
> > > > 2.14.1
> > > >
>
>

Attachments:

(No filename) (1.81 kB)
signature.asc (488.00 B)
Download all attachments

2017-08-24 04:48:58

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

Hi,

On (08/24/17 12:39), Boqun Feng wrote:
> On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> > On (08/23/17 13:35), Boqun Feng wrote:
> > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > > buffer immediately.
> > > >
> > >
> > > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > > usages of printk(KERN_CONT "...\n") in kernel.
> > >
> > > Did a bit research myself, and I now think the inappropriate use is to
> > > use a KERN_CONT printk *after* another printk ending with a "\n".
> >
> > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> > upon the return. sorry, your code is correct.
> >
>
> So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
> correct usage, right?

well, yes. with one precondition - there should be no printk-s from other
CPUs/tasks in between

printk(KERN_CON "..."); + printk(KERN_CONT "...\n")
^^^^^
here we can have a preliminary flush and broken
cont line. but it's been this way forever.

-ss

2017-08-30 05:18:17

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On (08/23/17 09:03), Byungchul Park wrote:
[..]
> > Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
> > the above report? That report namely doesn't make sense to me.
>
> The report is talking about the following lockup:
>
> A work in a worker A task work on exit to user
> ------------------ ---------------------------
> mutex_lock(&bdev->bd_mutex)
> mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
> wait_for_completion_io_timeout(&A)
> complete(&A)
>
[..]
> To Peterz,
>
> Anyway I wanted to avoid lockdep reports in the case using a timeout
> interface. Do you think it's still worth reporting the kind of lockup?
> I'm ok if you do.

Byungchul, a quick question.
have you measured the performance impact? somehow my linux-next is
notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
is irritatingly slow)

`time dmesg' shows some difference, but probably that's not a good
test.

!LOCKDEP LOCKDEP LOCKDEP -CROSSRELEASE -COMPLETIONS
real 0m0.661s 0m2.290s 0m1.920s
user 0m0.010s 0m0.105s 0m0.000s
sys 0m0.636s 0m2.224s 0m1.888s

anyone else "sees"/"can confirm" the slow down?

it gets back to "usual normal" when I disable CROSSRELEASE and COMPLETIONS.

---

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index b19c491cbc4e..cdc30ef81c5e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1091,8 +1091,6 @@ config PROVE_LOCKING
select DEBUG_MUTEXES
select DEBUG_RT_MUTEXES if RT_MUTEXES
select DEBUG_LOCK_ALLOC
- select LOCKDEP_CROSSRELEASE
- select LOCKDEP_COMPLETIONS
select TRACE_IRQFLAGS
default n
help

---

-ss

2017-08-30 05:43:41

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 30, 2017 at 02:20:37PM +0900, Sergey Senozhatsky wrote:
> Byungchul, a quick question.

Hello Sergey,

> have you measured the performance impact? somehow my linux-next is

Yeah, it might have performance impact inevitably.

> notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
> is irritatingly slow)

To Ingo,

I cannot decide if we have to roll back CONFIG_LOCKDEP_CROSSRELEASE
dependency on CONFIG_PROVE_LOCKING in Kconfig. With them enabled,
lockdep detection becomes strong but has performance impact. But,
it's anyway a debug option so IMHO we don't have to take case of the
performance impact. Please let me know your decision.

> `time dmesg' shows some difference, but probably that's not a good
> test.
>
> !LOCKDEP LOCKDEP LOCKDEP -CROSSRELEASE -COMPLETIONS
> real 0m0.661s 0m2.290s 0m1.920s
> user 0m0.010s 0m0.105s 0m0.000s
> sys 0m0.636s 0m2.224s 0m1.888s
>
> anyone else "sees"/"can confirm" the slow down?
>
>
> it gets back to "usual normal" when I disable CROSSRELEASE and COMPLETIONS.
>
> ---
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index b19c491cbc4e..cdc30ef81c5e 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1091,8 +1091,6 @@ config PROVE_LOCKING
> select DEBUG_MUTEXES
> select DEBUG_RT_MUTEXES if RT_MUTEXES
> select DEBUG_LOCK_ALLOC
> - select LOCKDEP_CROSSRELEASE
> - select LOCKDEP_COMPLETIONS
> select TRACE_IRQFLAGS
> default n
> help
>
> ---
>
> -ss

2017-08-30 06:12:50

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

Hi,

On (08/30/17 14:43), Byungchul Park wrote:
[..]
> > notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
> > is irritatingly slow)
>
> To Ingo,
>
> I cannot decide if we have to roll back CONFIG_LOCKDEP_CROSSRELEASE
> dependency on CONFIG_PROVE_LOCKING in Kconfig. With them enabled,
> lockdep detection becomes strong but has performance impact. But,
> it's anyway a debug option so IMHO we don't have to take case of the
> performance impact. Please let me know your decision.

well, I expected it :)

I've been running lockdep enabled kernels for years, and was OK with
the performance. but now it's just too much and I'm looking at disabling
lockdep.

a more relevant test -- compilation of a relatively small project

LOCKDEP -CROSSRELEASE -COMPLETIONS LOCKDEP +CROSSRELEASE +COMPLETIONS

real 1m23.722s real 2m9.969s
user 4m11.300s user 4m15.458s
sys 0m49.386s sys 2m3.594s

you don't want to know how much time now it takes to recompile the
kernel ;)

-ss

2017-08-30 08:42:21

by Peter Zijlstra

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 30, 2017 at 03:15:11PM +0900, Sergey Senozhatsky wrote:
> Hi,
>
> On (08/30/17 14:43), Byungchul Park wrote:
> [..]
> > > notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
> > > is irritatingly slow)
> >
> > To Ingo,
> >
> > I cannot decide if we have to roll back CONFIG_LOCKDEP_CROSSRELEASE
> > dependency on CONFIG_PROVE_LOCKING in Kconfig. With them enabled,
> > lockdep detection becomes strong but has performance impact. But,
> > it's anyway a debug option so IMHO we don't have to take case of the
> > performance impact. Please let me know your decision.
>
> well, I expected it :)
>
> I've been running lockdep enabled kernels for years, and was OK with
> the performance. but now it's just too much and I'm looking at disabling
> lockdep.
>
> a more relevant test -- compilation of a relatively small project
>
> LOCKDEP -CROSSRELEASE -COMPLETIONS LOCKDEP +CROSSRELEASE +COMPLETIONS
>
> real 1m23.722s real 2m9.969s
> user 4m11.300s user 4m15.458s
> sys 0m49.386s sys 2m3.594s
>
>
> you don't want to know how much time now it takes to recompile the
> kernel ;)

Right,.. so when I look at perf annotate for __lock_acquire and
lock_release (the two most expensive lockdep functions in a kernel
profile) I don't actually see much cross-release stuff.

So the overhead looks to be spread out over all sorts, which makes it
harder to find and fix.

stack unwinding is done lots and is fairly expensive, I've not yet
checked if crossrelease does too much of that.

The below saved about 50% of my __lock_acquire() time, not sure it made
a significant difference over all though.

---
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 44c8d0d17170..f8db1ead1c48 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3386,7 +3386,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
if (!class)
return 0;
}
- atomic_inc((atomic_t *)&class->ops);
+ /* atomic_inc((atomic_t *)&class->ops); */
if (very_verbose(class)) {
printk("\nacquire class [%p] %s", class->key, class->name);
if (class->name_version > 1)

2017-08-30 08:47:55

by Peter Zijlstra

[permalink] [raw]

Subject: Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

On Wed, Aug 30, 2017 at 10:42:07AM +0200, Peter Zijlstra wrote:
>
> So the overhead looks to be spread out over all sorts, which makes it
> harder to find and fix.
>
> stack unwinding is done lots and is fairly expensive, I've not yet
> checked if crossrelease does too much of that.

Aah, we do an unconditional stack unwind for every __lock_acquire() now.
It keeps a trace in the xhlocks[].

Does the below cure most of that overhead?

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 44c8d0d17170..7b872036b72e 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4872,7 +4872,7 @@ static void add_xhlock(struct held_lock *hlock)
xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
xhlock->trace.entries = xhlock->trace_entries;
xhlock->trace.skip = 3;
- save_stack_trace(&xhlock->trace);
+ /* save_stack_trace(&xhlock->trace); */
}

static inline int same_context_xhlock(struct hist_lock *xhlock)

2017-08-30 08:53:30

[permalink] [raw]

Subject: RE: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

> -----Original Message-----
> From: Peter Zijlstra [mailto:[email protected]]
> Sent: Wednesday, August 30, 2017 5:48 PM
> To: Sergey Senozhatsky
> Cc: Byungchul Park; Bart Van Assche; [email protected]; linux-
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: possible circular locking dependency detected [was: linux-
> next: Tree for Aug 22]
>
> On Wed, Aug 30, 2017 at 10:42:07AM +0200, Peter Zijlstra wrote:
> >
> > So the overhead looks to be spread out over all sorts, which makes it
> > harder to find and fix.
> >
> > stack unwinding is done lots and is fairly expensive, I've not yet
> > checked if crossrelease does too much of that.
>
> Aah, we do an unconditional stack unwind for every __lock_acquire() now.
> It keeps a trace in the xhlocks[].

Yeah.. I also think this is most significant..

>
> Does the below cure most of that overhead?
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 44c8d0d17170..7b872036b72e 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -4872,7 +4872,7 @@ static void add_xhlock(struct held_lock *hlock)
> xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
> xhlock->trace.entries = xhlock->trace_entries;
> xhlock->trace.skip = 3;
> - save_stack_trace(&xhlock->trace);
> + /* save_stack_trace(&xhlock->trace); */
> }
>
> static inline int same_context_xhlock(struct hist_lock *xhlock)

2017-08-30 12:27:47