2014-07-21 04:22:56

by Linus Torvalds

[permalink] [raw]
Subject: Linux 3.16-rc6

Week by week, we're getting to what is supposed to be the last rc's,
but quite frankly, things aren't calming down the way they are
supposed to.

That was already true for rc5 - it was bigger than rc4. That didn't
worry me all that much, because rc4 was really pretty small. But now
rc6 is out, and it's bigger than rc5 was, and it's not even all
trivial stuff.

That's not how this is all supposed to work.

Anyway, rc6 still isn't all *that* big, so I'm not exactly worried,
but I am getting to the point where I'm going to start calling people
names and shouting at you if you send me stuff that isn't appropriate
for the late rc releases. Which is not to say that people did: while
rc6 is bigger than I wished for, I don't think there's too much
obviously frivolous in there. But I'll be keepign an eye out, and I'll
be starting to get grumpy (or grumpiER) if I notice that peopel aren't
being serious about trying to calm things down.

Regardless, rc6 itself ends up having changes pretty much all over:
drivers (much of it networking, but there's gpu, there's infiniband,
you name it), filesystems (late nfs fixes, xfs, fuse, gfs2, btrfs),
core networking code, etc etc. The shortlog is appended for those
interested in (an overview of) the details.

So go get the latest rc and kick the tires, to see that nothing has
fallen through the cracks, ok?

Linus

---

Aaron Plattner (1):
ALSA: hda - Add new GPU codec ID 0x10de0070 to snd-hda

Abbas Raza (1):
usb: chipidea: udc: Disable auto ZLP generation on ep0

Alex Deucher (2):
drm/radeon: avoid leaking edid data
drm/radeon: set default bl level to something reasonable

Alex Wang (1):
openvswitch: Use exact lookup for flow_get and flow_del.

Alexander Aring (2):
ieee802154: reassembly: fix possible buffer overflow
MAINTAINERS: change IEEE 802.15.4 maintainer

Alexey Khoroshilov (1):
farsync: fix invalid memory accesses in fst_add_one() and fst_init_card()

Amir Vadai (4):
net/mlx4_en: Don't use irq_affinity_notifier to track changes in
IRQ affinity map
lib/cpumask: cpumask_set_cpu_local_first to use all cores when
numa node is not defined
net/mlx4_en: IRQ affinity hint is not cleared on port down
net/mlx4_en: Ignore budget on TX napi polling

Amit Shah (2):
hwrng: fetch randomness only after device init
hwrng: virtio - ensure reads happen after successful probe

Amitkumar Karwar (2):
mwifiex: fix Tx timeout issue
mwifiex: initialize Tx/Rx info of a packet correctly

Amritha Nambiar (1):
GRE: enable offloads for GRE

Anand Avati (1):
fuse: ignore entry-timeout on LOOKUP_REVAL

Andrea Adami (1):
mtd: cfi_cmdset_0001.c: add support for Sharp LH28F640BF NOR

Andrea Merello (1):
rt2800usb: Don't perform DMA from stack

Andrey Utkin (1):
appletalk: Fix socket referencing in skb

Andy Zhou (1):
openvswitch: Fix a double free bug for the sample action

Ard Biesheuvel (1):
efi/arm64: efistub: remove local copy of linux_banner

Arend van Spriel (1):
brcmfmac: assign chip id and rev in bus interface after brcmf_usb_dlneeded

Arik Nemtsov (1):
Revert "iwlwifi: remove IWL_UCODE_TLV_FLAGS_UAPSD_SUPPORT flag"

Axel Lin (2):
hwmon: (da9052) Don't use dash in the name attribute
hwmon: (da9055) Don't use dash in the name attribute

Bartosz Markowski (1):
ath10k: fix 8th virtual AP interface with DFS

Ben Pfaff (2):
openvswitch: Fix tracking of flags seen in TCP flows.
netlink: Fix handling of error from netlink_dump().

Benjamin LaHaise (1):
aio: protect reqs_available updates from changes in interrupt handlers

Bernd Wachter (1):
net: qmi_wwan: Add ID for Telewell TW-LTE 4G v2

Bo Shen (1):
ARM: at91: at91sam9x5: correct typo error for ohci clock

Bob Peterson (3):
GFS2: Only wait for demote when last holder is dequeued
GFS2: Allow flocks to use normal glock dq rather than dq_wait
GFS2: Allow caching of glocks for flock

Boris BREZILLON (2):
ARM: at91/dt: fix usb0 clocks definition in sam9n12 dtsi
ARM: at91/dt: add missing clocks property to pwm node in sam9x5.dtsi

Boris Ostrovsky (1):
x86/espfix/xen: Fix allocation of pages for paravirt page tables

Brian Norris (1):
UBI: fastmap: do not miss bit-flips

Brian W Hart (1):
cpufreq: make table sentinel macros unsigned to match use

Catalin Marinas (1):
efi: fdt: Do not report an error during boot if UEFI is not available

Christoph Hellwig (1):
nfs: only show Posix ACLs in listxattr if actually present

Christoph Paasch (1):
tcp: Fix divide by zero when pushing during tcp-repair

Christoph Schulz (2):
net: pppoe: use correct channel MTU when using Multilink PPP
net: ppp: don't call sk_chk_filter twice

Daniel Borkmann (1):
net: sctp: fix information leaks in ulpevent layer

Daniel Mack (1):
net: fix circular dependency in of_mdio code

Daniel Vetter (2):
Revert "drm/i915: Don't set the 8to6 dither flag when not scaling"
drm/i915: Track the primary plane correctly when reassigning planes

Darren Hart (1):
ACPI / documentation: Remove reference to
acpi_platform_device_ids from enumeration.txt

Dave Airlie (1):
Revert "drm/i915: reverse dp link param selection, prefer fast
over wide again"

Dave Chinner (3):
Revert "xfs: block allocation work needs to be kswapd aware"
xfs: refine the allocation stack switch
xfs: null unused quota inodes when quota is on

David S. Miller (1):
Revert "net: stmmac: add platform init/exit for Altera's ARM socfpga"

David Vrabel (4):
xen/manage: fix potential deadlock when resuming the console
xen/balloon: set ballooned out pages as invalid in p2m
xen-netfront: don't nest queue locks in xennet_connect()
xen-netfront: call netif_carrier_off() only once when disconnecting

Davidlohr Bueso (1):
locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER

Denis Kirjanov (2):
powerpc: bpf: Use correct mask while accessing the VLAN tag
powerpc: bpf: Fix the broken LD_VLAN_TAG_PRESENT test

Dexuan Cui (1):
Drivers: hv: hv_fcopy: fix a race condition for SMP guest

Dmitry Popov (1):
ip_tunnel: fix ip_tunnel_lookup

Edward Allcutt (1):
ipv4: icmp: Fix pMTU handling for rare case

Eliad Peller (2):
cfg80211: fix elapsed_jiffies calculation
iwlwifi: mvm: rework sched scan channel configuration

Emmanuel Grumbach (2):
iwlwifi: dvm: don't enable CTS to self
iwlwifi: mvm: disable CTS to Self

Eric Dumazet (4):
ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix
bnx2x: fix possible panic under memory stress
vlan: free percpu stats in device destructor
net: fix sparse warning in sk_dst_set()

Eric Sandeen (1):
btrfs: test for valid bdev before kobj removal in btrfs_rm_device

Ezequiel Garcia (1):
ARM: mvebu: Fix coherency bus notifiers by using separate notifiers

Fabian Frederick (3):
fuse: replace count*size kzalloc by kcalloc
GFS2: replace count*size kzalloc by kcalloc
GFS2: fs/gfs2/rgrp.c: kernel-doc warning fixes

Florian Fainelli (7):
net: systemport: do not clear IFF_MULTICAST flag
net: systemport: fix UniMAC reset logic
net: systemport: fix TX NAPI work done return value
net: bcmgenet: disable clock before register_netdev
net: bcmgenet: start with carrier off
net: bcmgenet: do not set packet length for RX buffers
net: bcmgenet: fix RGMII_MODE_EN bit

Gavin Guo (1):
usb: Check if port status is equal to RxDetect

Geert Uytterhoeven (2):
firewire: IEEE 1394 (FireWire) support should depend on HAS_DMA
GFS2: memcontrol: Spelling s/invlidate/invalidate/

Gregory CLEMENT (1):
ARM: mvebu: Fix the operand list in the inline asm of
armada_370_xp_pmsu_idle_enter

Guenter Roeck (3):
sched: Fix compiler warnings
platform_get_irq: Revert to platform_get_resource if of_irq_get fails
hwmon: (adt7470) Fix writes to temperature limit registers

HATAYAMA Daisuke (1):
perf/x86/intel: ignore CondChgd bit to avoid false NMI handling

Hangbin Liu (1):
ipv6: Fix MLD Query message check

Hannes Frederic Sowa (1):
random: check for increase of entropy_count because of signed conversion

Hans de Goede (1):
ACPI / video: Add use_native_backlight quirk for HP ProBook 4540s

Hariprasad S (2):
RDMA/cxgb4: Fix skb_leak in reject_cr()
RDMA/cxgb4: Clean up connection on ARP error

Heiko Schocher (1):
UBI: fix the volumes tree sorting criteria

Himangi Saraogi (1):
fuse: inode: drop cast

Ilan Peer (1):
iwlwifi: mvm: Fix broadcast filtering

James M Leddy (1):
udp: Add MIB counters for rcvbuferrors

Jason Low (6):
locking/rwsem: Allow conservative optimistic spinning when
readers have lock
locking/spinlocks/mcs: Rename optimistic_spin_queue() to
optimistic_spin_node()
locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead
locking/spinlocks/mcs: Introduce and use init macro and function
for osq locks
locking/spinlocks/mcs: Micro-optimize osq_unlock()
locking/rwsem: Reduce the size of struct rw_semaphore

Jason Wang (2):
mlx4: mark napi id for gro_skb
drm/qxl: return IRQ_NONE if it was not our irq

Jiri Olsa (2):
perf tools: Fix segfault in cumulative.callchain report
perf: Do not allow optimized switch for non-cloned events

Johan Hedberg (3):
Bluetooth: Fix overriding higher security level in SMP
Bluetooth: Refactor authentication method lookup into its own function
Bluetooth: Fix rejecting pairing in case of insufficient capabilities

Johannes Berg (2):
Revert "cfg80211: Use 5MHz bandwidth by default when checking
usable channels"
nl80211: move set_qos_map command into split state

John Stultz (1):
alarmtimer: Fix bug where relative alarm timers were treated as absolute

Jon Paul Maloy (2):
tipc: fix bug in multicast/broadcast message reassembly
tipc: clear 'next'-pointer of message fragments before reassembly

Joonyoung Shim (1):
usbnet: smsc95xx: add reset_resume function with reset operation

Linus Torvalds (1):
Linux 3.16-rc6

Linus Walleij (1):
cpufreq: sa1110: set memory type for h3600

Liu Bo (1):
Btrfs: fix abnormal long waiting in fsync

Loic Poulain (1):
Bluetooth: Ignore H5 non-link packets in non-active state

Loic Prylli (1):
net: Fix NETDEV_CHANGE notifier usage causing spurious arp flush

Lucas Stach (1):
ARM: clk-imx6q: parent lvds_sel input from upstream clock gates

Lukasz Rymanowski (1):
Bluetooth: Fix for ACL disconnect when pairing fails

Maciej W. Rozycki (2):
defxx: Remove an incorrectly inverted preprocessor conditional
defxx: Fix !DYNAMIC_BUFFERS compilation warnings

Marcel Holtmann (1):
Revert "Bluetooth: Add a new PID/VID 0cf3/e005 for AR3012."

Mario Kleiner (4):
drm/radeon: Prevent too early kms-pageflips triggered by vblank.
drm/radeon: Remove redundant fence unref in pageflip path.
drm/radeon: Add missing vblank_put in pageflip ioctl error path.
drm/radeon: Make classic pageflip completion path less racy.

Martin Fuzzey (1):
iio: mma8452: Use correct acceleration units.

Martin Lau (1):
ring-buffer: Fix polling on trace_pipe

Martin Peres (1):
drm/nouveau/therm: fix a potential deadlock in the therm monitoring code

Mateusz Guzik (1):
sched: Fix possible divide by zero in avg_atom() calculation

Mathias Krause (1):
neigh: sysctl - simplify address calculation of gc_* variables

Matthias Brugger (1):
irqchip: gic: Add support for cortex a7 compatible string

Max Stepanov (1):
mac80211: WEP extra head/tail room in ieee80211_send_auth

Maxim Patlasov (1):
fuse: release temporary page if fuse_writepage_locked() failed

Michael Brown (1):
x86/efi: Include a .bss section within the PE/COFF headers

Michael Welling (1):
gpio: mcp23s08: Eliminates redundant checking.

Michal Kazior (1):
ath10k: remove unnecessary htt rx corruption check

Michel Dänzer (2):
drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
drm/radeon: Complete page flip even if waiting on the BO fence fails

Mike Snitzer (2):
dm thin metadata: do not allow the data block size to change
dm cache metadata: do not allow the data block size to change

Miklos Szeredi (4):
fuse: timeout comparison fix
fuse: handle large user and group ID
fuse: avoid scheduling while atomic
fuse: restructure ->rename2()

Nicolas Del Piano (1):
cpufreq: imx6q: Select PM_OPP

Nikolay Aleksandrov (1):
bonding: fix ad_select module param check

Niu Yawei (1):
quota: missing lock in dqcache_shrink_scan()

Oleg Nesterov (1):
tracing: instance_rmdir() leaks ftrace_event_file->filter

Olivier Sobrie (2):
hso: remove unused workqueue
hso: fix deadlock when receiving bursts of data

Or Gerlitz (2):
net/mlx4_en: Don't configure the HW vxlan parser when vxlan
offloading isn't set
IB/mlx5: Enable "block multicast loopback" for kernel consumers

Oren Givon (1):
iwlwifi: update the 7265 series HW IDs

Paul Bolle (1):
x86: Remove unused variable "polling"

Paul E. McKenney (2):
rcu: Export debug_init_rcu_head() and and debug_init_rcu_head()
rcu: Reduce overhead of cond_resched() checks for RCU

Peter Zijlstra (3):
x86, tsc: Fix cpufreq lockup
locking/rwsem: Rename 'activity' to 'count'
locking/mutex: Disable optimistic spinning on some architectures

Quentin Armitage (1):
cpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOOD

Rafael J. Wysocki (1):
Revert "ACPI / video: change acpi-video
brightness_switch_enabled default to 0"

Richard Weinberger (4):
Revert "um: Fix wait_stub_done() error handling"
um: Ensure that a stub page cannot get unmapped
um: Fix hung task in fix_range_common()
um: segv: Save regs only in case of a kernel mode fault

Rickard Strandqvist (1):
isdn: hisax: l3ni1.c: Fix for possible null pointer dereference

S. Lockwood-Childs (1):
tools/liblockdep: Account for bitfield changes in lockdeps lock_acquire

Sagi Grimberg (1):
mlx5_core: Fix possible race between mr tree insert/delete

Sasha Levin (2):
tools/liblockdep: Remove debug print left over from development
net/l2tp: don't fall back on UDP [get|set]sockopt

Scot Doyle (1):
drm/i915: Ignore VBT backlight presence check on HP Chromebook 14

Simon Que (1):
perf symbols: Get kernel start address by symbol name

Srinivas Pandruvada (1):
iio:core: Handle error when mask type is not separate

Stefan Assmann (1):
igb: do a reset on SR-IOV re-init if device is down

Stefan Sørensen (1):
dp83640: Always decode received status frames

Steve Wise (2):
RDMA/cxgb4: Initialize the device status page
RDMA/cxgb4: Call iwpm_init() only once

Steven Rostedt (Red Hat) (1):
tracing: Fix graph tracer with stack tracer on other archs

Steven Whitehouse (2):
GFS2: Fix race in glock lru glock disposal
GFS2: Use GFP_NOFS when allocating glocks

Suravee Suthikulpanit (1):
irqchip: gic: Add binding probe for ARM GIC400

Suresh Reddy (1):
be2net: set EQ DB clear-intr bit in be_open()

Takashi Iwai (3):
ALSA: hda - Revert stream assignment order for Intel controllers
PM / sleep: Fix request_firmware() error at resume
ALSA: hda - Fix broken PM due to incomplete i915 initialization

Ted Juan (1):
mtd: devices: elm: fix elm_context_save() and
elm_context_restore() functions

Thierry Reding (1):
ALSA: hda: Fix build warning

Thomas Fitzsimmons (1):
net: mvneta: Fix big endian issue in mvneta_txq_desc_csum()

Thomas Petazzoni (3):
ARM: mvebu: fix SMP boot for Armada 38x and Armada 375 Z1 in big endian
net: mvneta: fix operation in 10 Mbit/s mode
mtd: nand: reduce the warning noise when the ECC is too weak

Todd Fujinaka (1):
igb: Workaround for i210 Errata 25: Slow System Clock

Tom Herbert (1):
net: Performance fix for process_backlog

Tomasz Figa (2):
irqchip: gic: Fix core ID calculation when topology is read from DT
ARM: EXYNOS: Fix core ID used by platsmp and hotplug code

Trond Myklebust (2):
NFS: Remove 2 unused variables
NFS: Don't reset pg_moreio in __nfs_pageio_add_request

Varun Sethi (3):
iommu/fsl: Fix PAMU window size check.
iommu/fsl: Fix the device domain attach condition.
iommu/fsl: Fix the error condition during iommu group

Vince Bridgers (3):
net: stmmac: add platform init/exit for Altera's ARM socfpga
net: stmmac: Correct duplicate if/then/else case found by cppcheck
net: stmmac: Remove unneeded I/O read caught by cppcheck

Viresh Kumar (2):
cpufreq: cpu0: OPPs can be populated at runtime
cpufreq: move policy kobj to policy->cpu at resume

Wei Zhang (1):
openvswitch: supply a dummy err_handler of gre_cisco_protocol to
prevent kernel crash

Weston Andros Adamson (5):
nfs: mark nfs_page reqs with flag for extra ref
nfs: nfs_page should take a ref on the head req
nfs: change find_request to find_head_request
nfs: handle multiple reqs in nfs_page_async_flush
nfs: handle multiple reqs in nfs_wb_page_cancel

Yijing Wang (1):
bnx2x: Fix the MSI flags

Yuchung Cheng (1):
tcp: fix false undo corner cases

Zhang Rui (1):
PM / sleep: fix freeze_ops NULL pointer dereferences

Zhao Qiang (1):
powerpc/ucc_geth: deal with a compile warning

dingtianhong (1):
igmp: fix the problem when mc leave group

françois romieu (1):
MAINTAINERS: update r8169 maintainer

hayeswang (3):
r8152: wake up the device before dumping the hw counter
r8169: disable L23
r8152: fix r8152_csum_workaround function

zhangdianfang (1):
tools/liblockdep: Fix comparison of a boolean value with a value of 2

zhangwei(Jovi) (2):
tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs
tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs


2014-07-23 09:53:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Sun, Jul 20, 2014 at 09:22:52PM -0700, Linus Torvalds wrote:
> So go get the latest rc and kick the tires, to see that nothing has
> fallen through the cracks, ok?

Well, it looks like we f*cked up something after -rc5 since I'm starting
to see lockdep splats all over the place which I didn't see before. I'm
running rc6 + tip/master.

There was one in r8169 yesterday:

https://lkml.kernel.org/r/[email protected]

and now I'm seeing the following in a kvm guest. I'm adding some more
lists to CC which look like might be related, judging from the stack
traces.

---

...

[ 9.456211] EXT3-fs (sda1): using internal journal
[ 24.623666] 8139cp 0000:00:03.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
[ 28.346202] mtrr: no MTRR for fc000000,100000 found
[ 31.704053]
[ 31.704282] =========================================================
[ 31.704282] [ INFO: possible irq lock inversion dependency detected ]
[ 31.704282] 3.16.0-rc6+ #1 Not tainted
[ 31.704282] ---------------------------------------------------------
[ 31.704282] Xorg/3484 just changed the state of lock:
[ 31.704282] (tasklist_lock){.?.+..}, at: [<ffffffff81184b19>] send_sigio+0x59/0x1b0
[ 31.704282] but this lock took another, HARDIRQ-unsafe lock in the past:
[ 31.704282] (&(&p->alloc_lock)->rlock){+.+...}

and interrupts could create inverse lock ordering between them.

[ 31.704282]
[ 31.704282] other info that might help us debug this:
[ 31.704282] Possible interrupt unsafe locking scenario:
[ 31.704282]
[ 31.704282] CPU0 CPU1
[ 31.704282] ---- ----
[ 31.704282] lock(&(&p->alloc_lock)->rlock);
[ 31.704282] local_irq_disable();
[ 31.704282] lock(tasklist_lock);
[ 31.704282] lock(&(&p->alloc_lock)->rlock);
[ 31.704282] <Interrupt>
[ 31.704282] lock(tasklist_lock);
[ 31.704282]
[ 31.704282] *** DEADLOCK ***
[ 31.704282]
[ 31.704282] 7 locks held by Xorg/3484:
[ 31.704282] #0: (&(&dev->event_lock)->rlock){-.....}, at: [<ffffffff8148c20d>] input_event+0x4d/0x90
[ 31.704282] #1: (rcu_read_lock){......}, at: [<ffffffff8148b365>] input_pass_values.part.3+0x5/0x360
[ 31.704282] #2: (rcu_read_lock){......}, at: [<ffffffff814915a5>] evdev_events+0x5/0x2e0
[ 31.704282] #3: (&(&client->buffer_lock)->rlock){-.....}, at: [<ffffffff81490743>] evdev_pass_values+0x63/0x1d0
[ 31.704282] #4: (rcu_read_lock){......}, at: [<ffffffff81184c7f>] kill_fasync+0xf/0x290
[ 31.704282] #5: (&(&new->fa_lock)->rlock){-.....}, at: [<ffffffff81184d06>] kill_fasync+0x96/0x290
[ 31.704282] #6: (&f->f_owner.lock){.-....}, at: [<ffffffff81184ae4>] send_sigio+0x24/0x1b0
[ 31.704282]
[ 31.704282] the shortest dependencies between 2nd lock and 1st lock:
[ 31.704282] -> (&(&p->alloc_lock)->rlock){+.+...} ops: 35104 {
[ 31.704282] HARDIRQ-ON-W at:
[ 31.704282] [<ffffffff8109a362>] __lock_acquire+0x952/0x2230
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff8161f8d1>] _raw_spin_lock+0x41/0x80
[ 31.704282] [<ffffffff811793c9>] __set_task_comm+0x39/0x180
[ 31.704282] [<ffffffff81073c35>] kthreadd+0x45/0x150
[ 31.704282] [<ffffffff81620b6c>] ret_from_fork+0x7c/0xb0
[ 31.704282] SOFTIRQ-ON-W at:
[ 31.704282] [<ffffffff8109a395>] __lock_acquire+0x985/0x2230
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff8161f8d1>] _raw_spin_lock+0x41/0x80
[ 31.704282] [<ffffffff811793c9>] __set_task_comm+0x39/0x180
[ 31.704282] [<ffffffff81073c35>] kthreadd+0x45/0x150
[ 31.704282] [<ffffffff81620b6c>] ret_from_fork+0x7c/0xb0
[ 31.704282] INITIAL USE at:
[ 31.704282] [<ffffffff81099e53>] __lock_acquire+0x443/0x2230
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff8161f8d1>] _raw_spin_lock+0x41/0x80
[ 31.704282] [<ffffffff811793c9>] __set_task_comm+0x39/0x180
[ 31.704282] [<ffffffff81073c35>] kthreadd+0x45/0x150
[ 31.704282] [<ffffffff81620b6c>] ret_from_fork+0x7c/0xb0
[ 31.704282] }
[ 31.704282] ... key at: [<ffffffff81cc4ae8>] __key.47760+0x0/0x8
[ 31.704282] ... acquired at:
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff8161f8d1>] _raw_spin_lock+0x41/0x80
[ 31.704282] [<ffffffff81066b45>] do_prlimit+0x205/0x250
[ 31.704282] [<ffffffff81066bba>] SyS_getrlimit+0x2a/0x70
[ 31.704282] [<ffffffff81620c16>] system_call_fastpath+0x1a/0x1f
[ 31.704282]
[ 31.704282] -> (tasklist_lock){.?.+..} ops: 22947 {
[ 31.704282] IN-HARDIRQ-R at:
[ 31.704282] [<ffffffff8109a5a3>] __lock_acquire+0xb93/0x2230
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff8161fe14>] _raw_read_lock+0x44/0x80
[ 31.704282] [<ffffffff81184b19>] send_sigio+0x59/0x1b0
[ 31.704282] [<ffffffff81184d34>] kill_fasync+0xc4/0x290
[ 31.704282] [<ffffffff81490776>] evdev_pass_values+0x96/0x1d0
[ 31.704282] [<ffffffff814917bc>] evdev_events+0x21c/0x2e0
[ 31.704282] [<ffffffff81489691>] input_to_handler+0x91/0x100
[ 31.704282] [<ffffffff8148b624>] input_pass_values.part.3+0x2c4/0x360
[ 31.704282] [<ffffffff8148bd1a>] input_handle_event+0xda/0x580
[ 31.704282] [<ffffffff8148c220>] input_event+0x60/0x90
[ 31.704282] [<ffffffff814c546f>] hidinput_report_event+0x3f/0x50
[ 31.704282] [<ffffffff814c3635>] hid_report_raw_event+0x285/0x420
[ 31.704282] [<ffffffff814c38f1>] hid_input_report+0x121/0x1a0
[ 31.704282] [<ffffffff814d1300>] hid_irq_in+0x80/0x1f0
[ 31.704282] [<ffffffff81448c98>] __usb_hcd_giveback_urb+0x68/0x100
[ 31.704282] [<ffffffff81448d7a>] usb_hcd_giveback_urb+0x4a/0x140
[ 31.704282] [<ffffffff8146ba38>] uhci_giveback_urb+0xb8/0x210
[ 31.704282] [<ffffffff8146c4a2>] uhci_scan_schedule.part.32+0x542/0xb60
[ 31.704282] [<ffffffff8146d351>] uhci_irq+0xf1/0x190
[ 31.704282] [<ffffffff81448285>] usb_hcd_irq+0x25/0x40
[ 31.704282] [<ffffffff810af329>] handle_irq_event_percpu+0x39/0x350
[ 31.704282] [<ffffffff810af688>] handle_irq_event+0x48/0x70
[ 31.704282] [<ffffffff810b2a60>] handle_fasteoi_irq+0xa0/0x180
[ 31.704282] [<ffffffff810055ce>] handle_irq+0x1e/0x30
[ 31.704282] [<ffffffff81623618>] do_IRQ+0x68/0x110
[ 31.704282] [<ffffffff8162186f>] ret_from_intr+0x0/0x13
[ 31.704282] HARDIRQ-ON-R at:
[ 31.704282] [<ffffffff81099dac>] __lock_acquire+0x39c/0x2230
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff8161fe14>] _raw_read_lock+0x44/0x80
[ 31.704282] [<ffffffff81053109>] do_wait+0xe9/0x370
[ 31.704282] [<ffffffff810537a5>] SyS_wait4+0x75/0xf0
[ 31.704282] [<ffffffff81067f0b>] wait_for_helper+0x4b/0x70
[ 31.704282] [<ffffffff81620b6c>] ret_from_fork+0x7c/0xb0
[ 31.704282] SOFTIRQ-ON-R at:
[ 31.704282] [<ffffffff8109a395>] __lock_acquire+0x985/0x2230
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff8161fe14>] _raw_read_lock+0x44/0x80
[ 31.704282] [<ffffffff81053109>] do_wait+0xe9/0x370
[ 31.704282] [<ffffffff810537a5>] SyS_wait4+0x75/0xf0
[ 31.704282] [<ffffffff81067f0b>] wait_for_helper+0x4b/0x70
[ 31.704282] [<ffffffff81620b6c>] ret_from_fork+0x7c/0xb0
[ 31.704282] INITIAL USE at:
[ 31.704282] [<ffffffff81099e53>] __lock_acquire+0x443/0x2230
[ 31.704282] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 31.704282] [<ffffffff816202c7>] _raw_write_lock_irq+0x47/0x80
[ 31.704282] [<ffffffff8104dd5a>] copy_process.part.51+0xe5a/0x19d0
[ 31.704282] [<ffffffff8104ea97>] do_fork+0xe7/0x770
[ 31.704282] [<ffffffff8104f146>] kernel_thread+0x26/0x30
[ 31.704282] [<ffffffff81614922>] rest_init+0x22/0x140
[ 31.704282] [<ffffffff81b90e3e>] start_kernel+0x408/0x415
[ 31.704282] [<ffffffff81b90463>] x86_64_start_reservations+0x2a/0x2c
[ 31.704282] [<ffffffff81b9055b>] x86_64_start_kernel+0xf6/0xf9
[ 31.704282] }
[ 31.704282] ... key at: [<ffffffff818ff098>] tasklist_lock+0x18/0x80
[ 31.704282] ... acquired at:
[ 31.704282] [<ffffffff8109636b>] check_usage_forwards+0x15b/0x160
[ 31.704282] [<ffffffff81097188>] mark_lock+0x3d8/0x760
[ 32.044737] [<ffffffff8109a5a3>] __lock_acquire+0xb93/0x2230
[ 32.044737] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 32.044737] [<ffffffff8161fe14>] _raw_read_lock+0x44/0x80
[ 32.044737] [<ffffffff81184b19>] send_sigio+0x59/0x1b0
[ 32.044737] [<ffffffff81184d34>] kill_fasync+0xc4/0x290
[ 32.044737] [<ffffffff81490776>] evdev_pass_values+0x96/0x1d0
[ 32.044737] [<ffffffff814917bc>] evdev_events+0x21c/0x2e0
[ 32.044737] [<ffffffff81489691>] input_to_handler+0x91/0x100
[ 32.044737] [<ffffffff8148b624>] input_pass_values.part.3+0x2c4/0x360
[ 32.044737] [<ffffffff8148bd1a>] input_handle_event+0xda/0x580
[ 32.044737] [<ffffffff8148c220>] input_event+0x60/0x90
[ 32.044737] [<ffffffff814c546f>] hidinput_report_event+0x3f/0x50
[ 32.044737] [<ffffffff814c3635>] hid_report_raw_event+0x285/0x420
[ 32.044737] [<ffffffff814c38f1>] hid_input_report+0x121/0x1a0
[ 32.044737] [<ffffffff814d1300>] hid_irq_in+0x80/0x1f0
[ 32.044737] [<ffffffff81448c98>] __usb_hcd_giveback_urb+0x68/0x100
[ 32.044737] [<ffffffff81448d7a>] usb_hcd_giveback_urb+0x4a/0x140
[ 32.044737] [<ffffffff8146ba38>] uhci_giveback_urb+0xb8/0x210
[ 32.044737] [<ffffffff8146c4a2>] uhci_scan_schedule.part.32+0x542/0xb60
[ 32.044737] [<ffffffff8146d351>] uhci_irq+0xf1/0x190
[ 32.044737] [<ffffffff81448285>] usb_hcd_irq+0x25/0x40
[ 32.044737] [<ffffffff810af329>] handle_irq_event_percpu+0x39/0x350
[ 32.044737] [<ffffffff810af688>] handle_irq_event+0x48/0x70
[ 32.044737] [<ffffffff810b2a60>] handle_fasteoi_irq+0xa0/0x180
[ 32.044737] [<ffffffff810055ce>] handle_irq+0x1e/0x30
[ 32.044737] [<ffffffff81623618>] do_IRQ+0x68/0x110
[ 32.044737] [<ffffffff8162186f>] ret_from_intr+0x0/0x13
[ 32.044737]
[ 32.044737]
[ 32.044737] stack backtrace:
[ 32.044737] CPU: 0 PID: 3484 Comm: Xorg Not tainted 3.16.0-rc6+ #1
[ 32.044737] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 32.044737] ffffffff8280ddc0 ffff88007be03770 ffffffff8161874e ffffffff8280ddc0
[ 32.044737] ffff88007be037b0 ffffffff81617b6f ffffffff8183b4f6 ffff880079da4300
[ 32.044737] ffff880079da3a50 ffffffff8183b4f6 0000000000000000 ffffffff8280ddc0
[ 32.044737] Call Trace:
[ 32.044737] <IRQ> [<ffffffff8161874e>] dump_stack+0x4e/0x7a
[ 32.044737] [<ffffffff81617b6f>] print_irq_inversion_bug.part.31+0x1b8/0x1c4
[ 32.044737] [<ffffffff8109636b>] check_usage_forwards+0x15b/0x160
[ 32.044737] [<ffffffff81097188>] mark_lock+0x3d8/0x760
[ 32.044737] [<ffffffff81096210>] ? print_shortest_lock_dependencies+0x1d0/0x1d0
[ 32.044737] [<ffffffff8109a5a3>] __lock_acquire+0xb93/0x2230
[ 32.044737] [<ffffffff8109c449>] lock_acquire+0xb9/0x200
[ 32.044737] [<ffffffff81184b19>] ? send_sigio+0x59/0x1b0
[ 32.044737] [<ffffffff8161fe14>] _raw_read_lock+0x44/0x80
[ 32.044737] [<ffffffff81184b19>] ? send_sigio+0x59/0x1b0
[ 32.044737] [<ffffffff81184b19>] send_sigio+0x59/0x1b0
[ 32.044737] [<ffffffff81184d34>] kill_fasync+0xc4/0x290
[ 32.044737] [<ffffffff81184c7f>] ? kill_fasync+0xf/0x290
[ 32.044737] [<ffffffff81490776>] evdev_pass_values+0x96/0x1d0
[ 32.044737] [<ffffffff814917bc>] evdev_events+0x21c/0x2e0
[ 32.044737] [<ffffffff814915a5>] ? evdev_events+0x5/0x2e0
[ 32.044737] [<ffffffff81489691>] input_to_handler+0x91/0x100
[ 32.044737] [<ffffffff8148b624>] input_pass_values.part.3+0x2c4/0x360
[ 32.044737] [<ffffffff8148b365>] ? input_pass_values.part.3+0x5/0x360
[ 32.044737] [<ffffffff8148bd1a>] input_handle_event+0xda/0x580
[ 32.044737] [<ffffffff8148c220>] input_event+0x60/0x90
[ 32.044737] [<ffffffff814c546f>] hidinput_report_event+0x3f/0x50
[ 32.044737] [<ffffffff814c3635>] hid_report_raw_event+0x285/0x420
[ 32.044737] [<ffffffff814c38f1>] hid_input_report+0x121/0x1a0
[ 32.044737] [<ffffffff814d1300>] hid_irq_in+0x80/0x1f0
[ 32.044737] [<ffffffff81448c98>] __usb_hcd_giveback_urb+0x68/0x100
[ 32.044737] [<ffffffff81448d7a>] usb_hcd_giveback_urb+0x4a/0x140
[ 32.044737] [<ffffffff8146ba38>] uhci_giveback_urb+0xb8/0x210
[ 32.044737] [<ffffffff8146c4a2>] uhci_scan_schedule.part.32+0x542/0xb60
[ 32.044737] [<ffffffff8146d2d2>] ? uhci_irq+0x72/0x190
[ 32.044737] [<ffffffff8146d351>] uhci_irq+0xf1/0x190
[ 32.044737] [<ffffffff81448285>] usb_hcd_irq+0x25/0x40
[ 32.044737] [<ffffffff810af329>] handle_irq_event_percpu+0x39/0x350
[ 32.044737] [<ffffffff810af688>] handle_irq_event+0x48/0x70
[ 32.044737] [<ffffffff810b2a60>] handle_fasteoi_irq+0xa0/0x180
[ 32.044737] [<ffffffff810055ce>] handle_irq+0x1e/0x30
[ 32.044737] [<ffffffff81623618>] do_IRQ+0x68/0x110
[ 32.044737] [<ffffffff8162186f>] common_interrupt+0x6f/0x6f
[ 32.044737] <EOI> [<ffffffff81621911>] ? retint_swapgs+0xe/0x13

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-07-24 00:37:46

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Wed, Jul 23, 2014 at 2:53 AM, Borislav Petkov <[email protected]> wrote:
>
> Well, it looks like we f*cked up something after -rc5 since I'm starting
> to see lockdep splats all over the place which I didn't see before. I'm
> running rc6 + tip/master.
>
> There was one in r8169 yesterday:
>
> https://lkml.kernel.org/r/[email protected]
>
> and now I'm seeing the following in a kvm guest. I'm adding some more
> lists to CC which look like might be related, judging from the stack
> traces.

Hmm. I'm not seeing the reason for this.

> [ 31.704282] [ INFO: possible irq lock inversion dependency detected ]
> [ 31.704282] 3.16.0-rc6+ #1 Not tainted
> [ 31.704282] ---------------------------------------------------------
> [ 31.704282] Xorg/3484 just changed the state of lock:
> [ 31.704282] (tasklist_lock){.?.+..}, at: [<ffffffff81184b19>] send_sigio+0x59/0x1b0
> [ 31.704282] but this lock took another, HARDIRQ-unsafe lock in the past:
> [ 31.704282] (&(&p->alloc_lock)->rlock){+.+...}

Ok, so the claim is that there's a 'p->alloc_lock' (ie "task_lock()")
that is inside the tasklist_lock, which would indeed be wrong. But I'm
not seeing it. The "shortest dependencies" thing seems to imply
__set_task_comm(), but that only takes task_lock.

Unless there is something in tip/master. Can you check that this is
actually in plain -rc6?

Or maybe I'm just blind. Those lockdep splats are easy to get wrong.
Adding PeterZ and Ingo to the list just because they are my lockdep
go-to people.

Linus

2014-07-24 01:53:23

by David Rientjes

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Wed, 23 Jul 2014, Linus Torvalds wrote:

> > Well, it looks like we f*cked up something after -rc5 since I'm starting
> > to see lockdep splats all over the place which I didn't see before. I'm
> > running rc6 + tip/master.
> >
> > There was one in r8169 yesterday:
> >
> > https://lkml.kernel.org/r/[email protected]
> >
> > and now I'm seeing the following in a kvm guest. I'm adding some more
> > lists to CC which look like might be related, judging from the stack
> > traces.
>
> Hmm. I'm not seeing the reason for this.
>
> > [ 31.704282] [ INFO: possible irq lock inversion dependency detected ]
> > [ 31.704282] 3.16.0-rc6+ #1 Not tainted
> > [ 31.704282] ---------------------------------------------------------
> > [ 31.704282] Xorg/3484 just changed the state of lock:
> > [ 31.704282] (tasklist_lock){.?.+..}, at: [<ffffffff81184b19>] send_sigio+0x59/0x1b0
> > [ 31.704282] but this lock took another, HARDIRQ-unsafe lock in the past:
> > [ 31.704282] (&(&p->alloc_lock)->rlock){+.+...}
>
> Ok, so the claim is that there's a 'p->alloc_lock' (ie "task_lock()")
> that is inside the tasklist_lock, which would indeed be wrong. But I'm
> not seeing it. The "shortest dependencies" thing seems to imply
> __set_task_comm(), but that only takes task_lock.
>

It's the reverse, task_lock() inside tasklist_lock is fine but it's
complaining about taking tasklist_lock inside task_lock().

I don't think it's anything that's sitting in tip/master nor is it
something that was introduced during this merge window. I think this has
been the behavior dating back to commit 94dfd7edfd5c ("USB: HCD: support
giveback of URB in tasklet context").

2014-07-24 06:44:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Wed, Jul 23, 2014 at 05:37:43PM -0700, Linus Torvalds wrote:
> On Wed, Jul 23, 2014 at 2:53 AM, Borislav Petkov <[email protected]> wrote:
> >
> > Well, it looks like we f*cked up something after -rc5 since I'm starting
> > to see lockdep splats all over the place which I didn't see before. I'm
> > running rc6 + tip/master.
> >
> > There was one in r8169 yesterday:
> >
> > https://lkml.kernel.org/r/[email protected]
> >
> > and now I'm seeing the following in a kvm guest. I'm adding some more
> > lists to CC which look like might be related, judging from the stack
> > traces.
>
> Hmm. I'm not seeing the reason for this.
>
> > [ 31.704282] [ INFO: possible irq lock inversion dependency detected ]
> > [ 31.704282] 3.16.0-rc6+ #1 Not tainted
> > [ 31.704282] ---------------------------------------------------------
> > [ 31.704282] Xorg/3484 just changed the state of lock:
> > [ 31.704282] (tasklist_lock){.?.+..}, at: [<ffffffff81184b19>] send_sigio+0x59/0x1b0
> > [ 31.704282] but this lock took another, HARDIRQ-unsafe lock in the past:
> > [ 31.704282] (&(&p->alloc_lock)->rlock){+.+...}
>
> Ok, so the claim is that there's a 'p->alloc_lock' (ie "task_lock()")
> that is inside the tasklist_lock, which would indeed be wrong. But I'm
> not seeing it. The "shortest dependencies" thing seems to imply
> __set_task_comm(), but that only takes task_lock.
>
> Unless there is something in tip/master.

lkml.kernel.org/r/[email protected]

Its supposed to change lockdep to the stricter semantics provided by the
qrwlock.

Where the rwlock used to be unfair and reader biased, qrwlock is 'fair'
and only allows interrupt recursion.

> Can you check that this is
> actually in plain -rc6?
>
> Or maybe I'm just blind. Those lockdep splats are easy to get wrong.
> Adding PeterZ and Ingo to the list just because they are my lockdep
> go-to people.

I've been staring at this splat from borislav since yesterday morning
and confusing myself properly.. I'll continue doing so until I'm
decided.

2014-07-24 08:41:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 08:43:53AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 23, 2014 at 05:37:43PM -0700, Linus Torvalds wrote:
> > On Wed, Jul 23, 2014 at 2:53 AM, Borislav Petkov <[email protected]> wrote:
> > >
> > > Well, it looks like we f*cked up something after -rc5 since I'm starting
> > > to see lockdep splats all over the place which I didn't see before. I'm
> > > running rc6 + tip/master.
> > >
> > > There was one in r8169 yesterday:
> > >
> > > https://lkml.kernel.org/r/[email protected]
> > >
> > > and now I'm seeing the following in a kvm guest. I'm adding some more
> > > lists to CC which look like might be related, judging from the stack
> > > traces.
> >
> > Hmm. I'm not seeing the reason for this.
> >
> > > [ 31.704282] [ INFO: possible irq lock inversion dependency detected ]
> > > [ 31.704282] 3.16.0-rc6+ #1 Not tainted
> > > [ 31.704282] ---------------------------------------------------------
> > > [ 31.704282] Xorg/3484 just changed the state of lock:
> > > [ 31.704282] (tasklist_lock){.?.+..}, at: [<ffffffff81184b19>] send_sigio+0x59/0x1b0
> > > [ 31.704282] but this lock took another, HARDIRQ-unsafe lock in the past:
> > > [ 31.704282] (&(&p->alloc_lock)->rlock){+.+...}
> >
> > Ok, so the claim is that there's a 'p->alloc_lock' (ie "task_lock()")
> > that is inside the tasklist_lock, which would indeed be wrong. But I'm
> > not seeing it. The "shortest dependencies" thing seems to imply
> > __set_task_comm(), but that only takes task_lock.
> >
> > Unless there is something in tip/master.
>
> lkml.kernel.org/r/[email protected]
>
> Its supposed to change lockdep to the stricter semantics provided by the
> qrwlock.
>
> Where the rwlock used to be unfair and reader biased, qrwlock is 'fair'
> and only allows interrupt recursion.
>
> > Can you check that this is
> > actually in plain -rc6?
> >
> > Or maybe I'm just blind. Those lockdep splats are easy to get wrong.
> > Adding PeterZ and Ingo to the list just because they are my lockdep
> > go-to people.
>
> I've been staring at this splat from borislav since yesterday morning
> and confusing myself properly.. I'll continue doing so until I'm
> decided.

CCing original author:

@Waiman, you can easily reproduce by booting a kvm guest with rc6 +
tip/master. It does not trigger everytime so you need to try a couple of
iterations.

I'm attaching my .config.

Also, here the splats I'm seeing on my machines:

https://lkml.kernel.org/r/[email protected]
https://lkml.kernel.org/r/[email protected]

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--


Attachments:
(No filename) (2.65 kB)
.config (85.29 kB)
Download all attachments

2014-07-24 12:25:20

by Borislav Petkov

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 10:41:27AM +0200, Borislav Petkov wrote:
> you can easily reproduce by booting a kvm guest with rc6 + tip/master.

Right, so reverting

586fefe5bbdc ("locking/selftest: Support queued rwlock")
e0645a111cb4 ("locking/lockdep: Restrict the use of recursive read_lock() with qrwlock")

from the top of tip/locking/core seems to fix the issue, with the kvm
guests at least.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-07-24 12:58:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 02:25:13PM +0200, Borislav Petkov wrote:
> On Thu, Jul 24, 2014 at 10:41:27AM +0200, Borislav Petkov wrote:
> > you can easily reproduce by booting a kvm guest with rc6 + tip/master.
>
> Right, so reverting
>
> 586fefe5bbdc ("locking/selftest: Support queued rwlock")
> e0645a111cb4 ("locking/lockdep: Restrict the use of recursive read_lock() with qrwlock")
>
> from the top of tip/locking/core seems to fix the issue, with the kvm
> guests at least.

Well, it makes the report go away.. I'm currently leaning towards that
the report is valid. We did after all change rwlock semantics, and that
lockdep patch is making lockdep match those new semantics.

Of course, its also possible the lockdep patch is wrong. But I'm leaning
towards that the report is valid.

So going by the nifty picture rostedt made:

[ 61.454336] CPU0 CPU1
[ 61.454336] ---- ----
[ 61.454336] lock(&(&p->alloc_lock)->rlock);
[ 61.454336] local_irq_disable();
[ 61.454336] lock(tasklist_lock);
[ 61.454336] lock(&(&p->alloc_lock)->rlock);
[ 61.454336] <Interrupt>
[ 61.454336] lock(tasklist_lock);

the fact that CPU1 holds tasklist_lock for reading, does not
automagically allow CPU0 to acquire tasklist_lock for reading too, for
example if CPU2 (not in the picture) is waiting to acquire tasklist_lock
for writing, CPU0's read acquire is made to wait.

The only kind of recursion that's safe is same CPU interrupt.

Of course we should have made the lockdep change before merging qrwlock,
and that's entirely my fail, but with qrwlock in these new semantics are
already a reality.

We could of course disable qrwlock until all such issues are cleared up
(its the safe option)...


Attachments:
(No filename) (1.81 kB)
(No filename) (836.00 B)
Download all attachments

2014-07-24 16:34:12

by Borislav Petkov

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 02:58:14PM +0200, Peter Zijlstra wrote:
> Of course we should have made the lockdep change before merging qrwlock,
> and that's entirely my fail, but with qrwlock in these new semantics are
> already a reality.
>
> We could of course disable qrwlock until all such issues are cleared up
> (its the safe option)...

So we leave in the lockdep change and disable qrwlock ontop of
tip/locking/core. Nothing's hurt yet as this whole pile is queued for
3.17 anyway.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-07-24 18:18:20

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 5:58 AM, Peter Zijlstra <[email protected]> wrote:
>
> So going by the nifty picture rostedt made:
>
> [ 61.454336] CPU0 CPU1
> [ 61.454336] ---- ----
> [ 61.454336] lock(&(&p->alloc_lock)->rlock);
> [ 61.454336] local_irq_disable();
> [ 61.454336] lock(tasklist_lock);
> [ 61.454336] lock(&(&p->alloc_lock)->rlock);
> [ 61.454336] <Interrupt>
> [ 61.454336] lock(tasklist_lock);

So this *should* be fine. It always has been in the past, and it was
certainly the *intention* that it should continue to work with
qrwlock, even in the presense of pending writers on other cpu's.

The qrwlock rules are that a read-lock in an interrupt is still going
to be unfair and succeed if there are other readers.

> the fact that CPU1 holds tasklist_lock for reading, does not
> automagically allow CPU0 to acquire tasklist_lock for reading too, for
> example if CPU2 (not in the picture) is waiting to acquire tasklist_lock
> for writing, CPU0's read acquire is made to wait.

No.

That is true for qrwlock in general. But *not* in interrupt context.
In interrupt context, it's unfair. At least that was the _intent_ of
the code, maybe that got screwed up some way.

> The only kind of recursion that's safe is same CPU interrupt.

Any read-lock from an irq should still be unfair, no "same CPU" rules.
See queue_read_lock_slowpath(), where it will just wait for any actual
write lockers (not *waiting* writers) to go away. So by definition, of
somebody else (not just the current CPU) holds the lock for reading,
taking it for reading is safe on all cpu's in irq context, because we
obviously won't be waiting for any actual write lock holders.

So it sounds to me like the new lockdep rules in tip/master are too
strict and are throwing a false positive.

Linus

2014-07-24 18:36:57

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 11:18:16AM -0700, Linus Torvalds wrote:
> On Thu, Jul 24, 2014 at 5:58 AM, Peter Zijlstra <[email protected]> wrote:
> >
> > So going by the nifty picture rostedt made:
> >
> > [ 61.454336] CPU0 CPU1
> > [ 61.454336] ---- ----
> > [ 61.454336] lock(&(&p->alloc_lock)->rlock);
> > [ 61.454336] local_irq_disable();
> > [ 61.454336] lock(tasklist_lock);
> > [ 61.454336] lock(&(&p->alloc_lock)->rlock);
> > [ 61.454336] <Interrupt>
> > [ 61.454336] lock(tasklist_lock);
>
> So this *should* be fine. It always has been in the past, and it was
> certainly the *intention* that it should continue to work with
> qrwlock, even in the presense of pending writers on other cpu's.
>
> The qrwlock rules are that a read-lock in an interrupt is still going
> to be unfair and succeed if there are other readers.

Ah, indeed. Should have checked :/

> So it sounds to me like the new lockdep rules in tip/master are too
> strict and are throwing a false positive.

Right. Waiman can you have a look?

2014-07-24 20:38:34

by Waiman Long

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On 07/24/2014 02:36 PM, Peter Zijlstra wrote:
> On Thu, Jul 24, 2014 at 11:18:16AM -0700, Linus Torvalds wrote:
>> On Thu, Jul 24, 2014 at 5:58 AM, Peter Zijlstra<[email protected]> wrote:
>>> So going by the nifty picture rostedt made:
>>>
>>> [ 61.454336] CPU0 CPU1
>>> [ 61.454336] ---- ----
>>> [ 61.454336] lock(&(&p->alloc_lock)->rlock);
>>> [ 61.454336] local_irq_disable();
>>> [ 61.454336] lock(tasklist_lock);
>>> [ 61.454336] lock(&(&p->alloc_lock)->rlock);
>>> [ 61.454336]<Interrupt>
>>> [ 61.454336] lock(tasklist_lock);
>> So this *should* be fine. It always has been in the past, and it was
>> certainly the *intention* that it should continue to work with
>> qrwlock, even in the presense of pending writers on other cpu's.
>>
>> The qrwlock rules are that a read-lock in an interrupt is still going
>> to be unfair and succeed if there are other readers.
> Ah, indeed. Should have checked :/
>
>> So it sounds to me like the new lockdep rules in tip/master are too
>> strict and are throwing a false positive.
> Right. Waiman can you have a look?

Yes, I think I may have a solution for that.

Borislav, can you apply the following patch on top of the lockdep patch
to see if it can fix the problem?

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index d24e433..507a8ce 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3595,6 +3595,12 @@ void lock_acquire(struct lockdep_map *lock,
unsigned int
raw_local_irq_save(flags);
check_flags(flags);

+ /*
+ * An interrupt recursive read in interrupt context can be
considered
+ * to be the same as a recursive read from checking perspective.
+ */
+ if ((read == 3) && in_interrupt())
+ read = 2;
current->lockdep_recursion = 1;
trace_lock_acquire(lock, subclass, trylock, read, check,
nest_lock, ip);
__lock_acquire(lock, subclass, trylock, read, check,

2014-07-24 21:45:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 04:38:28PM -0400, Waiman Long wrote:
> Borislav, can you apply the following patch on top of the lockdep
> patch to see if it can fix the problem?

It is too late here for me to test anything but the ingridients to
reproduce are nothing special. Just grab a kvm guest and pick out the
locking or so options out of the .config I sent previously. Then boot it
a couple of times, it triggers pretty easy after a couple of tries.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-07-24 22:15:23

by John Stoffel

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

>>>>> "Waiman" == Waiman Long <[email protected]> writes:

Waiman> On 07/24/2014 02:36 PM, Peter Zijlstra wrote:
>> On Thu, Jul 24, 2014 at 11:18:16AM -0700, Linus Torvalds wrote:
>>> On Thu, Jul 24, 2014 at 5:58 AM, Peter Zijlstra<[email protected]> wrote:
>>>> So going by the nifty picture rostedt made:
>>>>
>>>> [ 61.454336] CPU0 CPU1
>>>> [ 61.454336] ---- ----
>>>> [ 61.454336] lock(&(&p->alloc_lock)->rlock);
>>>> [ 61.454336] local_irq_disable();
>>>> [ 61.454336] lock(tasklist_lock);
>>>> [ 61.454336] lock(&(&p->alloc_lock)->rlock);
>>>> [ 61.454336]<Interrupt>
>>>> [ 61.454336] lock(tasklist_lock);
>>> So this *should* be fine. It always has been in the past, and it was
>>> certainly the *intention* that it should continue to work with
>>> qrwlock, even in the presense of pending writers on other cpu's.
>>>
>>> The qrwlock rules are that a read-lock in an interrupt is still going
>>> to be unfair and succeed if there are other readers.
>> Ah, indeed. Should have checked :/
>>
>>> So it sounds to me like the new lockdep rules in tip/master are too
>>> strict and are throwing a false positive.
>> Right. Waiman can you have a look?

Waiman> Yes, I think I may have a solution for that.

Waiman> Borislav, can you apply the following patch on top of the lockdep patch
Waiman> to see if it can fix the problem?

Waiman> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
Waiman> index d24e433..507a8ce 100644
Waiman> --- a/kernel/locking/lockdep.c
Waiman> +++ b/kernel/locking/lockdep.c
Waiman> @@ -3595,6 +3595,12 @@ void lock_acquire(struct lockdep_map *lock,
Waiman> unsigned int
Waiman> raw_local_irq_save(flags);
Waiman> check_flags(flags);

Waiman> + /*
Waiman> + * An interrupt recursive read in interrupt context can be
Waiman> considered
Waiman> + * to be the same as a recursive read from checking perspective.
Waiman> + */
Waiman> + if ((read == 3) && in_interrupt())
Waiman> + read = 2;
current-> lockdep_recursion = 1;
Waiman> trace_lock_acquire(lock, subclass, trylock, read, check,
Waiman> nest_lock, ip);
Waiman> __lock_acquire(lock, subclass, trylock, read, check,

Instead of the magic numbers 1,2,3, could you use some nicely named
constants here instead?

John

2014-07-25 16:10:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Thu, Jul 24, 2014 at 04:38:28PM -0400, Waiman Long wrote:
> Yes, I think I may have a solution for that.
>
> Borislav, can you apply the following patch on top of the lockdep patch to
> see if it can fix the problem?
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index d24e433..507a8ce 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -3595,6 +3595,12 @@ void lock_acquire(struct lockdep_map *lock, unsigned int
> raw_local_irq_save(flags);
> check_flags(flags);
>
> + /*
> + * An interrupt recursive read in interrupt context can be considered
> + * to be the same as a recursive read from checking perspective.
> + */
> + if ((read == 3) && in_interrupt())
> + read = 2;
> current->lockdep_recursion = 1;
> trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
> __lock_acquire(lock, subclass, trylock, read, check,

Just had another look at the initial patch and it cannot be right, even
with the above.

The problem is you cannot use in_interrupt() in check_deadlock().
Check_deadlock() must be context invariant, it should only test the
chain state and not rely on where or when its called.



Attachments:
(No filename) (1.23 kB)
(No filename) (836.00 B)
Download all attachments

2014-07-25 17:23:26

by Waiman Long

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On 07/24/2014 05:45 PM, Borislav Petkov wrote:
> On Thu, Jul 24, 2014 at 04:38:28PM -0400, Waiman Long wrote:
>> Borislav, can you apply the following patch on top of the lockdep
>> patch to see if it can fix the problem?
> It is too late here for me to test anything but the ingridients to
> reproduce are nothing special. Just grab a kvm guest and pick out the
> locking or so options out of the .config I sent previously. Then boot it
> a couple of times, it triggers pretty easy after a couple of tries.
>

Thank for the reply.

I was not able to reproduce the read_lock lockdep problem that you see
in your test machine. However, I saw the following lockdep warning on a
mutex when I enabled CONFIG_PROVE_LOCKING=y and CONFIG_LOCKDEP=y. My
virtual machine is based on RHEL 7 beta with xfs filesystem, and the
error happens for both plain 3.16-rc6 and tip/master+3.16-rc6.


[ 7.821723] ======================================================
[ 7.821725] [ INFO: possible circular locking dependency detected ]
[ 7.821727] 3.16.0-rc6 #4 Not tainted
[ 7.821727] -------------------------------------------------------
[ 7.821728] kworker/3:1/276 is trying to acquire lock:
[ 7.821747] (&qdev->release_mutex){+.+.+.}, at: [<ffffffffa0151a2c>]
qxl_alloc_release_reserved+0x6c/0x200 [qxl]
[ 7.821749]
[ 7.821749] but task is already holding lock:
[ 7.821763] (&fbdefio->lock){+.+.+.}, at: [<ffffffff8135ea65>]
fb_deferred_io_work+0x35/0xd0
[ 7.821765]
[ 7.821765] which lock already depends on the new lock.
[ 7.821765]
[ 7.821766]
[ 7.821766] the existing dependency chain (in reverse order) is:
[ 7.821770]
[ 7.821770] -> #4 (&fbdefio->lock){+.+.+.}:
[ 7.821781] [<ffffffff810b2078>] __lock_acquire+0x3a8/0xc40
[ 7.821785] [<ffffffff810b29c0>] lock_acquire+0xb0/0x140
[ 7.821793] [<ffffffff81636add>] mutex_lock_nested+0x5d/0x4d0
[ 7.821796] [<ffffffff8135e866>] fb_deferred_io_mkwrite+0x46/0x120
[ 7.821803] [<ffffffff8118534d>] do_page_mkwrite+0x3d/0x70
[ 7.821807] [<ffffffff81188376>]
do_shared_fault.isra.52+0x66/0x1d0
[ 7.821811] [<ffffffff81189524>] handle_mm_fault+0x474/0x1080
[ 7.821819] [<ffffffff81048ea1>] __do_page_fault+0x191/0x530
[ 7.821822] [<ffffffff810492f1>] trace_do_page_fault+0x41/0x100
[ 7.821828] [<ffffffff81043569>] do_async_page_fault+0x29/0xe0
[ 7.821834] [<ffffffff8163d608>] async_page_fault+0x28/0x30
[ 7.821841]
[ 7.821841] -> #3 (&mm->mmap_sem){++++++}:
[ 7.821844] [<ffffffff810b2078>] __lock_acquire+0x3a8/0xc40
[ 7.821846] [<ffffffff810b29c0>] lock_acquire+0xb0/0x140
[ 7.821849] [<ffffffff81185220>] might_fault+0x70/0xa0
[ 7.821858] [<ffffffff811e4ad1>] filldir+0x91/0x120
[ 7.821904] [<ffffffffa018253e>]
xfs_dir2_block_getdents.isra.12+0x1ae/0x200 [xfs]
[ 7.821916] [<ffffffffa01826f9>] xfs_readdir+0x109/0x150 [xfs]
[ 7.821928] [<ffffffffa018436b>] xfs_file_readdir+0x2b/0x40 [xfs]
[ 7.821931] [<ffffffff811e48be>] iterate_dir+0xae/0x140
[ 7.821933] [<ffffffff811e4dba>] SyS_getdents+0x8a/0x120
[ 7.821935] [<ffffffff8163b6e9>] system_call_fastpath+0x16/0x1b
[ 7.821937]
[ 7.821937] -> #2 (&xfs_dir_ilock_class){++++.+}:
[ 7.821940] [<ffffffff810b2078>] __lock_acquire+0x3a8/0xc40
[ 7.821942] [<ffffffff810b29c0>] lock_acquire+0xb0/0x140
[ 7.821945] [<ffffffff810abd74>] down_read_nested+0x44/0x90
[ 7.821965] [<ffffffffa01c8d82>] xfs_ilock+0xd2/0x100 [xfs]
[ 7.821982] [<ffffffffa01c8e24>]
xfs_ilock_attr_map_shared+0x34/0x40 [xfs]
[ 7.821997] [<ffffffffa019e5a7>] xfs_attr_get+0xb7/0x160 [xfs]
[ 7.822013] [<ffffffffa01986a7>] xfs_xattr_get+0x37/0x50 [xfs]
[ 7.822013] [<ffffffff811f6a7f>] generic_getxattr+0x4f/0x70
[ 7.822013] [<ffffffff8127ce60>]
inode_doinit_with_dentry+0x150/0x640
[ 7.822013] [<ffffffff8127d428>] sb_finish_set_opts+0xd8/0x270
[ 7.822013] [<ffffffff8127d84f>] selinux_set_mnt_opts+0x28f/0x5e0
[ 7.822013] [<ffffffff8127dc08>] superblock_doinit+0x68/0xd0
[ 7.822013] [<ffffffff8127dc80>] delayed_superblock_init+0x10/0x20
[ 7.822013] [<ffffffff811d45c2>] iterate_supers+0xb2/0x110
[ 7.822013] [<ffffffff8127e713>] selinux_complete_init+0x33/0x40
[ 7.822013] [<ffffffff8128cea4>] security_load_policy+0xf4/0x600
[ 7.822013] [<ffffffff8128008c>] sel_write_load+0xac/0x750
[ 7.822013] [<ffffffff811d0b9a>] vfs_write+0xba/0x1f0
[ 7.822013] [<ffffffff811d1749>] SyS_write+0x49/0xb0
[ 7.822013] [<ffffffff8163b6e9>] system_call_fastpath+0x16/0x1b
[ 7.822013]
[ 7.822013] -> #1 (&isec->lock){+.+.+.}:
[ 7.822013] [<ffffffff810b2078>] __lock_acquire+0x3a8/0xc40
[ 7.822013] [<ffffffff810b29c0>] lock_acquire+0xb0/0x140
[ 7.822013] [<ffffffff81636add>] mutex_lock_nested+0x5d/0x4d0
[ 7.822013] [<ffffffff8127cdb5>]
inode_doinit_with_dentry+0xa5/0x640
[ 7.822013] [<ffffffff8127defc>] selinux_d_instantiate+0x1c/0x20
[ 7.822013] [<ffffffff812732fb>] security_d_instantiate+0x1b/0x30
[ 7.822013] [<ffffffff811e8810>] d_instantiate+0x50/0x70
[ 7.822013] [<ffffffff811747f0>] __shmem_file_setup+0xe0/0x1d0
[ 7.822013] [<ffffffff811748f0>] shmem_file_setup+0x10/0x20
[ 7.822013] [<ffffffffa00d718b>] drm_gem_object_init+0x2b/0x40
[drm]
[ 7.822013] [<ffffffffa014cfce>] qxl_bo_create+0x7e/0x1a0 [qxl]
[ 7.822013] [<ffffffffa0151b50>]
qxl_alloc_release_reserved+0x190/0x200 [qxl]
[ 7.822013] [<ffffffffa014f6ec>] qxl_draw_opaque_fb+0x7c/0x390
[qxl]
[ 7.822013] [<ffffffffa014bc8e>]
qxl_fb_imageblit_internal+0x3e/0x40 [qxl]
[ 7.822013] [<ffffffffa014c17e>] qxl_fb_imageblit+0x6e/0x1a0 [qxl]
[ 7.822013] [<ffffffff813522a4>] soft_cursor+0x1b4/0x250
[ 7.822013] [<ffffffff81351b63>] bit_cursor+0x623/0x660
[ 7.822013] [<ffffffff8134df5b>] fbcon_cursor+0x13b/0x1c0
[ 7.822013] [<ffffffff813b5c98>] hide_cursor+0x28/0xa0
[ 7.822013] [<ffffffff813b77a8>] redraw_screen+0x168/0x240
[ 7.822013] [<ffffffff813b8181>] vc_do_resize+0x481/0x4b0
[ 7.822013] [<ffffffff813b81cf>] vc_resize+0x1f/0x30
[ 7.822013] [<ffffffff8134fb1c>] fbcon_init+0x35c/0x590
[ 7.822013] [<ffffffff813b5ef8>] visual_init+0xb8/0x120
[ 7.822013] [<ffffffff813b85a3>] do_bind_con_driver+0x163/0x330
[ 7.822013] [<ffffffff813b8d44>] do_take_over_console+0x114/0x1c0
[ 7.822013] [<ffffffff8134b3c3>] do_fbcon_takeover+0x63/0xd0
[ 7.822013] [<ffffffff813507a5>] fbcon_event_notify+0x6b5/0x800
[ 7.822013] [<ffffffff8108628c>] notifier_call_chain+0x4c/0x70
[ 7.822013] [<ffffffff81086553>]
__blocking_notifier_call_chain+0x53/0x70
[ 7.822013] [<ffffffff81086586>]
blocking_notifier_call_chain+0x16/0x20
[ 7.822013] [<ffffffff81356e0b>] fb_notifier_call_chain+0x1b/0x20
[ 7.822013] [<ffffffff8135905c>] register_framebuffer+0x1ec/0x330
[ 7.822013] [<ffffffffa013cb9e>]
drm_fb_helper_initial_config+0x2fe/0x4b0 [drm_kms_helper]
[ 7.822013] [<ffffffffa014ccfb>] qxl_fbdev_init+0xab/0xd0 [qxl]
[ 7.822013] [<ffffffffa014aded>] qxl_modeset_init+0x1fd/0x240
[qxl]
[ 7.822013] [<ffffffffa0148ca8>] qxl_driver_load+0x88/0xc0 [qxl]
[ 7.822013] [<ffffffffa00db90d>] drm_dev_register+0xad/0x100 [drm]
[ 7.822013] [<ffffffffa00de46f>] drm_get_pci_dev+0x8f/0x1f0 [drm]
[ 7.822013] [<ffffffffa01482ab>] qxl_pci_probe+0x1b/0x40 [qxl]
[ 7.822013] [<ffffffff8132ad65>] local_pci_probe+0x45/0xa0
[ 7.822013] [<ffffffff8132c031>] pci_device_probe+0xd1/0x130
[ 7.822013] [<ffffffff813e6f80>] driver_probe_device+0x90/0x3c0
[ 7.822013] [<ffffffff813e7383>] __driver_attach+0x93/0xa0
[ 7.822013] [<ffffffff813e4f2b>] bus_for_each_dev+0x6b/0xb0
[ 7.822013] [<ffffffff813e69ee>] driver_attach+0x1e/0x20
[ 7.822013] [<ffffffff813e65f8>] bus_add_driver+0x188/0x260
[ 7.822013] [<ffffffff813e7b54>] driver_register+0x64/0xf0
[ 7.822013] [<ffffffff8132a6f0>] __pci_register_driver+0x60/0x70
[ 7.822013] [<ffffffffa00de6da>] drm_pci_init+0x10a/0x140 [drm]
[ 7.822013] [<ffffffffa015b03e>]
cdrom_dummy_generic_packet+0x3e/0x40 [cdrom]
[ 7.822013] [<ffffffff810002fc>] do_one_initcall+0xbc/0x200
[ 7.822013] [<ffffffff810e714d>] load_module+0x162d/0x1a90
[ 7.822013] [<ffffffff810e7746>] SyS_finit_module+0x86/0xb0
[ 7.822013] [<ffffffff8163b6e9>] system_call_fastpath+0x16/0x1b
[ 7.822013]
[ 7.822013] -> #0 (&qdev->release_mutex){+.+.+.}:
[ 7.822013] [<ffffffff810b030c>]
validate_chain.isra.36+0x110c/0x11b0
[ 7.822013] [<ffffffff810b2078>] __lock_acquire+0x3a8/0xc40
[ 7.822013] [<ffffffff810b29c0>] lock_acquire+0xb0/0x140
[ 7.822013] [<ffffffff81636add>] mutex_lock_nested+0x5d/0x4d0
[ 7.822013] [<ffffffffa0151a2c>]
qxl_alloc_release_reserved+0x6c/0x200 [qxl]
[ 7.822013] [<ffffffffa014f6ec>] qxl_draw_opaque_fb+0x7c/0x390
[qxl]
[ 7.822013] [<ffffffffa014bdb1>]
qxl_fb_dirty_flush+0x121/0x160 [qxl]
[ 7.822013] [<ffffffffa014be90>] qxl_deferred_io+0xa0/0xb0 [qxl]
[ 7.822013] [<ffffffff8135eaae>] fb_deferred_io_work+0x7e/0xd0
[ 7.822013] [<ffffffff8107a0b5>] process_one_work+0x1f5/0x510
[ 7.822013] [<ffffffff8107ab4d>] worker_thread+0x11d/0x520
[ 7.822013] [<ffffffff81081760>] kthread+0xf0/0x110
[ 7.822013] [<ffffffff8163b63c>] ret_from_fork+0x7c/0xb0
[ 7.822013]
[ 7.822013] other info that might help us debug this:
[ 7.822013]
[ 7.822013] Chain exists of:
[ 7.822013] &qdev->release_mutex --> &mm->mmap_sem --> &fbdefio->lock
[ 7.822013]
[ 7.822013] Possible unsafe locking scenario:
[ 7.822013]
[ 7.822013] CPU0 CPU1
[ 7.822013] ---- ----
[ 7.822013] lock(&fbdefio->lock);
[ 7.822013] lock(&mm->mmap_sem);
[ 7.822013] lock(&fbdefio->lock);
[ 7.822013] lock(&qdev->release_mutex);
[ 7.822013]
[ 7.822013] *** DEADLOCK ***
[ 7.822013]
[ 7.822013] 3 locks held by kworker/3:1/276:
[ 7.822013] #0: ("events"){.+.+.+}, at: [<ffffffff8107a053>]
process_one_work+0x193/0x510
[ 7.822013] #1: ((&(&info->deferred_work)->work)){+.+.+.}, at:
[<ffffffff8107a053>] process_one_work+0x193/0x510
[ 7.822013] #2: (&fbdefio->lock){+.+.+.}, at: [<ffffffff8135ea65>]
fb_deferred_io_work+0x35/0xd0
[ 7.822013]
[ 7.822013] stack backtrace:
[ 7.822013] CPU: 3 PID: 276 Comm: kworker/3:1 Not tainted 3.16.0-rc6 #4
[ 7.822013] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[ 7.822013] Workqueue: events fb_deferred_io_work
[ 7.822013] ffffffff8268aa90 ffff8807df2ff8e8 ffffffff8163182a
ffffffff826ba5f0
[ 7.822013] ffff8807df2ff928 ffffffff8162b0d5 ffff8807df2ff960
ffff8807df2ddb50
[ 7.822013] ffff8807df2ddb50 0000000000000002 ffff8807df2dcec0
0000000000000003
[ 7.822013] Call Trace:
[ 7.822013] [<ffffffff8163182a>] dump_stack+0x45/0x56
[ 7.822013] [<ffffffff8162b0d5>] print_circular_bug+0x1f9/0x207
[ 7.822013] [<ffffffff810b030c>] validate_chain.isra.36+0x110c/0x11b0
[ 7.822013] [<ffffffff810ad89b>] ?
add_lock_to_list.isra.22.constprop.43+0x7b/0xf0
[ 7.822013] [<ffffffff8100c1b9>] ? sched_clock+0x9/0x10
[ 7.822013] [<ffffffff810b2078>] __lock_acquire+0x3a8/0xc40
[ 7.822013] [<ffffffff810b29c0>] lock_acquire+0xb0/0x140
[ 7.822013] [<ffffffffa0151a2c>] ?
qxl_alloc_release_reserved+0x6c/0x200 [qxl]
[ 7.822013] [<ffffffff81636add>] mutex_lock_nested+0x5d/0x4d0
[ 7.822013] [<ffffffffa0151a2c>] ?
qxl_alloc_release_reserved+0x6c/0x200 [qxl]
[ 7.822013] [<ffffffffa0151a2c>] ?
qxl_alloc_release_reserved+0x6c/0x200 [qxl]
[ 7.822013] [<ffffffff8163a957>] ? _raw_spin_unlock+0x27/0x30
[ 7.822013] [<ffffffffa0151598>] ? qxl_release_alloc+0x98/0x100 [qxl]
[ 7.822013] [<ffffffffa0151a2c>]
qxl_alloc_release_reserved+0x6c/0x200 [qxl]
[ 7.822013] [<ffffffffa014f6ec>] qxl_draw_opaque_fb+0x7c/0x390 [qxl]
[ 7.822013] [<ffffffffa014e3e3>] ? qxl_io_log+0x63/0x70 [qxl]
[ 7.822013] [<ffffffffa014bdb1>] qxl_fb_dirty_flush+0x121/0x160 [qxl]
[ 7.822013] [<ffffffffa014be90>] qxl_deferred_io+0xa0/0xb0 [qxl]
[ 7.822013] [<ffffffff8135eaae>] fb_deferred_io_work+0x7e/0xd0
[ 7.822013] [<ffffffff8107a0b5>] process_one_work+0x1f5/0x510
[ 7.822013] [<ffffffff8107a053>] ? process_one_work+0x193/0x510
[ 7.822013] [<ffffffff8107ab4d>] worker_thread+0x11d/0x520
[ 7.822013] [<ffffffff8107aa30>] ? create_and_start_worker+0x60/0x60
[ 7.822013] [<ffffffff81081760>] kthread+0xf0/0x110
[ 7.822013] [<ffffffff81081670>] ? kthread_create_on_node+0x220/0x220
[ 7.822013] [<ffffffff8163b63c>] ret_from_fork+0x7c/0xb0
[ 7.822013] [<ffffffff81081670>] ? kthread_create_on_node+0x220/0x220
[ 7.988039] hardirqs last enabled at (361): [<ffffffff81048f85>]
__do_page_fault+0x275/0x530
[ 7.988039] hardirqs last disabled at (360): [<ffffffff8163d826>]
error_sti+0x5/0x6
[ 7.988039] softirqs last enabled at (0): [<ffffffff8105ad8b>]
copy_process.part.22+0x66b/0x1d40
[ 7.988039] softirqs last disabled at (0): [<
(null)>] (null)

2014-07-28 16:37:22

by Waiman Long

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On 07/25/2014 12:10 PM, Peter Zijlstra wrote:
> On Thu, Jul 24, 2014 at 04:38:28PM -0400, Waiman Long wrote:
>> Yes, I think I may have a solution for that.
>>
>> Borislav, can you apply the following patch on top of the lockdep patch to
>> see if it can fix the problem?
>>
>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>> index d24e433..507a8ce 100644
>> --- a/kernel/locking/lockdep.c
>> +++ b/kernel/locking/lockdep.c
>> @@ -3595,6 +3595,12 @@ void lock_acquire(struct lockdep_map *lock, unsigned int
>> raw_local_irq_save(flags);
>> check_flags(flags);
>>
>> + /*
>> + * An interrupt recursive read in interrupt context can be considered
>> + * to be the same as a recursive read from checking perspective.
>> + */
>> + if ((read == 3)&& in_interrupt())
>> + read = 2;
>> current->lockdep_recursion = 1;
>> trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
>> __lock_acquire(lock, subclass, trylock, read, check,
> Just had another look at the initial patch and it cannot be right, even
> with the above.
>
> The problem is you cannot use in_interrupt() in check_deadlock().
> Check_deadlock() must be context invariant, it should only test the
> chain state and not rely on where or when its called.
>
>

I am planning to take out the check in check_deadlock and only have the
test in lock_acquire which change a 3 to 2 when in interrupt context.
Now my question is whether to do it as a new patch on top of the
existing one in tip or a total replacement. I also intend to use
symbolic names for the read states for better readability as suggested
by John.

-Longman

2014-07-28 16:42:35

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Linux 3.16-rc6

On Mon, Jul 28, 2014 at 12:37:14PM -0400, Waiman Long wrote:

> I am planning to take out the check in check_deadlock and only have the test
> in lock_acquire which change a 3 to 2 when in interrupt context. Now my
> question is whether to do it as a new patch on top of the existing one in
> tip or a total replacement. I also intend to use symbolic names for the read
> states for better readability as suggested by John.

Send new patches, the patches magically went away from tip.

I don't care too much about the symbolic thing, partly because the
actual value is not irrelevant seeing how we're peddling with bitfields.

Also, its an unrelated cleanup at best.

When you do re-submit extend the locking self test scenarios to cover
the new semantics as well.


Attachments:
(No filename) (765.00 B)
(No filename) (836.00 B)
Download all attachments