2020-07-08 13:40:06

by Lukas Wunner

[permalink] [raw]
Subject: [PATCH 3/3] driver core: Avoid adding children below a dead parent

If CONFIG_OF_DYNAMIC or CONFIG_ACPI is enabled, SPI devices may be added
below a controller at runtime by a DeviceTree overlay or DSDT patch.
But there are no precautions to prevent adding a device below a
controller that's being removed.

This seems like something that should be guarded against in the driver
core because it's not specific to SPI: Adding a child below a parent
that's going away seems like a bad idea regardless of the bus type.

Take advantage of kill_device() which was added by commit 00289cd87676
("drivers/base: Introduce kill_device()"), call it upon removal of an
SPI controller and teach the driver core to refuse device addition below
a killed parent. To make this race-free, device_add() needs to take the
parent's dead_sem before checking its "dead" flag and until the child
device has been added to the parent's klist_children.

Signed-off-by: Lukas Wunner <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Pantelis Antoniou <[email protected]>
Cc: Alexander Duyck <[email protected]>
---
drivers/base/core.c | 18 ++++++++++++++++--
drivers/spi/spi.c | 3 +++
2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 057da42b1a660..1d4e39696f996 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2597,6 +2597,14 @@ int device_add(struct device *dev)
pr_debug("device: '%s': %s\n", dev_name(dev), __func__);

parent = get_device(dev->parent);
+ if (parent) {
+ down_read(&parent->p->dead_sem);
+ if (parent->p->dead) {
+ error = -ENODEV;
+ goto parent_error;
+ }
+ }
+
kobj = get_device_parent(dev, parent);
if (IS_ERR(kobj)) {
error = PTR_ERR(kobj);
@@ -2679,9 +2687,11 @@ int device_add(struct device *dev)
}

bus_probe_device(dev);
- if (parent)
+ if (parent) {
klist_add_tail(&dev->p->knode_parent,
&parent->p->klist_children);
+ up_read(&parent->p->dead_sem);
+ }

if (dev->class) {
mutex_lock(&dev->class->p->mutex);
@@ -2722,6 +2732,8 @@ int device_add(struct device *dev)
Error:
cleanup_glue_dir(dev, glue_dir);
parent_error:
+ if (parent)
+ up_read(&parent->p->dead_sem);
put_device(parent);
name_error:
kfree(dev->p);
@@ -2785,7 +2797,9 @@ EXPORT_SYMBOL_GPL(put_device);
* kill_device - declare device dead
* @dev: device in question
*
- * Declare @dev dead to prevent it from binding to a driver.
+ * Declare @dev dead to prevent it from binding to a driver and
+ * to prevent addition of children.
+ *
* Return true if it was killed or false if it was already dead.
*/
bool kill_device(struct device *dev)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 8158e281f3540..005eca4bae089 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2764,6 +2764,9 @@ void spi_unregister_controller(struct spi_controller *ctlr)
struct spi_controller *found;
int id = ctlr->bus_num;

+ /* Prevent addition of new children, then remove existing ones */
+ if (IS_ENABLED(CONFIG_OF_DYNAMIC) || IS_ENABLED(CONFIG_ACPI))
+ kill_device(&ctlr->dev);
device_for_each_child(&ctlr->dev, NULL, __unregister);

/* First make sure that this controller was ever added */
--
2.27.0


2020-07-24 14:35:17

by Chen, Rong A

[permalink] [raw]
Subject: [driver core] e3b1cb5c89: WARNING:possible_recursive_locking_detected

Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: e3b1cb5c896ba748d8f848238c8ea1f89520bde3 ("[PATCH 3/3] driver core: Avoid adding children below a dead parent")
url: https://github.com/0day-ci/linux/commits/Lukas-Wunner/Fix-races-on-device-removal/20200708-214117
base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git e5711945c6415ab847b39ef89003ef68f9435bc6

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-------------------------------------------------------+------------+------------+
| | 1cf28805e6 | e3b1cb5c89 |
+-------------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 10 | 4 |
| WARNING:possible_circular_locking_dependency_detected | 10 | |
| BUG:workqueue_lockup-pool | 1 | 1 |
| INFO:rcu_sched_detected_stalls_on_CPUs/tasks | 1 | |
| RIP:do_syscall_64 | 1 | |
| BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#] | 1 | |
| RIP:smp_call_function_single | 1 | |
| Kernel_panic-not_syncing:softlockup:hung_tasks | 1 | |
| WARNING:possible_recursive_locking_detected | 0 | 4 |
+-------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 1.392584] WARNING: possible recursive locking detected
[ 1.393350] 5.8.0-rc3-00011-ge3b1cb5c896ba7 #1 Not tainted
[ 1.393350] --------------------------------------------
[ 1.393350] swapper/0/1 is trying to acquire lock:
[ 1.393350] ffff88841fc6ff70 (&dev->p->dead_sem){.+.+}-{3:3}, at: __device_attach+0x51/0x1a0
[ 1.393350]
[ 1.393350] but task is already holding lock:
[ 1.393350] ffff888107f42770 (&dev->p->dead_sem){.+.+}-{3:3}, at: device_add+0xf8/0x890
[ 1.393350]
[ 1.393350] other info that might help us debug this:
[ 1.393350] Possible unsafe locking scenario:
[ 1.393350]
[ 1.393350] CPU0
[ 1.393350] ----
[ 1.393350] lock(&dev->p->dead_sem);
[ 1.393350]
[ 1.393350] *** DEADLOCK ***
[ 1.393350]
[ 1.393350] May be due to missing lock nesting notation
[ 1.393350]
[ 1.393350] 2 locks held by swapper/0/1:
[ 1.393350] #0: ffff888107f42770 (&dev->p->dead_sem){.+.+}-{3:3}, at: device_add+0xf8/0x890
[ 1.393350] #1: ffff88841fc6b188 (&dev->mutex){....}-{3:3}, at: __device_attach+0x3e/0x1a0
[ 1.393350]
[ 1.393350] stack backtrace:
[ 1.393350] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-00011-ge3b1cb5c896ba7 #1
[ 1.393350] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 1.393350] Call Trace:
[ 1.393350] dump_stack+0x81/0xba
[ 1.393350] validate_chain.cold+0x132/0x167
[ 1.393350] __lock_acquire+0x570/0xb60
[ 1.393350] lock_acquire+0xba/0x390
[ 1.393350] ? __device_attach+0x51/0x1a0
[ 1.393350] down_read+0x47/0x140
[ 1.393350] ? __device_attach+0x51/0x1a0
[ 1.393350] __device_attach+0x51/0x1a0
[ 1.393350] device_initial_probe+0xe/0x10
[ 1.393350] bus_probe_device+0x98/0xb0
[ 1.393350] device_add+0x421/0x890
[ 1.393350] platform_device_add+0xee/0x240
[ 1.393350] regulator_dummy_init+0x35/0x83
[ 1.393350] regulator_init+0x89/0x9c
[ 1.393350] ? regulator_init_complete+0x25/0x25
[ 1.393350] do_one_initcall+0x63/0x330
[ 1.393350] ? rcu_read_lock_sched_held+0x4a/0x80
[ 1.393350] kernel_init_freeable+0x1fb/0x25f
[ 1.393350] ? rest_init+0x24d/0x24d
[ 1.393350] kernel_init+0x9/0x103
[ 1.393350] ret_from_fork+0x1f/0x30
[ 1.393605] thermal_sys: Registered thermal governor 'step_wise'
[ 1.393919] NET: Registered protocol family 16
[ 1.396950] EISA bus registered
[ 1.397389] cpuidle: using governor ladder
[ 1.398250] cpuidle: using governor teo
[ 1.399807] ACPI: bus type PCI registered
[ 1.400804] PCI: Using configuration type 1 for base access
[ 1.417603] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[ 1.433475] cryptd: max_cpu_qlen set to 1000
[ 1.641498] ACPI: Added _OSI(Module Device)
[ 1.642377] ACPI: Added _OSI(Processor Device)
[ 1.643143] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 1.643964] ACPI: Added _OSI(Processor Aggregator Device)
[ 1.644961] ACPI: Added _OSI(Linux-Dell-Video)
[ 1.645365] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[ 1.646421] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[ 1.652593] ACPI: 1 ACPI AML tables successfully acquired and loaded
[ 1.658595] ACPI: Interpreter enabled
[ 1.659418] ACPI: (supports S0 S5)
[ 1.660097] ACPI: Using IOAPIC for interrupt routing
[ 1.661137] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 1.661810] ACPI: Enabled 2 GPEs in block 00 to 0F
[ 1.679128] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 1.680444] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments HPX-Type3]
[ 1.681408] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[ 1.684010] PCI host bridge to bus 0000:00
[ 1.684843] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window]
[ 1.685366] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window]
[ 1.686820] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[ 1.688259] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfffff window]
[ 1.689367] pci_bus 0000:00: root bus resource [mem 0x440000000-0x4bfffffff window]
[ 1.691002] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 1.692175] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
[ 1.694191] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100
[ 1.696487] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180
[ 1.705090] pci 0000:00:01.1: reg 0x20: [io 0xc200-0xc20f]
[ 1.708092] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io 0x01f0-0x01f7]
[ 1.709365] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io 0x03f6]
[ 1.710633] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io 0x0170-0x0177]
[ 1.712030] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io 0x0376]
[ 1.713916] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000
[ 1.715635] pci 0000:00:01.3: quirk: [io 0x0600-0x063f] claimed by PIIX4 ACPI
[ 1.717241] pci 0000:00:01.3: quirk: [io 0x0700-0x070f] claimed by PIIX4 SMB
[ 1.717974] pci 0000:00:02.0: [1234:1111] type 00 class 0x030000
[ 1.722816] pci 0000:00:02.0: reg 0x10: [mem 0xfd000000-0xfdffffff pref]
[ 1.730758] pci 0000:00:02.0: reg 0x18: [mem 0xfebf0000-0xfebf0fff]
[ 1.744295] pci 0000:00:02.0: reg 0x30: [mem 0xfebe0000-0xfebeffff pref]
[ 1.746003] pci 0000:00:03.0: [8086:100e] type 00 class 0x020000
[ 1.750471] pci 0000:00:03.0: reg 0x10: [mem 0xfebc0000-0xfebdffff]
[ 1.753367] pci 0000:00:03.0: reg 0x14: [io 0xc000-0xc03f]
[ 1.769371] pci 0000:00:03.0: reg 0x30: [mem 0xfeb80000-0xfebbffff pref]
[ 1.771401] pci 0000:00:04.0: [1af4:1001] type 00 class 0x010000


To reproduce:

# build kernel
cd linux
cp config-5.8.0-rc3-00011-ge3b1cb5c896ba7 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Rong Chen


Attachments:
(No filename) (8.12 kB)
config-5.8.0-rc3-00011-ge3b1cb5c896ba7 (142.14 kB)
job-script (4.54 kB)
dmesg.xz (15.55 kB)
trinity (59.20 kB)
Download all attachments

2020-07-25 11:10:54

by Lukas Wunner

[permalink] [raw]
Subject: Re: [driver core] e3b1cb5c89: WARNING:possible_recursive_locking_detected

On Fri, Jul 24, 2020 at 10:29:50PM +0800, kernel test robot wrote:
> commit: e3b1cb5c896ba748d8f848238c8ea1f89520bde3 ("[PATCH 3/3] driver core: Avoid adding children below a dead parent")
[...]
> [ 1.392584] WARNING: possible recursive locking detected
> [ 1.393350] 5.8.0-rc3-00011-ge3b1cb5c896ba7 #1 Not tainted
> [ 1.393350] --------------------------------------------
> [ 1.393350] swapper/0/1 is trying to acquire lock:
> [ 1.393350] ffff88841fc6ff70 (&dev->p->dead_sem){.+.+}-{3:3}, at: __device_attach+0x51/0x1a0
> [ 1.393350]
> [ 1.393350] but task is already holding lock:
> [ 1.393350] ffff888107f42770 (&dev->p->dead_sem){.+.+}-{3:3}, at: device_add+0xf8/0x890

False positive:

__device_attach() takes a device's dead_sem whereas device_add() takes
the *parent's* dead_sem. But lockdep thinks they're the same because
they're in the same lock class.

We would normally see the same lockdep splat for device_lock() but
commit 1704f47b50b5 silenced it by assigning device_lock() to the
novalidate class.

I could silence this lockdep splat by assigning dead_sem to the
novalidate class as well. But I also have an idea how we could
fix it properly by introducing a per-device class for bus_types
that need it and by putting the device_lock, dead_sem etc in
separate subclasses within that per-device class.

Any preference as to which solution I should pursue?
Any thoughts on this series in general?
Does the newly introduced dead_sem evoke approval or rejection?
Anyone?

Thanks,

Lukas