Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752686AbcJJNRj (ORCPT ); Mon, 10 Oct 2016 09:17:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46774 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752409AbcJJNRg (ORCPT ); Mon, 10 Oct 2016 09:17:36 -0400 From: Laszlo Ersek Subject: CONFIG_DEBUG_TEST_DRIVER_REMOVE causes unremovable drivers to bind devices twice To: Rob Herring Cc: Arnd Bergmann , Greg Kroah-Hartman , main kernel list Message-ID: <175e70d2-1800-3740-c172-793af27b0285@redhat.com> Date: Mon, 10 Oct 2016 15:17:28 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 10 Oct 2016 13:17:30 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14708 Lines: 335 Hi, Greg asked me to stick to email with this bug report, so I'm reposting the original kernel bugzilla report to personal addresses, and lkml. Thanks, Laszlo https://bugzilla.kernel.org/show_bug.cgi?id=177021 Bug ID: 177021 Summary: [driver core] CONFIG_DEBUG_TEST_DRIVER_REMOVE causes unremovable drivers to bind devices twice Product: Drivers Version: 2.5 Kernel Version: v4.8-2283-ga3443cd (4.9.0-0.rc0.git2.1.fc26.aarch64) Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Other Assignee: drivers_other@kernel-bugs.osdl.org Reporter: lersek@redhat.com CC: arnd@arndb.de, greg@kroah.com Regression: No CONFIG_DEBUG_TEST_DRIVER_REMOVE was added in the following commit: > commit bea5b158ff0da9c7246ff391f754f5f38e34577a > Author: Rob Herring > Date: Thu Aug 11 10:20:58 2016 -0500 > > driver core: add test of driver remove calls during probe > > In recent discussions on ksummit-discuss[1], it was suggested to do a > sequence of probe, remove, probe for testing driver remove paths. This > adds a kconfig option for said test. > > [1] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003459.html > > Suggested-by: Arnd Bergmann > Cc: Greg Kroah-Hartman > Signed-off-by: Rob Herring > Signed-off-by: Greg Kroah-Hartman > > diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig > index 98504ec99c7d..fdf44cac08e6 100644 > --- a/drivers/base/Kconfig > +++ b/drivers/base/Kconfig > @@ -212,6 +212,16 @@ config DEBUG_DEVRES > > If you are unsure about this, Say N here. > > +config DEBUG_TEST_DRIVER_REMOVE > + bool "Test driver remove calls during probe" > + depends on DEBUG_KERNEL > + help > + Say Y here if you want the Driver core to test driver remove functions > + by calling probe, remove, probe. This tests the remove path without > + having to unbind the driver or unload the driver module. > + > + If you are unsure about this, say N here. > + > config SYS_HYPERVISOR > bool > default n > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index 16688f50729c..4910e6db2a34 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -328,68 +328,89 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue); > static int really_probe(struct device *dev, struct device_driver *drv) > { > int ret = -EPROBE_DEFER; > int local_trigger_count = atomic_read(&deferred_trigger_count); > + bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE); > > if (defer_all_probes) { > /* > * Value of defer_all_probes can be set only by > * device_defer_all_probes_enable() which, in turn, will call > * wait_for_device_probe() right after that to avoid any races. > */ > dev_dbg(dev, "Driver %s force probe deferral\n", drv->name); > driver_deferred_probe_add(dev); > return ret; > } > > atomic_inc(&probe_count); > pr_debug("bus: '%s': %s: probing driver %s with device %s\n", > drv->bus->name, __func__, drv->name, dev_name(dev)); > WARN_ON(!list_empty(&dev->devres_head)); > > +re_probe: > dev->driver = drv; > > /* If using pinctrl, bind pins now before probing */ > ret = pinctrl_bind_pins(dev); > if (ret) > goto pinctrl_bind_failed; > > if (driver_sysfs_add(dev)) { > printk(KERN_ERR "%s: driver_sysfs_add(%s) failed\n", > __func__, dev_name(dev)); > goto probe_failed; > } > > if (dev->pm_domain && dev->pm_domain->activate) { > ret = dev->pm_domain->activate(dev); > if (ret) > goto probe_failed; > } > > /* > * Ensure devices are listed in devices_kset in correct order > * It's important to move Dev to the end of devices_kset before > * calling .probe, because it could be recursive and parent Dev > * should always go first > */ > devices_kset_move_last(dev); > > if (dev->bus->probe) { > ret = dev->bus->probe(dev); > if (ret) > goto probe_failed; > } else if (drv->probe) { > ret = drv->probe(dev); > if (ret) > goto probe_failed; > } > > + if (test_remove) { > + test_remove = false; > + > + if (dev->bus && dev->bus->remove) > + dev->bus->remove(dev); > + else if (drv->remove) > + drv->remove(dev); > + > + devres_release_all(dev); > + driver_sysfs_remove(dev); > + dev->driver = NULL; > + dev_set_drvdata(dev, NULL); > + if (dev->pm_domain && dev->pm_domain->dismiss) > + dev->pm_domain->dismiss(dev); > + pm_runtime_reinit(dev); > + > + goto re_probe; > + } > + > pinctrl_init_done(dev); > > if (dev->pm_domain && dev->pm_domain->sync) > dev->pm_domain->sync(dev); > > driver_bound(dev); > ret = 1; > pr_debug("bus: '%s': %s: bound device %s to driver %s\n", > drv->bus->name, __func__, dev_name(dev), drv->name); > goto done; If none of the remove() callbacks are non-NULL -- that is, when the device cannot be un-bound --, then none of the remove() callbacks are called. That's good. However: - other resources are nonetheless released (from under the live device), and - the original device -- which is still bound -- is re-probed. This leads to crashes such as the following: > OF: PCI: host bridge /pcie@10000000 ranges: <------------ first probing > OF: PCI: IO 0x3eff0000..0x3effffff -> 0x00000000 > OF: PCI: MEM 0x10000000..0x3efeffff -> 0x10000000 > OF: PCI: MEM 0x8000000000..0xffffffffff -> 0x8000000000 > pci-host-generic 3f000000.pcie: ECAM at [mem 0x3f000000-0x3fffffff] for [bus 00-0f] > pci-host-generic 3f000000.pcie: PCI host bridge to bus 0000:00 > pci_bus 0000:00: root bus resource [bus 00-0f] > pci_bus 0000:00: root bus resource [io 0x0000-0xffff] > pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff] > pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff] > pci 0000:00:02.0: BAR 0: assigned [mem 0x8000000000-0x8000003fff 64bit] > pci 0000:00:03.0: BAR 4: assigned [mem 0x8000004000-0x8000007fff 64bit pref] > pci 0000:00:01.0: BAR 1: assigned [mem 0x10000000-0x10000fff] > pci 0000:00:03.0: BAR 1: assigned [mem 0x10001000-0x10001fff] > pci 0000:00:01.0: BAR 0: assigned [io 0x1000-0x103f] > pci 0000:00:02.0: enabling device (0000 -> 0002) > OF: PCI: host bridge /pcie@10000000 ranges: <------------ second probing > OF: PCI: IO 0x3eff0000..0x3effffff -> 0x00000000 > OF: PCI: MEM 0x10000000..0x3efeffff -> 0x10000000 > OF: PCI: MEM 0x8000000000..0xffffffffff -> 0x8000000000 > ------------[ cut here ]------------ > kernel BUG at lib/ioremap.c:64! > Internal error: Oops - BUG: 0 [#1] SMP > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-0.rc0.git2.1.fc26.aarch64 #1 > Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > task: fffffe00e402e400 task.stack: fffffe00e8028000 > PC is at ioremap_page_range+0x168/0x2f8 > LR is at pci_remap_iospace+0x68/0x88 > pc : [] lr : [] pstate: 80000145 > sp : fffffe00e802ba80 > x29: fffffe00e802ba80 x28: 00000200c01f0000 > x27: 0140000000000800 x26: fffffdff7ee10000 > x25: 0400000000000001 x24: 00e8000000000707 > x23: fffffc0009ff7fd8 x22: fffffdff7ee10000 > x21: fffffdff7ee00000 x20: fffffdff7ee0ffff > x19: 000000003eff0000 x18: 0000000000000010 > x17: 0000000000000000 x16: 0000000000000000 > x15: 0000000000000006 x14: fffffc0089cc42f7 > x13: fffffc0009cc4305 x12: 00000000000000ed > x11: 0000000000000006 x10: 00000000000000ee > x9 : 0000000000000001 x8 : fffffe00e8028000 > x7 : fffffc0009002e68 x6 : 00000200c01f0000 > x5 : fffffe000104f700 x4 : 000000000000ffff > x3 : fffffc0008ee5eb0 x2 : 0000000040000000 > x1 : 0000000041040000 x0 : 00e800003eff0707 > > Process swapper/0 (pid: 1, stack limit = 0xfffffe00e8028020) > Stack: (0xfffffe00e802ba80 to 0xfffffe00e802c000) > ba80: fffffe00e802bb00 fffffc00084fc6e0 fffffe00f205c980 000000003eff0000 > baa0: 0000000000000000 fffffc0008f8d908 fffffe00f2050d00 fffffc0008c251c8 > bac0: fffffe00e802bb90 0000000000000000 fffffe00f205c980 0000000000000000 > bae0: fffffe00e802bb20 fffffc000851f60c fffffdff7e801384 fffffe00e0105810 > bb00: fffffe00e802bb20 fffffc000851f6e4 fffffe00f205c000 fffffe00e0105810 > bb20: fffffe00e802bbe0 fffffc000851f7e4 fffffe00e0105800 fffffe00e0105810 > bb40: fffffc0008f8dab8 fffffc0009f88000 0000000000000000 0000000000000000 > bb60: fffffc0008d063c0 fffffc0008d86398 fffffc0008e520e0 fffffc000894d8cc > bb80: fffffe00e802bbb0 fffffc00087b5bc0 fffffe00f2056b80 fffffe00f2053d00 > bba0: 000000003eff0000 fffffe00ffffd068 fffffe00e802bbe0 fffffc000851f7d8 > bbc0: fffffe00e0105800 fffffe00e0105810 fffffc0008f8dab8 fffffc0009f88000 > bbe0: fffffe00e802bc00 fffffc000862a760 00000000fffffffe fffffe00e0105810 > bc00: fffffe00e802bc30 fffffc0008627ac8 fffffe00e0105810 0000000000000000 > bc20: fffffc0008f8dae0 0000000000000000 fffffe00e802bc70 fffffc0008627e54 > bc40: fffffe00e0105810 fffffe00e0105870 fffffc0008f8dae0 0000000000000000 > bc60: fffffc0008fc4000 fffffc0008d20464 fffffe00e802bca0 fffffc00086252a8 > bc80: 0000000000000000 fffffc0008f8dae0 fffffc0008627d28 fffffc0008d86398 > bca0: fffffe00e802bce0 fffffc0008627130 fffffc0008f8dae0 fffffe00f3038900 > bcc0: fffffc0008fc5768 0000000000000000 fffffe00fa703930 fffffe00e81ce198 > bce0: fffffe00e802bd00 fffffc0008626b50 fffffc0008f8dae0 fffffe00f3038900 > bd00: fffffe00e802bd40 fffffc0008628ee0 fffffc0008f8dae0 0000000000000000 > bd20: 0000000000000000 0000000000000006 fffffc0009060000 0000000000000000 > bd40: fffffe00e802bd60 fffffc000862a684 fffffc0008f8dab8 0000000000000000 > bd60: fffffe00e802bd80 fffffc0008d5c5ac fffffc0008d5c594 fffffe00e8028000 > bd80: fffffe00e802bd90 fffffc0008083594 fffffe00e802be00 fffffc0008d20dec > bda0: 0000000000000102 fffffc0009060000 fffffc0008d863a8 0000000000000006 > bdc0: fffffc0008e51c00 0000000000000000 fffffc0008d20dec fffffc0008bbe670 > bde0: 0000000000000000 0000000600000006 fffffc0008d20464 fffffc0008d063c0 > be00: fffffe00e802bea0 fffffc0008945568 fffffc0008945550 0000000000000000 > be20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > be40: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 > be60: 0000000000000001 0000000000000000 0000000000000000 0000000000000000 > be80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bea0: 0000000000000000 fffffc0008083330 fffffc0008945550 0000000000000000 > bec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bf00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bf20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bf40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bf60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bf80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bfa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > bfc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000 > bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > Call trace: > Exception stack(0xfffffe00e802b8a0 to 0xfffffe00e802b9d0) > b8a0: 000000003eff0000 0000040000000000 fffffe00e802ba80 fffffc0008487c48 > b8c0: 0000000080000145 000000000000003d fffffe00e802b960 fffffc0008137f44 > b8e0: 8b6ac059b5e8abc6 fffffe00e402ecd8 0000000000000000 0000000000000000 > b900: fffffe00e402e400 0000000000000000 0000000000000001 fffffc0009cc4000 > b920: 0000000000000002 fffffe00e402ecb0 fffffe00e0105bd0 0000000000000000 > b940: fffffe00e402e400 0000000000000000 00e800003eff0707 0000000041040000 > b960: 0000000040000000 fffffc0008ee5eb0 000000000000ffff fffffe000104f700 > b980: 00000200c01f0000 fffffc0009002e68 fffffe00e8028000 0000000000000001 > b9a0: 00000000000000ee 0000000000000006 00000000000000ed fffffc0009cc4305 > b9c0: fffffc0089cc42f7 0000000000000006 > [] ioremap_page_range+0x168/0x2f8 > [] pci_remap_iospace+0x68/0x88 > [] pci_host_common_probe+0x24c/0x318 > [] gen_pci_probe+0x34/0x40 > [] platform_drv_probe+0x60/0xc8 > [] driver_probe_device+0x240/0x4a0 > [] __driver_attach+0x12c/0x130 > [] bus_for_each_dev+0x70/0xb0 > [] driver_attach+0x30/0x40 > [] bus_add_driver+0x200/0x2b8 > [] driver_register+0x68/0x100 > [] __platform_driver_register+0x54/0x60 > [] gen_pci_driver_init+0x18/0x20 > [] do_one_initcall+0x44/0x138 > [] kernel_init_freeable+0x23c/0x2dc > [] kernel_init+0x18/0x110 > [] ret_from_fork+0x10/0x20 > Code: 54fff801 52800000 1400005e d503201f (d4210000) > ---[ end trace bf4514e6b7f2008d ]--- > Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b > > Kernel Offset: disabled > Memory Limit: none > ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b The "drivers/pci/host/pci-host-generic.c" driver has no remove method, and its probe method works like this: gen_pci_probe() [drivers/pci/host/pci-host-generic.c] pci_host_common_probe() [drivers/pci/host/pci-host-common.c] gen_pci_init() [drivers/pci/host/pci-host-common.c] gen_pci_parse_request_of_pci_ranges() [drivers/pci/host/pci-host-common.c] of_pci_get_host_bridge_resources() [drivers/of/of_pci.c] pr_info( "host bridge %s ranges:\n", dev->full_name); pci_remap_iospace() [drivers/pci/pci.c] At the second probing attempt, the area is already mapped (the device is still bound from the first probing), which triggers BUG_ON(!pte_none(*pte)); in ioremap_pte_range() [lib/ioremap.c]. This is almost a regression because the kernel crashes with valid drivers. It is not an error for a driver to not provide a remove() callback, so in this instance CONFIG_DEBUG_TEST_DRIVER_REMOVE does not expose a driver bug, it breaks with a valid driver. Not a regression for the upstream kernel after all, because the Kconfig documentation suggests N as default. Proposed solution: if none of the remove() methods exist, or the remove() method that exists fails, then don't release any resources, and don't re-probe the device. Thanks!