2023-04-27 14:34:43

by Alexander Stein

[permalink] [raw]
Subject: [PATCH 3/3] PCI/sysfs: Fix sysfs init race condition

sysfs attribute files for PCIe devices (pci_create_sysfs_dev_files) can be
created by two paths:
1. pci_sysfs_init()
2. pci_bus_add_device() (drivers/pci/bus.c)

There is a race during startup where an asynchronous PCIe host probe races
against the pci_sysfs_init() late_initcall. In this case the PCIe devices
are already added to the bus, for_each_pci_dev() will see them, but
pci_bus_add_device() has not yet finished, so both code paths try to add
the sysfs attributes.

Fix this by waiting on a workqueue until sysfs has been initialized.
pci_sysfs_init() needs the internal function without the check that
sysfs_initialized has been set to 1.
__pci_create_sysfs_dev_files still needs to remove resource files,
which might have been created during pci_sysfs_init initcall.

Signed-off-by: Alexander Stein <[email protected]>
---
drivers/pci/pci-sysfs.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7d4733773633..3067d55f981c 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -29,9 +29,11 @@
#include <linux/stat.h>
#include <linux/topology.h>
#include <linux/vgaarb.h>
+#include <linux/wait.h>
#include "pci.h"

static int sysfs_initialized; /* = 0 */
+static DECLARE_WAIT_QUEUE_HEAD(sysfs_wq);

/* show configuration fields */
#define pci_config_attr(field, format_string) \
@@ -997,8 +999,7 @@ static void __pci_create_legacy_files(struct pci_bus *b)
*/
void pci_create_legacy_files(struct pci_bus *b)
{
- if (!sysfs_initialized)
- return;
+ wait_event(sysfs_wq, sysfs_initialized);

__pci_create_legacy_files(b);
}
@@ -1501,13 +1502,18 @@ static const struct attribute_group pci_dev_resource_resize_group = {

int __must_check __pci_create_sysfs_dev_files(struct pci_dev *pdev)
{
+ /*
+ * sysfs attributes might already be created by pci_sysfs_init(),
+ * delete them here just in case
+ */
+ pci_remove_resource_files(pdev);
return pci_create_resource_files(pdev);
}

int __must_check pci_create_sysfs_dev_files(struct pci_dev *pdev)
{
- if (!sysfs_initialized)
- return -EACCES;
+ /* Wait until sysfs has been initialized */
+ wait_event(sysfs_wq, sysfs_initialized);

return __pci_create_sysfs_dev_files(pdev);
}
@@ -1520,8 +1526,8 @@ int __must_check pci_create_sysfs_dev_files(struct pci_dev *pdev)
*/
void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
{
- if (!sysfs_initialized)
- return;
+ /* Wait until sysfs has been initialized */
+ wait_event(sysfs_wq, sysfs_initialized);

pci_remove_resource_files(pdev);
}
@@ -1532,9 +1538,8 @@ static int __init pci_sysfs_init(void)
struct pci_bus *pbus = NULL;
int retval;

- sysfs_initialized = 1;
for_each_pci_dev(pdev) {
- retval = pci_create_sysfs_dev_files(pdev);
+ retval = __pci_create_sysfs_dev_files(pdev);
if (retval) {
pci_dev_put(pdev);
return retval;
@@ -1542,7 +1547,9 @@ static int __init pci_sysfs_init(void)
}

while ((pbus = pci_find_next_bus(pbus)))
- pci_create_legacy_files(pbus);
+ __pci_create_legacy_files(pbus);
+ sysfs_initialized = 1;
+ wake_up_all(&sysfs_wq);

return 0;
}
--
2.34.1


2023-04-29 03:13:42

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 3/3] PCI/sysfs: Fix sysfs init race condition


Hello,

kernel test robot noticed "BUG:kernel_hang_in_boot_stage" on:

commit: 931bc7f7debcbd7470fa92361f58e5f9dbe57cb0 ("[PATCH 3/3] PCI/sysfs: Fix sysfs init race condition")
url: https://github.com/intel-lab-lkp/linux/commits/Alexander-Stein/PCI-sysfs-sort-headers-alphabetically/20230427-223059
base: https://git.kernel.org/cgit/linux/kernel/git/pci/pci.git next
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH 3/3] PCI/sysfs: Fix sysfs init race condition

in testcase: boot

compiler: gcc-11
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+-------------------------------+------------+------------+
| | cbc7787301 | 931bc7f7de |
+-------------------------------+------------+------------+
| boot_successes | 20 | 0 |
| boot_failures | 0 | 18 |
| BUG:kernel_hang_in_boot_stage | 0 | 18 |
+-------------------------------+------------+------------+


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]


[ 1.377768][ T1] pci 0000:00:03.0: quirk_igfx_skip_te_disable+0x0/0xb0 took 0 usecs
[ 1.381714][ T1] pci 0000:00:04.0: [8086:25ab] type 00 class 0x088000
[ 1.384180][ T1] pci 0000:00:04.0: reg 0x10: [mem 0xfebf1000-0xfebf100f]
[ 1.390202][ T1] pci 0000:00:04.0: calling quirk_igfx_skip_te_disable+0x0/0xb0 @ 1
[ 1.391825][ T1] pci 0000:00:04.0: quirk_igfx_skip_te_disable+0x0/0xb0 took 0 usecs
BUG: kernel hang in boot stage

Kboot worker: lkp-worker07
Elapsed time: 1020

kvm=(


To reproduce:

# build kernel
cd linux
cp config-6.3.0-rc1-00075-g931bc7f7debc .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests



Attachments:
(No filename) (2.58 kB)
config-6.3.0-rc1-00075-g931bc7f7debc (159.65 kB)
job-script (4.85 kB)
dmesg.xz (10.90 kB)
Download all attachments