2021-08-25 10:30:35

by Barry Song

[permalink] [raw]
Subject: [PATCH v3 3/3] PCI/MSI: remove msi_attrib.default_irq in msi_desc

From: Marc Zyngier <[email protected]>

default_irq is hideous as it should be per-device but not per-desc.
On the other hand, MSI-X case doesn't use it at all. Since sysfs
IRQ has moved to use the msi_entry instead of pci_dev.irq, now it
seems it is safe to remove msi_attrib.default_irq.

Cc: Jesse Brandeburg <[email protected]>
Cc: Tony Nguyen <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
[Barry: Updated pci_irq_vector and __pci_restore_msi_state]
Signed-off-by: Barry Song <[email protected]>
---
drivers/pci/msi.c | 12 +++++-------
include/linux/msi.h | 2 --
2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index e5e7533..9434afa 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -422,7 +422,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
if (!dev->msi_enabled)
return;

- entry = irq_get_msi_desc(dev->irq);
+ entry = first_pci_msi_entry(dev);

pci_intx_for_msi(dev, 0);
pci_msi_set_enable(dev, 0);
@@ -591,7 +591,6 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
entry->msi_attrib.is_virtual = 0;
entry->msi_attrib.entry_nr = 0;
entry->msi_attrib.maskbit = !!(control & PCI_MSI_FLAGS_MASKBIT);
- entry->msi_attrib.default_irq = dev->irq; /* Save IOAPIC IRQ */
entry->msi_attrib.multi_cap = (control & PCI_MSI_FLAGS_QMASK) >> 1;
entry->msi_attrib.multiple = ilog2(__roundup_pow_of_two(nvec));

@@ -682,7 +681,6 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
dev->msi_enabled = 1;

pcibios_free_irq(dev);
- dev->irq = entry->irq;
return 0;
}

@@ -742,7 +740,6 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
entry->msi_attrib.is_virtual =
entry->msi_attrib.entry_nr >= vec_count;

- entry->msi_attrib.default_irq = dev->irq;
entry->mask_base = base;

addr = pci_msix_desc_addr(entry);
@@ -964,8 +961,6 @@ static void pci_msi_shutdown(struct pci_dev *dev)
mask = msi_mask(desc->msi_attrib.multi_cap);
msi_mask_irq(desc, mask, 0);

- /* Restore dev->irq to its default pin-assertion IRQ */
- dev->irq = desc->msi_attrib.default_irq;
pcibios_alloc_irq(dev);
}

@@ -1301,12 +1296,15 @@ int pci_irq_vector(struct pci_dev *dev, unsigned int nr)

if (WARN_ON_ONCE(nr >= entry->nvec_used))
return -EINVAL;
+
+ return entry->irq + nr;
} else {
if (WARN_ON_ONCE(nr > 0))
return -EINVAL;
}

- return dev->irq + nr;
+ /* legacy INTx */
+ return dev->irq;
}
EXPORT_SYMBOL(pci_irq_vector);

diff --git a/include/linux/msi.h b/include/linux/msi.h
index e8bdcb8..a631664 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -114,7 +114,6 @@ struct ti_sci_inta_msi_desc {
* @maskbit: [PCI MSI/X] Mask-Pending bit supported?
* @is_64: [PCI MSI/X] Address size: 0=32bit 1=64bit
* @entry_nr: [PCI MSI/X] Entry which is described by this descriptor
- * @default_irq:[PCI MSI/X] The default pre-assigned non-MSI irq
* @mask_pos: [PCI MSI] Mask register position
* @mask_base: [PCI MSI-X] Mask register base address
* @platform: [platform] Platform device specific msi descriptor data
@@ -148,7 +147,6 @@ struct msi_desc {
u8 is_64 : 1;
u8 is_virtual : 1;
u16 entry_nr;
- unsigned default_irq;
} msi_attrib;
union {
u8 mask_pos;
--
1.8.3.1


2021-08-25 13:55:48

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] PCI/MSI: remove msi_attrib.default_irq in msi_desc

On Wed, 25 Aug 2021 11:26:36 +0100,
Barry Song <[email protected]> wrote:
>
> From: Marc Zyngier <[email protected]>
>
> default_irq is hideous as it should be per-device but not per-desc.
> On the other hand, MSI-X case doesn't use it at all. Since sysfs
> IRQ has moved to use the msi_entry instead of pci_dev.irq, now it
> seems it is safe to remove msi_attrib.default_irq.

Thanks for doing the write-up. Maybe worth adding that drivers that
use dev->irq while having enabled MSI will break (INTx will be
disabled while MSI is enabled). That should give people a clue about
what to fix when they bisect the problem to this patch.

Also, a link to the discussion that lead to this patch would be useful
to give some context.

No need to respin this for now, let's give it a shake after 5.14.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-08-29 14:39:35

by kernel test robot

[permalink] [raw]
Subject: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3] PCI/MSI: remove msi_attrib.default_irq in msi_desc")
url: https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 6e764bcd1cf72a2846c0e53d3975a09b242c04c9

in testcase: kernel-selftests
version: kernel-selftests-x86_64-ebaa603b-1_20210825
with following parameters:

group: pidfd
ucode: 0xe2

test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt


on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>



[ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000 (mei_me) vs. 00000000 (xhci_hcd)
[ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted 5.14.0-rc7-00014-ga4fc4cf38831 #1
[ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
[ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
[ 179.637543][ T34] Call Trace:
[ 179.640789][ T34] dump_stack_lvl+0x45/0x59
[ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
[ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
[ 179.655923][ T34] request_threaded_irq+0x10c/0x180
[ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240 [mei_me]
[ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
[ 179.672767][ T34] local_pci_probe+0x42/0x80
[ 179.677313][ T34] pci_device_probe+0x107/0x1c0
[ 179.682118][ T34] really_probe+0xb6/0x380
[ 179.687094][ T34] __driver_probe_device+0xfe/0x180
[ 179.692242][ T34] driver_probe_device+0x1e/0xc0
[ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
[ 179.702802][ T34] async_run_entry_fn+0x30/0x140
[ 179.707693][ T34] process_one_work+0x274/0x5c0
[ 179.712503][ T34] worker_thread+0x50/0x3c0
[ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
[ 179.721936][ T34] kthread+0x14f/0x180
[ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
[ 179.730935][ T34] ret_from_fork+0x22/0x30
[ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq failure. irq = 16
[ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
[ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error -16



To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (3.27 kB)
config-5.14.0-rc7-00014-ga4fc4cf38831 (179.30 kB)
job-script (6.08 kB)
dmesg.xz (23.05 kB)
job.yaml (4.98 kB)
Download all attachments

2021-08-31 01:22:24

by Barry Song

[permalink] [raw]
Subject: Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

On Mon, Aug 30, 2021 at 2:38 AM kernel test robot <[email protected]> wrote:
>
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3] PCI/MSI: remove msi_attrib.default_irq in msi_desc")
> url: https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
> base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
>
> in testcase: kernel-selftests
> version: kernel-selftests-x86_64-ebaa603b-1_20210825
> with following parameters:
>
> group: pidfd
> ucode: 0xe2
>
> test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
>
>
> on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
>
> [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000 (mei_me) vs. 00000000 (xhci_hcd)
> [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted 5.14.0-rc7-00014-ga4fc4cf38831 #1
> [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
> [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
> [ 179.637543][ T34] Call Trace:
> [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
> [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
> [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
> [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
> [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240 [mei_me]
> [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
> [ 179.672767][ T34] local_pci_probe+0x42/0x80
> [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
> [ 179.682118][ T34] really_probe+0xb6/0x380
> [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
> [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
> [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
> [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
> [ 179.707693][ T34] process_one_work+0x274/0x5c0
> [ 179.712503][ T34] worker_thread+0x50/0x3c0
> [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
> [ 179.721936][ T34] kthread+0x14f/0x180
> [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
> [ 179.730935][ T34] ret_from_fork+0x22/0x30
> [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq failure. irq = 16
> [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
> [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error -16
>
>

it seems there is a direct reference to pdev->irq.
Hi Oliver, would you try if the below patch can fix the problem:

diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
index c3393b383e59..a45a2d4257a6 100644
--- a/drivers/misc/mei/pci-me.c
+++ b/drivers/misc/mei/pci-me.c
@@ -216,7 +216,7 @@ static int mei_me_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)

pci_enable_msi(pdev);

- hw->irq = pdev->irq;
+ hw->irq = pci_irq_vector(pdev, 0);

/* request and enable interrupt */
irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;


I don't have any hardware to test.

>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> bin/lkp run generated-yaml-file
>
>
>
> ---
> 0DAY/LKP+ Test Infrastructure Open Source Technology Center
> https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
>
> Thanks,
> Oliver Sang
>

Thanks
barry

2021-08-31 01:39:34

by Barry Song

[permalink] [raw]
Subject: Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

On Tue, Aug 31, 2021 at 1:21 PM Barry Song <[email protected]> wrote:
>
> On Mon, Aug 30, 2021 at 2:38 AM kernel test robot <[email protected]> wrote:
> >
> >
> >
> > Greeting,
> >
> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3] PCI/MSI: remove msi_attrib.default_irq in msi_desc")
> > url: https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
> > base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
> >
> > in testcase: kernel-selftests
> > version: kernel-selftests-x86_64-ebaa603b-1_20210825
> > with following parameters:
> >
> > group: pidfd
> > ucode: 0xe2
> >
> > test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
> > test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> >
> >
> > on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G memory
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >
> >
> >
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <[email protected]>
> >
> >
> >
> > [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000 (mei_me) vs. 00000000 (xhci_hcd)
> > [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted 5.14.0-rc7-00014-ga4fc4cf38831 #1
> > [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
> > [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
> > [ 179.637543][ T34] Call Trace:
> > [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
> > [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
> > [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
> > [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
> > [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240 [mei_me]
> > [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
> > [ 179.672767][ T34] local_pci_probe+0x42/0x80
> > [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
> > [ 179.682118][ T34] really_probe+0xb6/0x380
> > [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
> > [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
> > [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
> > [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
> > [ 179.707693][ T34] process_one_work+0x274/0x5c0
> > [ 179.712503][ T34] worker_thread+0x50/0x3c0
> > [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
> > [ 179.721936][ T34] kthread+0x14f/0x180
> > [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
> > [ 179.730935][ T34] ret_from_fork+0x22/0x30
> > [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq failure. irq = 16
> > [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
> > [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error -16
> >
> >
>
> it seems there is a direct reference to pdev->irq.
> Hi Oliver, would you try if the below patch can fix the problem:

+ Tomas

sorry. after second looking, drivers/misc/mei/pci-me.c has many
places using pdev->irq directly. We really need this driver's
maintainers to address the problem.

On the other hand, "struct mei_me_hw *hw" seems to be totally not
used in this driver except here:
164 static int mei_me_probe(struct pci_dev *pdev, const struct
pci_device_id *ent)
165 {
166 const struct mei_cfg *cfg;
167 struct mei_device *dev;
168 struct mei_me_hw *hw;
169 unsigned int irqflags;
170 int err;
.....
219 hw->irq = pdev->irq;
...

this looks wrong. maybe we can leverage hw->irq in other places such as
shutdown, suspend, resume.

Thanks
barry


>
> diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
> index c3393b383e59..a45a2d4257a6 100644
> --- a/drivers/misc/mei/pci-me.c
> +++ b/drivers/misc/mei/pci-me.c
> @@ -216,7 +216,7 @@ static int mei_me_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
>
> pci_enable_msi(pdev);
>
> - hw->irq = pdev->irq;
> + hw->irq = pci_irq_vector(pdev, 0);
>
> /* request and enable interrupt */
> irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;
>
>
> I don't have any hardware to test.
>
> >
> > To reproduce:
> >
> > git clone https://github.com/intel/lkp-tests.git
> > cd lkp-tests
> > bin/lkp install job.yaml # job file is attached in this email
> > bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> > bin/lkp run generated-yaml-file
> >
> >
> >
> > ---
> > 0DAY/LKP+ Test Infrastructure Open Source Technology Center
> > https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
> >
> > Thanks,
> > Oliver Sang
> >
>
> Thanks
> barry

2021-08-31 08:10:50

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

On 2021-08-31 02:21, Barry Song wrote:
> On Mon, Aug 30, 2021 at 2:38 AM kernel test robot
> <[email protected]> wrote:
>>
>>
>>
>> Greeting,
>>
>> FYI, we noticed the following commit (built with gcc-9):
>>
>> commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3]
>> PCI/MSI: remove msi_attrib.default_irq in msi_desc")
>> url:
>> https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
>> base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
>> 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
>>
>> in testcase: kernel-selftests
>> version: kernel-selftests-x86_64-ebaa603b-1_20210825
>> with following parameters:
>>
>> group: pidfd
>> ucode: 0xe2
>>
>> test-description: The kernel contains a set of "self tests" under the
>> tools/testing/selftests/ directory. These are intended to be small
>> unit tests to exercise individual code paths in the kernel.
>> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
>>
>>
>> on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
>> with 32G memory
>>
>> caused below changes (please refer to attached dmesg/kmsg for entire
>> log/backtrace):
>>
>>
>>
>> If you fix the issue, kindly add following tag
>> Reported-by: kernel test robot <[email protected]>
>>
>>
>>
>> [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000
>> (mei_me) vs. 00000000 (xhci_hcd)
>> [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted
>> 5.14.0-rc7-00014-ga4fc4cf38831 #1
>> [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT,
>> BIOS 1.8.1 12/05/2017
>> [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
>> [ 179.637543][ T34] Call Trace:
>> [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
>> [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
>> [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
>> [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
>> [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240
>> [mei_me]
>> [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
>> [ 179.672767][ T34] local_pci_probe+0x42/0x80
>> [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
>> [ 179.682118][ T34] really_probe+0xb6/0x380
>> [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
>> [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
>> [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
>> [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
>> [ 179.707693][ T34] process_one_work+0x274/0x5c0
>> [ 179.712503][ T34] worker_thread+0x50/0x3c0
>> [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
>> [ 179.721936][ T34] kthread+0x14f/0x180
>> [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
>> [ 179.730935][ T34] ret_from_fork+0x22/0x30
>> [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq
>> failure. irq = 16
>> [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
>> [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error
>> -16
>>
>>
>
> it seems there is a direct reference to pdev->irq.
> Hi Oliver, would you try if the below patch can fix the problem:
>
> diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
> index c3393b383e59..a45a2d4257a6 100644
> --- a/drivers/misc/mei/pci-me.c
> +++ b/drivers/misc/mei/pci-me.c
> @@ -216,7 +216,7 @@ static int mei_me_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
>
> pci_enable_msi(pdev);
>
> - hw->irq = pdev->irq;
> + hw->irq = pci_irq_vector(pdev, 0);
>
> /* request and enable interrupt */
> irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT :
> IRQF_SHARED;
>

Ah! one victim, 3000 to go! :D

That's exactly the kind of stuff I was mentioning when we
discussed this patch. Exposing the MSI vector as the INTx
IRQ has led to all sorts of broken drivers.

M.
--
Jazz is not dead. It just smells funny...

2021-08-31 22:46:38

by Barry Song

[permalink] [raw]
Subject: Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

On Tue, Aug 31, 2021 at 8:08 PM Marc Zyngier <[email protected]> wrote:
>
> On 2021-08-31 02:21, Barry Song wrote:
> > On Mon, Aug 30, 2021 at 2:38 AM kernel test robot
> > <[email protected]> wrote:
> >>
> >>
> >>
> >> Greeting,
> >>
> >> FYI, we noticed the following commit (built with gcc-9):
> >>
> >> commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3]
> >> PCI/MSI: remove msi_attrib.default_irq in msi_desc")
> >> url:
> >> https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
> >> base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
> >> 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
> >>
> >> in testcase: kernel-selftests
> >> version: kernel-selftests-x86_64-ebaa603b-1_20210825
> >> with following parameters:
> >>
> >> group: pidfd
> >> ucode: 0xe2
> >>
> >> test-description: The kernel contains a set of "self tests" under the
> >> tools/testing/selftests/ directory. These are intended to be small
> >> unit tests to exercise individual code paths in the kernel.
> >> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> >>
> >>
> >> on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
> >> with 32G memory
> >>
> >> caused below changes (please refer to attached dmesg/kmsg for entire
> >> log/backtrace):
> >>
> >>
> >>
> >> If you fix the issue, kindly add following tag
> >> Reported-by: kernel test robot <[email protected]>
> >>
> >>
> >>
> >> [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000
> >> (mei_me) vs. 00000000 (xhci_hcd)
> >> [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted
> >> 5.14.0-rc7-00014-ga4fc4cf38831 #1
> >> [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT,
> >> BIOS 1.8.1 12/05/2017
> >> [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
> >> [ 179.637543][ T34] Call Trace:
> >> [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
> >> [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
> >> [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
> >> [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
> >> [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240
> >> [mei_me]
> >> [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
> >> [ 179.672767][ T34] local_pci_probe+0x42/0x80
> >> [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
> >> [ 179.682118][ T34] really_probe+0xb6/0x380
> >> [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
> >> [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
> >> [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
> >> [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
> >> [ 179.707693][ T34] process_one_work+0x274/0x5c0
> >> [ 179.712503][ T34] worker_thread+0x50/0x3c0
> >> [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
> >> [ 179.721936][ T34] kthread+0x14f/0x180
> >> [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
> >> [ 179.730935][ T34] ret_from_fork+0x22/0x30
> >> [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq
> >> failure. irq = 16
> >> [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
> >> [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error
> >> -16
> >>
> >>
> >
> > it seems there is a direct reference to pdev->irq.
> > Hi Oliver, would you try if the below patch can fix the problem:
> >
> > diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
> > index c3393b383e59..a45a2d4257a6 100644
> > --- a/drivers/misc/mei/pci-me.c
> > +++ b/drivers/misc/mei/pci-me.c
> > @@ -216,7 +216,7 @@ static int mei_me_probe(struct pci_dev *pdev,
> > const struct pci_device_id *ent)
> >
> > pci_enable_msi(pdev);
> >
> > - hw->irq = pdev->irq;
> > + hw->irq = pci_irq_vector(pdev, 0);
> >
> > /* request and enable interrupt */
> > irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT :
> > IRQF_SHARED;
> >
>
> Ah! one victim, 3000 to go! :D
>

yep.

> That's exactly the kind of stuff I was mentioning when we
> discussed this patch. Exposing the MSI vector as the INTx
> IRQ has led to all sorts of broken drivers.

I guess drivers should depend on int pci_irq_vector(struct pci_dev
*dev, unsigned int nr)
rather than hardcodely use pdev->irq.

pci_irq_vector() supports all cases(intx, msi, msi-x)

>
> M.
> --
> Jazz is not dead. It just smells funny...

Thanks
barry

2021-09-01 22:12:43

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

On 2021-08-31 22:36, Barry Song wrote:
> On Tue, Aug 31, 2021 at 8:08 PM Marc Zyngier <[email protected]> wrote:
>>
>> On 2021-08-31 02:21, Barry Song wrote:
>> > On Mon, Aug 30, 2021 at 2:38 AM kernel test robot
>> > <[email protected]> wrote:
>> >>
>> >>
>> >>
>> >> Greeting,
>> >>
>> >> FYI, we noticed the following commit (built with gcc-9):
>> >>
>> >> commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3]
>> >> PCI/MSI: remove msi_attrib.default_irq in msi_desc")
>> >> url:
>> >> https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
>> >> base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
>> >> 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
>> >>
>> >> in testcase: kernel-selftests
>> >> version: kernel-selftests-x86_64-ebaa603b-1_20210825
>> >> with following parameters:
>> >>
>> >> group: pidfd
>> >> ucode: 0xe2
>> >>
>> >> test-description: The kernel contains a set of "self tests" under the
>> >> tools/testing/selftests/ directory. These are intended to be small
>> >> unit tests to exercise individual code paths in the kernel.
>> >> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
>> >>
>> >>
>> >> on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
>> >> with 32G memory
>> >>
>> >> caused below changes (please refer to attached dmesg/kmsg for entire
>> >> log/backtrace):
>> >>
>> >>
>> >>
>> >> If you fix the issue, kindly add following tag
>> >> Reported-by: kernel test robot <[email protected]>
>> >>
>> >>
>> >>
>> >> [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000
>> >> (mei_me) vs. 00000000 (xhci_hcd)
>> >> [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted
>> >> 5.14.0-rc7-00014-ga4fc4cf38831 #1
>> >> [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT,
>> >> BIOS 1.8.1 12/05/2017
>> >> [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
>> >> [ 179.637543][ T34] Call Trace:
>> >> [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
>> >> [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
>> >> [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
>> >> [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
>> >> [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240
>> >> [mei_me]
>> >> [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
>> >> [ 179.672767][ T34] local_pci_probe+0x42/0x80
>> >> [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
>> >> [ 179.682118][ T34] really_probe+0xb6/0x380
>> >> [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
>> >> [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
>> >> [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
>> >> [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
>> >> [ 179.707693][ T34] process_one_work+0x274/0x5c0
>> >> [ 179.712503][ T34] worker_thread+0x50/0x3c0
>> >> [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
>> >> [ 179.721936][ T34] kthread+0x14f/0x180
>> >> [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
>> >> [ 179.730935][ T34] ret_from_fork+0x22/0x30
>> >> [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq
>> >> failure. irq = 16
>> >> [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
>> >> [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error
>> >> -16
>> >>
>> >>
>> >
>> > it seems there is a direct reference to pdev->irq.
>> > Hi Oliver, would you try if the below patch can fix the problem:
>> >
>> > diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
>> > index c3393b383e59..a45a2d4257a6 100644
>> > --- a/drivers/misc/mei/pci-me.c
>> > +++ b/drivers/misc/mei/pci-me.c
>> > @@ -216,7 +216,7 @@ static int mei_me_probe(struct pci_dev *pdev,
>> > const struct pci_device_id *ent)
>> >
>> > pci_enable_msi(pdev);
>> >
>> > - hw->irq = pdev->irq;
>> > + hw->irq = pci_irq_vector(pdev, 0);
>> >
>> > /* request and enable interrupt */
>> > irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT :
>> > IRQF_SHARED;
>> >
>>
>> Ah! one victim, 3000 to go! :D
>>
>
> yep.
>
>> That's exactly the kind of stuff I was mentioning when we
>> discussed this patch. Exposing the MSI vector as the INTx
>> IRQ has led to all sorts of broken drivers.
>
> I guess drivers should depend on int pci_irq_vector(struct pci_dev
> *dev, unsigned int nr)
> rather than hardcodely use pdev->irq.
>
> pci_irq_vector() supports all cases(intx, msi, msi-x)

Yes, that'd be a sensible approach. Feels like a job for
a coccinelle script!

M.
--
Jazz is not dead. It just smells funny...

2021-09-03 00:44:52

by Tomas Winkler

[permalink] [raw]
Subject: RE: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

> dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)
>
> On Tue, Aug 31, 2021 at 1:21 PM Barry Song <[email protected]> wrote:
> >
> > On Mon, Aug 30, 2021 at 2:38 AM kernel test robot
> <[email protected]> wrote:
> > >
> > >
> > >
> > > Greeting,
> > >
> > > FYI, we noticed the following commit (built with gcc-9):
> > >
> > > commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3]
> > > PCI/MSI: remove msi_attrib.default_irq in msi_desc")
> > > url:
> > > https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-
> > > the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
> > > base:
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
> > > 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
> > >
> > > in testcase: kernel-selftests
> > > version: kernel-selftests-x86_64-ebaa603b-1_20210825
> > > with following parameters:
> > >
> > > group: pidfd
> > > ucode: 0xe2
> > >
> > > test-description: The kernel contains a set of "self tests" under the
> tools/testing/selftests/ directory. These are intended to be small unit tests
> to exercise individual code paths in the kernel.
> > > test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> > >
> > >
> > > on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
> > > with 32G memory
> > >
> > > caused below changes (please refer to attached dmesg/kmsg for entire
> log/backtrace):
> > >
> > >
> > >
> > > If you fix the issue, kindly add following tag
> > > Reported-by: kernel test robot <[email protected]>
> > >
> > >
> > >
> > > [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000 (mei_me) vs.
> 00000000 (xhci_hcd)
> > > [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted
> 5.14.0-rc7-00014-ga4fc4cf38831 #1
> > > [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT,
> BIOS 1.8.1 12/05/2017
> > > [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
> > > [ 179.637543][ T34] Call Trace:
> > > [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
> > > [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
> > > [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
> > > [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
> > > [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240
> [mei_me]
> > > [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
> > > [ 179.672767][ T34] local_pci_probe+0x42/0x80
> > > [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
> > > [ 179.682118][ T34] really_probe+0xb6/0x380
> > > [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
> > > [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
> > > [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
> > > [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
> > > [ 179.707693][ T34] process_one_work+0x274/0x5c0
> > > [ 179.712503][ T34] worker_thread+0x50/0x3c0
> > > [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
> > > [ 179.721936][ T34] kthread+0x14f/0x180
> > > [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
> > > [ 179.730935][ T34] ret_from_fork+0x22/0x30
> > > [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq failure.
> irq = 16
> > > [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
> > > [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error -16
> > >
> > >
> >
> > it seems there is a direct reference to pdev->irq.
> > Hi Oliver, would you try if the below patch can fix the problem:
>
> + Tomas
>
> sorry. after second looking, drivers/misc/mei/pci-me.c has many places using
> pdev->irq directly. We really need this driver's maintainers to address the
> problem.

Will look at that.
>
> On the other hand, "struct mei_me_hw *hw" seems to be totally not used in
> this driver except here:
> 164 static int mei_me_probe(struct pci_dev *pdev, const struct
> pci_device_id *ent)
> 165 {
> 166 const struct mei_cfg *cfg;
> 167 struct mei_device *dev;
> 168 struct mei_me_hw *hw;
> 169 unsigned int irqflags;
> 170 int err;
> .....
> 219 hw->irq = pdev->irq;
> ...
>
> this looks wrong. maybe we can leverage hw->irq in other places such as
> shutdown, suspend, resume.

We need this, usage will follow.
>
> Thanks
> barry
>
>
> >
> > diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
> > index c3393b383e59..a45a2d4257a6 100644
> > --- a/drivers/misc/mei/pci-me.c
> > +++ b/drivers/misc/mei/pci-me.c
> > @@ -216,7 +216,7 @@ static int mei_me_probe(struct pci_dev *pdev,
> > const struct pci_device_id *ent)
> >
> > pci_enable_msi(pdev);
> >
> > - hw->irq = pdev->irq;
> > + hw->irq = pci_irq_vector(pdev, 0);
> >
> > /* request and enable interrupt */
> > irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT :
> > IRQF_SHARED;
> >
> >
> > I don't have any hardware to test.


Hard to believe, MEI is on every Intel client platform :)

> >
> > >
> > > To reproduce:
> > >
> > > git clone https://github.com/intel/lkp-tests.git
> > > cd lkp-tests
> > > bin/lkp install job.yaml # job file is attached in this email
> > > bin/lkp split-job --compatible job.yaml # generate the yaml file for
> lkp run
> > > bin/lkp run generated-yaml-file
> > >
> > >
> > >
> > > ---
> > > 0DAY/LKP+ Test Infrastructure Open Source Technology Center
> > > https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
> > >
> > > Thanks,
> > > Oliver Sang
> > >
> >
> > Thanks
> > barry

2021-10-03 08:35:30

by Barry Song

[permalink] [raw]
Subject: Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

On Fri, Sep 3, 2021 at 7:34 AM Winkler, Tomas <[email protected]> wrote:
>
> > dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)
> >
> > On Tue, Aug 31, 2021 at 1:21 PM Barry Song <[email protected]> wrote:
> > >
> > > On Mon, Aug 30, 2021 at 2:38 AM kernel test robot
> > <[email protected]> wrote:
> > > >
> > > >
> > > >
> > > > Greeting,
> > > >
> > > > FYI, we noticed the following commit (built with gcc-9):
> > > >
> > > > commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3]
> > > > PCI/MSI: remove msi_attrib.default_irq in msi_desc")
> > > > url:
> > > > https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-
> > > > the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
> > > > base:
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
> > > > 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
> > > >
> > > > in testcase: kernel-selftests
> > > > version: kernel-selftests-x86_64-ebaa603b-1_20210825
> > > > with following parameters:
> > > >
> > > > group: pidfd
> > > > ucode: 0xe2
> > > >
> > > > test-description: The kernel contains a set of "self tests" under the
> > tools/testing/selftests/ directory. These are intended to be small unit tests
> > to exercise individual code paths in the kernel.
> > > > test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> > > >
> > > >
> > > > on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
> > > > with 32G memory
> > > >
> > > > caused below changes (please refer to attached dmesg/kmsg for entire
> > log/backtrace):
> > > >
> > > >
> > > >
> > > > If you fix the issue, kindly add following tag
> > > > Reported-by: kernel test robot <[email protected]>
> > > >
> > > >
> > > >
> > > > [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000 (mei_me) vs.
> > 00000000 (xhci_hcd)
> > > > [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted
> > 5.14.0-rc7-00014-ga4fc4cf38831 #1
> > > > [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT,
> > BIOS 1.8.1 12/05/2017
> > > > [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
> > > > [ 179.637543][ T34] Call Trace:
> > > > [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
> > > > [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
> > > > [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
> > > > [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
> > > > [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240
> > [mei_me]
> > > > [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
> > > > [ 179.672767][ T34] local_pci_probe+0x42/0x80
> > > > [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
> > > > [ 179.682118][ T34] really_probe+0xb6/0x380
> > > > [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
> > > > [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
> > > > [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
> > > > [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
> > > > [ 179.707693][ T34] process_one_work+0x274/0x5c0
> > > > [ 179.712503][ T34] worker_thread+0x50/0x3c0
> > > > [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
> > > > [ 179.721936][ T34] kthread+0x14f/0x180
> > > > [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
> > > > [ 179.730935][ T34] ret_from_fork+0x22/0x30
> > > > [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq failure.
> > irq = 16
> > > > [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
> > > > [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error -16
> > > >
> > > >
> > >
> > > it seems there is a direct reference to pdev->irq.
> > > Hi Oliver, would you try if the below patch can fix the problem:
> >
> > + Tomas
> >
> > sorry. after second looking, drivers/misc/mei/pci-me.c has many places using
> > pdev->irq directly. We really need this driver's maintainers to address the
> > problem.
>
> Will look at that.

Hi Tomas,

I assume using hw->irq or not is a separate topic, does vim command
%s/pdev->irq/pci_irq_vector(pdev, 0)/g
as below fix the current crash problem because of directly dereferencing
pdev->irq?

diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
index c3393b383e59..97495931fadd 100644
--- a/drivers/misc/mei/pci-me.c
+++ b/drivers/misc/mei/pci-me.c
@@ -216,18 +216,18 @@ static int mei_me_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)

pci_enable_msi(pdev);

- hw->irq = pdev->irq;
+ hw->irq = pci_irq_vector(pdev, 0);

/* request and enable interrupt */
irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;

- err = request_threaded_irq(pdev->irq,
+ err = request_threaded_irq(pci_irq_vector(pdev, 0),
mei_me_irq_quick_handler,
mei_me_irq_thread_handler,
irqflags, KBUILD_MODNAME, dev);
if (err) {
dev_err(&pdev->dev, "request_threaded_irq failure. irq = %d\n",
- pdev->irq);
+ pci_irq_vector(pdev, 0));
goto end;
}

@@ -278,7 +278,7 @@ static int mei_me_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)
release_irq:
mei_cancel_work(dev);
mei_disable_interrupts(dev);
- free_irq(pdev->irq, dev);
+ free_irq(pci_irq_vector(pdev, 0), dev);
end:
dev_err(&pdev->dev, "initialization failed.\n");
return err;
@@ -307,7 +307,7 @@ static void mei_me_shutdown(struct pci_dev *pdev)
mei_me_unset_pm_domain(dev);

mei_disable_interrupts(dev);
- free_irq(pdev->irq, dev);
+ free_irq(pci_irq_vector(pdev, 0), dev);
}

/**
@@ -336,7 +336,7 @@ static void mei_me_remove(struct pci_dev *pdev)

mei_disable_interrupts(dev);

- free_irq(pdev->irq, dev);
+ free_irq(pci_irq_vector(pdev, 0), dev);

mei_deregister(dev);
}
@@ -356,7 +356,7 @@ static int mei_me_pci_suspend(struct device *device)

mei_disable_interrupts(dev);

- free_irq(pdev->irq, dev);
+ free_irq(pci_irq_vector(pdev, 0), dev);
pci_disable_msi(pdev);

return 0;
@@ -378,14 +378,14 @@ static int mei_me_pci_resume(struct device *device)
irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;

/* request and enable interrupt */
- err = request_threaded_irq(pdev->irq,
+ err = request_threaded_irq(pci_irq_vector(pdev, 0),
mei_me_irq_quick_handler,
mei_me_irq_thread_handler,
irqflags, KBUILD_MODNAME, dev);

if (err) {
dev_err(&pdev->dev, "request_threaded_irq failed: irq = %d.\n",
- pdev->irq);
+ pci_irq_vector(pdev, 0));
return err;
}


Thanks
barry

2021-10-16 17:50:05

by Carel Si

[permalink] [raw]
Subject: Re: [LKP] Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

Hi, Barry

On Sun, Oct 03, 2021 at 04:32:28PM +0800, Barry Song wrote:
> On Fri, Sep 3, 2021 at 7:34 AM Winkler, Tomas <[email protected]> wrote:
> >
> > > dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)
> > >
> > > On Tue, Aug 31, 2021 at 1:21 PM Barry Song <[email protected]> wrote:
> > > >
> > > > On Mon, Aug 30, 2021 at 2:38 AM kernel test robot
> > > <[email protected]> wrote:
> > > > >
> > > > >
> > > > >
> > > > > Greeting,
> > > > >
> > > > > FYI, we noticed the following commit (built with gcc-9):
> > > > >
> > > > > commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3]
> > > > > PCI/MSI: remove msi_attrib.default_irq in msi_desc")
> > > > > url:
> > > > > https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-
> > > > > the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
> > > > > base:
> > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
> > > > > 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
> > > > >
> > > > > in testcase: kernel-selftests
> > > > > version: kernel-selftests-x86_64-ebaa603b-1_20210825
> > > > > with following parameters:
> > > > >
> > > > > group: pidfd
> > > > > ucode: 0xe2
> > > > >
> > > > > test-description: The kernel contains a set of "self tests" under the
> > > tools/testing/selftests/ directory. These are intended to be small unit tests
> > > to exercise individual code paths in the kernel.
> > > > > test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> > > > >
> > > > >
> > > > > on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
> > > > > with 32G memory
> > > > >
> > > > > caused below changes (please refer to attached dmesg/kmsg for entire
> > > log/backtrace):
> > > > >
> > > > >
> > > > >
> > > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kernel test robot <[email protected]>
> > > > >
> > > > >
> > > > >
> > > > > [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000 (mei_me) vs.
> > > 00000000 (xhci_hcd)
> > > > > [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted
> > > 5.14.0-rc7-00014-ga4fc4cf38831 #1
> > > > > [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT,
> > > BIOS 1.8.1 12/05/2017
> > > > > [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
> > > > > [ 179.637543][ T34] Call Trace:
> > > > > [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
> > > > > [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
> > > > > [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
> > > > > [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
> > > > > [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240
> > > [mei_me]
> > > > > [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
> > > > > [ 179.672767][ T34] local_pci_probe+0x42/0x80
> > > > > [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
> > > > > [ 179.682118][ T34] really_probe+0xb6/0x380
> > > > > [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
> > > > > [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
> > > > > [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
> > > > > [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
> > > > > [ 179.707693][ T34] process_one_work+0x274/0x5c0
> > > > > [ 179.712503][ T34] worker_thread+0x50/0x3c0
> > > > > [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
> > > > > [ 179.721936][ T34] kthread+0x14f/0x180
> > > > > [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
> > > > > [ 179.730935][ T34] ret_from_fork+0x22/0x30
> > > > > [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq failure.
> > > irq = 16
> > > > > [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
> > > > > [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error -16
> > > > >
> > > > >
> > > >
> > > > it seems there is a direct reference to pdev->irq.
> > > > Hi Oliver, would you try if the below patch can fix the problem:
> > >
> > > + Tomas
> > >
> > > sorry. after second looking, drivers/misc/mei/pci-me.c has many places using
> > > pdev->irq directly. We really need this driver's maintainers to address the
> > > problem.
> >
> > Will look at that.
>
> Hi Tomas,
>
> I assume using hw->irq or not is a separate topic, does vim command
> %s/pdev->irq/pci_irq_vector(pdev, 0)/g
> as below fix the current crash problem because of directly dereferencing
> pdev->irq?

We tested your fix, it can solve "Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)"
issue, but it still has "Flags_mismatch_irq##(i915)vs.#(xhci_hcd)" and
"Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)" issue, could you help on them?
thanks!


=========================================================================================
compiler/group/kconfig/rootfs/tbox_group/testcase/ucode:
gcc-9/pidfd/x86_64-rhel-8.3-kselftests/debian-10.4-x86_64-20200603.cgz/lkp-skl-d05/kernel-selftests/0xe2

commit:
86c19983f1 ("PCI/sysfs: Don't depend on pci_dev.irq for IRQ entry") <<< parent
a4fc4cf388 ("PCI/MSI: remove msi_attrib.default_irq in msi_desc") <<< fbc
29368adf4c ("fixup-for-a4fc4cf388")

86c19983f1808cea a4fc4cf388319ea957ffbdab507 29368adf4c2b598c3e13dbd9603
---------------- --------------------------- ---------------------------
fail:runs %reproduction fail:runs %reproduction fail:runs
| | | | |
:31 68% 21:31 71% 22:31 dmesg.genirq:Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)
:31 84% 26:31 94% 29:31 dmesg.genirq:Flags_mismatch_irq##(i915)vs.#(xhci_hcd)
:31 77% 24:31 0% :31 dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

>
> diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
> index c3393b383e59..97495931fadd 100644
> --- a/drivers/misc/mei/pci-me.c
> +++ b/drivers/misc/mei/pci-me.c
> @@ -216,18 +216,18 @@ static int mei_me_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
>
> pci_enable_msi(pdev);
>
> - hw->irq = pdev->irq;
> + hw->irq = pci_irq_vector(pdev, 0);
>
> /* request and enable interrupt */
> irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;
>
> - err = request_threaded_irq(pdev->irq,
> + err = request_threaded_irq(pci_irq_vector(pdev, 0),
> mei_me_irq_quick_handler,
> mei_me_irq_thread_handler,
> irqflags, KBUILD_MODNAME, dev);
> if (err) {
> dev_err(&pdev->dev, "request_threaded_irq failure. irq = %d\n",
> - pdev->irq);
> + pci_irq_vector(pdev, 0));
> goto end;
> }
>
> @@ -278,7 +278,7 @@ static int mei_me_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
> release_irq:
> mei_cancel_work(dev);
> mei_disable_interrupts(dev);
> - free_irq(pdev->irq, dev);
> + free_irq(pci_irq_vector(pdev, 0), dev);
> end:
> dev_err(&pdev->dev, "initialization failed.\n");
> return err;
> @@ -307,7 +307,7 @@ static void mei_me_shutdown(struct pci_dev *pdev)
> mei_me_unset_pm_domain(dev);
>
> mei_disable_interrupts(dev);
> - free_irq(pdev->irq, dev);
> + free_irq(pci_irq_vector(pdev, 0), dev);
> }
>
> /**
> @@ -336,7 +336,7 @@ static void mei_me_remove(struct pci_dev *pdev)
>
> mei_disable_interrupts(dev);
>
> - free_irq(pdev->irq, dev);
> + free_irq(pci_irq_vector(pdev, 0), dev);
>
> mei_deregister(dev);
> }
> @@ -356,7 +356,7 @@ static int mei_me_pci_suspend(struct device *device)
>
> mei_disable_interrupts(dev);
>
> - free_irq(pdev->irq, dev);
> + free_irq(pci_irq_vector(pdev, 0), dev);
> pci_disable_msi(pdev);
>
> return 0;
> @@ -378,14 +378,14 @@ static int mei_me_pci_resume(struct device *device)
> irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;
>
> /* request and enable interrupt */
> - err = request_threaded_irq(pdev->irq,
> + err = request_threaded_irq(pci_irq_vector(pdev, 0),
> mei_me_irq_quick_handler,
> mei_me_irq_thread_handler,
> irqflags, KBUILD_MODNAME, dev);
>
> if (err) {
> dev_err(&pdev->dev, "request_threaded_irq failed: irq = %d.\n",
> - pdev->irq);
> + pci_irq_vector(pdev, 0));
> return err;
> }
>
>
> Thanks
> barry
> _______________________________________________
> LKP mailing list -- [email protected]
> To unsubscribe send an email to [email protected]


Attachments:
(No filename) (8.47 kB)
config-5.14.0-rc7-00015-g29368adf4c2b (178.15 kB)
dmesg.xz (22.82 kB)
Download all attachments

2021-10-18 03:35:49

by Barry Song

[permalink] [raw]
Subject: Re: [LKP] Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

On Sat, Oct 16, 2021 at 3:46 AM Carel Si <[email protected]> wrote:
>
> Hi, Barry
>
> On Sun, Oct 03, 2021 at 04:32:28PM +0800, Barry Song wrote:
> > On Fri, Sep 3, 2021 at 7:34 AM Winkler, Tomas <[email protected]> wrote:
> > >
> > > > dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)
> > > >
> > > > On Tue, Aug 31, 2021 at 1:21 PM Barry Song <[email protected]> wrote:
> > > > >
> > > > > On Mon, Aug 30, 2021 at 2:38 AM kernel test robot
> > > > <[email protected]> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > Greeting,
> > > > > >
> > > > > > FYI, we noticed the following commit (built with gcc-9):
> > > > > >
> > > > > > commit: a4fc4cf388319ea957ffbdab5073bdd267de9082 ("[PATCH v3 3/3]
> > > > > > PCI/MSI: remove msi_attrib.default_irq in msi_desc")
> > > > > > url:
> > > > > > https://github.com/0day-ci/linux/commits/Barry-Song/PCI-MSI-Clarify-
> > > > > > the-IRQ-sysfs-ABI-for-PCI-devices/20210825-183018
> > > > > > base:
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
> > > > > > 6e764bcd1cf72a2846c0e53d3975a09b242c04c9
> > > > > >
> > > > > > in testcase: kernel-selftests
> > > > > > version: kernel-selftests-x86_64-ebaa603b-1_20210825
> > > > > > with following parameters:
> > > > > >
> > > > > > group: pidfd
> > > > > > ucode: 0xe2
> > > > > >
> > > > > > test-description: The kernel contains a set of "self tests" under the
> > > > tools/testing/selftests/ directory. These are intended to be small unit tests
> > > > to exercise individual code paths in the kernel.
> > > > > > test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> > > > > >
> > > > > >
> > > > > > on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
> > > > > > with 32G memory
> > > > > >
> > > > > > caused below changes (please refer to attached dmesg/kmsg for entire
> > > > log/backtrace):
> > > > > >
> > > > > >
> > > > > >
> > > > > > If you fix the issue, kindly add following tag
> > > > > > Reported-by: kernel test robot <[email protected]>
> > > > > >
> > > > > >
> > > > > >
> > > > > > [ 179.602028][ T34] genirq: Flags mismatch irq 16. 00002000 (mei_me) vs.
> > > > 00000000 (xhci_hcd)
> > > > > > [ 179.614073][ T34] CPU: 2 PID: 34 Comm: kworker/u8:2 Not tainted
> > > > 5.14.0-rc7-00014-ga4fc4cf38831 #1
> > > > > > [ 179.623225][ T34] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT,
> > > > BIOS 1.8.1 12/05/2017
> > > > > > [ 179.631432][ T34] Workqueue: events_unbound async_run_entry_fn
> > > > > > [ 179.637543][ T34] Call Trace:
> > > > > > [ 179.640789][ T34] dump_stack_lvl+0x45/0x59
> > > > > > [ 179.645253][ T34] __setup_irq.cold+0x50/0xd4
> > > > > > [ 179.649893][ T34] ? mei_me_pg_exit_sync+0x480/0x480 [mei_me]
> > > > > > [ 179.655923][ T34] request_threaded_irq+0x10c/0x180
> > > > > > [ 179.661073][ T34] ? mei_me_irq_quick_handler+0x240/0x240
> > > > [mei_me]
> > > > > > [ 179.667528][ T34] mei_me_probe+0x131/0x300 [mei_me]
> > > > > > [ 179.672767][ T34] local_pci_probe+0x42/0x80
> > > > > > [ 179.677313][ T34] pci_device_probe+0x107/0x1c0
> > > > > > [ 179.682118][ T34] really_probe+0xb6/0x380
> > > > > > [ 179.687094][ T34] __driver_probe_device+0xfe/0x180
> > > > > > [ 179.692242][ T34] driver_probe_device+0x1e/0xc0
> > > > > > [ 179.697133][ T34] __driver_attach_async_helper+0x2b/0x80
> > > > > > [ 179.702802][ T34] async_run_entry_fn+0x30/0x140
> > > > > > [ 179.707693][ T34] process_one_work+0x274/0x5c0
> > > > > > [ 179.712503][ T34] worker_thread+0x50/0x3c0
> > > > > > [ 179.716959][ T34] ? process_one_work+0x5c0/0x5c0
> > > > > > [ 179.721936][ T34] kthread+0x14f/0x180
> > > > > > [ 179.725958][ T34] ? set_kthread_struct+0x40/0x40
> > > > > > [ 179.730935][ T34] ret_from_fork+0x22/0x30
> > > > > > [ 179.735699][ T34] mei_me 0000:00:16.0: request_threaded_irq failure.
> > > > irq = 16
> > > > > > [ 179.743125][ T34] mei_me 0000:00:16.0: initialization failed.
> > > > > > [ 179.749399][ T34] mei_me: probe of 0000:00:16.0 failed with error -16
> > > > > >
> > > > > >
> > > > >
> > > > > it seems there is a direct reference to pdev->irq.
> > > > > Hi Oliver, would you try if the below patch can fix the problem:
> > > >
> > > > + Tomas
> > > >
> > > > sorry. after second looking, drivers/misc/mei/pci-me.c has many places using
> > > > pdev->irq directly. We really need this driver's maintainers to address the
> > > > problem.
> > >
> > > Will look at that.
> >
> > Hi Tomas,
> >
> > I assume using hw->irq or not is a separate topic, does vim command
> > %s/pdev->irq/pci_irq_vector(pdev, 0)/g
> > as below fix the current crash problem because of directly dereferencing
> > pdev->irq?
>
> We tested your fix, it can solve "Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)"

thanks for your test and update.

> issue, but it still has "Flags_mismatch_irq##(i915)vs.#(xhci_hcd)" and

Can you post the backtrace of i915?

> "Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)" issue, could you help on them?

I assume the below can fix i801_smbus:

diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index 89ae78ef1a1c..88d96e3ca268 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -1827,7 +1827,7 @@ static int i801_probe(struct pci_dev *dev, const
struct pci_device_id *id)
/* Default timeout in interrupt mode: 200 ms */
priv->adapter.timeout = HZ / 5;

- if (dev->irq == IRQ_NOTCONNECTED)
+ if (pci_irq_vector(dev, 0) == IRQ_NOTCONNECTED)
priv->features &= ~FEATURE_IRQ;

if (priv->features & FEATURE_IRQ) {
@@ -1849,11 +1849,11 @@ static int i801_probe(struct pci_dev *dev,
const struct pci_device_id *id)
if (priv->features & FEATURE_IRQ) {
init_completion(&priv->done);

- err = devm_request_irq(&dev->dev, dev->irq, i801_isr,
+ err = devm_request_irq(&dev->dev, pci_irq_vector(dev,
0), i801_isr,
IRQF_SHARED, DRV_NAME, priv);
if (err) {
dev_err(&dev->dev, "Failed to allocate irq %d: %d\n",
- dev->irq, err);
+ pci_irq_vector(dev, 0), err);
priv->features &= ~FEATURE_IRQ;
}
}



> thanks!
>
>
> =========================================================================================
> compiler/group/kconfig/rootfs/tbox_group/testcase/ucode:
> gcc-9/pidfd/x86_64-rhel-8.3-kselftests/debian-10.4-x86_64-20200603.cgz/lkp-skl-d05/kernel-selftests/0xe2
>
> commit:
> 86c19983f1 ("PCI/sysfs: Don't depend on pci_dev.irq for IRQ entry") <<< parent
> a4fc4cf388 ("PCI/MSI: remove msi_attrib.default_irq in msi_desc") <<< fbc
> 29368adf4c ("fixup-for-a4fc4cf388")
>
> 86c19983f1808cea a4fc4cf388319ea957ffbdab507 29368adf4c2b598c3e13dbd9603
> ---------------- --------------------------- ---------------------------
> fail:runs %reproduction fail:runs %reproduction fail:runs
> | | | | |
> :31 68% 21:31 71% 22:31 dmesg.genirq:Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)
> :31 84% 26:31 94% 29:31 dmesg.genirq:Flags_mismatch_irq##(i915)vs.#(xhci_hcd)
> :31 77% 24:31 0% :31 dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)
>
> >
> > diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
> > index c3393b383e59..97495931fadd 100644
> > --- a/drivers/misc/mei/pci-me.c
> > +++ b/drivers/misc/mei/pci-me.c
> > @@ -216,18 +216,18 @@ static int mei_me_probe(struct pci_dev *pdev,
> > const struct pci_device_id *ent)
> >
> > pci_enable_msi(pdev);
> >
> > - hw->irq = pdev->irq;
> > + hw->irq = pci_irq_vector(pdev, 0);
> >
> > /* request and enable interrupt */
> > irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;
> >
> > - err = request_threaded_irq(pdev->irq,
> > + err = request_threaded_irq(pci_irq_vector(pdev, 0),
> > mei_me_irq_quick_handler,
> > mei_me_irq_thread_handler,
> > irqflags, KBUILD_MODNAME, dev);
> > if (err) {
> > dev_err(&pdev->dev, "request_threaded_irq failure. irq = %d\n",
> > - pdev->irq);
> > + pci_irq_vector(pdev, 0));
> > goto end;
> > }
> >
> > @@ -278,7 +278,7 @@ static int mei_me_probe(struct pci_dev *pdev,
> > const struct pci_device_id *ent)
> > release_irq:
> > mei_cancel_work(dev);
> > mei_disable_interrupts(dev);
> > - free_irq(pdev->irq, dev);
> > + free_irq(pci_irq_vector(pdev, 0), dev);
> > end:
> > dev_err(&pdev->dev, "initialization failed.\n");
> > return err;
> > @@ -307,7 +307,7 @@ static void mei_me_shutdown(struct pci_dev *pdev)
> > mei_me_unset_pm_domain(dev);
> >
> > mei_disable_interrupts(dev);
> > - free_irq(pdev->irq, dev);
> > + free_irq(pci_irq_vector(pdev, 0), dev);
> > }
> >
> > /**
> > @@ -336,7 +336,7 @@ static void mei_me_remove(struct pci_dev *pdev)
> >
> > mei_disable_interrupts(dev);
> >
> > - free_irq(pdev->irq, dev);
> > + free_irq(pci_irq_vector(pdev, 0), dev);
> >
> > mei_deregister(dev);
> > }
> > @@ -356,7 +356,7 @@ static int mei_me_pci_suspend(struct device *device)
> >
> > mei_disable_interrupts(dev);
> >
> > - free_irq(pdev->irq, dev);
> > + free_irq(pci_irq_vector(pdev, 0), dev);
> > pci_disable_msi(pdev);
> >
> > return 0;
> > @@ -378,14 +378,14 @@ static int mei_me_pci_resume(struct device *device)
> > irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;
> >
> > /* request and enable interrupt */
> > - err = request_threaded_irq(pdev->irq,
> > + err = request_threaded_irq(pci_irq_vector(pdev, 0),
> > mei_me_irq_quick_handler,
> > mei_me_irq_thread_handler,
> > irqflags, KBUILD_MODNAME, dev);
> >
> > if (err) {
> > dev_err(&pdev->dev, "request_threaded_irq failed: irq = %d.\n",
> > - pdev->irq);
> > + pci_irq_vector(pdev, 0));
> > return err;
> > }
> >
> >
> > Thanks
> > barry

Thanks
barry

2021-10-19 06:57:21

by Carel Si

[permalink] [raw]
Subject: Re: [LKP] Re: [PCI/MSI] a4fc4cf388: dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)

Hi Barry,

On Sat, Oct 16, 2021 at 08:08:43AM +0800, Barry Song wrote:
> > issue, but it still has "Flags_mismatch_irq##(i915)vs.#(xhci_hcd)" and
>
> Can you post the backtrace of i915?

Sure, and dmesg log is attached in dmesg.xz.

[ 186.776123][ T198] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 186.776390][ T12] i915 0000:00:02.0: Direct firmware load for i915/skl_dmc_ver1_27.bin failed with error -2
[ 186.776407][ T12] i915 0000:00:02.0: [drm] Failed to load DMC firmware i915/skl_dmc_ver1_27.bin. Disabling runtime power management.
[ 186.776409][ T12] i915 0000:00:02.0: [drm] DMC firmware homepage: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915
[ 186.776697][ T198] genirq: Flags mismatch irq 16. 00000080 (i915) vs. 00000000 (xhci_hcd)
[ 186.776701][ T198] CPU: 0 PID: 198 Comm: systemd-udevd Not tainted 5.14.0-rc7-00016-gd41aee79717a #1
[ 186.776703][ T198] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
[ 186.776704][ T198] Call Trace:
[ 186.776706][ T198] dump_stack_lvl+0x45/0x59
[ 186.776712][ T198] __setup_irq.cold+0x106/0x13e
[ 186.776717][ T198] ? dg1_irq_handler+0x100/0x100 [i915]
[ 186.776800][ T198] request_threaded_irq+0x10c/0x180
[ 186.776807][ T198] intel_irq_install+0x93/0x100 [i915]
[ 186.776873][ T198] i915_driver_probe+0x18f/0x440 [i915]
[ 186.776943][ T198] i915_pci_probe+0x54/0x140 [i915]
[ 186.777008][ T198] local_pci_probe+0x42/0x80
[ 186.777014][ T198] pci_device_probe+0x16b/0x200
[ 186.777017][ T198] ? sysfs_do_create_link_sd+0x69/0x100
[ 186.777025][ T198] really_probe+0xb6/0x380
[ 186.777028][ T198] __driver_probe_device+0xfe/0x180
[ 186.777032][ T198] driver_probe_device+0x1e/0xc0
[ 186.777035][ T198] __driver_attach+0x9e/0x180
[ 186.777036][ T198] ? __device_attach_driver+0x100/0x100
[ 186.777038][ T198] ? __device_attach_driver+0x100/0x100
[ 186.777040][ T198] bus_for_each_dev+0x7b/0xc0
[ 186.777046][ T198] bus_add_driver+0x150/0x200
[ 186.777051][ T198] driver_register+0x6c/0xc0
[ 186.777052][ T198] ? 0xffffffffc065d000
[ 186.777055][ T198] i915_init+0x62/0x81 [i915]
[ 186.777126][ T198] do_one_initcall+0x5b/0x340
[ 186.777129][ T198] ? do_init_module+0x23/0x280
[ 186.777132][ T198] ? kmem_cache_alloc_trace+0x533/0x640
[ 186.777135][ T198] ? lock_is_held_type+0xd5/0x140
[ 186.777142][ T198] do_init_module+0x5c/0x280
[ 186.777145][ T198] load_module+0x11b7/0x1540
[ 186.777166][ T198] ? __do_sys_finit_module+0xae/0x140
[ 186.777168][ T198] __do_sys_finit_module+0xae/0x140
[ 186.777182][ T198] do_syscall_64+0x5c/0x80
[ 186.777184][ T198] ? do_syscall_64+0x69/0x80
[ 186.777187][ T198] ? asm_exc_page_fault+0x1e/0x30
[ 186.777189][ T198] ? asm_exc_page_fault+0x8/0x30
[ 186.777191][ T198] ? lockdep_hardirqs_on+0x79/0x100
[ 186.777194][ T198] entry_SYSCALL_64_after_hwframe+0x44/0xae

>
> > "Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)" issue, could you help on them?
>
> I assume the below can fix i801_smbus:

We tested below patch, it can't completely fix
"Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)", the result is unstable, we
tested 31 times, there's 21 times that we still have
"Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)" issue.

Could you help on this? thanks! The dmesg log is also attached in dmesg.xz.

=========================================================================================
compiler/group/kconfig/rootfs/tbox_group/testcase/ucode:
gcc-9/pidfd/x86_64-rhel-8.3-kselftests/debian-10.4-x86_64-20200603.cgz/lkp-skl-d05/kernel-selftests/0xe2

commit:
86c19983f1 ("PCI/sysfs: Don't depend on pci_dev.irq for IRQ entry") <<< parent
a4fc4cf388 ("PCI/MSI: remove msi_attrib.default_irq in msi_desc") <<< fbc
29368adf4c ("fixup-for-a4fc4cf388") <<< patch to fix "mei_me"
d41aee7971 ("fixup-for-29368adf4c") <<< patch to fix "i801_smbus"

86c19983f1808cea a4fc4cf388319ea957ffbdab507 29368adf4c2b598c3e13dbd9603 d41aee79717a7f4ac7f1888e488
---------------- --------------------------- --------------------------- ---------------------------
fail:runs %reproduction fail:runs %reproduction fail:runs %reproduction fail:runs
| | | | | | |
:31 68% 21:31 71% 22:31 68% 21:31 dmesg.genirq:Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)
:31 84% 26:31 94% 29:31 87% 27:31 dmesg.genirq:Flags_mismatch_irq##(i915)vs.#(xhci_hcd)
:31 77% 24:31 0% :31 0% :31 dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)


[ 185.967634][ T230] i801_smbus 0000:00:1f.4: SPD Write Disable is set
[ 185.969376][ T220] ahci 0000:00:17.0: flags: 64bit ncq pm led clo only pio slum part ems deso sadm sds apst
[ 185.974251][ T230] genirq: Flags mismatch irq 16. 00000080 (i801_smbus) vs. 00000000 (xhci_hcd)
[ 185.993159][ T230] CPU: 0 PID: 230 Comm: systemd-udevd Not tainted 5.14.0-rc7-00016-gd41aee79717a #1
[ 185.993164][ T230] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
Startin[ 185.993165][ T230] Call Trace:
e UTMP about Sys[ 186.020986][ T230] __setup_irq.cold+0x106/0x13e
tem Boot/Shutdow[ 186.027042][ T230] ? i801_remove+0x80/0x80 [i2c_i801]
[ 186.033616][ T230] request_threaded_irq+0x10c/0x180
[ 186.039480][ T230] ? i801_remove+0x80/0x80 [i2c_i801]
[ 186.044791][ T230] devm_request_threaded_irq+0x72/0x100
[ 186.044799][ T230] i801_probe+0x576/0x680 [i2c_i801]
Startin[ 186.055535][ T230] ? preempt_count_sub+0xa1/0x100
ind portmap serv[ 186.068795][ T230] local_pci_probe+0x42/0x80
[ 186.074591][ T230] pci_device_probe+0x16b/0x200
[ 186.080277][ T230] ? sysfs_do_create_link_sd+0x69/0x100
[ 186.086373][ T230] really_probe+0xb6/0x380
[ 186.086379][ T230] __driver_probe_device+0xfe/0x180
[ 186.086382][ T230] driver_probe_device+0x1e/0xc0
Startin[ 186.086385][ T230] __driver_attach+0x9e/0x180
rk Time Synchron[ 186.086389][ T230] ? __device_attach_driver+0x100/0x100
[ 186.086391][ T230] bus_for_each_dev+0x7b/0xc0
[ 186.086397][ T230] bus_add_driver+0x150/0x200
[ 186.086402][ T230] driver_register+0x6c/0xc0
[ 186.086404][ T230] ? 0xffffffffc0085000
[ 186.086407][ T230] i2c_i801_init+0xb3/0x1000 [i2c_i801]
[ 186.086413][ T230] ? 0xffffffffc0085000
[ 186.086415][ T230] do_one_initcall+0x5b/0x340
[ 186.086418][ T230] ? do_init_module+0x23/0x280
[ 186.086421][ T230] ? kmem_cache_alloc_trace+0x533/0x640
[ 186.086424][ T230] ? lock_is_held_type+0xd5/0x140
[ 186.086431][ T230] do_init_module+0x5c/0x280
[ 186.086435][ T230] load_module+0x11b7/0x1540
[ 186.086455][ T230] ? __do_sys_finit_module+0xae/0x140
[ 186.086457][ T230] __do_sys_finit_module+0xae/0x140
[ 186.086471][ T230] do_syscall_64+0x5c/0x80
[ 186.086474][ T230] ? do_syscall_64+0x69/0x80
[ 186.086476][ T230] ? lockdep_hardirqs_on+0x79/0x100
[ 186.086480][ T230] ? do_syscall_64+0x69/0x80
[ 186.086480][ T230] ? do_syscall_64+0x69/0x80
[ 186.086482][ T230] ? do_syscall_64+0x69/0x80
[ 186.086484][ T230] ? do_syscall_64+0x69/0x80
[ 186.086488][ T230] ? do_syscall_64+0x69/0x80
[ 186.086490][ T230] ? lockdep_hardirqs_on+0x79/0x100
[ 186.086493][ T230] ? do_syscall_64+0x69/0x80
[ 186.086495][ T230] ? asm_exc_page_fault+0x1e/0x30
[ 186.086497][ T230] ? asm_exc_page_fault+0x8/0x30
[ 186.086498][ T230] ? lockdep_hardirqs_on+0x79/0x100
[ 186.086502][ T230] entry_SYSCALL_64_after_hwframe+0x44/0xae

>
> diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
> index 89ae78ef1a1c..88d96e3ca268 100644
> --- a/drivers/i2c/busses/i2c-i801.c
> +++ b/drivers/i2c/busses/i2c-i801.c
> @@ -1827,7 +1827,7 @@ static int i801_probe(struct pci_dev *dev, const
> struct pci_device_id *id)
> /* Default timeout in interrupt mode: 200 ms */
> priv->adapter.timeout = HZ / 5;
>
> - if (dev->irq == IRQ_NOTCONNECTED)
> + if (pci_irq_vector(dev, 0) == IRQ_NOTCONNECTED)
> priv->features &= ~FEATURE_IRQ;
>
> if (priv->features & FEATURE_IRQ) {
> @@ -1849,11 +1849,11 @@ static int i801_probe(struct pci_dev *dev,
> const struct pci_device_id *id)
> if (priv->features & FEATURE_IRQ) {
> init_completion(&priv->done);
>
> - err = devm_request_irq(&dev->dev, dev->irq, i801_isr,
> + err = devm_request_irq(&dev->dev, pci_irq_vector(dev,
> 0), i801_isr,
> IRQF_SHARED, DRV_NAME, priv);
> if (err) {
> dev_err(&dev->dev, "Failed to allocate irq %d: %d\n",
> - dev->irq, err);
> + pci_irq_vector(dev, 0), err);
> priv->features &= ~FEATURE_IRQ;
> }
> }
>
>
>
> > thanks!
> >
> >
> > =========================================================================================
> > compiler/group/kconfig/rootfs/tbox_group/testcase/ucode:
> > gcc-9/pidfd/x86_64-rhel-8.3-kselftests/debian-10.4-x86_64-20200603.cgz/lkp-skl-d05/kernel-selftests/0xe2
> >
> > commit:
> > 86c19983f1 ("PCI/sysfs: Don't depend on pci_dev.irq for IRQ entry") <<< parent
> > a4fc4cf388 ("PCI/MSI: remove msi_attrib.default_irq in msi_desc") <<< fbc
> > 29368adf4c ("fixup-for-a4fc4cf388")
> >
> > 86c19983f1808cea a4fc4cf388319ea957ffbdab507 29368adf4c2b598c3e13dbd9603
> > ---------------- --------------------------- ---------------------------
> > fail:runs %reproduction fail:runs %reproduction fail:runs
> > | | | | |
> > :31 68% 21:31 71% 22:31 dmesg.genirq:Flags_mismatch_irq##(i801_smbus)vs.#(xhci_hcd)
> > :31 84% 26:31 94% 29:31 dmesg.genirq:Flags_mismatch_irq##(i915)vs.#(xhci_hcd)
> > :31 77% 24:31 0% :31 dmesg.genirq:Flags_mismatch_irq##(mei_me)vs.#(xhci_hcd)
> >
> > >
> > > diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
> > > index c3393b383e59..97495931fadd 100644
> > > --- a/drivers/misc/mei/pci-me.c
> > > +++ b/drivers/misc/mei/pci-me.c
> > > @@ -216,18 +216,18 @@ static int mei_me_probe(struct pci_dev *pdev,
> > > const struct pci_device_id *ent)
> > >
> > > pci_enable_msi(pdev);
> > >
> > > - hw->irq = pdev->irq;
> > > + hw->irq = pci_irq_vector(pdev, 0);
> > >
> > > /* request and enable interrupt */
> > > irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;
> > >
> > > - err = request_threaded_irq(pdev->irq,
> > > + err = request_threaded_irq(pci_irq_vector(pdev, 0),
> > > mei_me_irq_quick_handler,
> > > mei_me_irq_thread_handler,
> > > irqflags, KBUILD_MODNAME, dev);
> > > if (err) {
> > > dev_err(&pdev->dev, "request_threaded_irq failure. irq = %d\n",
> > > - pdev->irq);
> > > + pci_irq_vector(pdev, 0));
> > > goto end;
> > > }
> > >
> > > @@ -278,7 +278,7 @@ static int mei_me_probe(struct pci_dev *pdev,
> > > const struct pci_device_id *ent)
> > > release_irq:
> > > mei_cancel_work(dev);
> > > mei_disable_interrupts(dev);
> > > - free_irq(pdev->irq, dev);
> > > + free_irq(pci_irq_vector(pdev, 0), dev);
> > > end:
> > > dev_err(&pdev->dev, "initialization failed.\n");
> > > return err;
> > > @@ -307,7 +307,7 @@ static void mei_me_shutdown(struct pci_dev *pdev)
> > > mei_me_unset_pm_domain(dev);
> > >
> > > mei_disable_interrupts(dev);
> > > - free_irq(pdev->irq, dev);
> > > + free_irq(pci_irq_vector(pdev, 0), dev);
> > > }
> > >
> > > /**
> > > @@ -336,7 +336,7 @@ static void mei_me_remove(struct pci_dev *pdev)
> > >
> > > mei_disable_interrupts(dev);
> > >
> > > - free_irq(pdev->irq, dev);
> > > + free_irq(pci_irq_vector(pdev, 0), dev);
> > >
> > > mei_deregister(dev);
> > > }
> > > @@ -356,7 +356,7 @@ static int mei_me_pci_suspend(struct device *device)
> > >
> > > mei_disable_interrupts(dev);
> > >
> > > - free_irq(pdev->irq, dev);
> > > + free_irq(pci_irq_vector(pdev, 0), dev);
> > > pci_disable_msi(pdev);
> > >
> > > return 0;
> > > @@ -378,14 +378,14 @@ static int mei_me_pci_resume(struct device *device)
> > > irqflags = pci_dev_msi_enabled(pdev) ? IRQF_ONESHOT : IRQF_SHARED;
> > >
> > > /* request and enable interrupt */
> > > - err = request_threaded_irq(pdev->irq,
> > > + err = request_threaded_irq(pci_irq_vector(pdev, 0),
> > > mei_me_irq_quick_handler,
> > > mei_me_irq_thread_handler,
> > > irqflags, KBUILD_MODNAME, dev);
> > >
> > > if (err) {
> > > dev_err(&pdev->dev, "request_threaded_irq failed: irq = %d.\n",
> > > - pdev->irq);
> > > + pci_irq_vector(pdev, 0));
> > > return err;
> > > }
> > >
> > >
> > > Thanks
> > > barry
>
> Thanks
> barry


Attachments:
(No filename) (13.17 kB)
dmesg.xz (22.88 kB)
config-5.14.0-rc7-00016-gd41aee79717a (178.15 kB)
Download all attachments