2015-05-23 01:53:50

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [Xen-devel] Regression due to "device property: Make it possible to use secondary firmware nodes" Re: Xen-unstable + linux 4.1-mergewindow: problems with PV guest pci passthrough: pcifront pci-0: pciback not responding!!!

On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
> Hello Sander,
>
> Friday, May 15, 2015, 12:47:27 AM, you wrote:
>
>> Sorry for the resend, i messed up the to's en from's.
>
>> Hi Konrad / David,
>
>> One big snip on this thread, got some more debug info, hopefully this will
>> lead to something:
>
>> On a working kernel (with the two seemingly non related patches reverted) i get:
>
>> [ 0.717796] pcifront pci-0: Allocated pdev @ 0xffff880019e11780 pdev->sh_info @ 0xffff880018f58000
>> [ 0.717848] pcifront pci-0: ?!?!? before alloc gntref: 0
>> [ 0.717871] pcifront pci-0: ?!?!? after alloc gntref: 8
>> [ 0.717892] pcifront pci-0: ?!?!? before alloc evtchn: -1
>> [ 0.717915] pcifront pci-0: ?!?!? after alloc evtchn: 17
>> [ 0.717984] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
>> [ 0.721640] pcifront pci-0: publishing successful!
>> [ 0.723684] usbcore: registered new interface driver udlfb
>> [ 0.724664] xen:xen_evtchn: Event-channel device installed
>> [ 0.726597] pcifront pci-0: Installing PCI frontend
>> [ 0.726853] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>> [ 0.727059] pcifront pci-0: Creating PCI Frontend Bus 0000:00
>> [ 0.727363] pcifront pci-0: PCI host bridge to bus 0000:00
>> [ 0.727391] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
>> [ 0.727417] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffffff]
>> [ 0.727452] pci_bus 0000:00: root bus resource [bus 00-ff]
>> [ 0.727475] pci_bus 0000:00: scanning bus
>> [ 0.727503] pcifront pci-0: read dev=0000:00:00.0 - offset 0 size 4
>> [ 0.728253] Linux agpgart interface v0.103
>> [ 0.728387] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
>> [ 0.728474] [drm] Initialized drm 1.1.0 20060810
>> [ 0.728551] [drm] radeon kernel modesetting enabled.
>> [ 0.730319] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785551700000 ns_timeout: 1431641787541235000 evtchn:17 gnt_ref:8
>> [ 0.730319] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>> [ 0.730319] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>> [ 0.730319] pcifront pci-0: read got back value 11113f6
>> [ 0.738845] pcifront pci-0: read dev=0000:00:00.0 - offset e size 1
>> [ 0.744976] brd: module loaded
>> [ 0.745204] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785562852000 ns_timeout: 1431641787552580000 evtchn:17 gnt_ref:8
>> [ 0.745204] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:14 size:1
>> [ 0.745204] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:14 size:1
>> [ 0.745204] pcifront pci-0: read got back value 0
>> [ 0.749204] pcifront pci-0: read dev=0000:00:00.0 - offset 6 size 2
>> [ 0.750155] loop: module loaded
>> [ 0.752527] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785570841000 ns_timeout: 1431641787562917000 evtchn:17 gnt_ref:8
>> [ 0.752527] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:6 size:2
>> [ 0.752527] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:6 size:2
>> [ 0.752527] pcifront pci-0: read got back value 210
>> [ 0.757187] pcifront pci-0: read dev=0000:00:00.0 - offset 34 size 1
>
>
>> Were as in the non-working situation i get:
>
>> [ 0.751244] pcifront pci-0: Allocated pdev @ 0xffff880019ec2e00 pdev->sh_info @ 0xffff88001aa51000
>> [ 0.751295] pcifront pci-0: ?!?!? before alloc gntref: 0
>> [ 0.751315] pcifront pci-0: ?!?!? after alloc gntref: 8
>> [ 0.751334] pcifront pci-0: ?!?!? before alloc evtchn: -1
>> [ 0.751355] pcifront pci-0: ?!?!? after alloc evtchn: 17
>> [ 0.751422] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
>> [ 0.755215] pcifront pci-0: publishing successful!
>> [ 0.757341] usbcore: registered new interface driver udlfb
>> [ 0.758365] xen:xen_evtchn: Event-channel device installed
>> [ 0.760419] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>> [ 0.760819] pcifront pci-0: Installing PCI frontend
>> [ 0.761518] pcifront pci-0: Creating PCI Frontend Bus 0000:00
>> [ 0.761684] pcifront pci-0: PCI host bridge to bus 0000:00
>> [ 0.761710] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
>> [ 0.761733] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffffff]
>> [ 0.761763] pci_bus 0000:00: root bus resource [bus 00-ff]
>> [ 0.761783] pci_bus 0000:00: scanning bus
>> [ 0.761805] pcifront pci-0: read dev=0000:00:00.0 - offset 0 size 4
>> [ 0.767207] Linux agpgart interface v0.103
>> [ 0.767362] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
>> [ 0.767439] [drm] Initialized drm 1.1.0 20060810
>> [ 0.767515] [drm] radeon kernel modesetting enabled.
>> [ 0.766948] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641983026498000 ns_timeout: 1431641983026497000 evtchn:0 gnt_ref:0
>> [ 0.766948] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>> [ 0.766948] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>> [ 0.766948] pcifront pci-0: other err read got back err: ffffffff value: 0
>> [ 2.762062] pcifront pci-0: read dev=0000:00:01.0 - offset 0 size 4
>> [ 2.765203] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641985026742000 ns_timeout: 1431641985026741000 evtchn:0 gnt_ref:0
>> [ 2.765203] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>> [ 2.765203] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>> [ 2.765203] pcifront pci-0: other err read got back err: ffffffff value: 0
>> [ 4.762172] pcifront pci-0: read dev=0000:00:02.0 - offset 0 size 4
>> [ 4.764231] brd: module loaded
>> [ 4.765508] loop: module loaded
>> [ 4.766748] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641987026850000 ns_timeout: 1431641987026849000 evtchn:0 gnt_ref:0
>> [ 4.766748] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>> [ 4.766748] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>> [ 4.766748] pcifront pci-0: other err read got back err: ffffffff value: 0
>> [ 6.762248] pcifront pci-0: read dev=0000:00:03.0 - offset 0 size 4
>> [ 6.765545] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641989026930000 ns_timeout: 1431641989026929000 evtchn:0 gnt_ref:0
>> [ 6.765545] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>> [ 6.765545] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>> [ 6.765545] pcifront pci-0: other err read got back err: ffffffff value: 0
>> [ 8.762329] pcifront pci-0: read dev=0000:00:04.0 - offset 0 size 4
>> [ 8.765626] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641991027006000 ns_timeout: 1431641991027005000 evtchn:0 gnt_ref:0
>> [ 8.765626] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>> [ 8.765626] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>> [ 8.765626] pcifront pci-0: other err read got back err: ffffffff value: 0
>> [ 10.762410] pcifront pci-0: read dev=0000:00:05.0 - offset 0 size 4
>> [ 10.765701] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641993027087000 ns_timeout: 1431641993027086000 evtchn:0 gnt_ref:0
>> [ 10.765701] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>> [ 10.765701] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>> [ 10.765701] pcifront pci-0: other err read got back err: ffffffff value: 0
>> [ 12.762472] pcifront pci-0: read dev=0000:00:06.0 - offset 0 size 4
>
>
>> So somehow in the non-working situation, pdev->evtchn and pdev->gnt_ref are 0 in
>> xen-pcifront.c:do_pci_op(), so no wonder it's not getting a response back ...
>
>> Question is .. why ?
>
>> --
>> Sander
>
>
> Ping ?
>
> David / Boris,
>
> Any idea, since Konrad seems to be off for 2 weeks and we are at rc4 now.
>

(+Rafael again)

So the immediate cause of those errors is that pdev->evtchn is 0.
Backend is not notified and things not go well then.

And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:

We allocate pcifront_sd in pcifront_scan_root() and then pass it to
pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
and then set_primary_fwnode() writes it, thus corrupting
pcifront_sd->pdev (and I think this is what sets evtchn to zero).

I don't have a fix for that. I will see what we can do on Tuesday since
I am out on Monday.

Question to Rafael about commit 97badf873ab60e84124: is it really safe
to assume that bridge->bus->sysdata is a pointer to pci_sysdata in
pcibios_root_bridge_prepare()? It is declared as 'void *'.

-boris


2015-05-23 07:58:15

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: [Xen-devel] Regression due to "device property: Make it possible to use secondary firmware nodes" Re: Xen-unstable + linux 4.1-mergewindow: problems with PV guest pci passthrough: pcifront pci-0: pciback not responding!!!


Saturday, May 23, 2015, 3:53:37 AM, you wrote:

> On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
>> Hello Sander,
>>
>> Friday, May 15, 2015, 12:47:27 AM, you wrote:
>>
>>> Sorry for the resend, i messed up the to's en from's.
>>
>>> Hi Konrad / David,
>>
>>> One big snip on this thread, got some more debug info, hopefully this will
>>> lead to something:
>>
>>> On a working kernel (with the two seemingly non related patches reverted) i get:
>>
>>> [ 0.717796] pcifront pci-0: Allocated pdev @ 0xffff880019e11780 pdev->sh_info @ 0xffff880018f58000
>>> [ 0.717848] pcifront pci-0: ?!?!? before alloc gntref: 0
>>> [ 0.717871] pcifront pci-0: ?!?!? after alloc gntref: 8
>>> [ 0.717892] pcifront pci-0: ?!?!? before alloc evtchn: -1
>>> [ 0.717915] pcifront pci-0: ?!?!? after alloc evtchn: 17
>>> [ 0.717984] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
>>> [ 0.721640] pcifront pci-0: publishing successful!
>>> [ 0.723684] usbcore: registered new interface driver udlfb
>>> [ 0.724664] xen:xen_evtchn: Event-channel device installed
>>> [ 0.726597] pcifront pci-0: Installing PCI frontend
>>> [ 0.726853] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>>> [ 0.727059] pcifront pci-0: Creating PCI Frontend Bus 0000:00
>>> [ 0.727363] pcifront pci-0: PCI host bridge to bus 0000:00
>>> [ 0.727391] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
>>> [ 0.727417] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffffff]
>>> [ 0.727452] pci_bus 0000:00: root bus resource [bus 00-ff]
>>> [ 0.727475] pci_bus 0000:00: scanning bus
>>> [ 0.727503] pcifront pci-0: read dev=0000:00:00.0 - offset 0 size 4
>>> [ 0.728253] Linux agpgart interface v0.103
>>> [ 0.728387] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
>>> [ 0.728474] [drm] Initialized drm 1.1.0 20060810
>>> [ 0.728551] [drm] radeon kernel modesetting enabled.
>>> [ 0.730319] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785551700000 ns_timeout: 1431641787541235000 evtchn:17 gnt_ref:8
>>> [ 0.730319] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 0.730319] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 0.730319] pcifront pci-0: read got back value 11113f6
>>> [ 0.738845] pcifront pci-0: read dev=0000:00:00.0 - offset e size 1
>>> [ 0.744976] brd: module loaded
>>> [ 0.745204] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785562852000 ns_timeout: 1431641787552580000 evtchn:17 gnt_ref:8
>>> [ 0.745204] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:14 size:1
>>> [ 0.745204] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:14 size:1
>>> [ 0.745204] pcifront pci-0: read got back value 0
>>> [ 0.749204] pcifront pci-0: read dev=0000:00:00.0 - offset 6 size 2
>>> [ 0.750155] loop: module loaded
>>> [ 0.752527] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785570841000 ns_timeout: 1431641787562917000 evtchn:17 gnt_ref:8
>>> [ 0.752527] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:6 size:2
>>> [ 0.752527] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:6 size:2
>>> [ 0.752527] pcifront pci-0: read got back value 210
>>> [ 0.757187] pcifront pci-0: read dev=0000:00:00.0 - offset 34 size 1
>>
>>
>>> Were as in the non-working situation i get:
>>
>>> [ 0.751244] pcifront pci-0: Allocated pdev @ 0xffff880019ec2e00 pdev->sh_info @ 0xffff88001aa51000
>>> [ 0.751295] pcifront pci-0: ?!?!? before alloc gntref: 0
>>> [ 0.751315] pcifront pci-0: ?!?!? after alloc gntref: 8
>>> [ 0.751334] pcifront pci-0: ?!?!? before alloc evtchn: -1
>>> [ 0.751355] pcifront pci-0: ?!?!? after alloc evtchn: 17
>>> [ 0.751422] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
>>> [ 0.755215] pcifront pci-0: publishing successful!
>>> [ 0.757341] usbcore: registered new interface driver udlfb
>>> [ 0.758365] xen:xen_evtchn: Event-channel device installed
>>> [ 0.760419] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>>> [ 0.760819] pcifront pci-0: Installing PCI frontend
>>> [ 0.761518] pcifront pci-0: Creating PCI Frontend Bus 0000:00
>>> [ 0.761684] pcifront pci-0: PCI host bridge to bus 0000:00
>>> [ 0.761710] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
>>> [ 0.761733] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffffff]
>>> [ 0.761763] pci_bus 0000:00: root bus resource [bus 00-ff]
>>> [ 0.761783] pci_bus 0000:00: scanning bus
>>> [ 0.761805] pcifront pci-0: read dev=0000:00:00.0 - offset 0 size 4
>>> [ 0.767207] Linux agpgart interface v0.103
>>> [ 0.767362] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
>>> [ 0.767439] [drm] Initialized drm 1.1.0 20060810
>>> [ 0.767515] [drm] radeon kernel modesetting enabled.
>>> [ 0.766948] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641983026498000 ns_timeout: 1431641983026497000 evtchn:0 gnt_ref:0
>>> [ 0.766948] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 0.766948] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 0.766948] pcifront pci-0: other err read got back err: ffffffff value: 0
>>> [ 2.762062] pcifront pci-0: read dev=0000:00:01.0 - offset 0 size 4
>>> [ 2.765203] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641985026742000 ns_timeout: 1431641985026741000 evtchn:0 gnt_ref:0
>>> [ 2.765203] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 2.765203] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 2.765203] pcifront pci-0: other err read got back err: ffffffff value: 0
>>> [ 4.762172] pcifront pci-0: read dev=0000:00:02.0 - offset 0 size 4
>>> [ 4.764231] brd: module loaded
>>> [ 4.765508] loop: module loaded
>>> [ 4.766748] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641987026850000 ns_timeout: 1431641987026849000 evtchn:0 gnt_ref:0
>>> [ 4.766748] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 4.766748] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 4.766748] pcifront pci-0: other err read got back err: ffffffff value: 0
>>> [ 6.762248] pcifront pci-0: read dev=0000:00:03.0 - offset 0 size 4
>>> [ 6.765545] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641989026930000 ns_timeout: 1431641989026929000 evtchn:0 gnt_ref:0
>>> [ 6.765545] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 6.765545] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 6.765545] pcifront pci-0: other err read got back err: ffffffff value: 0
>>> [ 8.762329] pcifront pci-0: read dev=0000:00:04.0 - offset 0 size 4
>>> [ 8.765626] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641991027006000 ns_timeout: 1431641991027005000 evtchn:0 gnt_ref:0
>>> [ 8.765626] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 8.765626] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 8.765626] pcifront pci-0: other err read got back err: ffffffff value: 0
>>> [ 10.762410] pcifront pci-0: read dev=0000:00:05.0 - offset 0 size 4
>>> [ 10.765701] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641993027087000 ns_timeout: 1431641993027086000 evtchn:0 gnt_ref:0
>>> [ 10.765701] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 10.765701] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
>>> [ 10.765701] pcifront pci-0: other err read got back err: ffffffff value: 0
>>> [ 12.762472] pcifront pci-0: read dev=0000:00:06.0 - offset 0 size 4
>>
>>
>>> So somehow in the non-working situation, pdev->evtchn and pdev->gnt_ref are 0 in
>>> xen-pcifront.c:do_pci_op(), so no wonder it's not getting a response back ...
>>
>>> Question is .. why ?
>>
>>> --
>>> Sander
>>
>>
>> Ping ?
>>
>> David / Boris,
>>
>> Any idea, since Konrad seems to be off for 2 weeks and we are at rc4 now.
>>

> (+Rafael again)

> So the immediate cause of those errors is that pdev->evtchn is 0.
> Backend is not notified and things not go well then.

> And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:

> We allocate pcifront_sd in pcifront_scan_root() and then pass it to
> pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
> pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
pci_sysdata->>companion (which I believe is aliased to pcifront_sd->pdev)
> and then set_primary_fwnode() writes it, thus corrupting
pcifront_sd->>pdev (and I think this is what sets evtchn to zero).

> I don't have a fix for that. I will see what we can do on Tuesday since
> I am out on Monday.
So the bisection wasn't a red herring after all :-).
Thanks for the fast analysis and moving this forward Boris, have a nice weekend !

--
Sander

> Question to Rafael about commit 97badf873ab60e84124: is it really safe
> to assume that bridge->bus->sysdata is a pointer to pci_sysdata in
> pcibios_root_bridge_prepare()? It is declared as 'void *'.

> -boris



2015-05-25 22:56:48

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Xen-devel] Regression due to "device property: Make it possible to use secondary firmware nodes" Re: Xen-unstable + linux 4.1-mergewindow: problems with PV guest pci passthrough: pcifront pci-0: pciback not responding!!!

On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
> On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
> > Hello Sander,
> >
> > Friday, May 15, 2015, 12:47:27 AM, you wrote:
> >
> >> Sorry for the resend, i messed up the to's en from's.
> >
> >> Hi Konrad / David,
> >
> >> One big snip on this thread, got some more debug info, hopefully this will
> >> lead to something:
> >
> >> On a working kernel (with the two seemingly non related patches reverted) i get:
> >
> >> [ 0.717796] pcifront pci-0: Allocated pdev @ 0xffff880019e11780 pdev->sh_info @ 0xffff880018f58000
> >> [ 0.717848] pcifront pci-0: ?!?!? before alloc gntref: 0
> >> [ 0.717871] pcifront pci-0: ?!?!? after alloc gntref: 8
> >> [ 0.717892] pcifront pci-0: ?!?!? before alloc evtchn: -1
> >> [ 0.717915] pcifront pci-0: ?!?!? after alloc evtchn: 17
> >> [ 0.717984] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
> >> [ 0.721640] pcifront pci-0: publishing successful!
> >> [ 0.723684] usbcore: registered new interface driver udlfb
> >> [ 0.724664] xen:xen_evtchn: Event-channel device installed
> >> [ 0.726597] pcifront pci-0: Installing PCI frontend
> >> [ 0.726853] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> >> [ 0.727059] pcifront pci-0: Creating PCI Frontend Bus 0000:00
> >> [ 0.727363] pcifront pci-0: PCI host bridge to bus 0000:00
> >> [ 0.727391] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
> >> [ 0.727417] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffffff]
> >> [ 0.727452] pci_bus 0000:00: root bus resource [bus 00-ff]
> >> [ 0.727475] pci_bus 0000:00: scanning bus
> >> [ 0.727503] pcifront pci-0: read dev=0000:00:00.0 - offset 0 size 4
> >> [ 0.728253] Linux agpgart interface v0.103
> >> [ 0.728387] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
> >> [ 0.728474] [drm] Initialized drm 1.1.0 20060810
> >> [ 0.728551] [drm] radeon kernel modesetting enabled.
> >> [ 0.730319] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785551700000 ns_timeout: 1431641787541235000 evtchn:17 gnt_ref:8
> >> [ 0.730319] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 0.730319] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 0.730319] pcifront pci-0: read got back value 11113f6
> >> [ 0.738845] pcifront pci-0: read dev=0000:00:00.0 - offset e size 1
> >> [ 0.744976] brd: module loaded
> >> [ 0.745204] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785562852000 ns_timeout: 1431641787552580000 evtchn:17 gnt_ref:8
> >> [ 0.745204] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:14 size:1
> >> [ 0.745204] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:14 size:1
> >> [ 0.745204] pcifront pci-0: read got back value 0
> >> [ 0.749204] pcifront pci-0: read dev=0000:00:00.0 - offset 6 size 2
> >> [ 0.750155] loop: module loaded
> >> [ 0.752527] pcifront pci-0: ?!?!? pciback responded !!! irq:31 irq_flags:ffff880019e100a8 ns: 1431641785570841000 ns_timeout: 1431641787562917000 evtchn:17 gnt_ref:8
> >> [ 0.752527] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:6 size:2
> >> [ 0.752527] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:6 size:2
> >> [ 0.752527] pcifront pci-0: read got back value 210
> >> [ 0.757187] pcifront pci-0: read dev=0000:00:00.0 - offset 34 size 1
> >
> >
> >> Were as in the non-working situation i get:
> >
> >> [ 0.751244] pcifront pci-0: Allocated pdev @ 0xffff880019ec2e00 pdev->sh_info @ 0xffff88001aa51000
> >> [ 0.751295] pcifront pci-0: ?!?!? before alloc gntref: 0
> >> [ 0.751315] pcifront pci-0: ?!?!? after alloc gntref: 8
> >> [ 0.751334] pcifront pci-0: ?!?!? before alloc evtchn: -1
> >> [ 0.751355] pcifront pci-0: ?!?!? after alloc evtchn: 17
> >> [ 0.751422] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
> >> [ 0.755215] pcifront pci-0: publishing successful!
> >> [ 0.757341] usbcore: registered new interface driver udlfb
> >> [ 0.758365] xen:xen_evtchn: Event-channel device installed
> >> [ 0.760419] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> >> [ 0.760819] pcifront pci-0: Installing PCI frontend
> >> [ 0.761518] pcifront pci-0: Creating PCI Frontend Bus 0000:00
> >> [ 0.761684] pcifront pci-0: PCI host bridge to bus 0000:00
> >> [ 0.761710] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
> >> [ 0.761733] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffffff]
> >> [ 0.761763] pci_bus 0000:00: root bus resource [bus 00-ff]
> >> [ 0.761783] pci_bus 0000:00: scanning bus
> >> [ 0.761805] pcifront pci-0: read dev=0000:00:00.0 - offset 0 size 4
> >> [ 0.767207] Linux agpgart interface v0.103
> >> [ 0.767362] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
> >> [ 0.767439] [drm] Initialized drm 1.1.0 20060810
> >> [ 0.767515] [drm] radeon kernel modesetting enabled.
> >> [ 0.766948] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641983026498000 ns_timeout: 1431641983026497000 evtchn:0 gnt_ref:0
> >> [ 0.766948] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 0.766948] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 0.766948] pcifront pci-0: other err read got back err: ffffffff value: 0
> >> [ 2.762062] pcifront pci-0: read dev=0000:00:01.0 - offset 0 size 4
> >> [ 2.765203] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641985026742000 ns_timeout: 1431641985026741000 evtchn:0 gnt_ref:0
> >> [ 2.765203] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 2.765203] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 2.765203] pcifront pci-0: other err read got back err: ffffffff value: 0
> >> [ 4.762172] pcifront pci-0: read dev=0000:00:02.0 - offset 0 size 4
> >> [ 4.764231] brd: module loaded
> >> [ 4.765508] loop: module loaded
> >> [ 4.766748] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641987026850000 ns_timeout: 1431641987026849000 evtchn:0 gnt_ref:0
> >> [ 4.766748] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 4.766748] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 4.766748] pcifront pci-0: other err read got back err: ffffffff value: 0
> >> [ 6.762248] pcifront pci-0: read dev=0000:00:03.0 - offset 0 size 4
> >> [ 6.765545] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641989026930000 ns_timeout: 1431641989026929000 evtchn:0 gnt_ref:0
> >> [ 6.765545] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 6.765545] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 6.765545] pcifront pci-0: other err read got back err: ffffffff value: 0
> >> [ 8.762329] pcifront pci-0: read dev=0000:00:04.0 - offset 0 size 4
> >> [ 8.765626] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641991027006000 ns_timeout: 1431641991027005000 evtchn:0 gnt_ref:0
> >> [ 8.765626] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 8.765626] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 8.765626] pcifront pci-0: other err read got back err: ffffffff value: 0
> >> [ 10.762410] pcifront pci-0: read dev=0000:00:05.0 - offset 0 size 4
> >> [ 10.765701] pcifront pci-0: pciback not responding!!! irq:31 irq_flags:ffff880019ec0028 ns: 1431641993027087000 ns_timeout: 1431641993027086000 evtchn:0 gnt_ref:0
> >> [ 10.765701] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 10.765701] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 size:4
> >> [ 10.765701] pcifront pci-0: other err read got back err: ffffffff value: 0
> >> [ 12.762472] pcifront pci-0: read dev=0000:00:06.0 - offset 0 size 4
> >
> >
> >> So somehow in the non-working situation, pdev->evtchn and pdev->gnt_ref are 0 in
> >> xen-pcifront.c:do_pci_op(), so no wonder it's not getting a response back ...
> >
> >> Question is .. why ?
> >
> >> --
> >> Sander
> >
> >
> > Ping ?
> >
> > David / Boris,
> >
> > Any idea, since Konrad seems to be off for 2 weeks and we are at rc4 now.
> >
>
> (+Rafael again)
>
> So the immediate cause of those errors is that pdev->evtchn is 0.
> Backend is not notified and things not go well then.
>
> And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
>
> We allocate pcifront_sd in pcifront_scan_root() and then pass it to
> pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
> pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
> pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
> and then set_primary_fwnode() writes it, thus corrupting
> pcifront_sd->pdev (and I think this is what sets evtchn to zero).

Thanks for the analysis!

OK, so the pcibios_root_bridge_prepare() in arch/x86/pci/acpi.c assumes
that bridge->bus->sysdata points to a struct pci_sysdata which has a
'companion' of type struct acpi_device.

This is supposed to come from pci_acpi_scan_root() and not something else.

> I don't have a fix for that. I will see what we can do on Tuesday since
> I am out on Monday.
>
> Question to Rafael about commit 97badf873ab60e84124: is it really safe
> to assume that bridge->bus->sysdata is a pointer to pci_sysdata in
> pcibios_root_bridge_prepare()? It is declared as 'void *'.

That's because other architectures pass different things through it IIRC.

It should be a struct pci_sysdata pointer on x86 and ia64 at least. It is a bug
otherwise and things only worked by accident before. In particular, the ACPI
companion of bridge->dev was set to something random located at the end of
the struct pcifront_sd or behind it (on 64 bit). If referenced, that would
crash the kernel.

Padding struct pcifront_sd to match the layout of struct pci_sysdata should
make it work. Of course, a real fix would be to use a different
pcibios_root_bridge_prepare() for Xen.

Sander, can you please check if the patch below (untested) makes any difference?

---
drivers/pci/xen-pcifront.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/pci/xen-pcifront.c
===================================================================
--- linux-pm.orig/drivers/pci/xen-pcifront.c
+++ linux-pm/drivers/pci/xen-pcifront.c
@@ -53,6 +53,8 @@ struct pcifront_device {

struct pcifront_sd {
int domain;
+ int node;
+ void *padding[2];
struct pcifront_device *pdev;
};

@@ -465,7 +467,7 @@ static int pcifront_scan_root(struct pci
domain, bus);

bus_entry = kmalloc(sizeof(*bus_entry), GFP_KERNEL);
- sd = kmalloc(sizeof(*sd), GFP_KERNEL);
+ sd = kzalloc(sizeof(*sd), GFP_KERNEL);
if (!bus_entry || !sd) {
err = -ENOMEM;
goto err_out;

2015-05-25 23:16:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Xen-devel] Regression due to "device property: Make it possible to use secondary firmware nodes" Re: Xen-unstable + linux 4.1-mergewindow: problems with PV guest pci passthrough: pcifront pci-0: pciback not responding!!!

On Tuesday, May 26, 2015 01:22:12 AM Rafael J. Wysocki wrote:
> On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
> > On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
> > > Hello Sander,
> > >

[cut]

> > (+Rafael again)
> >
> > So the immediate cause of those errors is that pdev->evtchn is 0.
> > Backend is not notified and things not go well then.
> >
> > And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
> >
> > We allocate pcifront_sd in pcifront_scan_root() and then pass it to
> > pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
> > pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
> > pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)

Well, there is an int node field between them, so I'm not sure.

> > and then set_primary_fwnode() writes it, thus corrupting
> > pcifront_sd->pdev (and I think this is what sets evtchn to zero).

So the corruption happens when set_primary_fwnode() writes NULL to the
'secondary' field of object pointed to by 'fwnode'.

This isn't strictly necessary and we might avoid the crash by only
writing to fwnode->secondary if fn is not NULL.

So, Sander please test the patch below too if possible.

Of course, that doesn't solve a problem of passing an incorrect pointer
to ACPI_COMPANION_SET() in pcibios_root_bridge_prepare().

---
drivers/base/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -2167,7 +2167,9 @@ void set_primary_fwnode(struct device *d
if (fwnode_is_primary(fn))
fn = fn->secondary;

- fwnode->secondary = fn;
+ if (fn)
+ fwnode->secondary = fn;
+
dev->fwnode = fwnode;
} else {
dev->fwnode = fwnode_is_primary(dev->fwnode) ?

2015-05-26 00:42:50

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Xen-devel] Regression due to "device property: Make it possible to use secondary firmware nodes" Re: Xen-unstable + linux 4.1-mergewindow: problems with PV guest pci passthrough: pcifront pci-0: pciback not responding!!!

On Tuesday, May 26, 2015 01:42:16 AM Rafael J. Wysocki wrote:
> On Tuesday, May 26, 2015 01:22:12 AM Rafael J. Wysocki wrote:
> > On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
> > > On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
> > > > Hello Sander,
> > > >
>
> [cut]
>
> > > (+Rafael again)
> > >
> > > So the immediate cause of those errors is that pdev->evtchn is 0.
> > > Backend is not notified and things not go well then.
> > >
> > > And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
> > >
> > > We allocate pcifront_sd in pcifront_scan_root() and then pass it to
> > > pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
> > > pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
> > > pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
>
> Well, there is an int node field between them, so I'm not sure.
>
> > > and then set_primary_fwnode() writes it, thus corrupting
> > > pcifront_sd->pdev (and I think this is what sets evtchn to zero).
>
> So the corruption happens when set_primary_fwnode() writes NULL to the
> 'secondary' field of object pointed to by 'fwnode'.
>
> This isn't strictly necessary and we might avoid the crash by only
> writing to fwnode->secondary if fn is not NULL.
>
> So, Sander please test the patch below too if possible.
>
> Of course, that doesn't solve a problem of passing an incorrect pointer
> to ACPI_COMPANION_SET() in pcibios_root_bridge_prepare().

And here's one more thing to test.

Please let me know (a) if you get the stack trace from the WARN_ON() in the
patch below and (b) whether or not things work again with this patch applied.


---
arch/x86/pci/acpi.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-pm/arch/x86/pci/acpi.c
===================================================================
--- linux-pm.orig/arch/x86/pci/acpi.c
+++ linux-pm/arch/x86/pci/acpi.c
@@ -483,8 +483,12 @@ struct pci_bus *pci_acpi_scan_root(struc
int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
{
struct pci_sysdata *sd = bridge->bus->sysdata;
+ struct acpi_device *companion = sd->companion;
+
+ /* Protect against passing pointers of an incorrect type via sysdata. */
+ if (!WARN_ON(companion && companion->fwnode.type != FWNODE_ACPI))
+ ACPI_COMPANION_SET(&bridge->dev, companion);

- ACPI_COMPANION_SET(&bridge->dev, sd->companion);
return 0;
}

2015-05-26 01:51:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH] PCI / ACPI: Do not set ACPI companions for host bridges with parents

On Tuesday, May 26, 2015 03:08:17 AM Rafael J. Wysocki wrote:
> On Tuesday, May 26, 2015 01:42:16 AM Rafael J. Wysocki wrote:
> > On Tuesday, May 26, 2015 01:22:12 AM Rafael J. Wysocki wrote:
> > > On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
> > > > On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
> > > > > Hello Sander,
> > > > >
> >
> > [cut]
> >
> > > > (+Rafael again)
> > > >
> > > > So the immediate cause of those errors is that pdev->evtchn is 0.
> > > > Backend is not notified and things not go well then.
> > > >
> > > > And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
> > > >
> > > > We allocate pcifront_sd in pcifront_scan_root() and then pass it to
> > > > pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
> > > > pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
> > > > pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
> >
> > Well, there is an int node field between them, so I'm not sure.
> >
> > > > and then set_primary_fwnode() writes it, thus corrupting
> > > > pcifront_sd->pdev (and I think this is what sets evtchn to zero).
> >
> > So the corruption happens when set_primary_fwnode() writes NULL to the
> > 'secondary' field of object pointed to by 'fwnode'.
> >
> > This isn't strictly necessary and we might avoid the crash by only
> > writing to fwnode->secondary if fn is not NULL.
> >
> > So, Sander please test the patch below too if possible.
> >
> > Of course, that doesn't solve a problem of passing an incorrect pointer
> > to ACPI_COMPANION_SET() in pcibios_root_bridge_prepare().
>
> And here's one more thing to test.

And the below is how I'd fix it, so you can simply test this patch and skip the
previous ones.

---
From: Rafael J. Wysocki <[email protected]>
Subject: PCI / ACPI: Do not set ACPI companions for host bridges with parents

Commit 97badf873ab6 (device property: Make it possible to use
secondary firmware nodes) uncovered a bug in the x86 (and ia64) PCI
host bridge initialization code that assumes bridge->bus->sysdata
to always point to a struct pci_sysdata object which need not be
the case (in particular, the Xen PCI frontend driver sets it to point
to a different data type). If it is not the case, an incorrect
pointer (or a piece of data that is not a pointer at all) will be
passed to ACPI_COMPANION_SET() and that may cause interesting
breakage to happen going forward.

To work around this problem use the observation that the ACPI
host bridge initialization always passes NULL as parent to
pci_create_root_bus(), so if pcibios_root_bridge_prepare() sees
a non-NULL parent of the bridge, it should not attempt to set
an ACPI companion for it, because that means that
pci_create_root_bus() has been called by someone else.

Reported-by: Sander Eikelenboom <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
arch/ia64/pci/pci.c | 13 ++++++++++---
arch/x86/pci/acpi.c | 13 ++++++++++---
2 files changed, 20 insertions(+), 6 deletions(-)

Index: linux-pm/arch/x86/pci/acpi.c
===================================================================
--- linux-pm.orig/arch/x86/pci/acpi.c
+++ linux-pm/arch/x86/pci/acpi.c
@@ -482,9 +482,16 @@ struct pci_bus *pci_acpi_scan_root(struc

int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
{
- struct pci_sysdata *sd = bridge->bus->sysdata;
-
- ACPI_COMPANION_SET(&bridge->dev, sd->companion);
+ /*
+ * We pass NULL as parent to pci_create_root_bus(), so if it is not NULL
+ * here, pci_create_root_bus() has been called by someone else and
+ * sysdata is likely to be different from what we expect. Let it go in
+ * that case.
+ */
+ if (!bridge->dev.parent) {
+ struct pci_sysdata *sd = bridge->bus->sysdata;
+ ACPI_COMPANION_SET(&bridge->dev, sd->companion);
+ }
return 0;
}

Index: linux-pm/arch/ia64/pci/pci.c
===================================================================
--- linux-pm.orig/arch/ia64/pci/pci.c
+++ linux-pm/arch/ia64/pci/pci.c
@@ -478,9 +478,16 @@ struct pci_bus *pci_acpi_scan_root(struc

int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
{
- struct pci_controller *controller = bridge->bus->sysdata;
-
- ACPI_COMPANION_SET(&bridge->dev, controller->companion);
+ /*
+ * We pass NULL as parent to pci_create_root_bus(), so if it is not NULL
+ * here, pci_create_root_bus() has been called by someone else and
+ * sysdata is likely to be different from what we expect. Let it go in
+ * that case.
+ */
+ if (!bridge->dev.parent) {
+ struct pci_controller *controller = bridge->bus->sysdata;
+ ACPI_COMPANION_SET(&bridge->dev, controller->companion);
+ }
return 0;
}

2015-05-26 07:55:10

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: [PATCH] PCI / ACPI: Do not set ACPI companions for host bridges with parents


Tuesday, May 26, 2015, 4:17:05 AM, you wrote:

> On Tuesday, May 26, 2015 03:08:17 AM Rafael J. Wysocki wrote:
>> On Tuesday, May 26, 2015 01:42:16 AM Rafael J. Wysocki wrote:
>> > On Tuesday, May 26, 2015 01:22:12 AM Rafael J. Wysocki wrote:
>> > > On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
>> > > > On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
>> > > > > Hello Sander,
>> > > > >
>> >
>> > [cut]
>> >
>> > > > (+Rafael again)
>> > > >
>> > > > So the immediate cause of those errors is that pdev->evtchn is 0.
>> > > > Backend is not notified and things not go well then.
>> > > >
>> > > > And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
>> > > >
>> > > > We allocate pcifront_sd in pcifront_scan_root() and then pass it to
>> > > > pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
>> > > > pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
>> > > > pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
>> >
>> > Well, there is an int node field between them, so I'm not sure.
>> >
>> > > > and then set_primary_fwnode() writes it, thus corrupting
>> > > > pcifront_sd->pdev (and I think this is what sets evtchn to zero).
>> >
>> > So the corruption happens when set_primary_fwnode() writes NULL to the
>> > 'secondary' field of object pointed to by 'fwnode'.
>> >
>> > This isn't strictly necessary and we might avoid the crash by only
>> > writing to fwnode->secondary if fn is not NULL.
>> >
>> > So, Sander please test the patch below too if possible.
>> >
>> > Of course, that doesn't solve a problem of passing an incorrect pointer
>> > to ACPI_COMPANION_SET() in pcibios_root_bridge_prepare().
>>
>> And here's one more thing to test.

> And the below is how I'd fix it, so you can simply test this patch and skip the
> previous ones.

Hi Rafael,

Just tested it, works for me, thanks !

--
Sander


> ---
> From: Rafael J. Wysocki <[email protected]>
> Subject: PCI / ACPI: Do not set ACPI companions for host bridges with parents

> Commit 97badf873ab6 (device property: Make it possible to use
> secondary firmware nodes) uncovered a bug in the x86 (and ia64) PCI
> host bridge initialization code that assumes bridge->bus->sysdata
> to always point to a struct pci_sysdata object which need not be
> the case (in particular, the Xen PCI frontend driver sets it to point
> to a different data type). If it is not the case, an incorrect
> pointer (or a piece of data that is not a pointer at all) will be
> passed to ACPI_COMPANION_SET() and that may cause interesting
> breakage to happen going forward.

> To work around this problem use the observation that the ACPI
> host bridge initialization always passes NULL as parent to
> pci_create_root_bus(), so if pcibios_root_bridge_prepare() sees
> a non-NULL parent of the bridge, it should not attempt to set
> an ACPI companion for it, because that means that
> pci_create_root_bus() has been called by someone else.

> Reported-by: Sander Eikelenboom <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> arch/ia64/pci/pci.c | 13 ++++++++++---
> arch/x86/pci/acpi.c | 13 ++++++++++---
> 2 files changed, 20 insertions(+), 6 deletions(-)

> Index: linux-pm/arch/x86/pci/acpi.c
> ===================================================================
> --- linux-pm.orig/arch/x86/pci/acpi.c
> +++ linux-pm/arch/x86/pci/acpi.c
> @@ -482,9 +482,16 @@ struct pci_bus *pci_acpi_scan_root(struc
>
> int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
> {
> - struct pci_sysdata *sd = bridge->bus->sysdata;
> -
> - ACPI_COMPANION_SET(&bridge->dev, sd->companion);
> + /*
> + * We pass NULL as parent to pci_create_root_bus(), so if it is not NULL
> + * here, pci_create_root_bus() has been called by someone else and
> + * sysdata is likely to be different from what we expect. Let it go in
> + * that case.
> + */
> + if (!bridge->dev.parent) {
> + struct pci_sysdata *sd = bridge->bus->sysdata;
> + ACPI_COMPANION_SET(&bridge->dev, sd->companion);
> + }
> return 0;
> }
>
> Index: linux-pm/arch/ia64/pci/pci.c
> ===================================================================
> --- linux-pm.orig/arch/ia64/pci/pci.c
> +++ linux-pm/arch/ia64/pci/pci.c
> @@ -478,9 +478,16 @@ struct pci_bus *pci_acpi_scan_root(struc
>
> int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
> {
> - struct pci_controller *controller = bridge->bus->sysdata;
> -
> - ACPI_COMPANION_SET(&bridge->dev, controller->companion);
> + /*
> + * We pass NULL as parent to pci_create_root_bus(), so if it is not NULL
> + * here, pci_create_root_bus() has been called by someone else and
> + * sysdata is likely to be different from what we expect. Let it go in
> + * that case.
> + */
> + if (!bridge->dev.parent) {
> + struct pci_controller *controller = bridge->bus->sysdata;
> + ACPI_COMPANION_SET(&bridge->dev, controller->companion);
> + }
> return 0;
> }
>


2015-05-26 14:32:39

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH] PCI / ACPI: Do not set ACPI companions for host bridges with parents

On 05/25/2015 10:17 PM, Rafael J. Wysocki wrote:
> On Tuesday, May 26, 2015 03:08:17 AM Rafael J. Wysocki wrote:
>> On Tuesday, May 26, 2015 01:42:16 AM Rafael J. Wysocki wrote:
>>> On Tuesday, May 26, 2015 01:22:12 AM Rafael J. Wysocki wrote:
>>>> On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
>>>>> On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
>>>>>> Hello Sander,
>>>>>>
>>>
>>> [cut]
>>>
>>>>> (+Rafael again)
>>>>>
>>>>> So the immediate cause of those errors is that pdev->evtchn is 0.
>>>>> Backend is not notified and things not go well then.
>>>>>
>>>>> And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
>>>>>
>>>>> We allocate pcifront_sd in pcifront_scan_root() and then pass it to
>>>>> pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
>>>>> pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
>>>>> pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
>>>
>>> Well, there is an int node field between them, so I'm not sure.
>>>
>>>>> and then set_primary_fwnode() writes it, thus corrupting
>>>>> pcifront_sd->pdev (and I think this is what sets evtchn to zero).
>>>
>>> So the corruption happens when set_primary_fwnode() writes NULL to the
>>> 'secondary' field of object pointed to by 'fwnode'.
>>>
>>> This isn't strictly necessary and we might avoid the crash by only
>>> writing to fwnode->secondary if fn is not NULL.
>>>
>>> So, Sander please test the patch below too if possible.
>>>
>>> Of course, that doesn't solve a problem of passing an incorrect pointer
>>> to ACPI_COMPANION_SET() in pcibios_root_bridge_prepare().
>>
>> And here's one more thing to test.
>
> And the below is how I'd fix it, so you can simply test this patch and skip the
> previous ones.
>
> ---
> From: Rafael J. Wysocki <[email protected]>
> Subject: PCI / ACPI: Do not set ACPI companions for host bridges with parents
>
> Commit 97badf873ab6 (device property: Make it possible to use
> secondary firmware nodes) uncovered a bug in the x86 (and ia64) PCI
> host bridge initialization code that assumes bridge->bus->sysdata
> to always point to a struct pci_sysdata object which need not be
> the case (in particular, the Xen PCI frontend driver sets it to point
> to a different data type). If it is not the case, an incorrect
> pointer (or a piece of data that is not a pointer at all) will be
> passed to ACPI_COMPANION_SET() and that may cause interesting
> breakage to happen going forward.
>
> To work around this problem use the observation that the ACPI
> host bridge initialization always passes NULL as parent to
> pci_create_root_bus(), so if pcibios_root_bridge_prepare() sees
> a non-NULL parent of the bridge, it should not attempt to set
> an ACPI companion for it, because that means that
> pci_create_root_bus() has been called by someone else.


I am wondering whether at some point we might want to use companion
information in Xen as well.

Another option might be to require that whoever creates their own
sysdata structure to have pci_sysdata as its first element, e.g.

--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -52,7 +52,12 @@ struct pcifront_device {
};

struct pcifront_sd {
+ /*
+ * Must be the first element of this structure since pcifront_sd
+ * is assigned to bus' sysdata and may later be dereferenced as
+ * pci_sysdata.
+ */
+ struct pci_sysdata sd;
struct pcifront_device *pdev;
};


-boris


>
> Reported-by: Sander Eikelenboom <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> arch/ia64/pci/pci.c | 13 ++++++++++---
> arch/x86/pci/acpi.c | 13 ++++++++++---
> 2 files changed, 20 insertions(+), 6 deletions(-)
>
> Index: linux-pm/arch/x86/pci/acpi.c
> ===================================================================
> --- linux-pm.orig/arch/x86/pci/acpi.c
> +++ linux-pm/arch/x86/pci/acpi.c
> @@ -482,9 +482,16 @@ struct pci_bus *pci_acpi_scan_root(struc
>
> int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
> {
> - struct pci_sysdata *sd = bridge->bus->sysdata;
> -
> - ACPI_COMPANION_SET(&bridge->dev, sd->companion);
> + /*
> + * We pass NULL as parent to pci_create_root_bus(), so if it is not NULL
> + * here, pci_create_root_bus() has been called by someone else and
> + * sysdata is likely to be different from what we expect. Let it go in
> + * that case.
> + */
> + if (!bridge->dev.parent) {
> + struct pci_sysdata *sd = bridge->bus->sysdata;
> + ACPI_COMPANION_SET(&bridge->dev, sd->companion);
> + }
> return 0;
> }
>
> Index: linux-pm/arch/ia64/pci/pci.c
> ===================================================================
> --- linux-pm.orig/arch/ia64/pci/pci.c
> +++ linux-pm/arch/ia64/pci/pci.c
> @@ -478,9 +478,16 @@ struct pci_bus *pci_acpi_scan_root(struc
>
> int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
> {
> - struct pci_controller *controller = bridge->bus->sysdata;
> -
> - ACPI_COMPANION_SET(&bridge->dev, controller->companion);
> + /*
> + * We pass NULL as parent to pci_create_root_bus(), so if it is not NULL
> + * here, pci_create_root_bus() has been called by someone else and
> + * sysdata is likely to be different from what we expect. Let it go in
> + * that case.
> + */
> + if (!bridge->dev.parent) {
> + struct pci_controller *controller = bridge->bus->sysdata;
> + ACPI_COMPANION_SET(&bridge->dev, controller->companion);
> + }
> return 0;
> }
>
>

2015-05-26 23:41:50

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PCI / ACPI: Do not set ACPI companions for host bridges with parents

On Tuesday, May 26, 2015 10:32:03 AM Boris Ostrovsky wrote:
> On 05/25/2015 10:17 PM, Rafael J. Wysocki wrote:
> > On Tuesday, May 26, 2015 03:08:17 AM Rafael J. Wysocki wrote:
> >> On Tuesday, May 26, 2015 01:42:16 AM Rafael J. Wysocki wrote:
> >>> On Tuesday, May 26, 2015 01:22:12 AM Rafael J. Wysocki wrote:
> >>>> On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
> >>>>> On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
> >>>>>> Hello Sander,
> >>>>>>
> >>>
> >>> [cut]
> >>>
> >>>>> (+Rafael again)
> >>>>>
> >>>>> So the immediate cause of those errors is that pdev->evtchn is 0.
> >>>>> Backend is not notified and things not go well then.
> >>>>>
> >>>>> And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
> >>>>>
> >>>>> We allocate pcifront_sd in pcifront_scan_root() and then pass it to
> >>>>> pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
> >>>>> pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
> >>>>> pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
> >>>
> >>> Well, there is an int node field between them, so I'm not sure.
> >>>
> >>>>> and then set_primary_fwnode() writes it, thus corrupting
> >>>>> pcifront_sd->pdev (and I think this is what sets evtchn to zero).
> >>>
> >>> So the corruption happens when set_primary_fwnode() writes NULL to the
> >>> 'secondary' field of object pointed to by 'fwnode'.
> >>>
> >>> This isn't strictly necessary and we might avoid the crash by only
> >>> writing to fwnode->secondary if fn is not NULL.
> >>>
> >>> So, Sander please test the patch below too if possible.
> >>>
> >>> Of course, that doesn't solve a problem of passing an incorrect pointer
> >>> to ACPI_COMPANION_SET() in pcibios_root_bridge_prepare().
> >>
> >> And here's one more thing to test.
> >
> > And the below is how I'd fix it, so you can simply test this patch and skip the
> > previous ones.
> >
> > ---
> > From: Rafael J. Wysocki <[email protected]>
> > Subject: PCI / ACPI: Do not set ACPI companions for host bridges with parents
> >
> > Commit 97badf873ab6 (device property: Make it possible to use
> > secondary firmware nodes) uncovered a bug in the x86 (and ia64) PCI
> > host bridge initialization code that assumes bridge->bus->sysdata
> > to always point to a struct pci_sysdata object which need not be
> > the case (in particular, the Xen PCI frontend driver sets it to point
> > to a different data type). If it is not the case, an incorrect
> > pointer (or a piece of data that is not a pointer at all) will be
> > passed to ACPI_COMPANION_SET() and that may cause interesting
> > breakage to happen going forward.
> >
> > To work around this problem use the observation that the ACPI
> > host bridge initialization always passes NULL as parent to
> > pci_create_root_bus(), so if pcibios_root_bridge_prepare() sees
> > a non-NULL parent of the bridge, it should not attempt to set
> > an ACPI companion for it, because that means that
> > pci_create_root_bus() has been called by someone else.
>
>
> I am wondering whether at some point we might want to use companion
> information in Xen as well.
>
> Another option might be to require that whoever creates their own
> sysdata structure to have pci_sysdata as its first element, e.g.

Well, I was thinking about that, but it would be x86-specific and I believe
that Xen is supposed to work on other architectures too (or at least will be
in the future).

> --- a/drivers/pci/xen-pcifront.c
> +++ b/drivers/pci/xen-pcifront.c
> @@ -52,7 +52,12 @@ struct pcifront_device {
> };
>
> struct pcifront_sd {
> + /*
> + * Must be the first element of this structure since pcifront_sd
> + * is assigned to bus' sysdata and may later be dereferenced as
> + * pci_sysdata.
> + */
> + struct pci_sysdata sd;
> struct pcifront_device *pdev;
> };

Also if you want to use an ACPI companion, it will be a good question *which* one.

My half-way educated guess is that will be the ACPI companion of the parent,
in which case it's better to modify pcibios_root_bridge_prepare(). In any
case, we need to talk about that beforehand, so please let me know when you
have more specific plans.

As for 4.1 (which currently is broken for Sander), I think the $subject patch
is the cleanest and least intrusive thing we can do.

Thanks,
Rafael

2015-05-27 02:16:17

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH] PCI / ACPI: Do not set ACPI companions for host bridges with parents



On 05/26/2015 08:07 PM, Rafael J. Wysocki wrote:
> On Tuesday, May 26, 2015 10:32:03 AM Boris Ostrovsky wrote:
>> On 05/25/2015 10:17 PM, Rafael J. Wysocki wrote:
>>> On Tuesday, May 26, 2015 03:08:17 AM Rafael J. Wysocki wrote:
>>>> On Tuesday, May 26, 2015 01:42:16 AM Rafael J. Wysocki wrote:
>>>>> On Tuesday, May 26, 2015 01:22:12 AM Rafael J. Wysocki wrote:
>>>>>> On Friday, May 22, 2015 09:53:37 PM Boris Ostrovsky wrote:
>>>>>>> On 05/22/2015 04:11 AM, Sander Eikelenboom wrote:
>>>>>>>> Hello Sander,
>>>>>>>>
>>>>> [cut]
>>>>>
>>>>>>> (+Rafael again)
>>>>>>>
>>>>>>> So the immediate cause of those errors is that pdev->evtchn is 0.
>>>>>>> Backend is not notified and things not go well then.
>>>>>>>
>>>>>>> And it is indeed caused by 97badf873ab60e841243b66133ff9eff2a46ef29:
>>>>>>>
>>>>>>> We allocate pcifront_sd in pcifront_scan_root() and then pass it to
>>>>>>> pci_scan_bus_parented() as sysdata. Eventually this sysdata is used in
>>>>>>> pcibios_root_bridge_prepare() as pci_sysdata. It is dereferenced as
>>>>>>> pci_sysdata->companion (which I believe is aliased to pcifront_sd->pdev)
>>>>> Well, there is an int node field between them, so I'm not sure.
>>>>>
>>>>>>> and then set_primary_fwnode() writes it, thus corrupting
>>>>>>> pcifront_sd->pdev (and I think this is what sets evtchn to zero).
>>>>> So the corruption happens when set_primary_fwnode() writes NULL to the
>>>>> 'secondary' field of object pointed to by 'fwnode'.
>>>>>
>>>>> This isn't strictly necessary and we might avoid the crash by only
>>>>> writing to fwnode->secondary if fn is not NULL.
>>>>>
>>>>> So, Sander please test the patch below too if possible.
>>>>>
>>>>> Of course, that doesn't solve a problem of passing an incorrect pointer
>>>>> to ACPI_COMPANION_SET() in pcibios_root_bridge_prepare().
>>>> And here's one more thing to test.
>>> And the below is how I'd fix it, so you can simply test this patch and skip the
>>> previous ones.
>>>
>>> ---
>>> From: Rafael J. Wysocki <[email protected]>
>>> Subject: PCI / ACPI: Do not set ACPI companions for host bridges with parents
>>>
>>> Commit 97badf873ab6 (device property: Make it possible to use
>>> secondary firmware nodes) uncovered a bug in the x86 (and ia64) PCI
>>> host bridge initialization code that assumes bridge->bus->sysdata
>>> to always point to a struct pci_sysdata object which need not be
>>> the case (in particular, the Xen PCI frontend driver sets it to point
>>> to a different data type). If it is not the case, an incorrect
>>> pointer (or a piece of data that is not a pointer at all) will be
>>> passed to ACPI_COMPANION_SET() and that may cause interesting
>>> breakage to happen going forward.
>>>
>>> To work around this problem use the observation that the ACPI
>>> host bridge initialization always passes NULL as parent to
>>> pci_create_root_bus(), so if pcibios_root_bridge_prepare() sees
>>> a non-NULL parent of the bridge, it should not attempt to set
>>> an ACPI companion for it, because that means that
>>> pci_create_root_bus() has been called by someone else.
>>
>> I am wondering whether at some point we might want to use companion
>> information in Xen as well.
>>
>> Another option might be to require that whoever creates their own
>> sysdata structure to have pci_sysdata as its first element, e.g.
> Well, I was thinking about that, but it would be x86-specific and I believe
> that Xen is supposed to work on other architectures too (or at least will be
> in the future).
>
>> --- a/drivers/pci/xen-pcifront.c
>> +++ b/drivers/pci/xen-pcifront.c
>> @@ -52,7 +52,12 @@ struct pcifront_device {
>> };
>>
>> struct pcifront_sd {
>> + /*
>> + * Must be the first element of this structure since pcifront_sd
>> + * is assigned to bus' sysdata and may later be dereferenced as
>> + * pci_sysdata.
>> + */
>> + struct pci_sysdata sd;
>> struct pcifront_device *pdev;
>> };
> Also if you want to use an ACPI companion, it will be a good question *which* one.
>
> My half-way educated guess is that will be the ACPI companion of the parent,
> in which case it's better to modify pcibios_root_bridge_prepare(). In any
> case, we need to talk about that beforehand, so please let me know when you
> have more specific plans.

I don't have anything specific. And after looking at this code some more
I am not sure pcifront will ever want to use companions since ACPI is
not part of device initialization here (it's done via xenbus, which is a
purely software construct).


>
> As for 4.1 (which currently is broken for Sander), I think the $subject patch
> is the cleanest and least intrusive thing we can do.

OK. Thank you for fixing this.

-boris