2018-09-12 16:12:07

by Jan Kundrát

[permalink] [raw]
Subject: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hi,
since commit ee1604381a371b3ea6aec7d5e43b6e3f5e153854 ("PCI: mvebu: Only
remap I/O space if configured"), my board (Solidrun Clearfog Base) won't
finish booting with 4.18-rc3 won't boot:

> [ 1.741458] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
> [ 1.748182] CPU: 1 PID: 72 Comm: kworker/1:2 Tainted: G W 4.19.0-rc3 #1
> [ 1.756120] Hardware name: Marvell Armada 380/385 (Device Tree)
> [ 1.762063] Workqueue: events deferred_probe_work_func
> [ 1.767222] PC is at ioremap_page_range+0x114/0x1a4
> [ 1.772117] LR is at mvebu_pcie_probe+0x134/0xa60
> [ 1.776832] pc : [<c06e643c>] lr : [<c03e9d5c>] psr: a0000013
> [ 1.783115] sp : ee30dd88 ip : 000f0000 fp : c0837b38
> [ 1.788352] r10: 00000243 r9 : e9200000 r8 : 00000103
> [ 1.793590] r7 : ef6f8000 r6 : fee10000 r5 : fee00000 r4 : ef6f8004
> [ 1.800133] r3 : ef6f8000 r2 : e8000243 r1 : fee0ffff r0 : c0004000
> [ 1.806678] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 1.813832] Control: 10c5387d Table: 0000404a DAC: 00000051
> [ 1.819592] Process kworker/1:2 (pid: 72, stack limit = 0x(ptrval))
> [ 1.825874] Stack: (0xee30dd88 to 0xee30e000)
> [ 1.830244] dd80: fee10000 e9200000 fee0ffff ffe00000 c0007fb8 ee33e224
> [ 1.838445] dda0: 00000000 ee33e010 00000000 ef575610 ee33e224 00000000 c0a30ac4 00000002
> [ 1.846646] ddc0: ef9f827c c03e9d5c c0ab6d30 ef9ed434 ee342898 ee342898 ee33e1d0 ee33e1dc
> [ 1.854848] dde0: ee342898 00000000 c0a03bc8 c07fcabc 00000001 ef575610 ef577000 c02a72b4
> [ 1.863050] de00: 00000000 ee342898 ef6744d0 ef6744d0 ee342898 ef577000 c07fcabc 5d6fd1e4
> [ 1.871252] de20: c0a30ac4 ef575610 00000000 c0a30ac4 00000000 00000000 c0a30ac4 00000002
> [ 1.879454] de40: ef9e40c0 c043dba4 c0ab3144 ef575610 c0ab3150 c043c4e4 ef575610 c0a30ac4
> [ 1.887656] de60: c043cb2c c0a03bc8 00000001 c0a5cf50 00000000 c043c958 ef575610 c0a30ac4
> [ 1.895858] de80: ef575610 00000000 ee30ded4 c043cb2c c0a03bc8 00000001 c0a5cf50 00000000
> [ 1.904059] dea0: ef9e40c0 c043af7c c0a5cf50 ef484d6c ef66d8b8 5d6fd1e4 ef575610 ef575610
> [ 1.912261] dec0: c0a03bc8 ef575644 c0a371c4 c043c7c0 00000003 ef575610 00000001 5d6fd1e4
> [ 1.920462] dee0: ef575610 ef575610 c0a373f0 c0a371c4 00000000 c043b160 ef575610 c0a371a8
> [ 1.928663] df00: c0a371a8 c043bdf8 c0a371e0 ee2f9880 ef9e40c0 ef9e5300 00000000 c01388b4
> [ 1.936864] df20: ef9e40c0 ef9e40c0 ee30c000 ee2f9880 ef9e40c0 ee2f9894 c0a02d00 ef9e40d8
> [ 1.945064] df40: ffffe000 00000008 ef9e40c0 c0138e4c ffffe000 c0a5ca32 c07d2f68 00000000
> [ 1.953265] df60: ffffe000 eee5e0c0 ee2fa4c0 00000000 ee30c000 ee2f9880 c0138ba4 ef4d5ea4
> [ 1.961466] df80: eee5e0dc c013e448 00000001 ee2fa4c0 c013e2fc 00000000 00000000 00000000
> [ 1.969668] dfa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
> [ 1.977869] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 1.986069] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [ 1.994277] [<c06e643c>] (ioremap_page_range) from [<c03e9d5c>] (mvebu_pcie_probe+0x134/0xa60)
> [ 2.002917] [<c03e9d5c>] (mvebu_pcie_probe) from [<c043dba4>] (platform_drv_probe+0x48/0x98)
> [ 2.011382] [<c043dba4>] (platform_drv_probe) from [<c043c4e4>] (really_probe+0x1d0/0x2bc)
> [ 2.019671] [<c043c4e4>] (really_probe) from [<c043c958>] (driver_probe_device+0x60/0x160)
> [ 2.027961] [<c043c958>] (driver_probe_device) from [<c043af7c>] (bus_for_each_drv+0x58/0xb8)
> [ 2.036511] [<c043af7c>] (bus_for_each_drv) from [<c043c7c0>] (__device_attach+0xd0/0x138)
> [ 2.044800] [<c043c7c0>] (__device_attach) from [<c043b160>] (bus_probe_device+0x84/0x8c)
> [ 2.053002] [<c043b160>] (bus_probe_device) from [<c043bdf8>] (deferred_probe_work_func+0x60/0x8c)
> [ 2.061991] [<c043bdf8>] (deferred_probe_work_func) from [<c01388b4>] (process_one_work+0x218/0x508)
> [ 2.071152] [<c01388b4>] (process_one_work) from [<c0138e4c>] (worker_thread+0x2a8/0x5c0)
> [ 2.079355] [<c0138e4c>] (worker_thread) from [<c013e448>] (kthread+0x14c/0x154)
> [ 2.086774] [<c013e448>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> [ 2.094015] Exception stack(0xee30dfb0 to 0xee30dff8)
> [ 2.099079] dfa0: 00000000 00000000 00000000 00000000
> [ 2.107280] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 2.115481] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 2.122115] Code: e2844004 e5972000 e3520000 0affffee (e7f001f2)
> [ 2.128225] ---[ end trace 7a77412d00a47123 ]---
> [ 2.132853] Kernel panic - not syncing: Fatal exception
> [ 2.138094] CPU0: stopping
> [ 2.140811] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D W 4.19.0-rc3 #1
> [ 2.148489] Hardware name: Marvell Armada 380/385 (Device Tree)
> [ 2.154430] [<c01109fc>] (unwind_backtrace) from [<c010c894>] (show_stack+0x10/0x14)
> [ 2.162198] [<c010c894>] (show_stack) from [<c06e29c4>] (dump_stack+0x88/0x9c)
> [ 2.169443] [<c06e29c4>] (dump_stack) from [<c010f71c>] (handle_IPI+0x38c/0x3ac)
> [ 2.176863] [<c010f71c>] (handle_IPI) from [<c03b4af4>] (gic_handle_irq+0x8c/0x90)
> [ 2.184457] [<c03b4af4>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [ 2.191960] Exception stack(0xc0a01f18 to 0xc0a01f60)
> [ 2.197024] 1f00: 00000000 00001db4
> [ 2.205226] 1f20: ef9d1398 c0119260 ffffe000 c0a03bf0 c0a03c30 00000001 00000000 c0a03bc8
> [ 2.213426] 1f40: c0968980 00000000 00000000 c0a01f68 c0108e68 c0108e6c 60000013 ffffffff
> [ 2.221631] [<c0101a0c>] (__irq_svc) from [<c0108e6c>] (arch_cpu_idle+0x38/0x3c)
> [ 2.229051] [<c0108e6c>] (arch_cpu_idle) from [<c014cf4c>] (do_idle+0x1e0/0x210)
> [ 2.236469] [<c014cf4c>] (do_idle) from [<c014d218>] (cpu_startup_entry+0x18/0x20)
> [ 2.244065] [<c014d218>] (cpu_startup_entry) from [<c0900f48>] (start_kernel+0x42c/0x454)
> [ 2.252268] [<c0900f48>] (start_kernel) from [<00000000>] ( (null))
> [ 2.258642] Rebooting in 10 seconds..

I cannot easily revert that commit now due to some followup changes. I'll
be happy to test patches which attempt to fix this.

The DTS file in use on this board is armada-388-clearfog-base.dts in case
it matters (we have some DT add-on patches on top of that, but nothing PCI-
or MBUS-related).

With kind regards,
Jan


2018-09-12 18:52:33

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Jan, Baruch,

On Wed, 12 Sep 2018 21:49:41 +0300, Baruch Siach wrote:

> Jan Kundrát writes:
> > since commit ee1604381a371b3ea6aec7d5e43b6e3f5e153854 ("PCI: mvebu: Only
> > remap I/O space if configured"), my board (Solidrun Clearfog Base) won't
> > finish booting with 4.18-rc3 won't boot:
>
> You mean '4.19-rc3', right?
>
> >> [ 1.741458] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
> >> [ 1.748182] CPU: 1 PID: 72 Comm: kworker/1:2 Tainted: G W 4.19.0-rc3 #1
>
> The 'W' taint means that there was a kernel warning before. Which
> warning was that?
>
> I reproduced the same Oops on Clearfog Base without any taint:

Thanks for the report, I'll have a look tomorrow when I have to
ClearFog hardware.

Thanks,

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-12 18:52:50

by Baruch Siach

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hi Jan,

Jan Kundrát writes:
> since commit ee1604381a371b3ea6aec7d5e43b6e3f5e153854 ("PCI: mvebu: Only
> remap I/O space if configured"), my board (Solidrun Clearfog Base) won't
> finish booting with 4.18-rc3 won't boot:

You mean '4.19-rc3', right?

>> [ 1.741458] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
>> [ 1.748182] CPU: 1 PID: 72 Comm: kworker/1:2 Tainted: G W 4.19.0-rc3 #1

The 'W' taint means that there was a kernel warning before. Which
warning was that?

I reproduced the same Oops on Clearfog Base without any taint:

[ 1.476401] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
[ 1.483124] CPU: 1 PID: 1241 Comm: kworker/1:2 Not tainted 4.19.0-rc3 #145
[ 1.490013] Hardware name: Marvell Armada 380/385 (Device Tree)
[ 1.495953] Workqueue: events deferred_probe_work_func
[ 1.501108] PC is at ioremap_page_range+0x114/0x1a4
[ 1.505999] LR is at mvebu_pcie_probe+0x138/0x7d8
[ 1.510711] pc : [<c083552c>] lr : [<c043980c>] psr: a0000013
[ 1.516990] sp : ef257d80 ip : 000f0000 fp : c0a6d0bc
[ 1.522225] r10: 00000243 r9 : e9200000 r8 : 00000103
[ 1.527460] r7 : ee87f000 r6 : fee10000 r5 : fee00000 r4 : ee87f004
[ 1.534001] r3 : ee87f000 r2 : e8000243 r1 : fee0ffff r0 : c0004000
[ 1.540543] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 1.547693] Control: 10c5387d Table: 0000404a DAC: 00000051
[ 1.553450] Process kworker/1:2 (pid: 1241, stack limit = 0x(ptrval))
[ 1.559903] Stack: (0xef257d80 to 0xef258000)
[ 1.564269] 7d80: fee10000 e9200000 fee0ffff ffe00000 c0007fb8 00000000 00000000 ee743010
[ 1.572466] 7da0: ef200410 ee743264 00000000 00000000 c0e39474 00000001 00000000 c043980c
[ 1.580662] 7dc0: ef257e00 c0e03c08 ef7f7574 c0a26120 ef7f7574 ee743210 ee74321c f2a8b0d2
[ 1.588858] 7de0: c0e03c08 c0e03c08 c0e39474 ef200410 00000000 c063ce84 00000000 c0296ad4
[ 1.597054] 7e00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 f2a8b0d2
[ 1.605250] 7e20: 00000000 ef200410 00000000 c0e39474 00000000 00000000 c0e39474 00000001
[ 1.613446] 7e40: ef7e52c0 c048c4c0 c0e9461c ef200410 c0e94628 c048a8fc ef200410 c0e39474
[ 1.621642] 7e60: c048af44 c0e03c08 00000001 c0e73c30 00000000 c048ad70 ef200410 c0e39474
[ 1.629838] 7e80: ef200410 00000000 ef257ed4 c048af44 c0e03c08 00000001 c0e73c30 00000000
[ 1.638034] 7ea0: ef7e52c0 c0489354 c0e73c30 ef088d6c ef3116b8 f2a8b0d2 ef200410 ef200410
[ 1.646230] 7ec0: c0e03c08 ef200444 c0e3fae8 c048abd8 c0e03c08 ef200410 00000001 f2a8b0d2
[ 1.654426] 7ee0: ef200410 ef200410 c0e3fd30 c0e3fae8 00000000 c0489538 ef200410 c0e3facc
[ 1.662622] 7f00: c0e3facc c048a1d0 c0e3fb04 ee4b7f00 ef7e52c0 ef7e6400 00000000 c0139974
[ 1.670818] 7f20: ef7e52c0 ef7e52c0 c0e02d00 ee4b7f00 ef7e52c0 ee4b7f14 c0e02d00 ef7e52d8
[ 1.679014] 7f40: ffffe000 00000008 ef7e52c0 c0139f0c 00000000 c0e73696 c09efbe0 ee952d40
[ 1.687209] 7f60: 00000000 ee75f680 ef256000 ee952d40 00000000 ee4b7f00 c0139c64 ef0d9ea4
[ 1.695405] 7f80: ee75f69c c013f75c 0000004e ee952d40 c013f62c 00000000 00000000 00000000
[ 1.703601] 7fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
[ 1.711796] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 1.719992] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[ 1.728195] [<c083552c>] (ioremap_page_range) from [<c043980c>] (mvebu_pcie_probe+0x138/0x7d8)
[ 1.736830] [<c043980c>] (mvebu_pcie_probe) from [<c048c4c0>] (platform_drv_probe+0x48/0x98)
[ 1.745290] [<c048c4c0>] (platform_drv_probe) from [<c048a8fc>] (really_probe+0x1d0/0x2bc)
[ 1.753575] [<c048a8fc>] (really_probe) from [<c048ad70>] (driver_probe_device+0x60/0x160)
[ 1.761860] [<c048ad70>] (driver_probe_device) from [<c0489354>] (bus_for_each_drv+0x58/0xb8)
[ 1.770404] [<c0489354>] (bus_for_each_drv) from [<c048abd8>] (__device_attach+0xd0/0x138)
[ 1.778688] [<c048abd8>] (__device_attach) from [<c0489538>] (bus_probe_device+0x84/0x8c)
[ 1.786884] [<c0489538>] (bus_probe_device) from [<c048a1d0>] (deferred_probe_work_func+0x60/0x8c)
[ 1.795868] [<c048a1d0>] (deferred_probe_work_func) from [<c0139974>] (process_one_work+0x218/0x508)
[ 1.805023] [<c0139974>] (process_one_work) from [<c0139f0c>] (worker_thread+0x2a8/0x5c0)
[ 1.813221] [<c0139f0c>] (worker_thread) from [<c013f75c>] (kthread+0x130/0x138)
[ 1.820635] [<c013f75c>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[ 1.827872] Exception stack(0xef257fb0 to 0xef257ff8)
[ 1.832933] 7fa0: 00000000 00000000 00000000 00000000
[ 1.841129] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 1.849324] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 1.855954] Code: e2844004 e5972000 e3520000 0affffee (e7f001f2)
[ 1.862059] ---[ end trace 4067cdcfd86589a5 ]---

>> [ 1.756120] Hardware name: Marvell Armada 380/385 (Device Tree)
>> [ 1.762063] Workqueue: events deferred_probe_work_func
>> [ 1.767222] PC is at ioremap_page_range+0x114/0x1a4
>> [ 1.772117] LR is at mvebu_pcie_probe+0x134/0xa60
>> [ 1.776832] pc : [<c06e643c>] lr : [<c03e9d5c>] psr: a0000013
>> [ 1.783115] sp : ee30dd88 ip : 000f0000 fp : c0837b38
>> [ 1.788352] r10: 00000243 r9 : e9200000 r8 : 00000103
>> [ 1.793590] r7 : ef6f8000 r6 : fee10000 r5 : fee00000 r4 : ef6f8004
>> [ 1.800133] r3 : ef6f8000 r2 : e8000243 r1 : fee0ffff r0 : c0004000
>> [ 1.806678] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
>> [ 1.813832] Control: 10c5387d Table: 0000404a DAC: 00000051
>> [ 1.819592] Process kworker/1:2 (pid: 72, stack limit = 0x(ptrval))
>> [ 1.825874] Stack: (0xee30dd88 to 0xee30e000)
>> [ 1.830244] dd80: fee10000 e9200000 fee0ffff ffe00000 c0007fb8 ee33e224
>> [ 1.838445] dda0: 00000000 ee33e010 00000000 ef575610 ee33e224 00000000 c0a30ac4 00000002
>> [ 1.846646] ddc0: ef9f827c c03e9d5c c0ab6d30 ef9ed434 ee342898 ee342898 ee33e1d0 ee33e1dc
>> [ 1.854848] dde0: ee342898 00000000 c0a03bc8 c07fcabc 00000001 ef575610 ef577000 c02a72b4
>> [ 1.863050] de00: 00000000 ee342898 ef6744d0 ef6744d0 ee342898 ef577000 c07fcabc 5d6fd1e4
>> [ 1.871252] de20: c0a30ac4 ef575610 00000000 c0a30ac4 00000000 00000000 c0a30ac4 00000002
>> [ 1.879454] de40: ef9e40c0 c043dba4 c0ab3144 ef575610 c0ab3150 c043c4e4 ef575610 c0a30ac4
>> [ 1.887656] de60: c043cb2c c0a03bc8 00000001 c0a5cf50 00000000 c043c958 ef575610 c0a30ac4
>> [ 1.895858] de80: ef575610 00000000 ee30ded4 c043cb2c c0a03bc8 00000001 c0a5cf50 00000000
>> [ 1.904059] dea0: ef9e40c0 c043af7c c0a5cf50 ef484d6c ef66d8b8 5d6fd1e4 ef575610 ef575610
>> [ 1.912261] dec0: c0a03bc8 ef575644 c0a371c4 c043c7c0 00000003 ef575610 00000001 5d6fd1e4
>> [ 1.920462] dee0: ef575610 ef575610 c0a373f0 c0a371c4 00000000 c043b160 ef575610 c0a371a8
>> [ 1.928663] df00: c0a371a8 c043bdf8 c0a371e0 ee2f9880 ef9e40c0 ef9e5300 00000000 c01388b4
>> [ 1.936864] df20: ef9e40c0 ef9e40c0 ee30c000 ee2f9880 ef9e40c0 ee2f9894 c0a02d00 ef9e40d8
>> [ 1.945064] df40: ffffe000 00000008 ef9e40c0 c0138e4c ffffe000 c0a5ca32 c07d2f68 00000000
>> [ 1.953265] df60: ffffe000 eee5e0c0 ee2fa4c0 00000000 ee30c000 ee2f9880 c0138ba4 ef4d5ea4
>> [ 1.961466] df80: eee5e0dc c013e448 00000001 ee2fa4c0 c013e2fc 00000000 00000000 00000000
>> [ 1.969668] dfa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
>> [ 1.977869] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> [ 1.986069] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
>> [ 1.994277] [<c06e643c>] (ioremap_page_range) from [<c03e9d5c>] (mvebu_pcie_probe+0x134/0xa60)
>> [ 2.002917] [<c03e9d5c>] (mvebu_pcie_probe) from [<c043dba4>] (platform_drv_probe+0x48/0x98)
>> [ 2.011382] [<c043dba4>] (platform_drv_probe) from [<c043c4e4>] (really_probe+0x1d0/0x2bc)
>> [ 2.019671] [<c043c4e4>] (really_probe) from [<c043c958>] (driver_probe_device+0x60/0x160)
>> [ 2.027961] [<c043c958>] (driver_probe_device) from [<c043af7c>] (bus_for_each_drv+0x58/0xb8)
>> [ 2.036511] [<c043af7c>] (bus_for_each_drv) from [<c043c7c0>] (__device_attach+0xd0/0x138)
>> [ 2.044800] [<c043c7c0>] (__device_attach) from [<c043b160>] (bus_probe_device+0x84/0x8c)
>> [ 2.053002] [<c043b160>] (bus_probe_device) from [<c043bdf8>] (deferred_probe_work_func+0x60/0x8c)
>> [ 2.061991] [<c043bdf8>] (deferred_probe_work_func) from [<c01388b4>] (process_one_work+0x218/0x508)
>> [ 2.071152] [<c01388b4>] (process_one_work) from [<c0138e4c>] (worker_thread+0x2a8/0x5c0)
>> [ 2.079355] [<c0138e4c>] (worker_thread) from [<c013e448>] (kthread+0x14c/0x154)
>> [ 2.086774] [<c013e448>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>> [ 2.094015] Exception stack(0xee30dfb0 to 0xee30dff8)
>> [ 2.099079] dfa0: 00000000 00000000 00000000 00000000
>> [ 2.107280] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> [ 2.115481] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> [ 2.122115] Code: e2844004 e5972000 e3520000 0affffee (e7f001f2)
>> [ 2.128225] ---[ end trace 7a77412d00a47123 ]---
>> [ 2.132853] Kernel panic - not syncing: Fatal exception
>> [ 2.138094] CPU0: stopping
>> [ 2.140811] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D W 4.19.0-rc3 #1
>> [ 2.148489] Hardware name: Marvell Armada 380/385 (Device Tree)
>> [ 2.154430] [<c01109fc>] (unwind_backtrace) from [<c010c894>] (show_stack+0x10/0x14)
>> [ 2.162198] [<c010c894>] (show_stack) from [<c06e29c4>] (dump_stack+0x88/0x9c)
>> [ 2.169443] [<c06e29c4>] (dump_stack) from [<c010f71c>] (handle_IPI+0x38c/0x3ac)
>> [ 2.176863] [<c010f71c>] (handle_IPI) from [<c03b4af4>] (gic_handle_irq+0x8c/0x90)
>> [ 2.184457] [<c03b4af4>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
>> [ 2.191960] Exception stack(0xc0a01f18 to 0xc0a01f60)
>> [ 2.197024] 1f00: 00000000 00001db4
>> [ 2.205226] 1f20: ef9d1398 c0119260 ffffe000 c0a03bf0 c0a03c30 00000001 00000000 c0a03bc8
>> [ 2.213426] 1f40: c0968980 00000000 00000000 c0a01f68 c0108e68 c0108e6c 60000013 ffffffff
>> [ 2.221631] [<c0101a0c>] (__irq_svc) from [<c0108e6c>] (arch_cpu_idle+0x38/0x3c)
>> [ 2.229051] [<c0108e6c>] (arch_cpu_idle) from [<c014cf4c>] (do_idle+0x1e0/0x210)
>> [ 2.236469] [<c014cf4c>] (do_idle) from [<c014d218>] (cpu_startup_entry+0x18/0x20)
>> [ 2.244065] [<c014d218>] (cpu_startup_entry) from [<c0900f48>] (start_kernel+0x42c/0x454)
>> [ 2.252268] [<c0900f48>] (start_kernel) from [<00000000>] ( (null))
>> [ 2.258642] Rebooting in 10 seconds..
>
> I cannot easily revert that commit now due to some followup changes. I'll
> be happy to test patches which attempt to fix this.
>
> The DTS file in use on this board is armada-388-clearfog-base.dts in case
> it matters (we have some DT add-on patches on top of that, but nothing PCI-
> or MBUS-related).

baruch

--
http://baruch.siach.name/blog/ ~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- [email protected] - tel: +972.52.368.4656, http://www.tkos.co.il -

2018-09-12 19:01:20

by Jan Kundrát

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

>> You mean '4.19-rc3', right?

Right, sorry.

>>>> [ 1.741458] Internal error: Oops - undefined instruction:
>>>> 0 [#1] SMP ARM
>>>> [ 1.748182] CPU: 1 PID: 72 Comm: kworker/1:2 Tainted: G
>>>> W 4.19.0-rc3 #1
>>
>> The 'W' taint means that there was a kernel warning before. Which
>> warning was that?

I have "spidev" listed in my DT, and the kernel complains about that;
that's a harmelss warning. (I sent a patch which adds my particular device
into the spidev driver some time ago.)

With kind regards,
Jan

2018-09-12 23:12:17

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On Wed, Sep 12, 2018 at 09:49:41PM +0300, Baruch Siach wrote:
> I reproduced the same Oops on Clearfog Base without any taint:
>
> [ 1.476401] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
...
> [ 1.855954] Code: e2844004 e5972000 e3520000 0affffee (e7f001f2)

That is a BUG(). Please turn on verbose bug reporting to get more
information about the cause.

There are two possibilities:

BUG_ON(addr >= end);

and

BUG_ON(!pte_none(*pte));

It's probably the latter - the region is probably already mapped, that
being the PCI IO region.

The original driver was setup to call pci_ioremap_io() as the very
last thing - and as the driver is non-removable, we were guaranteed
to never tear down this mapping (which is sensible, it's published
to userspace.)

However, the current code calls pci_ioremap_io() much earlier, in
a path where probe failures can occur. This breaks pci_ioremap_io()'s
requirements - it must not be called more than once. So:

ee1604381a37 ("PCI: mvebu: Only remap I/O space if configured")

is basically incorrect - pci_ioremap_io() needs to move back to a
place where it is only called in a path which will never fail.
However, looking at the generic host bits, I'm not sure such a place
exists in the new effort to make stuff more generic.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

2018-09-13 03:20:20

by Baruch Siach

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hi Russell,

Russell King - ARM Linux writes:
> On Wed, Sep 12, 2018 at 09:49:41PM +0300, Baruch Siach wrote:
>> I reproduced the same Oops on Clearfog Base without any taint:
>>
>> [ 1.476401] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
> ...
>> [ 1.855954] Code: e2844004 e5972000 e3520000 0affffee (e7f001f2)
>
> That is a BUG(). Please turn on verbose bug reporting to get more
> information about the cause.
>
> There are two possibilities:
>
> BUG_ON(addr >= end);
>
> and
>
> BUG_ON(!pte_none(*pte));
>
> It's probably the latter - the region is probably already mapped, that
> being the PCI IO region.

That is the one. Enabling CONFIG_DEBUG_BUGVERBOSE shows:

[ 1.481927] kernel BUG at lib/ioremap.c:72!
[ 1.486118] Internal error: Oops - BUG: 0 [#1] SMP ARM
[ 1.491269] CPU: 0 PID: 1246 Comm: kworker/0:2 Not tainted 4.19.0-rc3 #146
...

baruch

> The original driver was setup to call pci_ioremap_io() as the very
> last thing - and as the driver is non-removable, we were guaranteed
> to never tear down this mapping (which is sensible, it's published
> to userspace.)
>
> However, the current code calls pci_ioremap_io() much earlier, in
> a path where probe failures can occur. This breaks pci_ioremap_io()'s
> requirements - it must not be called more than once. So:
>
> ee1604381a37 ("PCI: mvebu: Only remap I/O space if configured")
>
> is basically incorrect - pci_ioremap_io() needs to move back to a
> place where it is only called in a path which will never fail.
> However, looking at the generic host bits, I'm not sure such a place
> exists in the new effort to make stuff more generic.

--
http://baruch.siach.name/blog/ ~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- [email protected] - tel: +972.52.368.4656, http://www.tkos.co.il -

2018-09-13 07:45:39

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Russell, Baruch, Jan,

On Thu, 13 Sep 2018 00:10:51 +0100, Russell King - ARM Linux wrote:

> It's probably the latter - the region is probably already mapped, that
> being the PCI IO region.
>
> The original driver was setup to call pci_ioremap_io() as the very
> last thing - and as the driver is non-removable, we were guaranteed
> to never tear down this mapping (which is sensible, it's published
> to userspace.)
>
> However, the current code calls pci_ioremap_io() much earlier, in
> a path where probe failures can occur. This breaks pci_ioremap_io()'s
> requirements - it must not be called more than once. So:
>
> ee1604381a37 ("PCI: mvebu: Only remap I/O space if configured")
>
> is basically incorrect - pci_ioremap_io() needs to move back to a
> place where it is only called in a path which will never fail.
> However, looking at the generic host bits, I'm not sure such a place
> exists in the new effort to make stuff more generic.

What about something like the below. I tested it, including the error
case by forcing an -EPROBE_DEFER. The new pci_unmap_io() is modeled
after pci_unmap_iospace(). Actually, I would prefer to use
pci_remap_iospace() and pci_unmap_iospace() but for now this API
doesn't allow overloading the memory type used for the mapping.

diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index 2cfbc531f63b..cee0e2be5ac7 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -185,6 +185,7 @@ static inline void pci_ioremap_set_mem_type(int mem_type) {}
#endif

extern int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr);
+extern void pci_unmap_io(unsigned int offset);

/*
* PCI configuration space mapping function.
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index fc91205ff46c..f3a4df44bb27 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -482,6 +482,14 @@ int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr)
}
EXPORT_SYMBOL_GPL(pci_ioremap_io);

+void pci_unmap_io(unsigned int offset)
+{
+ BUG_ON(offset + SZ_64K > IO_SPACE_LIMIT);
+
+ unmap_kernel_range(PCI_IO_VIRT_BASE + offset, SZ_64K);
+}
+EXPORT_SYMBOL_GPL(pci_unmap_io);
+
void __iomem *pci_remap_cfgspace(resource_size_t res_cookie, size_t size)
{
return arch_ioremap_caller(res_cookie, size, MT_UNCACHED,
diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
index 50eb0729385b..772cff19c2ce 100644
--- a/drivers/pci/controller/pci-mvebu.c
+++ b/drivers/pci/controller/pci-mvebu.c
@@ -1145,7 +1145,6 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
{
struct device *dev = &pcie->pdev->dev;
struct device_node *np = dev->of_node;
- unsigned int i;
int ret;

INIT_LIST_HEAD(&pcie->resources);
@@ -1179,15 +1178,34 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
resource_size(&pcie->io) - 1);
pcie->realio.name = "PCI I/O";

- for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
- pci_ioremap_io(i, pcie->io.start + i);
-
pci_add_resource(&pcie->resources, &pcie->realio);
}

return devm_request_pci_bus_resources(dev, &pcie->resources);
}

+static void mvebu_pcie_map_io(struct mvebu_pcie *pcie)
+{
+ int i;
+
+ if (resource_size(&pcie->io) == 0)
+ return;
+
+ for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
+ pci_ioremap_io(i, pcie->io.start + i);
+}
+
+static void mvebu_pcie_unmap_io(struct mvebu_pcie *pcie)
+{
+ int i;
+
+ if (resource_size(&pcie->io) == 0)
+ return;
+
+ for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
+ pci_unmap_io(i);
+}
+
static int mvebu_pcie_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -1258,6 +1276,8 @@ static int mvebu_pcie_probe(struct platform_device *pdev)

pcie->nports = i;

+ mvebu_pcie_map_io(pcie);
+
list_splice_init(&pcie->resources, &bridge->windows);
bridge->dev.parent = dev;
bridge->sysdata = pcie;
@@ -1268,7 +1288,13 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
bridge->align_resource = mvebu_pcie_align_resource;
bridge->msi = pcie->msi;

- return pci_host_probe(bridge);
+ ret = pci_host_probe(bridge);
+ if (ret) {
+ mvebu_pcie_unmap_io(pcie);
+ return ret;
+ }
+
+ return 0;
}

static const struct of_device_id mvebu_pcie_of_match_table[] = {


--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-13 08:21:13

by Jan Kundrát

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On čtvrtek 13. září 2018 9:45:15 CEST, Thomas Petazzoni wrote:
> What about something like the below. I tested it, including the error
> case by forcing an -EPROBE_DEFER. The new pci_unmap_io() is modeled
> after pci_unmap_iospace(). Actually, I would prefer to use
> pci_remap_iospace() and pci_unmap_iospace() but for now this API
> doesn't allow overloading the memory type used for the mapping.

Thanks for providing this fix so quickly, Thomas. I can confirm that this
patch -- tested on top of 54eda9df17f3215b9ed16629ee71ea07413efdaf ("Merge
tag 'pci-v4.19-fixes-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci"). Disclaimer: I
have zero familiarity with Linux' PCI code.

Tested-by: Jan Kundrát <[email protected]>

With kind regards,
Jan

2018-09-13 08:43:12

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hello,

On Thu, 13 Sep 2018 10:20:45 +0200, Jan Kundrát wrote:
> On čtvrtek 13. září 2018 9:45:15 CEST, Thomas Petazzoni wrote:
> > What about something like the below. I tested it, including the error
> > case by forcing an -EPROBE_DEFER. The new pci_unmap_io() is modeled
> > after pci_unmap_iospace(). Actually, I would prefer to use
> > pci_remap_iospace() and pci_unmap_iospace() but for now this API
> > doesn't allow overloading the memory type used for the mapping.
>
> Thanks for providing this fix so quickly, Thomas. I can confirm that this
> patch -- tested on top of 54eda9df17f3215b9ed16629ee71ea07413efdaf ("Merge
> tag 'pci-v4.19-fixes-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci"). Disclaimer: I
> have zero familiarity with Linux' PCI code.
>
> Tested-by: Jan Kundrát <[email protected]>

Thanks for the testing. I'll wait for Russell to say if he is happy
(or not) with the addition of pci_unmap_io() in the ARM code, if that's
the case, I'll send a proper patch to fix the issue.

Best regards,

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-24 10:07:25

by Jan Kundrát

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

>>> What about something like the below. I tested it, including the error
>>> case by forcing an -EPROBE_DEFER. The new pci_unmap_io() is modeled
>>> after pci_unmap_iospace(). Actually, I would prefer to use
>>> pci_remap_iospace() and pci_unmap_iospace() but for now this API
>>> doesn't allow overloading the memory type used for the mapping.
>>
>> Thanks for providing this fix so quickly, Thomas. I can confirm that this
>> patch -- tested on top of
>> 54eda9df17f3215b9ed16629ee71ea07413efdaf ("Merge
>> tag 'pci-v4.19-fixes-1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci").
>> Disclaimer: I
>> have zero familiarity with Linux' PCI code.
>>
>> Tested-by: Jan Kundrát <[email protected]>
>
> Thanks for the testing. I'll wait for Russell to say if he is happy
> (or not) with the addition of pci_unmap_io() in the ARM code, if that's
> the case, I'll send a proper patch to fix the issue.

Hi Thomas, Russell,
is there a proper patch for this? I've just verified that 4.19-rc5 won't
boot for me either. Thomas' quick patch applies and makes the problem go
away.

With kind regards,
Jan

2018-09-24 10:13:48

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hello,

On Mon, 24 Sep 2018 12:02:33 +0200, Jan Kundrát wrote:

> is there a proper patch for this? I've just verified that 4.19-rc5 won't
> boot for me either. Thomas' quick patch applies and makes the problem go
> away.

I was waiting for a quick review from Russell on my proposal, but since
it didn't happen so far, I will send a proper patch series, hopefully
today.

Thanks for your reminder!

Best regards,

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-24 10:14:21

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On Thu, Sep 13, 2018 at 10:42:41AM +0200, Thomas Petazzoni wrote:
> Hello,
>
> On Thu, 13 Sep 2018 10:20:45 +0200, Jan Kundrát wrote:
> > On čtvrtek 13. září 2018 9:45:15 CEST, Thomas Petazzoni wrote:
> > > What about something like the below. I tested it, including the error
> > > case by forcing an -EPROBE_DEFER. The new pci_unmap_io() is modeled
> > > after pci_unmap_iospace(). Actually, I would prefer to use
> > > pci_remap_iospace() and pci_unmap_iospace() but for now this API
> > > doesn't allow overloading the memory type used for the mapping.
> >
> > Thanks for providing this fix so quickly, Thomas. I can confirm that this
> > patch -- tested on top of 54eda9df17f3215b9ed16629ee71ea07413efdaf ("Merge
> > tag 'pci-v4.19-fixes-1' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci"). Disclaimer: I
> > have zero familiarity with Linux' PCI code.
> >
> > Tested-by: Jan Kundrát <[email protected]>
>
> Thanks for the testing. I'll wait for Russell to say if he is happy
> (or not) with the addition of pci_unmap_io() in the ARM code, if that's
> the case, I'll send a proper patch to fix the issue.

I'd prefer not to provide an unmapping API, because it gives the
impression that it's okay to unmap the IO space mapping, and we'll
end up with drivers that decide to call it in their cleanup path.
IO space isn't expected to ever go away - eg, on a PC, it's always
present.

I've been toying with the idea of making pci_map_io() ignore
subsequent attempts to overwrite the mapping with an identical
mapping, and WARN() if there is an attempt to overwrite an existing
mapping with other physical address, but I'm not entirely happy
with that either (which is why I haven't responded sooner.)

At the moment, I don't have a way forward that I'm happy with for
this.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

2018-09-24 10:29:30

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hello,

On Mon, 24 Sep 2018 11:12:13 +0100, Russell King - ARM Linux wrote:

> > Thanks for the testing. I'll wait for Russell to say if he is happy
> > (or not) with the addition of pci_unmap_io() in the ARM code, if that's
> > the case, I'll send a proper patch to fix the issue.
>
> I'd prefer not to provide an unmapping API, because it gives the
> impression that it's okay to unmap the IO space mapping, and we'll
> end up with drivers that decide to call it in their cleanup path.
> IO space isn't expected to ever go away - eg, on a PC, it's always
> present.

But being able to unmap it would also be needed to be able to remove
PCI host controller drivers, and therefore compile them as module, and
make them more like any other drivers.

I'm not sure why we need to guarantee that the I/O space is always
mapped:

- It isn't mapped before the PCI controller driver does the mapping.

- There is no reason for it to be accessed when the PCI controller
driver is not initialized: PCI devices can only be probed and
initialized when the PCI controller driver is probed/initialized.

Also, in general, pci_ioremap_io() is ARM specific, and is now only
used by very few drivers:

- Dove (Marvell platform)
- IOP13xx
- MV78xx0 (Marvell platform, should be moved to use pci-mvebu)
- Orion5x (Marvell platform, should be moved to use pci-mvebu)
- pci-mvebu
- at91_cf

All other drivers, including on ARM, use pci_remap_iospace(), which
does provide the pci_unmap_iospace() counter part.

The only reason I have not changed pci-mvebu to use
pci_{remap,unmap}_iospace() is because it doesn't allow to change the
memory attributes.

But to me, the general direction is that the ARM-specific
pci_remap_io() API is fading away, and its replacement already provides
an unmapping capability. So why not add the same unmapping capability
to pci_remap_io() ?

> I've been toying with the idea of making pci_map_io() ignore
> subsequent attempts to overwrite the mapping with an identical
> mapping, and WARN() if there is an attempt to overwrite an existing
> mapping with other physical address, but I'm not entirely happy
> with that either (which is why I haven't responded sooner.)
>
> At the moment, I don't have a way forward that I'm happy with for
> this.

But we have a regression and we need to fix it. Do you suggest to not
use the new pci_host_probe() API ?

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-24 11:14:44

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On Mon, Sep 24, 2018 at 12:26:14PM +0200, Thomas Petazzoni wrote:
> Hello,
>
> On Mon, 24 Sep 2018 11:12:13 +0100, Russell King - ARM Linux wrote:
>
> > > Thanks for the testing. I'll wait for Russell to say if he is happy
> > > (or not) with the addition of pci_unmap_io() in the ARM code, if that's
> > > the case, I'll send a proper patch to fix the issue.
> >
> > I'd prefer not to provide an unmapping API, because it gives the
> > impression that it's okay to unmap the IO space mapping, and we'll
> > end up with drivers that decide to call it in their cleanup path.
> > IO space isn't expected to ever go away - eg, on a PC, it's always
> > present.
>
> But being able to unmap it would also be needed to be able to remove
> PCI host controller drivers, and therefore compile them as module, and
> make them more like any other drivers.
>
> I'm not sure why we need to guarantee that the I/O space is always
> mapped:
>
> - It isn't mapped before the PCI controller driver does the mapping.
>
> - There is no reason for it to be accessed when the PCI controller
> driver is not initialized: PCI devices can only be probed and
> initialized when the PCI controller driver is probed/initialized.

There are historic reasons. PCI provides ISA IO space, and when you
have a machine with ISA peripherals present, the PCI IO space must
never be unmapped - if it is, ISA drivers will oops the kernel. There
is no way for a vanishing PCI controller to cause ISA drivers to be
unbound.

If you have a host controller that does unmap PCI IO space and you have
ISA peripherals with drivers present, unbinding the PCI host controller
will remove the IO space mapping, and next time an ISA peripheral
touches IO space, the kernel will oops.

> All other drivers, including on ARM, use pci_remap_iospace(), which
> does provide the pci_unmap_iospace() counter part.

... which has been created in PCI land just to deal with PCI without
regard for the above issue.

However, there's another issue I missed - if you _do_ have ISA
peripherals, you likely want the IO space setup from very early on,
and you won't be using the new fangled PCI host driver support anyway.
That uses pci_map_io_early() rather than pci_ioremap_io() or
pci_remap_io().

> But to me, the general direction is that the ARM-specific
> pci_remap_io() API is fading away, and its replacement already provides
> an unmapping capability. So why not add the same unmapping capability
> to pci_remap_io() ?

Yes, that would be a good longer term plan - we don't need three
different ways to map PCI IO space, but it is development.

> But we have a regression and we need to fix it. Do you suggest to not
> use the new pci_host_probe() API ?

Well, arguably, the patch that caused the regression is the buggy patch,
_not_ the lack of unmapping API for pci_ioremap_io(). Trying to address
a regression with further development means that _that_ development
needs thought and review, which is a slower process.

I do understand the desire to keep moving forward and never take a step
backwards, but sometimes backwards steps are the best way to resolve a
regression. But I also do appreciate that a simple revert in this case
is not possible.

I'll accept your patch on the condition that the ARM private
pci_ioremap_io() will go away in the very near future (please _try_
to get agreement on that before this patch is merged.)

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

2018-09-24 12:47:47

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On Mon, Sep 24, 2018 at 02:12:03PM +0200, Thomas Petazzoni wrote:

[...]

> > Trying to address a regression with further development means that
> > _that_ development needs thought and review, which is a slower
> > process.
> >
> > I do understand the desire to keep moving forward and never take a
> > step backwards, but sometimes backwards steps are the best way to
> > resolve a regression. But I also do appreciate that a simple revert
> > in this case is not possible.
>
> Well, I can revert:
>
> 42342073e38b50113354944cd51dcfed28d857a1 PCI: mvebu: Convert to use
> pci_host_bridge directly ee1604381a371b3ea6aec7d5e43b6e3f5e153854 PCI:
> mvebu: Only remap I/O space if configured
>
> so it's not a big deal either. I can revert those, and then resubmit a
> more complete series later on that moves pci-mvebu to use
> pci_remap_iospace().
>
> > I'll accept your patch on the condition that the ARM private
> > pci_ioremap_io() will go away in the very near future (please _try_
> > to get agreement on that before this patch is merged.)
>
> Bjorn, Lorenzo, what do you prefer ?
>
> If we want to get rid of pci_ioremap_io(), then we need a way to tell
> pci_remap_iospace() the memory attributes that should be used for the
> mapping, because on Armada 38x, we need to map the I/O space mapped
> MT_UNCACHED instead of MT_DEVICE. I'm not sure how to achieve this yet.
> Should pgprot_device() be changed to return MT_UNCACHED on a
> platform-specific basis ? Any other idea ?

I think we should address Russell's concern, he has more insights into
code that predates the PCI host developments.

What I think you can do short term, given that AFAICS MVEBU is not
removable, instead of using pci_host_probe() you move part of its code
into the driver and make sure that you remap IO as last operation before
probe completion (ie after scanning the host bridge) so that you do not
need to unmap it on failure; write a commit log summarising/linking this
thread please and when v4.20 lands we will give this a more thorough
look as Russell requested.

How does that sound ?

Thanks,
Lorenzo

2018-09-24 12:58:00

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hello,

On Mon, 24 Sep 2018 12:13:41 +0100, Russell King - ARM Linux wrote:

> > But being able to unmap it would also be needed to be able to remove
> > PCI host controller drivers, and therefore compile them as module, and
> > make them more like any other drivers.
> >
> > I'm not sure why we need to guarantee that the I/O space is always
> > mapped:
> >
> > - It isn't mapped before the PCI controller driver does the mapping.
> >
> > - There is no reason for it to be accessed when the PCI controller
> > driver is not initialized: PCI devices can only be probed and
> > initialized when the PCI controller driver is probed/initialized.
>
> There are historic reasons. PCI provides ISA IO space, and when you
> have a machine with ISA peripherals present, the PCI IO space must
> never be unmapped - if it is, ISA drivers will oops the kernel. There
> is no way for a vanishing PCI controller to cause ISA drivers to be
> unbound.
>
> If you have a host controller that does unmap PCI IO space and you have
> ISA peripherals with drivers present, unbinding the PCI host controller
> will remove the IO space mapping, and next time an ISA peripheral
> touches IO space, the kernel will oops.

Thanks for sharing some additional technical context on this, very
useful.

I have another question though: shouldn't those ISA devices be child
devices of the PCI controller, if they use some resources of the PCI
controller ? Could you give an example of such an ISA device driver ?
This is just to understand better the issue, because there seems to be
a kind of hidden dependency between those ISA drivers and the setup of
the PCI controller.

> > All other drivers, including on ARM, use pci_remap_iospace(), which
> > does provide the pci_unmap_iospace() counter part.
>
> ... which has been created in PCI land just to deal with PCI without
> regard for the above issue.
>
> However, there's another issue I missed - if you _do_ have ISA
> peripherals, you likely want the IO space setup from very early on,
> and you won't be using the new fangled PCI host driver support anyway.
> That uses pci_map_io_early() rather than pci_ioremap_io() or
> pci_remap_io().

OK. There's today a single platform (Footbridge) that uses
pci_map_io_early(), and it is indeed called through the ->map_io()
hook, which is very early in the boot process.

BTW, look at drivers/pcmcia/at91_cf.c. It has ->probe() and ->remove(),
and does a pci_ioremap_io() in its ->probe(), and nothing in its
->remove(). I don't think this driver, compiled as a module, will work
well after a insmod/rmmod/insmod cycle.

> > But to me, the general direction is that the ARM-specific
> > pci_remap_io() API is fading away, and its replacement already provides
> > an unmapping capability. So why not add the same unmapping capability
> > to pci_remap_io() ?
>
> Yes, that would be a good longer term plan - we don't need three
> different ways to map PCI IO space, but it is development.

Absolutely. Glad to hear that you agree on the longer term plan.

> > But we have a regression and we need to fix it. Do you suggest to not
> > use the new pci_host_probe() API ?
>
> Well, arguably, the patch that caused the regression is the buggy patch,
> _not_ the lack of unmapping API for pci_ioremap_io().

Totally true.

> Trying to address a regression with further development means that
> _that_ development needs thought and review, which is a slower
> process.
>
> I do understand the desire to keep moving forward and never take a
> step backwards, but sometimes backwards steps are the best way to
> resolve a regression. But I also do appreciate that a simple revert
> in this case is not possible.

Well, I can revert:

42342073e38b50113354944cd51dcfed28d857a1 PCI: mvebu: Convert to use
pci_host_bridge directly ee1604381a371b3ea6aec7d5e43b6e3f5e153854 PCI:
mvebu: Only remap I/O space if configured

so it's not a big deal either. I can revert those, and then resubmit a
more complete series later on that moves pci-mvebu to use
pci_remap_iospace().

> I'll accept your patch on the condition that the ARM private
> pci_ioremap_io() will go away in the very near future (please _try_
> to get agreement on that before this patch is merged.)

Bjorn, Lorenzo, what do you prefer ?

If we want to get rid of pci_ioremap_io(), then we need a way to tell
pci_remap_iospace() the memory attributes that should be used for the
mapping, because on Armada 38x, we need to map the I/O space mapped
MT_UNCACHED instead of MT_DEVICE. I'm not sure how to achieve this yet.
Should pgprot_device() be changed to return MT_UNCACHED on a
platform-specific basis ? Any other idea ?

Best regards,

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-24 13:12:50

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hello,

On Mon, 24 Sep 2018 13:46:29 +0100, Lorenzo Pieralisi wrote:

> What I think you can do short term, given that AFAICS MVEBU is not
> removable, instead of using pci_host_probe() you move part of its code
> into the driver and make sure that you remap IO as last operation before
> probe completion (ie after scanning the host bridge) so that you do not
> need to unmap it on failure; write a commit log summarising/linking this
> thread please and when v4.20 lands we will give this a more thorough
> look as Russell requested.
>
> How does that sound ?

The only thing that can fail in pci_host_probe() is:

ret = pci_scan_root_bus_bridge(bridge);
if (ret < 0) {
dev_err(bridge->dev.parent, "Scanning root bridge
failed"); return ret;
}

In the pci-mvebu driver prior to the conversion to pci_host_probe(),
the code flow at the end of ->probe() was:

mvebu_pcie_enable()
pci_common_init_dev()
pcibios_init_hw()

and pcibios_init_hw() calls pci_scan_root_bus_bridge(), without doing
much about the return value other than issuing a warning:

ret = pci_scan_root_bus_bridge(bridge);
}

if (WARN(ret < 0, "PCI: unable to scan bus!")) {
pci_free_host_bridge(bridge);
break;
}

I.e, even before the conversion to pci_host_probe(), in case of
failure in pci_scan_root_bus_bridge(), we would have the I/O mapping in
place, but the PCI controller not registered.

We could keep the same (not great) behavior by doing:

diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
index 50eb0729385b..487492f0c5f7 100644
--- a/drivers/pci/controller/pci-mvebu.c
+++ b/drivers/pci/controller/pci-mvebu.c
@@ -1179,9 +1179,6 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
resource_size(&pcie->io) - 1);
pcie->realio.name = "PCI I/O";

- for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
- pci_ioremap_io(i, pcie->io.start + i);
-
pci_add_resource(&pcie->resources, &pcie->realio);
}

@@ -1197,7 +1194,7 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
struct device_node *child;
int num, i, ret;

- bridge = devm_pci_alloc_host_bridge(dev, sizeof(struct mvebu_pcie));
+ bridge = pci_alloc_host_bridge(sizeof(struct mvebu_pcie));
if (!bridge)
return -ENOMEM;

@@ -1212,8 +1209,10 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
num = of_get_available_child_count(np);

pcie->ports = devm_kcalloc(dev, num, sizeof(*pcie->ports), GFP_KERNEL);
- if (!pcie->ports)
- return -ENOMEM;
+ if (!pcie->ports) {
+ ret = -ENOMEM;
+ goto free_host_bridge;
+ }

i = 0;
for_each_available_child_of_node(np, child) {
@@ -1222,7 +1221,7 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
ret = mvebu_pcie_parse_port(pcie, port, child);
if (ret < 0) {
of_node_put(child);
- return ret;
+ goto free_host_bridge;
} else if (ret == 0) {
continue;
}
@@ -1268,7 +1267,21 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
bridge->align_resource = mvebu_pcie_align_resource;
bridge->msi = pcie->msi;

- return pci_host_probe(bridge);
+ if (resource_size(&pcie->io) != 0) {
+ for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
+ pci_ioremap_io(i, pcie->io.start + i);
+ }
+
+ ret = pci_host_probe(bridge);
+ if (ret)
+ pci_free_host_bridge(bridge);
+
+ /* Yes, when pci_host_probe() returns a failure, we don't care */
+ return 0;
+
+free_host_bridge:
+ pci_free_host_bridge(bridge);
+ return ret;
}

static const struct of_device_id mvebu_pcie_of_match_table[] = {

I.e, we simply ignore the failure of pci_host_probe().

To be honest, I really prefer the option of introducing pci_unmap_io().

Best regards,

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-24 14:59:07

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On Mon, Sep 24, 2018 at 03:10:40PM +0200, Thomas Petazzoni wrote:
> Hello,
>
> On Mon, 24 Sep 2018 13:46:29 +0100, Lorenzo Pieralisi wrote:
>
> > What I think you can do short term, given that AFAICS MVEBU is not
> > removable, instead of using pci_host_probe() you move part of its code
> > into the driver and make sure that you remap IO as last operation before
> > probe completion (ie after scanning the host bridge) so that you do not
> > need to unmap it on failure; write a commit log summarising/linking this
> > thread please and when v4.20 lands we will give this a more thorough
> > look as Russell requested.
> >
> > How does that sound ?
>
> The only thing that can fail in pci_host_probe() is:
>
> ret = pci_scan_root_bus_bridge(bridge);
> if (ret < 0) {
> dev_err(bridge->dev.parent, "Scanning root bridge
> failed"); return ret;
> }
>
> In the pci-mvebu driver prior to the conversion to pci_host_probe(),
> the code flow at the end of ->probe() was:
>
> mvebu_pcie_enable()
> pci_common_init_dev()
> pcibios_init_hw()
>
> and pcibios_init_hw() calls pci_scan_root_bus_bridge(), without doing
> much about the return value other than issuing a warning:
>
> ret = pci_scan_root_bus_bridge(bridge);
> }
>
> if (WARN(ret < 0, "PCI: unable to scan bus!")) {
> pci_free_host_bridge(bridge);
> break;
> }
>
> I.e, even before the conversion to pci_host_probe(), in case of
> failure in pci_scan_root_bus_bridge(), we would have the I/O mapping in
> place, but the PCI controller not registered.
>
> We could keep the same (not great) behavior by doing:
>
> diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> index 50eb0729385b..487492f0c5f7 100644
> --- a/drivers/pci/controller/pci-mvebu.c
> +++ b/drivers/pci/controller/pci-mvebu.c
> @@ -1179,9 +1179,6 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
> resource_size(&pcie->io) - 1);
> pcie->realio.name = "PCI I/O";
>
> - for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
> - pci_ioremap_io(i, pcie->io.start + i);
> -
> pci_add_resource(&pcie->resources, &pcie->realio);
> }
>
> @@ -1197,7 +1194,7 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
> struct device_node *child;
> int num, i, ret;
>
> - bridge = devm_pci_alloc_host_bridge(dev, sizeof(struct mvebu_pcie));
> + bridge = pci_alloc_host_bridge(sizeof(struct mvebu_pcie));
> if (!bridge)
> return -ENOMEM;
>
> @@ -1212,8 +1209,10 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
> num = of_get_available_child_count(np);
>
> pcie->ports = devm_kcalloc(dev, num, sizeof(*pcie->ports), GFP_KERNEL);
> - if (!pcie->ports)
> - return -ENOMEM;
> + if (!pcie->ports) {
> + ret = -ENOMEM;
> + goto free_host_bridge;
> + }
>
> i = 0;
> for_each_available_child_of_node(np, child) {
> @@ -1222,7 +1221,7 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
> ret = mvebu_pcie_parse_port(pcie, port, child);
> if (ret < 0) {
> of_node_put(child);
> - return ret;
> + goto free_host_bridge;
> } else if (ret == 0) {
> continue;
> }
> @@ -1268,7 +1267,21 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
> bridge->align_resource = mvebu_pcie_align_resource;
> bridge->msi = pcie->msi;
>
> - return pci_host_probe(bridge);
> + if (resource_size(&pcie->io) != 0) {
> + for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
> + pci_ioremap_io(i, pcie->io.start + i);
> + }
> +
> + ret = pci_host_probe(bridge);
> + if (ret)
> + pci_free_host_bridge(bridge);
> +
> + /* Yes, when pci_host_probe() returns a failure, we don't care */
> + return 0;
> +
> +free_host_bridge:
> + pci_free_host_bridge(bridge);
> + return ret;
> }
>
> static const struct of_device_id mvebu_pcie_of_match_table[] = {
>
> I.e, we simply ignore the failure of pci_host_probe().
>
> To be honest, I really prefer the option of introducing pci_unmap_io().

I understand that, I wanted to make sure we come up with a fix asap
and what I put forward would cover everything discussed in this thread,
at least temporarily, giving us time to check ISA related issues while
unmapping IO space.

Lorenzo

2018-09-24 15:04:44

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hello,

On Mon, 24 Sep 2018 15:15:12 +0100, Lorenzo Pieralisi wrote:

> I understand that, I wanted to make sure we come up with a fix asap
> and what I put forward would cover everything discussed in this thread,
> at least temporarily, giving us time to check ISA related issues while
> unmapping IO space.

Something like this should implemented what you suggest I guess:

diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
index 50eb0729385b..a41d79b8d46a 100644
--- a/drivers/pci/controller/pci-mvebu.c
+++ b/drivers/pci/controller/pci-mvebu.c
@@ -1145,7 +1145,6 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
{
struct device *dev = &pcie->pdev->dev;
struct device_node *np = dev->of_node;
- unsigned int i;
int ret;

INIT_LIST_HEAD(&pcie->resources);
@@ -1179,13 +1178,58 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
resource_size(&pcie->io) - 1);
pcie->realio.name = "PCI I/O";

+ pci_add_resource(&pcie->resources, &pcie->realio);
+ }
+
+ return devm_request_pci_bus_resources(dev, &pcie->resources);
+}
+
+/*
+ * This is a copy of pci_host_probe(), except that it does the I/O
+ * remap as the last step, once we are sure we won't fail.
+ *
+ * It should be removed once the I/O remap error handling issue has
+ * been sorted out.
+ */
+static int mvebu_pci_host_probe(struct pci_host_bridge *bridge)
+{
+ struct mvebu_pcie *pcie;
+ struct pci_bus *bus, *child;
+ int ret;
+
+ ret = pci_scan_root_bus_bridge(bridge);
+ if (ret < 0) {
+ dev_err(bridge->dev.parent, "Scanning root bridge failed");
+ return ret;
+ }
+
+ pcie = pci_host_bridge_priv(bridge);
+ if (resource_size(&pcie->io) != 0) {
+ unsigned int i;
+
for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
pci_ioremap_io(i, pcie->io.start + i);
+ }

- pci_add_resource(&pcie->resources, &pcie->realio);
+ bus = bridge->bus;
+
+ /*
+ * We insert PCI resources into the iomem_resource and
+ * ioport_resource trees in either pci_bus_claim_resources()
+ * or pci_bus_assign_resources().
+ */
+ if (pci_has_flag(PCI_PROBE_ONLY)) {
+ pci_bus_claim_resources(bus);
+ } else {
+ pci_bus_size_bridges(bus);
+ pci_bus_assign_resources(bus);
+
+ list_for_each_entry(child, &bus->children, node)
+ pcie_bus_configure_settings(child);
}

- return devm_request_pci_bus_resources(dev, &pcie->resources);
+ pci_bus_add_devices(bus);
+ return 0;
}

static int mvebu_pcie_probe(struct platform_device *pdev)
@@ -1268,7 +1312,7 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
bridge->align_resource = mvebu_pcie_align_resource;
bridge->msi = pcie->msi;

- return pci_host_probe(bridge);
+ return mvebu_pci_host_probe(bridge);
}

static const struct of_device_id mvebu_pcie_of_match_table[] = {

If that's what you meant, I'll go ahead and test on actual hardware and
submit as a proper patch.

Thanks!

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-09-24 16:43:09

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On Mon, Sep 24, 2018 at 04:52:18PM +0200, Thomas Petazzoni wrote:
> Hello,
>
> On Mon, 24 Sep 2018 15:15:12 +0100, Lorenzo Pieralisi wrote:
>
> > I understand that, I wanted to make sure we come up with a fix asap
> > and what I put forward would cover everything discussed in this thread,
> > at least temporarily, giving us time to check ISA related issues while
> > unmapping IO space.
>
> Something like this should implemented what you suggest I guess:
>
> diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> index 50eb0729385b..a41d79b8d46a 100644
> --- a/drivers/pci/controller/pci-mvebu.c
> +++ b/drivers/pci/controller/pci-mvebu.c
> @@ -1145,7 +1145,6 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
> {
> struct device *dev = &pcie->pdev->dev;
> struct device_node *np = dev->of_node;
> - unsigned int i;
> int ret;
>
> INIT_LIST_HEAD(&pcie->resources);
> @@ -1179,13 +1178,58 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
> resource_size(&pcie->io) - 1);
> pcie->realio.name = "PCI I/O";
>
> + pci_add_resource(&pcie->resources, &pcie->realio);
> + }
> +
> + return devm_request_pci_bus_resources(dev, &pcie->resources);
> +}
> +
> +/*
> + * This is a copy of pci_host_probe(), except that it does the I/O
> + * remap as the last step, once we are sure we won't fail.
> + *
> + * It should be removed once the I/O remap error handling issue has
> + * been sorted out.
> + */
> +static int mvebu_pci_host_probe(struct pci_host_bridge *bridge)
> +{
> + struct mvebu_pcie *pcie;
> + struct pci_bus *bus, *child;
> + int ret;
> +
> + ret = pci_scan_root_bus_bridge(bridge);
> + if (ret < 0) {
> + dev_err(bridge->dev.parent, "Scanning root bridge failed");
> + return ret;
> + }
> +
> + pcie = pci_host_bridge_priv(bridge);
> + if (resource_size(&pcie->io) != 0) {
> + unsigned int i;
> +
> for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
> pci_ioremap_io(i, pcie->io.start + i);
> + }
>
> - pci_add_resource(&pcie->resources, &pcie->realio);
> + bus = bridge->bus;
> +
> + /*
> + * We insert PCI resources into the iomem_resource and
> + * ioport_resource trees in either pci_bus_claim_resources()
> + * or pci_bus_assign_resources().
> + */
> + if (pci_has_flag(PCI_PROBE_ONLY)) {
> + pci_bus_claim_resources(bus);
> + } else {
> + pci_bus_size_bridges(bus);
> + pci_bus_assign_resources(bus);
> +
> + list_for_each_entry(child, &bus->children, node)
> + pcie_bus_configure_settings(child);
> }
>
> - return devm_request_pci_bus_resources(dev, &pcie->resources);
> + pci_bus_add_devices(bus);
> + return 0;
> }
>
> static int mvebu_pcie_probe(struct platform_device *pdev)
> @@ -1268,7 +1312,7 @@ static int mvebu_pcie_probe(struct platform_device *pdev)
> bridge->align_resource = mvebu_pcie_align_resource;
> bridge->msi = pcie->msi;
>
> - return pci_host_probe(bridge);
> + return mvebu_pci_host_probe(bridge);
> }
>
> static const struct of_device_id mvebu_pcie_of_match_table[] = {
>
> If that's what you meant, I'll go ahead and test on actual hardware and
> submit as a proper patch.

Yes, that's what I meant, I hope that you will have some bandwidth to
discuss a way forward for v4.21, when this fix is merged (I think that
v4.20 is a tall order but we can try if you will have time to post the
patches - I suspect your fix will go in -rc6 if Bjorn can pull it this
week).

Lorenzo

2018-09-25 08:18:35

by Andrew Murray

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hi Thomas,

On Mon, Sep 24, 2018 at 02:12:03PM +0200, Thomas Petazzoni wrote:
> Hello,
>
> On Mon, 24 Sep 2018 12:13:41 +0100, Russell King - ARM Linux wrote:
>
> > > But being able to unmap it would also be needed to be able to remove
> > > PCI host controller drivers, and therefore compile them as module, and
> > > make them more like any other drivers.
> > >
> > > I'm not sure why we need to guarantee that the I/O space is always
> > > mapped:
> > >
> > > - It isn't mapped before the PCI controller driver does the mapping.
> > >
> > > - There is no reason for it to be accessed when the PCI controller
> > > driver is not initialized: PCI devices can only be probed and
> > > initialized when the PCI controller driver is probed/initialized.
> >
> > There are historic reasons. PCI provides ISA IO space, and when you
> > have a machine with ISA peripherals present, the PCI IO space must
> > never be unmapped - if it is, ISA drivers will oops the kernel. There
> > is no way for a vanishing PCI controller to cause ISA drivers to be
> > unbound.
> >
> > If you have a host controller that does unmap PCI IO space and you have
> > ISA peripherals with drivers present, unbinding the PCI host controller
> > will remove the IO space mapping, and next time an ISA peripheral
> > touches IO space, the kernel will oops.
>
> Thanks for sharing some additional technical context on this, very
> useful.
>
> I have another question though: shouldn't those ISA devices be child
> devices of the PCI controller, if they use some resources of the PCI
> controller ? Could you give an example of such an ISA device driver ?

Legacy VGA also falls into this category - for example
drivers/video/console/vgacon.c will happily use outb/inb macros to hard
coded addresses which are hoped to be present on some PCI/ISA bus.

With regards to ISA drivers - take a look for anything that registers with
isa_register_driver - for example:

drivers/input/touchscreen/htcpen.c
drivers/net/ethernet/3com/3c509.c
drivers/watchdog/ebc-c384_wdt.c

None of these drivers do any kind of mapping before attempting to access
these addresses.

Thanks,

Andrew Murray

> This is just to understand better the issue, because there seems to be
> a kind of hidden dependency between those ISA drivers and the setup of
> the PCI controller.
>
> > > All other drivers, including on ARM, use pci_remap_iospace(), which
> > > does provide the pci_unmap_iospace() counter part.
> >
> > ... which has been created in PCI land just to deal with PCI without
> > regard for the above issue.
> >
> > However, there's another issue I missed - if you _do_ have ISA
> > peripherals, you likely want the IO space setup from very early on,
> > and you won't be using the new fangled PCI host driver support anyway.
> > That uses pci_map_io_early() rather than pci_ioremap_io() or
> > pci_remap_io().
>
> OK. There's today a single platform (Footbridge) that uses
> pci_map_io_early(), and it is indeed called through the ->map_io()
> hook, which is very early in the boot process.
>
> BTW, look at drivers/pcmcia/at91_cf.c. It has ->probe() and ->remove(),
> and does a pci_ioremap_io() in its ->probe(), and nothing in its
> ->remove(). I don't think this driver, compiled as a module, will work
> well after a insmod/rmmod/insmod cycle.
>
> > > But to me, the general direction is that the ARM-specific
> > > pci_remap_io() API is fading away, and its replacement already provides
> > > an unmapping capability. So why not add the same unmapping capability
> > > to pci_remap_io() ?
> >
> > Yes, that would be a good longer term plan - we don't need three
> > different ways to map PCI IO space, but it is development.
>
> Absolutely. Glad to hear that you agree on the longer term plan.
>
> > > But we have a regression and we need to fix it. Do you suggest to not
> > > use the new pci_host_probe() API ?
> >
> > Well, arguably, the patch that caused the regression is the buggy patch,
> > _not_ the lack of unmapping API for pci_ioremap_io().
>
> Totally true.
>
> > Trying to address a regression with further development means that
> > _that_ development needs thought and review, which is a slower
> > process.
> >
> > I do understand the desire to keep moving forward and never take a
> > step backwards, but sometimes backwards steps are the best way to
> > resolve a regression. But I also do appreciate that a simple revert
> > in this case is not possible.
>
> Well, I can revert:
>
> 42342073e38b50113354944cd51dcfed28d857a1 PCI: mvebu: Convert to use
> pci_host_bridge directly ee1604381a371b3ea6aec7d5e43b6e3f5e153854 PCI:
> mvebu: Only remap I/O space if configured
>
> so it's not a big deal either. I can revert those, and then resubmit a
> more complete series later on that moves pci-mvebu to use
> pci_remap_iospace().
>
> > I'll accept your patch on the condition that the ARM private
> > pci_ioremap_io() will go away in the very near future (please _try_
> > to get agreement on that before this patch is merged.)
>
> Bjorn, Lorenzo, what do you prefer ?
>
> If we want to get rid of pci_ioremap_io(), then we need a way to tell
> pci_remap_iospace() the memory attributes that should be used for the
> mapping, because on Armada 38x, we need to map the I/O space mapped
> MT_UNCACHED instead of MT_DEVICE. I'm not sure how to achieve this yet.
> Should pgprot_device() be changed to return MT_UNCACHED on a
> platform-specific basis ? Any other idea ?
>
> Best regards,
>
> Thomas
> --
> Thomas Petazzoni, CTO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2018-10-01 10:57:20

by Jan Kundrát

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On pondělí 24. září 2018 16:52:18 CEST, Thomas Petazzoni wrote:
> Hello,
>
> On Mon, 24 Sep 2018 15:15:12 +0100, Lorenzo Pieralisi wrote:
>
>> I understand that, I wanted to make sure we come up with a fix asap
>> and what I put forward would cover everything discussed in this thread,
>> at least temporarily, giving us time to check ISA related issues while
>> unmapping IO space.
>
> Something like this should implemented what you suggest I guess:
>
> diff --git a/drivers/pci/controller/pci-mvebu.c
> b/drivers/pci/controller/pci-mvebu.c
> index 50eb0729385b..a41d79b8d46a 100644
> --- a/drivers/pci/controller/pci-mvebu.c
> +++ b/drivers/pci/controller/pci-mvebu.c
> @@ -1145,7 +1145,6 @@ static int
> mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
> {
> struct device *dev = &pcie->pdev->dev;
> struct device_node *np = dev->of_node;
> - unsigned int i;
> int ret;
>
> INIT_LIST_HEAD(&pcie->resources);
> @@ -1179,13 +1178,58 @@ static int
> mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
> resource_size(&pcie->io) - 1);
> pcie->realio.name = "PCI I/O";
>
> + pci_add_resource(&pcie->resources, &pcie->realio);
> + }
> +
> + return devm_request_pci_bus_resources(dev, &pcie->resources);
> +}
> +
> +/*
> + * This is a copy of pci_host_probe(), except that it does the I/O
> + * remap as the last step, once we are sure we won't fail.
> + *
> + * It should be removed once the I/O remap error handling issue has
> + * been sorted out.
> + */
> +static int mvebu_pci_host_probe(struct pci_host_bridge *bridge)
> +{
> + struct mvebu_pcie *pcie;
> + struct pci_bus *bus, *child;
> + int ret;
> +
> + ret = pci_scan_root_bus_bridge(bridge);
> + if (ret < 0) {
> + dev_err(bridge->dev.parent, "Scanning root bridge failed");
> + return ret;
> + }
> +
> + pcie = pci_host_bridge_priv(bridge);
> + if (resource_size(&pcie->io) != 0) {
> + unsigned int i;
> +
> for (i = 0; i < resource_size(&pcie->realio); i += SZ_64K)
> pci_ioremap_io(i, pcie->io.start + i);
> + }
>
> - pci_add_resource(&pcie->resources, &pcie->realio);
> + bus = bridge->bus;
> +
> + /*
> + * We insert PCI resources into the iomem_resource and
> + * ioport_resource trees in either pci_bus_claim_resources()
> + * or pci_bus_assign_resources().
> + */
> + if (pci_has_flag(PCI_PROBE_ONLY)) {
> + pci_bus_claim_resources(bus);
> + } else {
> + pci_bus_size_bridges(bus);
> + pci_bus_assign_resources(bus);
> +
> + list_for_each_entry(child, &bus->children, node)
> + pcie_bus_configure_settings(child);
> }
>
> - return devm_request_pci_bus_resources(dev, &pcie->resources);
> + pci_bus_add_devices(bus);
> + return 0;
> }
>
> static int mvebu_pcie_probe(struct platform_device *pdev)
> @@ -1268,7 +1312,7 @@ static int mvebu_pcie_probe(struct
> platform_device *pdev)
> bridge->align_resource = mvebu_pcie_align_resource;
> bridge->msi = pcie->msi;
>
> - return pci_host_probe(bridge);
> + return mvebu_pci_host_probe(bridge);
> }
>
> static const struct of_device_id mvebu_pcie_of_match_table[] = {
>
> If that's what you meant, I'll go ahead and test on actual hardware and
> submit as a proper patch.

Thomas, Russell, Lorenzo,
did you have time to convert this into a patch which can hit 4.19? I don't
see anything related in 4.19-rc6, but perhaps I missed something. Is there
something that I should test or otherwise help?

With kind regards,
Jan

2018-10-01 12:52:21

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

Hello,

On Mon, 01 Oct 2018 12:56:37 +0200, Jan Kundrát wrote:

> Thomas, Russell, Lorenzo,
> did you have time to convert this into a patch which can hit 4.19? I don't
> see anything related in 4.19-rc6, but perhaps I missed something. Is there
> something that I should test or otherwise help?

Sorry, I suddenly got busy (my second son arrived a few days earlier
than expected).

I just sent a proper patch with the proposal I made last week, after
testing on ClearFog and Armada XP GP. Note that on ClearFog, I only
tested that it fixes the panic at boot, since I didn't had any
mini-PCIe devices at hand. On Armada XP GP, I verified that an E1000E
NIC was still working as expected. Therefore, it would be useful if
you could test on your ClearFog platform with PCI devices connected.

Thanks a lot and sorry for the delay.

Best regards,

Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2018-10-01 21:02:11

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [BISECTED] Regression: Solidrun Clearfog Base won't boot since "PCI: mvebu: Only remap I/O space if configured"

On Mon, Oct 01, 2018 at 02:51:48PM +0200, Thomas Petazzoni wrote:
> Hello,
>
> On Mon, 01 Oct 2018 12:56:37 +0200, Jan Kundr?t wrote:
>
> > Thomas, Russell, Lorenzo,
> > did you have time to convert this into a patch which can hit 4.19? I don't
> > see anything related in 4.19-rc6, but perhaps I missed something. Is there
> > something that I should test or otherwise help?
>
> Sorry, I suddenly got busy (my second son arrived a few days earlier
> than expected).

Congratulations!

> I just sent a proper patch with the proposal I made last week, after
> testing on ClearFog and Armada XP GP. Note that on ClearFog, I only
> tested that it fixes the panic at boot, since I didn't had any
> mini-PCIe devices at hand. On Armada XP GP, I verified that an E1000E
> NIC was still working as expected. Therefore, it would be useful if
> you could test on your ClearFog platform with PCI devices connected.
>
> Thanks a lot and sorry for the delay.

And thanks for the patch! No need to apologize for having a life :)

Bjorn