Forwarding to NVMe folks, lists for visibility.
----- Forwarded message from [email protected] -----
https://bugzilla.kernel.org/show_bug.cgi?id=217251
...
Created attachment 304031
--> https://bugzilla.kernel.org/attachment.cgi?id=304031&action=edit
the tracing of nvme_pci_enable() during re-insertion
Hi,
There is a JHL7540-based device that may host a NVMe device. After the first
insertion a nvme drive is properly discovered and handled by the relevant
modules. Once disconnected any further attempts are not successful. The device
is visible on a PCI bus, but nvme_pci_enable() ends up calling
pci_disable_device() every time; the runtime PM status of the device is
"suspended", the power status of the 04:01.0 PCI bridge is D3. Preventing the
device from being power managed ("on" -> /sys/devices/../power/control)
combined with device removal and pci rescan changes nothing. A host reboot
restores the initial state.
I would appreciate any suggestions how to debug it further.
On Mon, Mar 27, 2023 at 09:33:59AM -0500, Bjorn Helgaas wrote:
> Forwarding to NVMe folks, lists for visibility.
>
> ----- Forwarded message from [email protected] -----
>
> https://bugzilla.kernel.org/show_bug.cgi?id=217251
> ...
>
> Created attachment 304031
> --> https://bugzilla.kernel.org/attachment.cgi?id=304031&action=edit
> the tracing of nvme_pci_enable() during re-insertion
>
> Hi,
>
> There is a JHL7540-based device that may host a NVMe device. After the first
> insertion a nvme drive is properly discovered and handled by the relevant
> modules. Once disconnected any further attempts are not successful. The device
> is visible on a PCI bus, but nvme_pci_enable() ends up calling
> pci_disable_device() every time; the runtime PM status of the device is
> "suspended", the power status of the 04:01.0 PCI bridge is D3. Preventing the
> device from being power managed ("on" -> /sys/devices/../power/control)
> combined with device removal and pci rescan changes nothing. A host reboot
> restores the initial state.
>
> I would appreciate any suggestions how to debug it further.
Sounds the same as this report:
http://lists.infradead.org/pipermail/linux-nvme/2023-March/038259.html
The driver is bailing on the device because we can't read it's status register
out of the remapped BAR. There's nothing we can do about that from the nvme
driver level. Memory mapped IO has to work in order to proceed.
On Mon, Mar 27, 2023 at 05:43:18PM +0000, Aleksander Trofimowicz wrote:
>
> Keith Busch <[email protected]> writes:
>
> > On Mon, Mar 27, 2023 at 09:33:59AM -0500, Bjorn Helgaas wrote:
> >> Forwarding to NVMe folks, lists for visibility.
> >>
> >> ----- Forwarded message from [email protected] -----
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=217251
> >> ...
> >>
> >> Created attachment 304031
> >> --> https://bugzilla.kernel.org/attachment.cgi?id=304031&action=edit
> >> the tracing of nvme_pci_enable() during re-insertion
> >>
> >> Hi,
> >>
> >> There is a JHL7540-based device that may host a NVMe device. After the first
> >> insertion a nvme drive is properly discovered and handled by the relevant
> >> modules. Once disconnected any further attempts are not successful. The device
> >> is visible on a PCI bus, but nvme_pci_enable() ends up calling
> >> pci_disable_device() every time; the runtime PM status of the device is
> >> "suspended", the power status of the 04:01.0 PCI bridge is D3. Preventing the
> >> device from being power managed ("on" -> /sys/devices/../power/control)
> >> combined with device removal and pci rescan changes nothing. A host reboot
> >> restores the initial state.
> >>
> >> I would appreciate any suggestions how to debug it further.
> >
> > Sounds the same as this report:
> >
> > http://lists.infradead.org/pipermail/linux-nvme/2023-March/038259.html
> >
> > The driver is bailing on the device because we can't read it's status register
> > out of the remapped BAR. There's nothing we can do about that from the nvme
> > driver level. Memory mapped IO has to work in order to proceed.
> >
> Thanks. I can confirm it is the same problem:
>
> a) the platform is Intel Alderlake
> b) readl(dev->bar + NVME_REG_CSTS) in nvme_pci_enable() fails
> c) reading BAR0 via setpci gives 0x00000004
It's strange too. In your example, kernel says:
0000:05:00.0: BAR 0: assigned [mem 0x54000000-0x54003fff 64bit]
There is a check right after that message that ensures the kernel reads back
what it wrote. No failures reported means the device really did have the
expected BAR value at one point.
Keith Busch <[email protected]> writes:
> On Mon, Mar 27, 2023 at 09:33:59AM -0500, Bjorn Helgaas wrote:
>> Forwarding to NVMe folks, lists for visibility.
>>
>> ----- Forwarded message from [email protected] -----
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=217251
>> ...
>>
>> Created attachment 304031
>> --> https://bugzilla.kernel.org/attachment.cgi?id=304031&action=edit
>> the tracing of nvme_pci_enable() during re-insertion
>>
>> Hi,
>>
>> There is a JHL7540-based device that may host a NVMe device. After the first
>> insertion a nvme drive is properly discovered and handled by the relevant
>> modules. Once disconnected any further attempts are not successful. The device
>> is visible on a PCI bus, but nvme_pci_enable() ends up calling
>> pci_disable_device() every time; the runtime PM status of the device is
>> "suspended", the power status of the 04:01.0 PCI bridge is D3. Preventing the
>> device from being power managed ("on" -> /sys/devices/../power/control)
>> combined with device removal and pci rescan changes nothing. A host reboot
>> restores the initial state.
>>
>> I would appreciate any suggestions how to debug it further.
>
> Sounds the same as this report:
>
> http://lists.infradead.org/pipermail/linux-nvme/2023-March/038259.html
>
> The driver is bailing on the device because we can't read it's status register
> out of the remapped BAR. There's nothing we can do about that from the nvme
> driver level. Memory mapped IO has to work in order to proceed.
>
Thanks. I can confirm it is the same problem:
a) the platform is Intel Alderlake
b) readl(dev->bar + NVME_REG_CSTS) in nvme_pci_enable() fails
c) reading BAR0 via setpci gives 0x00000004
--
at