LinuxLists.cc - I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

+Cc: Ilpo

On Wed, Sep 14, 2022 at 10:15:02AM +0300, Lennert Buytenhek wrote:
> Hi,
>
> On an Intel SoC with several 8250_mid PCIe UARTs built into the CPU, I
> can reliably trigger I/O page faults if I invoke TIOCVHANGUP on any of
> the UARTs and then re-open that UART.
>
> Invoking TIOCVHANGUP appears to clear the MSI address/data registers
> in the UART via tty_ioctl() -> tty_vhangup() -> __tty_hangup() ->
> uart_hangup() -> uart_shutdown() -> uart_port_shutdown() ->
> univ8250_release_irq() -> free_irq() -> irq_domain_deactivate_irq() ->
> __irq_domain_deactivate_irq() -> msi_domain_deactivate() ->
> __pci_write_msi_msg():
>
> [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> Address: fee00278 Data: 0000
> [root@icelake ~]# cat hangup.c
> #include <stdio.h>
> #include <sys/ioctl.h>
>
> int main(int argc, char *argv[])
> {
> ioctl(1, TIOCVHANGUP);
>
> return 0;
> }
> [root@icelake ~]# gcc -Wall -o hangup hangup.c
> [root@icelake ~]# ./hangup > /dev/ttyS4
> [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> Address: 00000000 Data: 0000
> [root@icelake ~]#
>
> Opening the serial port device again while the UART is in this state
> then appears to cause the UART to generate an interrupt

The interrupt is ORed three: DMA Tx, DMA Rx and UART itself.
Any of them can be possible, but to be sure, can you add:

dev_info(p->dev, "FISR: %x\n", fisr);

into dnv_handle_irq() before any other code and see which bits we actually got
there before the crash?

(If it floods the logs, dev_info_ratelimited() may help)

> before the
> MSI vector has been set up again, causing a DMA write to I/O virtual
> address zero:
>
> [root@icelake console]# echo > /dev/ttyS4
> [ 979.463307] DMAR: DRHD: handling fault status reg 3
> [ 979.469409] DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
>
> I'm guessing there's something under tty_open() -> uart_open() ->
> tty_port_open() -> uart_port_activate() -> uart_port_startup() ->
> serial8250_do_startup() that triggers a UART interrupt before the
> MSI vector has been set up again.
>
> I did a quick search but it didn't seem like this is a known issue.

Thanks for your report and reproducer! Yes, I also never heard about such an
issue before. Ilpo, who is doing more UART work nowadays, might have an idea,
I hope.

I'm a bit busy with other stuff right now.

--
With Best Regards,
Andy Shevchenko

2022-09-14 11:23:14

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Wed, Sep 14, 2022 at 01:09:40PM +0300, Andy Shevchenko wrote:

> > On an Intel SoC with several 8250_mid PCIe UARTs built into the CPU, I
> > can reliably trigger I/O page faults if I invoke TIOCVHANGUP on any of
> > the UARTs and then re-open that UART.
> >
> > Invoking TIOCVHANGUP appears to clear the MSI address/data registers
> > in the UART via tty_ioctl() -> tty_vhangup() -> __tty_hangup() ->
> > uart_hangup() -> uart_shutdown() -> uart_port_shutdown() ->
> > univ8250_release_irq() -> free_irq() -> irq_domain_deactivate_irq() ->
> > __irq_domain_deactivate_irq() -> msi_domain_deactivate() ->
> > __pci_write_msi_msg():
> >
> > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > Address: fee00278 Data: 0000
> > [root@icelake ~]# cat hangup.c
> > #include <stdio.h>
> > #include <sys/ioctl.h>
> >
> > int main(int argc, char *argv[])
> > {
> > ioctl(1, TIOCVHANGUP);
> >
> > return 0;
> > }
> > [root@icelake ~]# gcc -Wall -o hangup hangup.c
> > [root@icelake ~]# ./hangup > /dev/ttyS4
> > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > Address: 00000000 Data: 0000
> > [root@icelake ~]#
> >
> > Opening the serial port device again while the UART is in this state
> > then appears to cause the UART to generate an interrupt
>
> The interrupt is ORed three: DMA Tx, DMA Rx and UART itself.
> Any of them can be possible, but to be sure, can you add:
>
> dev_info(p->dev, "FISR: %x\n", fisr);
>
> into dnv_handle_irq() before any other code and see which bits we
> actually got there before the crash?
>
> (If it floods the logs, dev_info_ratelimited() may help)

I think that that wouldn't report anything because when the UART is
triggering an interrupt here, the MSI address/data are zero, so the
IRQ handler is not actually invoked.

If Ilpo doesn't beat me to it, I'll try adding some debug code to see
exactly which UART register write in the tty open path is causing the
UART to signal an interrupt before the IRQ handler is set up.

(The IOMMU stops the write in this case, so the machine doesn't crash,
we just get an I/O page fault warning in dmesg every time this happens.)

> > before the
> > MSI vector has been set up again, causing a DMA write to I/O virtual
> > address zero:
> >
> > [root@icelake console]# echo > /dev/ttyS4
> > [ 979.463307] DMAR: DRHD: handling fault status reg 3
> > [ 979.469409] DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
> >
> > I'm guessing there's something under tty_open() -> uart_open() ->
> > tty_port_open() -> uart_port_activate() -> uart_port_startup() ->
> > serial8250_do_startup() that triggers a UART interrupt before the
> > MSI vector has been set up again.
> >
> > I did a quick search but it didn't seem like this is a known issue.
>
> Thanks for your report and reproducer! Yes, I also never heard about
> such an issue before. Ilpo, who is doing more UART work nowadays, might
> have an idea, I hope.

Thank you!

2022-09-14 13:17:43

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Wed, Sep 14, 2022 at 02:10:44PM +0300, Lennert Buytenhek wrote:
> On Wed, Sep 14, 2022 at 01:09:40PM +0300, Andy Shevchenko wrote:
> > > On an Intel SoC with several 8250_mid PCIe UARTs built into the CPU, I
> > > can reliably trigger I/O page faults if I invoke TIOCVHANGUP on any of
> > > the UARTs and then re-open that UART.
> > >
> > > Invoking TIOCVHANGUP appears to clear the MSI address/data registers
> > > in the UART via tty_ioctl() -> tty_vhangup() -> __tty_hangup() ->
> > > uart_hangup() -> uart_shutdown() -> uart_port_shutdown() ->
> > > univ8250_release_irq() -> free_irq() -> irq_domain_deactivate_irq() ->
> > > __irq_domain_deactivate_irq() -> msi_domain_deactivate() ->
> > > __pci_write_msi_msg():
> > >
> > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > Address: fee00278 Data: 0000
> > > [root@icelake ~]# cat hangup.c
> > > #include <stdio.h>
> > > #include <sys/ioctl.h>
> > >
> > > int main(int argc, char *argv[])
> > > {
> > > ioctl(1, TIOCVHANGUP);
> > >
> > > return 0;
> > > }
> > > [root@icelake ~]# gcc -Wall -o hangup hangup.c
> > > [root@icelake ~]# ./hangup > /dev/ttyS4
> > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > Address: 00000000 Data: 0000
> > > [root@icelake ~]#
> > >
> > > Opening the serial port device again while the UART is in this state
> > > then appears to cause the UART to generate an interrupt
> >
> > The interrupt is ORed three: DMA Tx, DMA Rx and UART itself.
> > Any of them can be possible, but to be sure, can you add:
> >
> > dev_info(p->dev, "FISR: %x\n", fisr);
> >
> > into dnv_handle_irq() before any other code and see which bits we
> > actually got there before the crash?
> >
> > (If it floods the logs, dev_info_ratelimited() may help)
>
> I think that that wouldn't report anything because when the UART is
> triggering an interrupt here, the MSI address/data are zero, so the
> IRQ handler is not actually invoked.

Ah, indeed. Then you may disable MSI (in 8250_mid) and see that anyway?

> If Ilpo doesn't beat me to it, I'll try adding some debug code to see
> exactly which UART register write in the tty open path is causing the
> UART to signal an interrupt before the IRQ handler is set up.
>
> (The IOMMU stops the write in this case, so the machine doesn't crash,
> we just get an I/O page fault warning in dmesg every time this happens.)

And I believe you are not using that UART as debug console, so it won't
dead lock itself. It's then better than I assumed.

> > > before the
> > > MSI vector has been set up again, causing a DMA write to I/O virtual
> > > address zero:
> > >
> > > [root@icelake console]# echo > /dev/ttyS4
> > > [ 979.463307] DMAR: DRHD: handling fault status reg 3
> > > [ 979.469409] DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
> > >
> > > I'm guessing there's something under tty_open() -> uart_open() ->
> > > tty_port_open() -> uart_port_activate() -> uart_port_startup() ->
> > > serial8250_do_startup() that triggers a UART interrupt before the
> > > MSI vector has been set up again.
> > >
> > > I did a quick search but it didn't seem like this is a known issue.
> >
> > Thanks for your report and reproducer! Yes, I also never heard about
> > such an issue before. Ilpo, who is doing more UART work nowadays, might
> > have an idea, I hope.
>
> Thank you!

--
With Best Regards,
Andy Shevchenko

2022-09-15 16:48:17

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Wed, 14 Sep 2022, Andy Shevchenko wrote:

> On Wed, Sep 14, 2022 at 02:10:44PM +0300, Lennert Buytenhek wrote:
> > On Wed, Sep 14, 2022 at 01:09:40PM +0300, Andy Shevchenko wrote:
> > > > On an Intel SoC with several 8250_mid PCIe UARTs built into the CPU, I
> > > > can reliably trigger I/O page faults if I invoke TIOCVHANGUP on any of
> > > > the UARTs and then re-open that UART.
> > > >
> > > > Invoking TIOCVHANGUP appears to clear the MSI address/data registers
> > > > in the UART via tty_ioctl() -> tty_vhangup() -> __tty_hangup() ->
> > > > uart_hangup() -> uart_shutdown() -> uart_port_shutdown() ->
> > > > univ8250_release_irq() -> free_irq() -> irq_domain_deactivate_irq() ->
> > > > __irq_domain_deactivate_irq() -> msi_domain_deactivate() ->
> > > > __pci_write_msi_msg():
> > > >
> > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > Address: fee00278 Data: 0000
> > > > [root@icelake ~]# cat hangup.c
> > > > #include <stdio.h>
> > > > #include <sys/ioctl.h>
> > > >
> > > > int main(int argc, char *argv[])
> > > > {
> > > > ioctl(1, TIOCVHANGUP);
> > > >
> > > > return 0;
> > > > }
> > > > [root@icelake ~]# gcc -Wall -o hangup hangup.c
> > > > [root@icelake ~]# ./hangup > /dev/ttyS4
> > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > Address: 00000000 Data: 0000
> > > > [root@icelake ~]#
> > > >
> > > > Opening the serial port device again while the UART is in this state
> > > > then appears to cause the UART to generate an interrupt
> > >
> > > The interrupt is ORed three: DMA Tx, DMA Rx and UART itself.
> > > Any of them can be possible, but to be sure, can you add:
> > >
> > > dev_info(p->dev, "FISR: %x\n", fisr);
> > >
> > > into dnv_handle_irq() before any other code and see which bits we
> > > actually got there before the crash?
> > >
> > > (If it floods the logs, dev_info_ratelimited() may help)
> >
> > I think that that wouldn't report anything because when the UART is
> > triggering an interrupt here, the MSI address/data are zero, so the
> > IRQ handler is not actually invoked.
>
> Ah, indeed. Then you may disable MSI (in 8250_mid) and see that anyway?
>
> > If Ilpo doesn't beat me to it, I'll try adding some debug code to see
> > exactly which UART register write in the tty open path is causing the
> > UART to signal an interrupt before the IRQ handler is set up.
> >
> > (The IOMMU stops the write in this case, so the machine doesn't crash,
> > we just get an I/O page fault warning in dmesg every time this happens.)
>
> And I believe you are not using that UART as debug console, so it won't
> dead lock itself. It's then better than I assumed.
>
> > > > before the
> > > > MSI vector has been set up again, causing a DMA write to I/O virtual
> > > > address zero:
> > > >
> > > > [root@icelake console]# echo > /dev/ttyS4
> > > > [ 979.463307] DMAR: DRHD: handling fault status reg 3
> > > > [ 979.469409] DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
> > > >
> > > > I'm guessing there's something under tty_open() -> uart_open() ->
> > > > tty_port_open() -> uart_port_activate() -> uart_port_startup() ->
> > > > serial8250_do_startup() that triggers a UART interrupt before the
> > > > MSI vector has been set up again.
> > > >
> > > > I did a quick search but it didn't seem like this is a known issue.
> > >
> > > Thanks for your report and reproducer! Yes, I also never heard about
> > > such an issue before. Ilpo, who is doing more UART work nowadays, might
> > > have an idea, I hope.

The patch below seems to avoid the faults. I'm far from sure if it's the
best fix though as I don't fully understand what causes the faults during
the THRE tests because the port->irq is disabled by the THRE test block.

--
From: Ilpo J?rvinen <[email protected]>
[PATCH] serial: 8250: Turn IER bits on only after irq has been set up

Invoking TIOCVHANGUP on 8250_mid port and then reopening the
port triggers these faults during serial8250_do_startup():

DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set

The faults are triggered by the THRE test that temporarily toggles THRI
in IER before UART's irq is set up.

Refactor serial8250_do_startup() such that irq is setup before the THRE
test. The current irq setup code is intermixed with the timer setup
code. As THRE test must be performed prior to the timer setup, extract
it into own function and call it only after the THRE test.

Reported-by: Lennert Buytenhek <[email protected]>
Fixes: 40b36daad0ac ("[PATCH] 8250 UART backup timer")
Signed-off-by: Ilpo J?rvinen <[email protected]>

---
drivers/tty/serial/8250/8250.h | 2 ++
drivers/tty/serial/8250/8250_core.c | 23 ++++++++++++-----------
drivers/tty/serial/8250/8250_port.c | 8 +++++---
3 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h
index 287153d32536..dbf4c1204bf3 100644
--- a/drivers/tty/serial/8250/8250.h
+++ b/drivers/tty/serial/8250/8250.h
@@ -403,3 +403,5 @@ static inline int serial_index(struct uart_port *port)
{
return port->minor - 64;
}
+
+void univ8250_setup_timer(struct uart_8250_port *up);
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 2e83e7367441..e81a9cab6960 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -298,21 +298,15 @@ static void serial8250_backup_timeout(struct timer_list *t)
jiffies + uart_poll_timeout(&up->port) + HZ / 5);
}

-static int univ8250_setup_irq(struct uart_8250_port *up)
+void univ8250_setup_timer(struct uart_8250_port *up)
{
struct uart_port *port = &up->port;
- int retval = 0;

- /*
- * The above check will only give an accurate result the first time
- * the port is opened so this value needs to be preserved.
- */
if (up->bugs & UART_BUG_THRE) {
pr_debug("%s - using backup timer\n", port->name);

up->timer.function = serial8250_backup_timeout;
- mod_timer(&up->timer, jiffies +
- uart_poll_timeout(port) + HZ / 5);
+ mod_timer(&up->timer, jiffies + uart_poll_timeout(port) + HZ / 5);
}

/*
@@ -322,10 +316,17 @@ static int univ8250_setup_irq(struct uart_8250_port *up)
*/
if (!port->irq)
mod_timer(&up->timer, jiffies + uart_poll_timeout(port));
- else
- retval = serial_link_irq_chain(up);
+}
+EXPORT_SYMBOL_GPL(univ8250_setup_timer);

- return retval;
+static int univ8250_setup_irq(struct uart_8250_port *up)
+{
+ struct uart_port *port = &up->port;
+
+ if (port->irq)
+ return serial_link_irq_chain(up);
+
+ return 0;
}

static void univ8250_release_irq(struct uart_8250_port *up)
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index 39b35a61958c..6e8e16227a3a 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2294,6 +2294,10 @@ int serial8250_do_startup(struct uart_port *port)
if (port->irq && (up->port.flags & UPF_SHARE_IRQ))
up->port.irqflags |= IRQF_SHARED;

+ retval = up->ops->setup_irq(up);
+ if (retval)
+ goto out;
+
if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
unsigned char iir1;

@@ -2336,9 +2340,7 @@ int serial8250_do_startup(struct uart_port *port)
}
}

- retval = up->ops->setup_irq(up);
- if (retval)
- goto out;
+ univ8250_setup_timer(up);

/*
* Now, initialize the UART

--
tg: (1d10cd4da593..) 8250/setup-irq-fix (depends on: tty-linus)

2022-09-16 11:49:33

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Thu, Sep 15, 2022 at 07:27:45PM +0300, Ilpo J?rvinen wrote:

> > > > > On an Intel SoC with several 8250_mid PCIe UARTs built into the CPU, I
> > > > > can reliably trigger I/O page faults if I invoke TIOCVHANGUP on any of
> > > > > the UARTs and then re-open that UART.
> > > > >
> > > > > Invoking TIOCVHANGUP appears to clear the MSI address/data registers
> > > > > in the UART via tty_ioctl() -> tty_vhangup() -> __tty_hangup() ->
> > > > > uart_hangup() -> uart_shutdown() -> uart_port_shutdown() ->
> > > > > univ8250_release_irq() -> free_irq() -> irq_domain_deactivate_irq() ->
> > > > > __irq_domain_deactivate_irq() -> msi_domain_deactivate() ->
> > > > > __pci_write_msi_msg():
> > > > >
> > > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > > Address: fee00278 Data: 0000
> > > > > [root@icelake ~]# cat hangup.c
> > > > > #include <stdio.h>
> > > > > #include <sys/ioctl.h>
> > > > >
> > > > > int main(int argc, char *argv[])
> > > > > {
> > > > > ioctl(1, TIOCVHANGUP);
> > > > >
> > > > > return 0;
> > > > > }
> > > > > [root@icelake ~]# gcc -Wall -o hangup hangup.c
> > > > > [root@icelake ~]# ./hangup > /dev/ttyS4
> > > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > > Address: 00000000 Data: 0000
> > > > > [root@icelake ~]#
> > > > >
> > > > > Opening the serial port device again while the UART is in this state
> > > > > then appears to cause the UART to generate an interrupt
> > > >
> > > > The interrupt is ORed three: DMA Tx, DMA Rx and UART itself.
> > > > Any of them can be possible, but to be sure, can you add:
> > > >
> > > > dev_info(p->dev, "FISR: %x\n", fisr);
> > > >
> > > > into dnv_handle_irq() before any other code and see which bits we
> > > > actually got there before the crash?
> > > >
> > > > (If it floods the logs, dev_info_ratelimited() may help)
> > >
> > > I think that that wouldn't report anything because when the UART is
> > > triggering an interrupt here, the MSI address/data are zero, so the
> > > IRQ handler is not actually invoked.
> >
> > Ah, indeed. Then you may disable MSI (in 8250_mid) and see that anyway?
> >
> > > If Ilpo doesn't beat me to it, I'll try adding some debug code to see
> > > exactly which UART register write in the tty open path is causing the
> > > UART to signal an interrupt before the IRQ handler is set up.
> > >
> > > (The IOMMU stops the write in this case, so the machine doesn't crash,
> > > we just get an I/O page fault warning in dmesg every time this happens.)
> >
> > And I believe you are not using that UART as debug console, so it won't
> > dead lock itself. It's then better than I assumed.
> >
> > > > > before the
> > > > > MSI vector has been set up again, causing a DMA write to I/O virtual
> > > > > address zero:
> > > > >
> > > > > [root@icelake console]# echo > /dev/ttyS4
> > > > > [ 979.463307] DMAR: DRHD: handling fault status reg 3
> > > > > [ 979.469409] DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
> > > > >
> > > > > I'm guessing there's something under tty_open() -> uart_open() ->
> > > > > tty_port_open() -> uart_port_activate() -> uart_port_startup() ->
> > > > > serial8250_do_startup() that triggers a UART interrupt before the
> > > > > MSI vector has been set up again.
> > > > >
> > > > > I did a quick search but it didn't seem like this is a known issue.
> > > >
> > > > Thanks for your report and reproducer! Yes, I also never heard about
> > > > such an issue before. Ilpo, who is doing more UART work nowadays, might
> > > > have an idea, I hope.
>
> The patch below seems to avoid the faults. [...]

Thanks for the fix!

> [...] I'm far from sure if it's the
> best fix though as I don't fully understand what causes the faults during
> the THRE tests because the port->irq is disabled by the THRE test block.

If the IRQ hasn't been set up yet, the UART will have zeroes in its MSI
address/data registers. Disabling the IRQ at the interrupt controller
won't stop the UART from performing a DMA write to the address programmed
in its MSI address register (zero) when it wants to signal an interrupt.

(These UARTs (in Ice Lake-D) implement PCI 2.1 style MSI without masking
capability, so there is no way to mask the interrupt at the source PCI
function level, except disabling the MSI capability entirely, but that
would cause it to fall back to INTx# assertion, and the PCI specification
prohibits disabling the MSI capability as a way to mask a function's
interrupt service request.)

> Reported-by: Lennert Buytenhek <[email protected]>

Could you make this [email protected] ?

2022-09-16 12:11:41

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Fri, 16 Sep 2022, Lennert Buytenhek wrote:

> On Thu, Sep 15, 2022 at 07:27:45PM +0300, Ilpo J?rvinen wrote:
>
> > > > > > On an Intel SoC with several 8250_mid PCIe UARTs built into the CPU, I
> > > > > > can reliably trigger I/O page faults if I invoke TIOCVHANGUP on any of
> > > > > > the UARTs and then re-open that UART.
> > > > > >
> > > > > > Invoking TIOCVHANGUP appears to clear the MSI address/data registers
> > > > > > in the UART via tty_ioctl() -> tty_vhangup() -> __tty_hangup() ->
> > > > > > uart_hangup() -> uart_shutdown() -> uart_port_shutdown() ->
> > > > > > univ8250_release_irq() -> free_irq() -> irq_domain_deactivate_irq() ->
> > > > > > __irq_domain_deactivate_irq() -> msi_domain_deactivate() ->
> > > > > > __pci_write_msi_msg():
> > > > > >
> > > > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > > > Address: fee00278 Data: 0000
> > > > > > [root@icelake ~]# cat hangup.c
> > > > > > #include <stdio.h>
> > > > > > #include <sys/ioctl.h>
> > > > > >
> > > > > > int main(int argc, char *argv[])
> > > > > > {
> > > > > > ioctl(1, TIOCVHANGUP);
> > > > > >
> > > > > > return 0;
> > > > > > }
> > > > > > [root@icelake ~]# gcc -Wall -o hangup hangup.c
> > > > > > [root@icelake ~]# ./hangup > /dev/ttyS4
> > > > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > > > Address: 00000000 Data: 0000
> > > > > > [root@icelake ~]#
> > > > > >
> > > > > > Opening the serial port device again while the UART is in this state
> > > > > > then appears to cause the UART to generate an interrupt
> > > > >
> > > > > The interrupt is ORed three: DMA Tx, DMA Rx and UART itself.
> > > > > Any of them can be possible, but to be sure, can you add:
> > > > >
> > > > > dev_info(p->dev, "FISR: %x\n", fisr);
> > > > >
> > > > > into dnv_handle_irq() before any other code and see which bits we
> > > > > actually got there before the crash?
> > > > >
> > > > > (If it floods the logs, dev_info_ratelimited() may help)
> > > >
> > > > I think that that wouldn't report anything because when the UART is
> > > > triggering an interrupt here, the MSI address/data are zero, so the
> > > > IRQ handler is not actually invoked.
> > >
> > > Ah, indeed. Then you may disable MSI (in 8250_mid) and see that anyway?
> > >
> > > > If Ilpo doesn't beat me to it, I'll try adding some debug code to see
> > > > exactly which UART register write in the tty open path is causing the
> > > > UART to signal an interrupt before the IRQ handler is set up.
> > > >
> > > > (The IOMMU stops the write in this case, so the machine doesn't crash,
> > > > we just get an I/O page fault warning in dmesg every time this happens.)
> > >
> > > And I believe you are not using that UART as debug console, so it won't
> > > dead lock itself. It's then better than I assumed.
> > >
> > > > > > before the
> > > > > > MSI vector has been set up again, causing a DMA write to I/O virtual
> > > > > > address zero:
> > > > > >
> > > > > > [root@icelake console]# echo > /dev/ttyS4
> > > > > > [ 979.463307] DMAR: DRHD: handling fault status reg 3
> > > > > > [ 979.469409] DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
> > > > > >
> > > > > > I'm guessing there's something under tty_open() -> uart_open() ->
> > > > > > tty_port_open() -> uart_port_activate() -> uart_port_startup() ->
> > > > > > serial8250_do_startup() that triggers a UART interrupt before the
> > > > > > MSI vector has been set up again.
> > > > > >
> > > > > > I did a quick search but it didn't seem like this is a known issue.
> > > > >
> > > > > Thanks for your report and reproducer! Yes, I also never heard about
> > > > > such an issue before. Ilpo, who is doing more UART work nowadays, might
> > > > > have an idea, I hope.
> >
> > The patch below seems to avoid the faults. [...]
>
> Thanks for the fix!
>
>
> > [...] I'm far from sure if it's the
> > best fix though as I don't fully understand what causes the faults during
> > the THRE tests because the port->irq is disabled by the THRE test block.
>
> If the IRQ hasn't been set up yet, the UART will have zeroes in its MSI
> address/data registers. Disabling the IRQ at the interrupt controller
> won't stop the UART from performing a DMA write to the address programmed
> in its MSI address register (zero) when it wants to signal an interrupt.
>
> (These UARTs (in Ice Lake-D) implement PCI 2.1 style MSI without masking
> capability, so there is no way to mask the interrupt at the source PCI
> function level, except disabling the MSI capability entirely, but that
> would cause it to fall back to INTx# assertion, and the PCI specification
> prohibits disabling the MSI capability as a way to mask a function's
> interrupt service request.)
>
> > Reported-by: Lennert Buytenhek <[email protected]>
>
> Could you make this [email protected] ?

Sure. Should I add Tested-by as well?

--
i.

2022-09-16 13:41:29

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Fri, Sep 16, 2022 at 03:02:04PM +0300, Ilpo J?rvinen wrote:

> > > > > > > On an Intel SoC with several 8250_mid PCIe UARTs built into the CPU, I
> > > > > > > can reliably trigger I/O page faults if I invoke TIOCVHANGUP on any of
> > > > > > > the UARTs and then re-open that UART.
> > > > > > >
> > > > > > > Invoking TIOCVHANGUP appears to clear the MSI address/data registers
> > > > > > > in the UART via tty_ioctl() -> tty_vhangup() -> __tty_hangup() ->
> > > > > > > uart_hangup() -> uart_shutdown() -> uart_port_shutdown() ->
> > > > > > > univ8250_release_irq() -> free_irq() -> irq_domain_deactivate_irq() ->
> > > > > > > __irq_domain_deactivate_irq() -> msi_domain_deactivate() ->
> > > > > > > __pci_write_msi_msg():
> > > > > > >
> > > > > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > > > > Address: fee00278 Data: 0000
> > > > > > > [root@icelake ~]# cat hangup.c
> > > > > > > #include <stdio.h>
> > > > > > > #include <sys/ioctl.h>
> > > > > > >
> > > > > > > int main(int argc, char *argv[])
> > > > > > > {
> > > > > > > ioctl(1, TIOCVHANGUP);
> > > > > > >
> > > > > > > return 0;
> > > > > > > }
> > > > > > > [root@icelake ~]# gcc -Wall -o hangup hangup.c
> > > > > > > [root@icelake ~]# ./hangup > /dev/ttyS4
> > > > > > > [root@icelake ~]# lspci -s 00:1a.0 -vv | grep -A1 MSI:
> > > > > > > Capabilities: [40] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > > > > > > Address: 00000000 Data: 0000
> > > > > > > [root@icelake ~]#
> > > > > > >
> > > > > > > Opening the serial port device again while the UART is in this state
> > > > > > > then appears to cause the UART to generate an interrupt
> > > > > >
> > > > > > The interrupt is ORed three: DMA Tx, DMA Rx and UART itself.
> > > > > > Any of them can be possible, but to be sure, can you add:
> > > > > >
> > > > > > dev_info(p->dev, "FISR: %x\n", fisr);
> > > > > >
> > > > > > into dnv_handle_irq() before any other code and see which bits we
> > > > > > actually got there before the crash?
> > > > > >
> > > > > > (If it floods the logs, dev_info_ratelimited() may help)
> > > > >
> > > > > I think that that wouldn't report anything because when the UART is
> > > > > triggering an interrupt here, the MSI address/data are zero, so the
> > > > > IRQ handler is not actually invoked.
> > > >
> > > > Ah, indeed. Then you may disable MSI (in 8250_mid) and see that anyway?
> > > >
> > > > > If Ilpo doesn't beat me to it, I'll try adding some debug code to see
> > > > > exactly which UART register write in the tty open path is causing the
> > > > > UART to signal an interrupt before the IRQ handler is set up.
> > > > >
> > > > > (The IOMMU stops the write in this case, so the machine doesn't crash,
> > > > > we just get an I/O page fault warning in dmesg every time this happens.)
> > > >
> > > > And I believe you are not using that UART as debug console, so it won't
> > > > dead lock itself. It's then better than I assumed.
> > > >
> > > > > > > before the
> > > > > > > MSI vector has been set up again, causing a DMA write to I/O virtual
> > > > > > > address zero:
> > > > > > >
> > > > > > > [root@icelake console]# echo > /dev/ttyS4
> > > > > > > [ 979.463307] DMAR: DRHD: handling fault status reg 3
> > > > > > > [ 979.469409] DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
> > > > > > >
> > > > > > > I'm guessing there's something under tty_open() -> uart_open() ->
> > > > > > > tty_port_open() -> uart_port_activate() -> uart_port_startup() ->
> > > > > > > serial8250_do_startup() that triggers a UART interrupt before the
> > > > > > > MSI vector has been set up again.
> > > > > > >
> > > > > > > I did a quick search but it didn't seem like this is a known issue.
> > > > > >
> > > > > > Thanks for your report and reproducer! Yes, I also never heard about
> > > > > > such an issue before. Ilpo, who is doing more UART work nowadays, might
> > > > > > have an idea, I hope.
> > >
> > > The patch below seems to avoid the faults. [...]
> >
> > Thanks for the fix!
> >
> >
> > > [...] I'm far from sure if it's the
> > > best fix though as I don't fully understand what causes the faults during
> > > the THRE tests because the port->irq is disabled by the THRE test block.
> >
> > If the IRQ hasn't been set up yet, the UART will have zeroes in its MSI
> > address/data registers. Disabling the IRQ at the interrupt controller
> > won't stop the UART from performing a DMA write to the address programmed
> > in its MSI address register (zero) when it wants to signal an interrupt.
> >
> > (These UARTs (in Ice Lake-D) implement PCI 2.1 style MSI without masking
> > capability, so there is no way to mask the interrupt at the source PCI
> > function level, except disabling the MSI capability entirely, but that
> > would cause it to fall back to INTx# assertion, and the PCI specification
> > prohibits disabling the MSI capability as a way to mask a function's
> > interrupt service request.)

(In other words, disabling the IRQ at the interrupt controller doesn't
prevent the device from signaling an interrupt, and signaling an
interrupt without a proper MSI target address configured in the device's
MSI address register is what is causing the I/O page fault.)

> > > Reported-by: Lennert Buytenhek <[email protected]>
> >
> > Could you make this [email protected] ?
>
> Sure. Should I add Tested-by as well?

OK!

Tested-by: Lennert Buytenhek <[email protected]>

2022-09-19 13:57:13

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Thu, Sep 15, 2022 at 07:27:45PM +0300, Ilpo J?rvinen wrote:
> On Wed, 14 Sep 2022, Andy Shevchenko wrote:
> > On Wed, Sep 14, 2022 at 02:10:44PM +0300, Lennert Buytenhek wrote:
> > > On Wed, Sep 14, 2022 at 01:09:40PM +0300, Andy Shevchenko wrote:

...

> - /*
> - * The above check will only give an accurate result the first time
> - * the port is opened so this value needs to be preserved.
> - */

Side note: I haven't got why you removed this comment (it may be some staled
info, but shouldn't be done in the separate change then?).

--
With Best Regards,
Andy Shevchenko

2022-09-19 14:22:41

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Fri, Sep 16, 2022 at 02:47:08PM +0300, Lennert Buytenhek wrote:
> On Thu, Sep 15, 2022 at 07:27:45PM +0300, Ilpo J?rvinen wrote:

...

> Thanks for the fix!
>
> > [...] I'm far from sure if it's the
> > best fix though as I don't fully understand what causes the faults during
> > the THRE tests because the port->irq is disabled by the THRE test block.
>
> If the IRQ hasn't been set up yet, the UART will have zeroes in its MSI
> address/data registers. Disabling the IRQ at the interrupt controller
> won't stop the UART from performing a DMA write to the address programmed
> in its MSI address register (zero) when it wants to signal an interrupt.
>
> (These UARTs (in Ice Lake-D) implement PCI 2.1 style MSI without masking
> capability, so there is no way to mask the interrupt at the source PCI
> function level, except disabling the MSI capability entirely, but that
> would cause it to fall back to INTx# assertion, and the PCI specification
> prohibits disabling the MSI capability as a way to mask a function's
> interrupt service request.)

This sounds to me like a good part to be injected into commit message of
the proposed fix.

--
With Best Regards,
Andy Shevchenko

2022-09-19 14:27:59

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Mon, 19 Sep 2022, Andy Shevchenko wrote:

> On Thu, Sep 15, 2022 at 07:27:45PM +0300, Ilpo J?rvinen wrote:
> > On Wed, 14 Sep 2022, Andy Shevchenko wrote:
> > > On Wed, Sep 14, 2022 at 02:10:44PM +0300, Lennert Buytenhek wrote:
> > > > On Wed, Sep 14, 2022 at 01:09:40PM +0300, Andy Shevchenko wrote:
>
> ...
>
> > - /*
> > - * The above check will only give an accurate result the first time
> > - * the port is opened so this value needs to be preserved.
> > - */
>
> Side note: I haven't got why you removed this comment (it may be some staled
> info, but shouldn't be done in the separate change then?).

I cleaned up this part in v2 (as you probably noticed). I was just an
artifact of how the fix got initially made.

I've also located the place where the comment belongs to. "The above
check" refers to the THRE test. However, I don't fully understand the
comment itself, that is, why the test is claimed to only work for for the
first time. As long as FIFO is cleared beforehand, I think it should work
on other times too.

--
i.

2022-09-19 14:50:48

[permalink] [raw]

Subject: Re: I/O page faults from 8250_mid PCIe UART after TIOCVHANGUP

On Mon, Sep 19, 2022 at 05:19:35PM +0300, Ilpo J?rvinen wrote:

> > > > [...] I'm far from sure if it's the
> > > > best fix though as I don't fully understand what causes the faults during
> > > > the THRE tests because the port->irq is disabled by the THRE test block.
> > >
> > > If the IRQ hasn't been set up yet, the UART will have zeroes in its MSI
> > > address/data registers. Disabling the IRQ at the interrupt controller
> > > won't stop the UART from performing a DMA write to the address programmed
> > > in its MSI address register (zero) when it wants to signal an interrupt.
> > >
> > > (These UARTs (in Ice Lake-D) implement PCI 2.1 style MSI without masking
> > > capability, so there is no way to mask the interrupt at the source PCI
> > > function level, except disabling the MSI capability entirely, but that
> > > would cause it to fall back to INTx# assertion, and the PCI specification
> > > prohibits disabling the MSI capability as a way to mask a function's
> > > interrupt service request.)
> >
> > This sounds to me like a good part to be injected into commit message of
> > the proposed fix.
>
> I added my own wording already but I could adds of Lennert's far superior
> descriptions verbatim if he is OK with that?

No objections from me.

2022-09-19 14:51:54