For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
of reboot mechanisms. That said, the AMD Milan processors don't reboot
in 15ms after invoking acpi_reset().
The proposed 50ms delay can effectively work around this issue.
This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
which indicates that ideally OSPM should execute spin loops on the CPUs
in the system following a write to this register.
Signed-off-by: James Liu <[email protected]>
---
drivers/acpi/reboot.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/reboot.c b/drivers/acpi/reboot.c
index b79b7c99c237..002f7c7814a1 100644
--- a/drivers/acpi/reboot.c
+++ b/drivers/acpi/reboot.c
@@ -78,5 +78,5 @@ void acpi_reboot(void)
* The 15ms delay has been found to be long enough for the system
* to reboot on the affected platforms.
*/
- mdelay(15);
+ mdelay(50);
}
--
2.40.1
On Wed, Jun 7, 2023 at 5:44 AM James Liu <[email protected]> wrote:
>
> For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
> of reboot mechanisms. That said, the AMD Milan processors don't reboot
> in 15ms after invoking acpi_reset().
>
> The proposed 50ms delay can effectively work around this issue.
> This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
> which indicates that ideally OSPM should execute spin loops on the CPUs
> in the system following a write to this register.
>
> Signed-off-by: James Liu <[email protected]>
Why do you want to affect everyone (including guest kernels running in
virtual machines AFAICS) in order to address a problem specific to one
platform?
Wouldn't it be better to quirk that platform and document the quirk properly?
> ---
> drivers/acpi/reboot.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/reboot.c b/drivers/acpi/reboot.c
> index b79b7c99c237..002f7c7814a1 100644
> --- a/drivers/acpi/reboot.c
> +++ b/drivers/acpi/reboot.c
> @@ -78,5 +78,5 @@ void acpi_reboot(void)
> * The 15ms delay has been found to be long enough for the system
> * to reboot on the affected platforms.
> */
> - mdelay(15);
> + mdelay(50);
> }
> --
> 2.40.1
>
On Wed, Jun 07, 2023 at 01:19:42PM +0200, Rafael J. Wysocki wrote:
> On Wed, Jun 7, 2023 at 5:44 AM James Liu <[email protected]> wrote:
> >
> > For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
> > of reboot mechanisms. That said, the AMD Milan processors don't reboot
> > in 15ms after invoking acpi_reset().
> >
> > The proposed 50ms delay can effectively work around this issue.
> > This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
> > which indicates that ideally OSPM should execute spin loops on the CPUs
> > in the system following a write to this register.
> >
> > Signed-off-by: James Liu <[email protected]>
>
> Why do you want to affect everyone (including guest kernels running in
> virtual machines AFAICS) in order to address a problem specific to one
> platform?
I hoped to address this issue for any platform requiring a longer delay to
complete ACPI reset in time for any (maybe silicon-level) reasons. Some AMD Milan
platforms were the cases that we've found so far.
Except that, since ACPI spec indicates there should be a spin loop (long enough)
after the write instruction to Reset Register, I thought it should be no harms to
the other systems which well consider this spin loop when they claim to support
ACPI reboot.
Btw, I am just curious, why is the virtual machine mentioned here?
is the 50ms delay in acpi_reboot() for a guest OS on VM so long that some
unexpected behavior might happen?
> Wouldn't it be better to quirk that platform and document the quirk properly?
Yeah, it could be. Actually we considered this, and we will consider it again.
> > ---
> > drivers/acpi/reboot.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/reboot.c b/drivers/acpi/reboot.c
> > index b79b7c99c237..002f7c7814a1 100644
> > --- a/drivers/acpi/reboot.c
> > +++ b/drivers/acpi/reboot.c
> > @@ -78,5 +78,5 @@ void acpi_reboot(void)
> > * The 15ms delay has been found to be long enough for the system
> > * to reboot on the affected platforms.
> > */
> > - mdelay(15);
> > + mdelay(50);
> > }
> > --
> > 2.40.1
> >
On Thu, Jun 8, 2023 at 11:14 AM James Liu <[email protected]> wrote:
>
> On Wed, Jun 07, 2023 at 01:19:42PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Jun 7, 2023 at 5:44 AM James Liu <[email protected]> wrote:
> > >
> > > For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
> > > of reboot mechanisms. That said, the AMD Milan processors don't reboot
> > > in 15ms after invoking acpi_reset().
> > >
> > > The proposed 50ms delay can effectively work around this issue.
> > > This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
> > > which indicates that ideally OSPM should execute spin loops on the CPUs
> > > in the system following a write to this register.
> > >
> > > Signed-off-by: James Liu <[email protected]>
> >
> > Why do you want to affect everyone (including guest kernels running in
> > virtual machines AFAICS) in order to address a problem specific to one
> > platform?
>
> I hoped to address this issue for any platform requiring a longer delay to
> complete ACPI reset in time for any (maybe silicon-level) reasons. Some AMD Milan
> platforms were the cases that we've found so far.
Do you know about any other?
> Except that, since ACPI spec indicates there should be a spin loop (long enough)
> after the write instruction to Reset Register, I thought it should be no harms to
> the other systems which well consider this spin loop when they claim to support
> ACPI reboot.
>
> Btw, I am just curious, why is the virtual machine mentioned here?
The new delay would be over 3 times larger, so some users might be
surprised by it at least potentially.
> is the 50ms delay in acpi_reboot() for a guest OS on VM so long that some
> unexpected behavior might happen?
>
> > Wouldn't it be better to quirk that platform and document the quirk properly?
>
> Yeah, it could be. Actually we considered this, and we will consider it again.
>
> > > ---
> > > drivers/acpi/reboot.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/acpi/reboot.c b/drivers/acpi/reboot.c
> > > index b79b7c99c237..002f7c7814a1 100644
> > > --- a/drivers/acpi/reboot.c
> > > +++ b/drivers/acpi/reboot.c
> > > @@ -78,5 +78,5 @@ void acpi_reboot(void)
> > > * The 15ms delay has been found to be long enough for the system
> > > * to reboot on the affected platforms.
> > > */
> > > - mdelay(15);
> > > + mdelay(50);
> > > }
> > > --
On Tue, Aug 15, 2023 at 9:51 PM James Liu <[email protected]> wrote:
>
> On Mon, Jun 12, 2023 at 06:57:01PM +0200, Rafael J. Wysocki wrote:
> > On Thu, Jun 8, 2023 at 11:14 AM James Liu <[email protected]> wrote:
> > >
> > > On Wed, Jun 07, 2023 at 01:19:42PM +0200, Rafael J. Wysocki wrote:
> > > > On Wed, Jun 7, 2023 at 5:44 AM James Liu <[email protected]> wrote:
> > > > >
> > > > > For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
> > > > > of reboot mechanisms. That said, the AMD Milan processors don't reboot
> > > > > in 15ms after invoking acpi_reset().
> > > > >
> > > > > The proposed 50ms delay can effectively work around this issue.
> > > > > This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
> > > > > which indicates that ideally OSPM should execute spin loops on the CPUs
> > > > > in the system following a write to this register.
> > > > >
> > > > > Signed-off-by: James Liu <[email protected]>
> > > >
> > > > Why do you want to affect everyone (including guest kernels running in
> > > > virtual machines AFAICS) in order to address a problem specific to one
> > > > platform?
> > >
> > > I hoped to address this issue for any platform requiring a longer delay to
> > > complete ACPI reset in time for any (maybe silicon-level) reasons. Some AMD Milan
> > > platforms were the cases that we've found so far.
> >
> > Do you know about any other?
>
> So far, no. Thus, I wont' proceed with anything until I find the same syndrome
> on next-gen platforms (e.g., AMD Genoa). Now, as you say, it is satisfying to
> document this quirk properly.
>
> >
> > > Except that, since ACPI spec indicates there should be a spin loop (long enough)
> > > after the write instruction to Reset Register, I thought it should be no harms to
> > > the other systems which well consider this spin loop when they claim to support
> > > ACPI reboot.
> > >
> > > Btw, I am just curious, why is the virtual machine mentioned here?
> >
> > The new delay would be over 3 times larger, so some users might be
> > surprised by it at least potentially.
>
> Got it. Just in case, if we really need to increase the delay to address it for
> certain amount of platforms, in experience, how long is the delay acceptable so
> that VM users will not be surprised?
Honestly, I don't know.
Also, it is not quite clear to me from the changelog what the problem
with the Milan platforms vs the reboot delay is and I don't think that
the second paragraph (regarding the ACPI specification compliance) is
relevant for this patch at all.