2024-04-11 00:50:49

by Chen Yu

[permalink] [raw]
Subject: [PATCH v2] efi/unaccepted: touch soft lockup during memory accept

Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused
by parallel memory acceptance") has released the spinlock so
other CPUs can do memory acceptance in parallel and not
triggers softlockup on other CPUs.

However the softlock up was intermittent shown up if the memory
of the TD guest is large, and the timeout of softlockup is set
to 1 second.

The symptom is:
When the local irq is enabled at the end of accept_memory(),
the softlockup detects that the watchdog on single CPU has
not been fed for a while. That is to say, even other CPUs
will not be blocked by spinlock, the current CPU might be
stunk with local irq disabled for a while, which hurts not
only nmi watchdog but also softlockup.

Chao Gao pointed out that the memory accept could be time
costly and there was similar report before. Thus to avoid
any softlocup detection during this stage, give the
softlockup a flag to skip the timeout check at the end of
accept_memory(), by invoking touch_softlockup_watchdog().

Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
Reported-by: "Hossain, Md Iqbal" <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Chen Yu <[email protected]>
---
v1 -> v2:
Refine the commit log and add fixes tag/reviewed-by tag from Kirill.
---
drivers/firmware/efi/unaccepted_memory.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/efi/unaccepted_memory.c
index 5b439d04079c..50f6503fe49f 100644
--- a/drivers/firmware/efi/unaccepted_memory.c
+++ b/drivers/firmware/efi/unaccepted_memory.c
@@ -4,6 +4,7 @@
#include <linux/memblock.h>
#include <linux/spinlock.h>
#include <linux/crash_dump.h>
+#include <linux/nmi.h>
#include <asm/unaccepted_memory.h>

/* Protects unaccepted memory bitmap and accepting_list */
@@ -149,6 +150,9 @@ void accept_memory(phys_addr_t start, phys_addr_t end)
}

list_del(&range.list);
+
+ touch_softlockup_watchdog();
+
spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
}

--
2.25.1



2024-04-22 14:40:32

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH v2] efi/unaccepted: touch soft lockup during memory accept

On 2024-04-11 at 08:49:07 +0800, Chen Yu wrote:
> Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused
> by parallel memory acceptance") has released the spinlock so
> other CPUs can do memory acceptance in parallel and not
> triggers softlockup on other CPUs.
>
> However the softlock up was intermittent shown up if the memory
> of the TD guest is large, and the timeout of softlockup is set
> to 1 second.
>
> The symptom is:
> When the local irq is enabled at the end of accept_memory(),
> the softlockup detects that the watchdog on single CPU has
> not been fed for a while. That is to say, even other CPUs
> will not be blocked by spinlock, the current CPU might be
> stunk with local irq disabled for a while, which hurts not
> only nmi watchdog but also softlockup.
>
> Chao Gao pointed out that the memory accept could be time
> costly and there was similar report before. Thus to avoid
> any softlocup detection during this stage, give the
> softlockup a flag to skip the timeout check at the end of
> accept_memory(), by invoking touch_softlockup_watchdog().
>
> Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
> Reported-by: "Hossain, Md Iqbal" <[email protected]>
> Reviewed-by: Kirill A. Shutemov <[email protected]>
> Signed-off-by: Chen Yu <[email protected]>
> ---
> v1 -> v2:
> Refine the commit log and add fixes tag/reviewed-by tag from Kirill.

Gently pinging about this patch.

thanks,
Chenyu

2024-04-24 17:12:51

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2] efi/unaccepted: touch soft lockup during memory accept

On Mon, 22 Apr 2024 at 16:40, Chen Yu <[email protected]> wrote:
>
> On 2024-04-11 at 08:49:07 +0800, Chen Yu wrote:
> > Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused
> > by parallel memory acceptance") has released the spinlock so
> > other CPUs can do memory acceptance in parallel and not
> > triggers softlockup on other CPUs.
> >
> > However the softlock up was intermittent shown up if the memory
> > of the TD guest is large, and the timeout of softlockup is set
> > to 1 second.
> >
> > The symptom is:
> > When the local irq is enabled at the end of accept_memory(),
> > the softlockup detects that the watchdog on single CPU has
> > not been fed for a while. That is to say, even other CPUs
> > will not be blocked by spinlock, the current CPU might be
> > stunk with local irq disabled for a while, which hurts not
> > only nmi watchdog but also softlockup.
> >
> > Chao Gao pointed out that the memory accept could be time
> > costly and there was similar report before. Thus to avoid
> > any softlocup detection during this stage, give the
> > softlockup a flag to skip the timeout check at the end of
> > accept_memory(), by invoking touch_softlockup_watchdog().
> >
> > Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
> > Reported-by: "Hossain, Md Iqbal" <[email protected]>
> > Reviewed-by: Kirill A. Shutemov <[email protected]>
> > Signed-off-by: Chen Yu <[email protected]>
> > ---
> > v1 -> v2:
> > Refine the commit log and add fixes tag/reviewed-by tag from Kirill.
>
> Gently pinging about this patch.
>

Queued up in efi/urgent now, thanks.

2024-05-03 10:31:37

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2] efi/unaccepted: touch soft lockup during memory accept

On Wed, 24 Apr 2024 at 19:12, Ard Biesheuvel <[email protected]> wrote:
>
> On Mon, 22 Apr 2024 at 16:40, Chen Yu <[email protected]> wrote:
> >
> > On 2024-04-11 at 08:49:07 +0800, Chen Yu wrote:
> > > Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused
> > > by parallel memory acceptance") has released the spinlock so
> > > other CPUs can do memory acceptance in parallel and not
> > > triggers softlockup on other CPUs.
> > >
> > > However the softlock up was intermittent shown up if the memory
> > > of the TD guest is large, and the timeout of softlockup is set
> > > to 1 second.
> > >
> > > The symptom is:
> > > When the local irq is enabled at the end of accept_memory(),
> > > the softlockup detects that the watchdog on single CPU has
> > > not been fed for a while. That is to say, even other CPUs
> > > will not be blocked by spinlock, the current CPU might be
> > > stunk with local irq disabled for a while, which hurts not
> > > only nmi watchdog but also softlockup.
> > >
> > > Chao Gao pointed out that the memory accept could be time
> > > costly and there was similar report before. Thus to avoid
> > > any softlocup detection during this stage, give the
> > > softlockup a flag to skip the timeout check at the end of
> > > accept_memory(), by invoking touch_softlockup_watchdog().
> > >
> > > Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
> > > Reported-by: "Hossain, Md Iqbal" <[email protected]>
> > > Reviewed-by: Kirill A. Shutemov <[email protected]>
> > > Signed-off-by: Chen Yu <[email protected]>
> > > ---
> > > v1 -> v2:
> > > Refine the commit log and add fixes tag/reviewed-by tag from Kirill.
> >
> > Gently pinging about this patch.
> >
>
> Queued up in efi/urgent now, thanks.

OK, I was about to send this patch to Linus (and I am still going to).

However, I do wonder if sprinkling touch_softlockup_watchdog() left
and right is really the right solution here.

Looking at the backtrace, this is a page fault originating in user
space. So why do we end up calling into the hypervisor to accept a
chunk of memory large enough to trigger the softlockup watchdog? Or is
the hypercall simply taking a disproportionate amount of time?

And AIUI, touch_softlockup_watchdog() hides the fact that we are
hogging the CPU for way too long - is there any way we can at least
yield the CPU on this condition?

2024-05-03 13:48:08

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v2] efi/unaccepted: touch soft lockup during memory accept

On Fri, May 03, 2024 at 12:31:12PM +0200, Ard Biesheuvel wrote:
> On Wed, 24 Apr 2024 at 19:12, Ard Biesheuvel <[email protected]> wrote:
> >
> > On Mon, 22 Apr 2024 at 16:40, Chen Yu <[email protected]> wrote:
> > >
> > > On 2024-04-11 at 08:49:07 +0800, Chen Yu wrote:
> > > > Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused
> > > > by parallel memory acceptance") has released the spinlock so
> > > > other CPUs can do memory acceptance in parallel and not
> > > > triggers softlockup on other CPUs.
> > > >
> > > > However the softlock up was intermittent shown up if the memory
> > > > of the TD guest is large, and the timeout of softlockup is set
> > > > to 1 second.
> > > >
> > > > The symptom is:
> > > > When the local irq is enabled at the end of accept_memory(),
> > > > the softlockup detects that the watchdog on single CPU has
> > > > not been fed for a while. That is to say, even other CPUs
> > > > will not be blocked by spinlock, the current CPU might be
> > > > stunk with local irq disabled for a while, which hurts not
> > > > only nmi watchdog but also softlockup.
> > > >
> > > > Chao Gao pointed out that the memory accept could be time
> > > > costly and there was similar report before. Thus to avoid
> > > > any softlocup detection during this stage, give the
> > > > softlockup a flag to skip the timeout check at the end of
> > > > accept_memory(), by invoking touch_softlockup_watchdog().
> > > >
> > > > Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
> > > > Reported-by: "Hossain, Md Iqbal" <[email protected]>
> > > > Reviewed-by: Kirill A. Shutemov <[email protected]>
> > > > Signed-off-by: Chen Yu <[email protected]>
> > > > ---
> > > > v1 -> v2:
> > > > Refine the commit log and add fixes tag/reviewed-by tag from Kirill.
> > >
> > > Gently pinging about this patch.
> > >
> >
> > Queued up in efi/urgent now, thanks.
>
> OK, I was about to send this patch to Linus (and I am still going to).
>
> However, I do wonder if sprinkling touch_softlockup_watchdog() left
> and right is really the right solution here.
>
> Looking at the backtrace, this is a page fault originating in user
> space. So why do we end up calling into the hypervisor to accept a
> chunk of memory large enough to trigger the softlockup watchdog? Or is
> the hypercall simply taking a disproportionate amount of time?

Note that softlockup timeout was set to 1 second to trigger this. So this
is exaggerated case.

> And AIUI, touch_softlockup_watchdog() hides the fact that we are
> hogging the CPU for way too long - is there any way we can at least
> yield the CPU on this condition?

Not really. There's no magic entity that handles accept. It is done by
CPU.

There's a feature in pipeline that makes page accept interruptable in TDX
guest. It can help some cases. But if ended up in this codepath from
non-preemptable context, it won't help.

--
Kiryl Shutsemau / Kirill A. Shutemov

2024-05-03 15:00:53

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH v2] efi/unaccepted: touch soft lockup during memory accept

On 2024-05-03 at 16:47:49 +0300, Kirill A. Shutemov wrote:
> On Fri, May 03, 2024 at 12:31:12PM +0200, Ard Biesheuvel wrote:
> > On Wed, 24 Apr 2024 at 19:12, Ard Biesheuvel <[email protected]> wrote:
> > >
> > > On Mon, 22 Apr 2024 at 16:40, Chen Yu <[email protected]> wrote:
> > > >
> > > > On 2024-04-11 at 08:49:07 +0800, Chen Yu wrote:
> > > > > Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused
> > > > > by parallel memory acceptance") has released the spinlock so
> > > > > other CPUs can do memory acceptance in parallel and not
> > > > > triggers softlockup on other CPUs.
> > > > >
> > > > > However the softlock up was intermittent shown up if the memory
> > > > > of the TD guest is large, and the timeout of softlockup is set
> > > > > to 1 second.
> > > > >
> > > > > The symptom is:
> > > > > When the local irq is enabled at the end of accept_memory(),
> > > > > the softlockup detects that the watchdog on single CPU has
> > > > > not been fed for a while. That is to say, even other CPUs
> > > > > will not be blocked by spinlock, the current CPU might be
> > > > > stunk with local irq disabled for a while, which hurts not
> > > > > only nmi watchdog but also softlockup.
> > > > >
> > > > > Chao Gao pointed out that the memory accept could be time
> > > > > costly and there was similar report before. Thus to avoid
> > > > > any softlocup detection during this stage, give the
> > > > > softlockup a flag to skip the timeout check at the end of
> > > > > accept_memory(), by invoking touch_softlockup_watchdog().
> > > > >
> > > > > Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
> > > > > Reported-by: "Hossain, Md Iqbal" <[email protected]>
> > > > > Reviewed-by: Kirill A. Shutemov <[email protected]>
> > > > > Signed-off-by: Chen Yu <[email protected]>
> > > > > ---
> > > > > v1 -> v2:
> > > > > Refine the commit log and add fixes tag/reviewed-by tag from Kirill.
> > > >
> > > > Gently pinging about this patch.
> > > >
> > >
> > > Queued up in efi/urgent now, thanks.
> >
> > OK, I was about to send this patch to Linus (and I am still going to).
> >
> > However, I do wonder if sprinkling touch_softlockup_watchdog() left
> > and right is really the right solution here.
> >
> > Looking at the backtrace, this is a page fault originating in user
> > space. So why do we end up calling into the hypervisor to accept a
> > chunk of memory large enough to trigger the softlockup watchdog? Or is
> > the hypercall simply taking a disproportionate amount of time?
>
> Note that softlockup timeout was set to 1 second to trigger this. So this
> is exaggerated case.
>
> > And AIUI, touch_softlockup_watchdog() hides the fact that we are
> > hogging the CPU for way too long - is there any way we can at least
> > yield the CPU on this condition?
>
> Not really. There's no magic entity that handles accept. It is done by
> CPU.
>
> There's a feature in pipeline that makes page accept interruptable in TDX
> guest. It can help some cases. But if ended up in this codepath from
> non-preemptable context, it won't help.
>

Is it possible to enable the local irq for a little while after
each arch_accept_memory(phys_start, phys_end),
and even split the [phys_start,phys_end] to smaller regions?
so the watchdog can be fed on time/tick is normal. But currently
the softlock fed at the end seems to be more easier to implement.

thanks,
Chenyu

2024-05-06 12:32:43

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v2] efi/unaccepted: touch soft lockup during memory accept

On Fri, May 03, 2024 at 11:00:18PM +0800, Chen Yu wrote:
> On 2024-05-03 at 16:47:49 +0300, Kirill A. Shutemov wrote:
> > On Fri, May 03, 2024 at 12:31:12PM +0200, Ard Biesheuvel wrote:
> > > On Wed, 24 Apr 2024 at 19:12, Ard Biesheuvel <[email protected]> wrote:
> > > >
> > > > On Mon, 22 Apr 2024 at 16:40, Chen Yu <[email protected]> wrote:
> > > > >
> > > > > On 2024-04-11 at 08:49:07 +0800, Chen Yu wrote:
> > > > > > Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused
> > > > > > by parallel memory acceptance") has released the spinlock so
> > > > > > other CPUs can do memory acceptance in parallel and not
> > > > > > triggers softlockup on other CPUs.
> > > > > >
> > > > > > However the softlock up was intermittent shown up if the memory
> > > > > > of the TD guest is large, and the timeout of softlockup is set
> > > > > > to 1 second.
> > > > > >
> > > > > > The symptom is:
> > > > > > When the local irq is enabled at the end of accept_memory(),
> > > > > > the softlockup detects that the watchdog on single CPU has
> > > > > > not been fed for a while. That is to say, even other CPUs
> > > > > > will not be blocked by spinlock, the current CPU might be
> > > > > > stunk with local irq disabled for a while, which hurts not
> > > > > > only nmi watchdog but also softlockup.
> > > > > >
> > > > > > Chao Gao pointed out that the memory accept could be time
> > > > > > costly and there was similar report before. Thus to avoid
> > > > > > any softlocup detection during this stage, give the
> > > > > > softlockup a flag to skip the timeout check at the end of
> > > > > > accept_memory(), by invoking touch_softlockup_watchdog().
> > > > > >
> > > > > > Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
> > > > > > Reported-by: "Hossain, Md Iqbal" <[email protected]>
> > > > > > Reviewed-by: Kirill A. Shutemov <[email protected]>
> > > > > > Signed-off-by: Chen Yu <[email protected]>
> > > > > > ---
> > > > > > v1 -> v2:
> > > > > > Refine the commit log and add fixes tag/reviewed-by tag from Kirill.
> > > > >
> > > > > Gently pinging about this patch.
> > > > >
> > > >
> > > > Queued up in efi/urgent now, thanks.
> > >
> > > OK, I was about to send this patch to Linus (and I am still going to).
> > >
> > > However, I do wonder if sprinkling touch_softlockup_watchdog() left
> > > and right is really the right solution here.
> > >
> > > Looking at the backtrace, this is a page fault originating in user
> > > space. So why do we end up calling into the hypervisor to accept a
> > > chunk of memory large enough to trigger the softlockup watchdog? Or is
> > > the hypercall simply taking a disproportionate amount of time?
> >
> > Note that softlockup timeout was set to 1 second to trigger this. So this
> > is exaggerated case.
> >
> > > And AIUI, touch_softlockup_watchdog() hides the fact that we are
> > > hogging the CPU for way too long - is there any way we can at least
> > > yield the CPU on this condition?
> >
> > Not really. There's no magic entity that handles accept. It is done by
> > CPU.
> >
> > There's a feature in pipeline that makes page accept interruptable in TDX
> > guest. It can help some cases. But if ended up in this codepath from
> > non-preemptable context, it won't help.
> >
>
> Is it possible to enable the local irq for a little while after
> each arch_accept_memory(phys_start, phys_end),
> and even split the [phys_start,phys_end] to smaller regions?
> so the watchdog can be fed on time/tick is normal. But currently
> the softlock fed at the end seems to be more easier to implement.

That's what I did initially. But Vlastimil correctly pointed that it will
lead to deadlock.

https://lore.kernel.org/all/[email protected]/

--
Kiryl Shutsemau / Kirill A. Shutemov