2022-10-07 16:46:58

by Hao Peng

[permalink] [raw]
Subject: [PATCH] kvm: x86: keep srcu writer side operation mutually exclusive

From: Peng Hao <[email protected]>

Synchronization operations on the writer side of SRCU should be
invoked within the mutex.

Signed-off-by: Peng Hao <[email protected]>
---
arch/x86/kvm/pmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 8a7dbe2c469a..619151849980 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -602,9 +602,9 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm
*kvm, void __user *argp)

mutex_lock(&kvm->lock);
filter = rcu_replace_pointer(kvm->arch.pmu_event_filter, filter, 1);
+ synchronize_srcu_expedited(&kvm->srcu);
mutex_unlock(&kvm->lock);

- synchronize_srcu_expedited(&kvm->srcu);
r = 0;
cleanup:
kfree(filter);
--
2.27.0


2022-10-07 17:18:14

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] kvm: x86: keep srcu writer side operation mutually exclusive

On Sat, Oct 08, 2022, Hao Peng wrote:
> From: Peng Hao <[email protected]>
>
> Synchronization operations on the writer side of SRCU should be
> invoked within the mutex.

Why? Synchronizing SRCU is necessary only to ensure that all previous readers go
away before the old filter is freed. There's no need to serialize synchronization
between writers. The mutex ensures each writer operates on the "new" filter that's
set by the previous writer, i.e. there's no danger of a double-free. And the next
writer will wait for readers to _its_ "new" filter.

I think it's a moot point though, as this is a subset of patch I posted[*] to fix
other issues with the PMU event filter.

[*] https://lore.kernel.org/all/[email protected]

2022-10-09 11:52:05

by Hao Peng

[permalink] [raw]
Subject: Re: [PATCH] kvm: x86: keep srcu writer side operation mutually exclusive

On Sat, Oct 8, 2022 at 1:12 AM Sean Christopherson <[email protected]> wrote:
>
> On Sat, Oct 08, 2022, Hao Peng wrote:
> > From: Peng Hao <[email protected]>
> >
> > Synchronization operations on the writer side of SRCU should be
> > invoked within the mutex.
>
> Why? Synchronizing SRCU is necessary only to ensure that all previous readers go
> away before the old filter is freed. There's no need to serialize synchronization
> between writers. The mutex ensures each writer operates on the "new" filter that's
> set by the previous writer, i.e. there's no danger of a double-free. And the next
> writer will wait for readers to _its_ "new" filter.
>
Array srcu_lock_count/srcu_unlock_count[] in srcu_data, which is used
alternately to determine
which readers need to wait to get out of the critical area. If two
synchronize_srcu are initiated concurrently,
there may be a problem with the judgment of gp. But if it is confirmed
that there will be no writer concurrency,
it is not necessary to ensure that synchronize_srcu is executed within
the scope of the mutex lock.
Thanks.
> I think it's a moot point though, as this is a subset of patch I posted[*] to fix
> other issues with the PMU event filter.
>
> [*] https://lore.kernel.org/all/[email protected]

2022-10-10 18:26:53

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] kvm: x86: keep srcu writer side operation mutually exclusive

On Sun, Oct 09, 2022, Hao Peng wrote:
> On Sat, Oct 8, 2022 at 1:12 AM Sean Christopherson <[email protected]> wrote:
> >
> > On Sat, Oct 08, 2022, Hao Peng wrote:
> > > From: Peng Hao <[email protected]>
> > >
> > > Synchronization operations on the writer side of SRCU should be
> > > invoked within the mutex.
> >
> > Why? Synchronizing SRCU is necessary only to ensure that all previous readers go
> > away before the old filter is freed. There's no need to serialize synchronization
> > between writers. The mutex ensures each writer operates on the "new" filter that's
> > set by the previous writer, i.e. there's no danger of a double-free. And the next
> > writer will wait for readers to _its_ "new" filter.
> >
> Array srcu_lock_count/srcu_unlock_count[] in srcu_data, which is used
> alternately to determine
> which readers need to wait to get out of the critical area. If two
> synchronize_srcu are initiated concurrently,
> there may be a problem with the judgment of gp. But if it is confirmed
> that there will be no writer concurrency,
> it is not necessary to ensure that synchronize_srcu is executed within
> the scope of the mutex lock.

I don't see anything in the RCU documentation or code that suggests that callers
need to serialize synchronization calls. E.g. the "tree" SRCU implementation uses
a dedicated mutex to serialize grace period work

struct mutex srcu_gp_mutex; /* Serialize GP work. */

static void srcu_advance_state(struct srcu_struct *ssp)
{
int idx;

mutex_lock(&ssp->srcu_gp_mutex);

<magic>
}


and its state machine explicitly accounts for "Someone else" starting a grace
period

if (idx != SRCU_STATE_IDLE) {
mutex_unlock(&ssp->srcu_gp_mutex);
return; /* Someone else started the grace period. */
}

and srcu_gp_end() also guards against creating more than 2 grace periods.

/* Prevent more than one additional grace period. */
mutex_lock(&ssp->srcu_cb_mutex);

And if this is a subtle requirement, there is a lot of broken kernel code, e.g.
mmu_notifier, other KVM code, srcu_notifier_chain_unregister(), etc...

2022-10-24 03:36:03

by Hao Peng

[permalink] [raw]
Subject: Re: [PATCH] kvm: x86: keep srcu writer side operation mutually exclusive

On Tue, Oct 11, 2022 at 1:38 AM Sean Christopherson <[email protected]> wrote:
>
> On Sun, Oct 09, 2022, Hao Peng wrote:
> > On Sat, Oct 8, 2022 at 1:12 AM Sean Christopherson <[email protected]> wrote:
> > >
> > > On Sat, Oct 08, 2022, Hao Peng wrote:
> > > > From: Peng Hao <[email protected]>
> > > >
> > > > Synchronization operations on the writer side of SRCU should be
> > > > invoked within the mutex.
> > >
> > > Why? Synchronizing SRCU is necessary only to ensure that all previous readers go
> > > away before the old filter is freed. There's no need to serialize synchronization
> > > between writers. The mutex ensures each writer operates on the "new" filter that's
> > > set by the previous writer, i.e. there's no danger of a double-free. And the next
> > > writer will wait for readers to _its_ "new" filter.
> > >
> > Array srcu_lock_count/srcu_unlock_count[] in srcu_data, which is used
> > alternately to determine
> > which readers need to wait to get out of the critical area. If two
> > synchronize_srcu are initiated concurrently,
> > there may be a problem with the judgment of gp. But if it is confirmed
> > that there will be no writer concurrency,
> > it is not necessary to ensure that synchronize_srcu is executed within
> > the scope of the mutex lock.
>
> I don't see anything in the RCU documentation or code that suggests that callers
> need to serialize synchronization calls. E.g. the "tree" SRCU implementation uses
> a dedicated mutex to serialize grace period work
>
> struct mutex srcu_gp_mutex; /* Serialize GP work. */
>
> static void srcu_advance_state(struct srcu_struct *ssp)
> {
> int idx;
>
> mutex_lock(&ssp->srcu_gp_mutex);
>
> <magic>
> }
>
>
> and its state machine explicitly accounts for "Someone else" starting a grace
> period
>
> if (idx != SRCU_STATE_IDLE) {
> mutex_unlock(&ssp->srcu_gp_mutex);
> return; /* Someone else started the grace period. */
> }
>
> and srcu_gp_end() also guards against creating more than 2 grace periods.
>
> /* Prevent more than one additional grace period. */
> mutex_lock(&ssp->srcu_cb_mutex);
>
> And if this is a subtle requirement, there is a lot of broken kernel code, e.g.
> mmu_notifier, other KVM code, srcu_notifier_chain_unregister(), etc...

srcu_gp_mutex is meaningless because the workqueue already guarantees
that the same work_struct will not be reentrant.
If synchronize_srcu is not mutually exclusive on the update side, it may cause
a GP to fail for a long time. I will continue to analyze when I have time.