2023-01-16 04:16:32

by Gavin Shan

[permalink] [raw]
Subject: [PATCH 0/4] Improve dirty ring warning report

It has been known case where no running VCPU context exists when the
vgic/its tables are saved. There are other two unknown cases where we
don't have the running VCPU context: (a) saving vgic3 LPI pending status
and (b) saving vgic3 pending tables. Besides, the warning reports in
mark_page_dirty_in_slot() is triggered even the dirty ring hasn't been
enabled by the user space. It's not the unexpected behaviour.

PATCH[1 - 2] Fixes the no-running VCPU context issue when vgic3 LPI
and vgic3 pending table are saved.
PATCH[3 - 4] Improve the warning reports by enabling them when the
dirty ring has been enabled by the user space.

Gavin Shan (4):
KVM: arm64: Allow saving vgic3 LPI pending status in no running vcpu
context
KVM: arm64: Allow saving vgic3 pending tables in no running vcpu
context
KVM: Refactor mark_page_dirty_in_slot()
KVM: Improve warning report in mark_page_dirty_in_slot()

Documentation/virt/kvm/api.rst | 8 ++++++--
arch/arm64/kvm/vgic/vgic-its.c | 3 ++-
arch/arm64/kvm/vgic/vgic-v3.c | 5 +++++
include/kvm/arm_vgic.h | 1 +
include/linux/kvm_dirty_ring.h | 5 +++++
virt/kvm/kvm_main.c | 30 ++++++++++++++++++------------
6 files changed, 37 insertions(+), 15 deletions(-)

--
2.23.0


2023-01-16 04:20:41

by Gavin Shan

[permalink] [raw]
Subject: [PATCH 2/4] KVM: arm64: Allow saving vgic3 pending tables in no running vcpu context

It's possible to save vgic3 pending tables in no running VCPU
context. This is another unknown case detected by 'kvm-unit-tests'.

# ./kvm-unit-tests/tests/its-pending-migration
WARNING: CPU: 120 PID: 7973 at arch/arm64/kvm/../../../virt/kvm/kvm_main.c:3325 \
mark_page_dirty_in_slot+0x60/0xe0
:
mark_page_dirty_in_slot+0x60/0xe0
__kvm_write_guest_page+0xcc/0x100
kvm_write_guest+0x7c/0xb0
vgic_v3_save_pending_tables+0x148/0x2a0
vgic_set_common_attr+0x158/0x240
vgic_v3_set_attr+0x4c/0x5c
kvm_device_ioctl+0x100/0x160
__arm64_sys_ioctl+0xa8/0xf0
invoke_syscall.constprop.0+0x7c/0xd0
el0_svc_common.constprop.0+0x144/0x160
do_el0_svc+0x34/0x60
el0_svc+0x3c/0x1a0
el0t_64_sync_handler+0xb4/0x130
el0t_64_sync+0x178/0x17c

Fix it by allowing to save VGIC3 pending tables in no running VCPU
context.

Reported-by: Zenghui Yu <[email protected]>
Signed-off-by: Gavin Shan <[email protected]>
---
Documentation/virt/kvm/api.rst | 3 +++
arch/arm64/kvm/vgic/vgic-v3.c | 2 ++
2 files changed, 5 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 18b245a0ba02..7cf3d4b77703 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8074,6 +8074,9 @@ NOTE: One example of using the backup bitmap is saving arm64 vgic/its
tables and vgic3 LPI pending status through KVM_DEV_ARM_{VGIC_GRP_CTRL,
ITS_SAVE_TABLES} and KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES}
command on KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
+The backup bitmap is also used when vgic3 pending table is saved
+through KVM_DEV_ARM_{VGIC_GRP_CTRL, VGIC_SAVE_PENDING_TABLES} command
+on KVM device "kvm-arm-vgic-v3".

8.30 KVM_CAP_XEN_HVM
--------------------
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 32998c8587a8..1e6b5f19d524 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -440,7 +440,9 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
else
val &= ~(1 << bit_nr);

+ dist->save_vgic_v3_tables_in_progress = true;
ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
+ dist->save_vgic_v3_tables_in_progress = false;
if (ret)
goto out;
}
--
2.23.0

2023-01-16 04:21:21

by Gavin Shan

[permalink] [raw]
Subject: [PATCH 3/4] KVM: Refactor mark_page_dirty_in_slot()

Refactor mark_page_dirty_in_slot() to bail early if the memory slot
isn't existing or dirty page tracking is disabled on it. It's the
preparatory work for the forth coming fixes.

No functional change intended.

Signed-off-by: Gavin Shan <[email protected]>
---
virt/kvm/kvm_main.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9c60384b5ae0..90f538433916 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3317,6 +3317,8 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
gfn_t gfn)
{
struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
+ unsigned long rel_gfn;
+ u32 slot;

#ifdef CONFIG_HAVE_KVM_DIRTY_RING
if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm))
@@ -3325,15 +3327,16 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
WARN_ON_ONCE(!vcpu && !kvm_arch_allow_write_without_running_vcpu(kvm));
#endif

- if (memslot && kvm_slot_dirty_track_enabled(memslot)) {
- unsigned long rel_gfn = gfn - memslot->base_gfn;
- u32 slot = (memslot->as_id << 16) | memslot->id;
+ if (!memslot || !kvm_slot_dirty_track_enabled(memslot))
+ return;

- if (kvm->dirty_ring_size && vcpu)
- kvm_dirty_ring_push(vcpu, slot, rel_gfn);
- else if (memslot->dirty_bitmap)
- set_bit_le(rel_gfn, memslot->dirty_bitmap);
- }
+ rel_gfn = gfn - memslot->base_gfn;
+ slot = (memslot->as_id << 16) | memslot->id;
+
+ if (kvm->dirty_ring_size && vcpu)
+ kvm_dirty_ring_push(vcpu, slot, rel_gfn);
+ else if (memslot->dirty_bitmap)
+ set_bit_le(rel_gfn, memslot->dirty_bitmap);
}
EXPORT_SYMBOL_GPL(mark_page_dirty_in_slot);

--
2.23.0

2023-01-16 04:38:18

by Gavin Shan

[permalink] [raw]
Subject: [PATCH 4/4] KVM: Improve warning report in mark_page_dirty_in_slot()

There are two warning reports about the dirty ring in the function.
We have the wrong assumption that the dirty ring is always enabled when
CONFIG_HAVE_KVM_DIRTY_RING is selected. This leads to warning messages
about the dirty ring is reported even the dirty ring isn't enabled by
the user space. Actually, the expected behaviour is to report the
warning messages only when the dirty ring is enabled, instead of
being configured.

Fix it by enabling the checks and warning reports when the dirty ring
has been enabled by the user space.

Signed-off-by: Gavin Shan <[email protected]>
---
include/linux/kvm_dirty_ring.h | 5 +++++
virt/kvm/kvm_main.c | 25 ++++++++++++++-----------
2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h
index 4862c98d80d3..3fda0aa42858 100644
--- a/include/linux/kvm_dirty_ring.h
+++ b/include/linux/kvm_dirty_ring.h
@@ -42,6 +42,11 @@ static inline bool kvm_use_dirty_bitmap(struct kvm *kvm)
return true;
}

+static inline bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm)
+{
+ return false;
+}
+
static inline int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring,
int index, u32 size)
{
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 90f538433916..a35c32bc84e1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3316,26 +3316,29 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
const struct kvm_memory_slot *memslot,
gfn_t gfn)
{
- struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
+ struct kvm_vcpu *vcpu;
unsigned long rel_gfn;
u32 slot;

-#ifdef CONFIG_HAVE_KVM_DIRTY_RING
- if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm))
- return;
-
- WARN_ON_ONCE(!vcpu && !kvm_arch_allow_write_without_running_vcpu(kvm));
-#endif
-
if (!memslot || !kvm_slot_dirty_track_enabled(memslot))
return;

rel_gfn = gfn - memslot->base_gfn;
slot = (memslot->as_id << 16) | memslot->id;

- if (kvm->dirty_ring_size && vcpu)
- kvm_dirty_ring_push(vcpu, slot, rel_gfn);
- else if (memslot->dirty_bitmap)
+ if (kvm->dirty_ring_size) {
+ vcpu = kvm_get_running_vcpu();
+ if (vcpu) {
+ if (!WARN_ON_ONCE(vcpu->kvm != kvm))
+ kvm_dirty_ring_push(vcpu, slot, rel_gfn);
+
+ return;
+ }
+
+ WARN_ON_ONCE(!kvm_arch_allow_write_without_running_vcpu(kvm));
+ }
+
+ if (memslot->dirty_bitmap)
set_bit_le(rel_gfn, memslot->dirty_bitmap);
}
EXPORT_SYMBOL_GPL(mark_page_dirty_in_slot);
--
2.23.0

2023-01-16 05:21:51

by Gavin Shan

[permalink] [raw]
Subject: [PATCH 1/4] KVM: arm64: Allow saving vgic3 LPI pending status in no running vcpu context

When dirty ring is enabled, the dirty page information is pushed to
the dirty ring if there is a running VCPU context. Otherwise, the
dirty page information is still tracked by the backup dirty bitmap.
In order to detect if there is a running VCPU context when a guest
page becomes dirty, kvm_arch_allow_write_without_running_vcpu() was
introduced to warn when no running VCPU context exists on unknown
cases.

Other than the site of saving ITS tables, it's possible to save vgic3
LPI pending status in no running vcpu context because it can happen when
ITS ITE is restored through the command KVM_DEV_ARM_ITS_RESTORE_TABLES
on 'kvm-arm-vgic-its' device.

Fix it by allowing to save vgic3 LPI pending status in no running
vcpu context.

Signed-off-by: Gavin Shan <[email protected]>
---
Documentation/virt/kvm/api.rst | 5 +++--
arch/arm64/kvm/vgic/vgic-its.c | 3 ++-
arch/arm64/kvm/vgic/vgic-v3.c | 3 +++
include/kvm/arm_vgic.h | 1 +
4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 9807b05a1b57..18b245a0ba02 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8071,8 +8071,9 @@ state is final and avoid missing dirty pages from another ioctl ordered
after the bitmap collection.

NOTE: One example of using the backup bitmap is saving arm64 vgic/its
-tables through KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on
-KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
+tables and vgic3 LPI pending status through KVM_DEV_ARM_{VGIC_GRP_CTRL,
+ITS_SAVE_TABLES} and KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES}
+command on KVM device "kvm-arm-vgic-its" when dirty ring is enabled.

8.30 KVM_CAP_XEN_HVM
--------------------
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 94a666dd1443..119a9c7a0a52 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -2792,7 +2792,8 @@ bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm)
{
struct vgic_dist *dist = &kvm->arch.vgic;

- return dist->save_its_tables_in_progress;
+ return dist->save_vgic_v3_tables_in_progress ||
+ dist->save_its_tables_in_progress;
}

static int vgic_its_set_attr(struct kvm_device *dev,
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 2074521d4a8c..32998c8587a8 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -304,6 +304,7 @@ void vgic_v3_enable(struct kvm_vcpu *vcpu)
int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
{
struct kvm_vcpu *vcpu;
+ struct vgic_dist *dist = &kvm->arch.vgic;
int byte_offset, bit_nr;
gpa_t pendbase, ptr;
bool status;
@@ -339,7 +340,9 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
if (status) {
/* clear consumed data */
val &= ~(1 << bit_nr);
+ dist->save_vgic_v3_tables_in_progress = true;
ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
+ dist->save_vgic_v3_tables_in_progress = false;
if (ret)
return ret;
}
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 9270cd87da3f..0485b4e82b00 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -264,6 +264,7 @@ struct vgic_dist {

bool has_its;
bool save_its_tables_in_progress;
+ bool save_vgic_v3_tables_in_progress;

/*
* Contains the attributes and gpa of the LPI configuration table.
--
2.23.0

2023-01-17 16:33:25

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 4/4] KVM: Improve warning report in mark_page_dirty_in_slot()

On Mon, Jan 16, 2023, Gavin Shan wrote:
> There are two warning reports about the dirty ring in the function.
> We have the wrong assumption that the dirty ring is always enabled when
> CONFIG_HAVE_KVM_DIRTY_RING is selected.

No, it's not a wrong assumption, becuase it's not an assumption. The intent is
to warn irrespective of dirty ring/log enabling. The orignal code actually warned
irrespective of dirty ring support[1], again intentionally. The
CONFIG_HAVE_KVM_DIRTY_RING check was added because s390 can mark pages dirty from
an worker thread[2] and s390 has no plans to support the dirty ring.

The reason for warning even if dirty ring isn't enabled is so that bots can catch
potential KVM bugs without having to set up a dirty ring or enable dirty logging.

[1] 2efd61a608b0 ("KVM: Warn if mark_page_dirty() is called without an active vCPU")
[2] e09fccb5435d ("KVM: avoid warning on s390 in mark_page_dirty")

2023-01-18 00:26:20

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH 1/4] KVM: arm64: Allow saving vgic3 LPI pending status in no running vcpu context

Hi Gavin,

On Mon, Jan 16, 2023 at 12:04:02PM +0800, Gavin Shan wrote:
> When dirty ring is enabled, the dirty page information is pushed to
> the dirty ring if there is a running VCPU context. Otherwise, the
> dirty page information is still tracked by the backup dirty bitmap.
> In order to detect if there is a running VCPU context when a guest
> page becomes dirty, kvm_arch_allow_write_without_running_vcpu() was
> introduced to warn when no running VCPU context exists on unknown
> cases.
>
> Other than the site of saving ITS tables, it's possible to save vgic3
> LPI pending status in no running vcpu context because it can happen when
> ITS ITE is restored through the command KVM_DEV_ARM_ITS_RESTORE_TABLES
> on 'kvm-arm-vgic-its' device.
>
> Fix it by allowing to save vgic3 LPI pending status in no running
> vcpu context.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> Documentation/virt/kvm/api.rst | 5 +++--
> arch/arm64/kvm/vgic/vgic-its.c | 3 ++-
> arch/arm64/kvm/vgic/vgic-v3.c | 3 +++
> include/kvm/arm_vgic.h | 1 +
> 4 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 9807b05a1b57..18b245a0ba02 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -8071,8 +8071,9 @@ state is final and avoid missing dirty pages from another ioctl ordered
> after the bitmap collection.
>
> NOTE: One example of using the backup bitmap is saving arm64 vgic/its
> -tables through KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on
> -KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
> +tables and vgic3 LPI pending status through KVM_DEV_ARM_{VGIC_GRP_CTRL,
> +ITS_SAVE_TABLES} and KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES}
> +command on KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
>
> 8.30 KVM_CAP_XEN_HVM
> --------------------
> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
> index 94a666dd1443..119a9c7a0a52 100644
> --- a/arch/arm64/kvm/vgic/vgic-its.c
> +++ b/arch/arm64/kvm/vgic/vgic-its.c
> @@ -2792,7 +2792,8 @@ bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm)
> {
> struct vgic_dist *dist = &kvm->arch.vgic;
>
> - return dist->save_its_tables_in_progress;
> + return dist->save_vgic_v3_tables_in_progress ||
> + dist->save_its_tables_in_progress;

I'd much prefer using a single bool to keep track of this, i.e:

return dist->save_tables_in_progress;

> }
>
> static int vgic_its_set_attr(struct kvm_device *dev,
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 2074521d4a8c..32998c8587a8 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -304,6 +304,7 @@ void vgic_v3_enable(struct kvm_vcpu *vcpu)
> int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
> {
> struct kvm_vcpu *vcpu;
> + struct vgic_dist *dist = &kvm->arch.vgic;
> int byte_offset, bit_nr;
> gpa_t pendbase, ptr;
> bool status;
> @@ -339,7 +340,9 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
> if (status) {
> /* clear consumed data */
> val &= ~(1 << bit_nr);
> + dist->save_vgic_v3_tables_in_progress = true;
> ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
> + dist->save_vgic_v3_tables_in_progress = false;

With the above suggestion of using a bool, this should become a helper
used at all the affected callsites:

static int vgic_write_guest_lock(struct kvm *kvm, gpa_t gpa,
const void *data, unsigned long len)
{
struct vgic_dist *dist = &kvm->arch.vgic;
int ret;

dist->save_tables_in_progress = true;
ret = kvm_write_guest_lock(kvm, gpa, data, len);
dist->save_tables_in_progress = false;

return ret;
}

--
Thanks,
Oliver

2023-01-19 01:29:49

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH 1/4] KVM: arm64: Allow saving vgic3 LPI pending status in no running vcpu context

Hi Oliver,

On 1/18/23 7:51 AM, Oliver Upton wrote:
> On Mon, Jan 16, 2023 at 12:04:02PM +0800, Gavin Shan wrote:
>> When dirty ring is enabled, the dirty page information is pushed to
>> the dirty ring if there is a running VCPU context. Otherwise, the
>> dirty page information is still tracked by the backup dirty bitmap.
>> In order to detect if there is a running VCPU context when a guest
>> page becomes dirty, kvm_arch_allow_write_without_running_vcpu() was
>> introduced to warn when no running VCPU context exists on unknown
>> cases.
>>
>> Other than the site of saving ITS tables, it's possible to save vgic3
>> LPI pending status in no running vcpu context because it can happen when
>> ITS ITE is restored through the command KVM_DEV_ARM_ITS_RESTORE_TABLES
>> on 'kvm-arm-vgic-its' device.
>>
>> Fix it by allowing to save vgic3 LPI pending status in no running
>> vcpu context.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> Documentation/virt/kvm/api.rst | 5 +++--
>> arch/arm64/kvm/vgic/vgic-its.c | 3 ++-
>> arch/arm64/kvm/vgic/vgic-v3.c | 3 +++
>> include/kvm/arm_vgic.h | 1 +
>> 4 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 9807b05a1b57..18b245a0ba02 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -8071,8 +8071,9 @@ state is final and avoid missing dirty pages from another ioctl ordered
>> after the bitmap collection.
>>
>> NOTE: One example of using the backup bitmap is saving arm64 vgic/its
>> -tables through KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on
>> -KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
>> +tables and vgic3 LPI pending status through KVM_DEV_ARM_{VGIC_GRP_CTRL,
>> +ITS_SAVE_TABLES} and KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES}
>> +command on KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
>>
>> 8.30 KVM_CAP_XEN_HVM
>> --------------------
>> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
>> index 94a666dd1443..119a9c7a0a52 100644
>> --- a/arch/arm64/kvm/vgic/vgic-its.c
>> +++ b/arch/arm64/kvm/vgic/vgic-its.c
>> @@ -2792,7 +2792,8 @@ bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm)
>> {
>> struct vgic_dist *dist = &kvm->arch.vgic;
>>
>> - return dist->save_its_tables_in_progress;
>> + return dist->save_vgic_v3_tables_in_progress ||
>> + dist->save_its_tables_in_progress;
>
> I'd much prefer using a single bool to keep track of this, i.e:
>

Yes, it's clean to have 'dist->save_tables_in_progress' for all
3 cases. One more concern like below.

> return dist->save_tables_in_progress;
>
>> }
>>
>> static int vgic_its_set_attr(struct kvm_device *dev,
>> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
>> index 2074521d4a8c..32998c8587a8 100644
>> --- a/arch/arm64/kvm/vgic/vgic-v3.c
>> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
>> @@ -304,6 +304,7 @@ void vgic_v3_enable(struct kvm_vcpu *vcpu)
>> int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
>> {
>> struct kvm_vcpu *vcpu;
>> + struct vgic_dist *dist = &kvm->arch.vgic;
>> int byte_offset, bit_nr;
>> gpa_t pendbase, ptr;
>> bool status;
>> @@ -339,7 +340,9 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
>> if (status) {
>> /* clear consumed data */
>> val &= ~(1 << bit_nr);
>> + dist->save_vgic_v3_tables_in_progress = true;
>> ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
>> + dist->save_vgic_v3_tables_in_progress = false;
>
> With the above suggestion of using a bool, this should become a helper
> used at all the affected callsites:
>
> static int vgic_write_guest_lock(struct kvm *kvm, gpa_t gpa,
> const void *data, unsigned long len)
> {
> struct vgic_dist *dist = &kvm->arch.vgic;
> int ret;
>
> dist->save_tables_in_progress = true;
> ret = kvm_write_guest_lock(kvm, gpa, data, len);
> dist->save_tables_in_progress = false;
>
> return ret;
> }
>

I will have vgic_write_guest_lock() in v2. Note that those 3 paths can't be
running in parallel since one switch is shared by them. Alternatively, we
extend struct vgic_dist::save_tables_in_progress from 'bool' to 'unsigned long'.
Several bit is defined for each site as below. In this way, the 3 paths can be
running in parallel:

unsigned long struct vgic_dist::save_tables_in_progress

#define VGIC_DIST_SAVE_ITS_ITE 0 /* ITS Translation Entry */
#define VGIC_DIST_SAVE_ITS_DTE 1 /* ITS Device Table Entry */
#define VGIC_DIST_SAVE_ITS_CTE 2 /* ITS Collection Table Entry */
#define VGIC_DIST_SAVE_ITS_CT 3 /* ITS Collection Table */
#define VGIC_DIST_SAVE_VGIC3_LPI 4 /* VGIC3 LPI Pending Status */
#define VGIC_DIST_SAVE_VGIC3_PENDING_TABLE 5 /* VGIC3 Pending Table */

The drawback is the calls are limited to 64. If those 3 paths can't be running
in parallel, we needn't the extension at all.

Thanks,
Gavin

2023-01-19 02:05:38

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH 4/4] KVM: Improve warning report in mark_page_dirty_in_slot()

Hi Sean,

On 1/18/23 2:42 AM, Sean Christopherson wrote:
> On Mon, Jan 16, 2023, Gavin Shan wrote:
>> There are two warning reports about the dirty ring in the function.
>> We have the wrong assumption that the dirty ring is always enabled when
>> CONFIG_HAVE_KVM_DIRTY_RING is selected.
>
> No, it's not a wrong assumption, becuase it's not an assumption. The intent is
> to warn irrespective of dirty ring/log enabling. The orignal code actually warned
> irrespective of dirty ring support[1], again intentionally. The
> CONFIG_HAVE_KVM_DIRTY_RING check was added because s390 can mark pages dirty from
> an worker thread[2] and s390 has no plans to support the dirty ring.
>
> The reason for warning even if dirty ring isn't enabled is so that bots can catch
> potential KVM bugs without having to set up a dirty ring or enable dirty logging.
>
> [1] 2efd61a608b0 ("KVM: Warn if mark_page_dirty() is called without an active vCPU")
> [2] e09fccb5435d ("KVM: avoid warning on s390 in mark_page_dirty")
>

Thanks for the linker. I was confused when looking at the code, but now it's clear to
me. Thanks for your explanation. How about to add a comment there?

/*
* The warning is expected when the dirty ring is configured,
* but not enabled.
*/

Thanks,
Gavin

2023-01-19 16:12:19

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 1/4] KVM: arm64: Allow saving vgic3 LPI pending status in no running vcpu context

On Thu, 19 Jan 2023 01:11:44 +0000,
Gavin Shan <[email protected]> wrote:
>
> I will have vgic_write_guest_lock() in v2. Note that those 3 paths can't be
> running in parallel since one switch is shared by them. Alternatively, we
> extend struct vgic_dist::save_tables_in_progress from 'bool' to 'unsigned long'.
> Several bit is defined for each site as below. In this way, the 3 paths can be
> running in parallel:
>
> unsigned long struct vgic_dist::save_tables_in_progress
>
> #define VGIC_DIST_SAVE_ITS_ITE 0 /* ITS Translation Entry */
> #define VGIC_DIST_SAVE_ITS_DTE 1 /* ITS Device Table Entry */
> #define VGIC_DIST_SAVE_ITS_CTE 2 /* ITS Collection Table Entry */
> #define VGIC_DIST_SAVE_ITS_CT 3 /* ITS Collection Table */
> #define VGIC_DIST_SAVE_VGIC3_LPI 4 /* VGIC3 LPI Pending Status */
> #define VGIC_DIST_SAVE_VGIC3_PENDING_TABLE 5 /* VGIC3 Pending Table */
>
> The drawback is the calls are limited to 64. If those 3 paths can't be running
> in parallel, we needn't the extension at all.

It should all be completely sequential. KVM_DEV_ARM_ITS_SAVE_TABLES
runs in a context where everything is locked, and so is
VGIC_DIST_SAVE_VGIC3_PENDING_TABLE.

M.

--
Without deviation from the norm, progress is not possible.

2023-01-19 16:42:43

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 4/4] KVM: Improve warning report in mark_page_dirty_in_slot()

On Thu, Jan 19, 2023, Gavin Shan wrote:
> Hi Sean,
>
> On 1/18/23 2:42 AM, Sean Christopherson wrote:
> > On Mon, Jan 16, 2023, Gavin Shan wrote:
> > > There are two warning reports about the dirty ring in the function.
> > > We have the wrong assumption that the dirty ring is always enabled when
> > > CONFIG_HAVE_KVM_DIRTY_RING is selected.
> >
> > No, it's not a wrong assumption, becuase it's not an assumption. The intent is
> > to warn irrespective of dirty ring/log enabling. The orignal code actually warned
> > irrespective of dirty ring support[1], again intentionally. The
> > CONFIG_HAVE_KVM_DIRTY_RING check was added because s390 can mark pages dirty from
> > an worker thread[2] and s390 has no plans to support the dirty ring.
> >
> > The reason for warning even if dirty ring isn't enabled is so that bots can catch
> > potential KVM bugs without having to set up a dirty ring or enable dirty logging.
> >
> > [1] 2efd61a608b0 ("KVM: Warn if mark_page_dirty() is called without an active vCPU")
> > [2] e09fccb5435d ("KVM: avoid warning on s390 in mark_page_dirty")
> >
>
> Thanks for the linker. I was confused when looking at the code, but now it's clear to
> me. Thanks for your explanation. How about to add a comment there?
>
> /*
> * The warning is expected when the dirty ring is configured,
> * but not enabled.
> */

That's not correct either. By design, the warning can also fire if the dirty ring
is enabled. KVM's rule is that writes to guest memory always need to be done in
the context of a running vCPU, with the recently added exception of
kvm_arch_allow_write_without_running_vcpu(). That intent of the warning is to
enforce that rule regardless of the state of the VM.

Concretely, I think you can just drop patches 3 and 4, and just fix the arm64 issues.

2023-01-19 23:38:22

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH 4/4] KVM: Improve warning report in mark_page_dirty_in_slot()

Hi Sean,

On 1/20/23 2:19 AM, Sean Christopherson wrote:
> On Thu, Jan 19, 2023, Gavin Shan wrote:
>> On 1/18/23 2:42 AM, Sean Christopherson wrote:
>>> On Mon, Jan 16, 2023, Gavin Shan wrote:
>>>> There are two warning reports about the dirty ring in the function.
>>>> We have the wrong assumption that the dirty ring is always enabled when
>>>> CONFIG_HAVE_KVM_DIRTY_RING is selected.
>>>
>>> No, it's not a wrong assumption, becuase it's not an assumption. The intent is
>>> to warn irrespective of dirty ring/log enabling. The orignal code actually warned
>>> irrespective of dirty ring support[1], again intentionally. The
>>> CONFIG_HAVE_KVM_DIRTY_RING check was added because s390 can mark pages dirty from
>>> an worker thread[2] and s390 has no plans to support the dirty ring.
>>>
>>> The reason for warning even if dirty ring isn't enabled is so that bots can catch
>>> potential KVM bugs without having to set up a dirty ring or enable dirty logging.
>>>
>>> [1] 2efd61a608b0 ("KVM: Warn if mark_page_dirty() is called without an active vCPU")
>>> [2] e09fccb5435d ("KVM: avoid warning on s390 in mark_page_dirty")
>>>
>>
>> Thanks for the linker. I was confused when looking at the code, but now it's clear to
>> me. Thanks for your explanation. How about to add a comment there?
>>
>> /*
>> * The warning is expected when the dirty ring is configured,
>> * but not enabled.
>> */
>
> That's not correct either. By design, the warning can also fire if the dirty ring
> is enabled. KVM's rule is that writes to guest memory always need to be done in
> the context of a running vCPU, with the recently added exception of
> kvm_arch_allow_write_without_running_vcpu(). That intent of the warning is to
> enforce that rule regardless of the state of the VM.
>
> Concretely, I think you can just drop patches 3 and 4, and just fix the arm64 issues.
>

Right, the warning report is still expected when dirty ring is enabled. My attempt
was to have comment for the confused case. Anyway, it's not a big deal. I will drop
PATCH[3] and PATCH[4] in v2.

Thanks,
Gavin

2023-01-19 23:42:34

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH 1/4] KVM: arm64: Allow saving vgic3 LPI pending status in no running vcpu context

Hi Marc,

On 1/20/23 2:47 AM, Marc Zyngier wrote:
> On Thu, 19 Jan 2023 01:11:44 +0000,
> Gavin Shan <[email protected]> wrote:
>>
>> I will have vgic_write_guest_lock() in v2. Note that those 3 paths can't be
>> running in parallel since one switch is shared by them. Alternatively, we
>> extend struct vgic_dist::save_tables_in_progress from 'bool' to 'unsigned long'.
>> Several bit is defined for each site as below. In this way, the 3 paths can be
>> running in parallel:
>>
>> unsigned long struct vgic_dist::save_tables_in_progress
>>
>> #define VGIC_DIST_SAVE_ITS_ITE 0 /* ITS Translation Entry */
>> #define VGIC_DIST_SAVE_ITS_DTE 1 /* ITS Device Table Entry */
>> #define VGIC_DIST_SAVE_ITS_CTE 2 /* ITS Collection Table Entry */
>> #define VGIC_DIST_SAVE_ITS_CT 3 /* ITS Collection Table */
>> #define VGIC_DIST_SAVE_VGIC3_LPI 4 /* VGIC3 LPI Pending Status */
>> #define VGIC_DIST_SAVE_VGIC3_PENDING_TABLE 5 /* VGIC3 Pending Table */
>>
>> The drawback is the calls are limited to 64. If those 3 paths can't be running
>> in parallel, we needn't the extension at all.
>
> It should all be completely sequential. KVM_DEV_ARM_ITS_SAVE_TABLES
> runs in a context where everything is locked, and so is
> VGIC_DIST_SAVE_VGIC3_PENDING_TABLE.
>

Thanks for your confirm. Yeah, it's sequential because 'kvm->lock' is
hold on KVM_DEV_ARM_ITS_SAVE_TABLES and VGIC_DIST_SAVE_VGIC3_PENDING_TABLE.
So all good to have one shared switch. v2 will be posted pretty soon.

Thanks,
Gavin