While writting vgic v3 init sequence KVM selftests I noticed some
relatively minor issues. This was also the opportunity to try to
fix the issue laterly reported by Zenghui, related to the RDIST_TYPER
last bit emulation. The final patch is a first batch of VGIC init
sequence selftests. Of course they can be augmented with a lot more
register access tests, but let's try to move forward incrementally ...
Best Regards
Eric
This series can be found at:
https://github.com/eauger/linux/tree/vgic_kvmselftests_v4
History:
v3 -> v4:
- take into account Drew's comment on the kvm selftests. No
change to the KVM related patches compared to v3
v2 ->v3:
- reworked last bit read accessor to handle contiguous redist
regions and rdist not registered in ascending order
- removed [PATCH 5/9] KVM: arm: move has_run_once after the
map_resources
v1 -> v2:
- Took into account all comments from Marc and Alexandru's except
the has_run_once still after the map_resources (this would oblige
me to revisit in depth the selftests)
Eric Auger (8):
KVM: arm64: vgic-v3: Fix some error codes when setting RDIST base
KVM: arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read
KVM: arm64: vgic-v3: Fix error handling in vgic_v3_set_redist_base()
KVM: arm/arm64: vgic: Reset base address on kvm_vgic_dist_destroy()
docs: kvm: devices/arm-vgic-v3: enhance KVM_DEV_ARM_VGIC_CTRL_INIT doc
KVM: arm64: Simplify argument passing to vgic_uaccess_[read|write]
KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace
KVM: selftests: aarch64/vgic-v3 init sequence tests
.../virt/kvm/devices/arm-vgic-v3.rst | 2 +-
arch/arm64/kvm/vgic/vgic-init.c | 13 +-
arch/arm64/kvm/vgic/vgic-kvm-device.c | 3 +
arch/arm64/kvm/vgic/vgic-mmio-v3.c | 116 +++-
arch/arm64/kvm/vgic/vgic-mmio.c | 10 +-
arch/arm64/kvm/vgic/vgic.h | 1 +
include/kvm/arm_vgic.h | 3 +
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../testing/selftests/kvm/aarch64/vgic_init.c | 652 ++++++++++++++++++
.../testing/selftests/kvm/include/kvm_util.h | 9 +
tools/testing/selftests/kvm/lib/kvm_util.c | 77 +++
12 files changed, 838 insertions(+), 50 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/vgic_init.c
--
2.26.3
The doc says:
"The characteristics of a specific redistributor region can
be read by presetting the index field in the attr data.
Only valid for KVM_DEV_TYPE_ARM_VGIC_V3"
Unfortunately the existing code fails to read the input attr data.
Fixes: 04c110932225 ("KVM: arm/arm64: Implement KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION")
Cc: [email protected]#v4.17+
Signed-off-by: Eric Auger <[email protected]>
Reviewed-by: Alexandru Elisei <[email protected]>
---
v1 -> v2:
- in the commit message, remove the statement that the index always is 0
- add Alexandru's R-b
---
arch/arm64/kvm/vgic/vgic-kvm-device.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/kvm/vgic/vgic-kvm-device.c b/arch/arm64/kvm/vgic/vgic-kvm-device.c
index 44419679f91ad..2f66cf2472825 100644
--- a/arch/arm64/kvm/vgic/vgic-kvm-device.c
+++ b/arch/arm64/kvm/vgic/vgic-kvm-device.c
@@ -226,6 +226,9 @@ static int vgic_get_common_attr(struct kvm_device *dev,
u64 addr;
unsigned long type = (unsigned long)attr->attr;
+ if (copy_from_user(&addr, uaddr, sizeof(addr)))
+ return -EFAULT;
+
r = kvm_vgic_addr(dev->kvm, type, &addr, false);
if (r)
return (r == -ENODEV) ? -ENXIO : r;
--
2.26.3
vgic_v3_insert_redist_region() may succeed while
vgic_register_all_redist_iodevs fails. For example this happens
while adding a redistributor region overlapping a dist region. The
failure only is detected on vgic_register_all_redist_iodevs when
vgic_v3_check_base() gets called in vgic_register_redist_iodev().
In such a case, remove the newly added redistributor region and free
it.
Signed-off-by: Eric Auger <[email protected]>
---
v1 -> v2:
- fix the commit message and split declaration/assignment of rdreg
---
arch/arm64/kvm/vgic/vgic-mmio-v3.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index 013b737b658f8..987e366c80008 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -860,8 +860,14 @@ int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count)
* afterwards will register the iodevs when needed.
*/
ret = vgic_register_all_redist_iodevs(kvm);
- if (ret)
+ if (ret) {
+ struct vgic_redist_region *rdreg;
+
+ rdreg = vgic_v3_rdist_region_from_index(kvm, index);
+ list_del(&rdreg->list);
+ kfree(rdreg);
return ret;
+ }
return 0;
}
--
2.26.3
On vgic_dist_destroy(), the addresses are not reset. However for
kvm selftest purpose this would allow to continue the test execution
even after a failure when running KVM_RUN. So let's reset the
base addresses.
Signed-off-by: Eric Auger <[email protected]>
---
v1 -> v2:
- use dist-> in the else and add braces
---
arch/arm64/kvm/vgic/vgic-init.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
index 052917deb1495..cf6faa0aeddb2 100644
--- a/arch/arm64/kvm/vgic/vgic-init.c
+++ b/arch/arm64/kvm/vgic/vgic-init.c
@@ -335,13 +335,16 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
kfree(dist->spis);
dist->spis = NULL;
dist->nr_spis = 0;
+ dist->vgic_dist_base = VGIC_ADDR_UNDEF;
- if (kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
+ if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) {
list_del(&rdreg->list);
kfree(rdreg);
}
INIT_LIST_HEAD(&dist->rd_regions);
+ } else {
+ dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
}
if (vgic_has_its(kvm))
@@ -362,6 +365,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
vgic_flush_pending_lpis(vcpu);
INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
+ vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
}
/* To be called with kvm->lock held */
--
2.26.3
kvm_arch_vcpu_precreate() returns -EBUSY if the vgic is
already initialized. So let's document that KVM_DEV_ARM_VGIC_CTRL_INIT
must be called after all vcpu creations.
Signed-off-by: Eric Auger <[email protected]>
---
v1 -> v2:
- Must be called after all vcpu creations ->
Must be called after all VCPUs have been created as per
Alexandru's suggestion
---
Documentation/virt/kvm/devices/arm-vgic-v3.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/virt/kvm/devices/arm-vgic-v3.rst b/Documentation/virt/kvm/devices/arm-vgic-v3.rst
index 5dd3bff519783..51e5e57625716 100644
--- a/Documentation/virt/kvm/devices/arm-vgic-v3.rst
+++ b/Documentation/virt/kvm/devices/arm-vgic-v3.rst
@@ -228,7 +228,7 @@ Groups:
KVM_DEV_ARM_VGIC_CTRL_INIT
request the initialization of the VGIC, no additional parameter in
- kvm_device_attr.addr.
+ kvm_device_attr.addr. Must be called after all VCPUs have been created.
KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES
save all LPI pending bits into guest RAM pending tables.
--
2.26.3
The tests exercise the VGIC_V3 device creation including the
associated KVM_DEV_ARM_VGIC_GRP_ADDR group attributes:
- KVM_VGIC_V3_ADDR_TYPE_DIST/REDIST
- KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION
Some other tests dedicate to KVM_DEV_ARM_VGIC_GRP_REDIST_REGS group
and especially the GICR_TYPER read. The goal was to test the case
recently fixed by commit 23bde34771f1
("KVM: arm64: vgic-v3: Drop the reporting of GICR_TYPER.Last for userspace").
The API under test can be found at
Documentation/virt/kvm/devices/arm-vgic-v3.rst
Signed-off-by: Eric Auger <[email protected]>
---
v3 -> v4:
- update .gitignore
- More vgic-mmio-v3.c change into the previous patch
- rename fuzz_dist_rdist into test_dist_rdist
- cleanup in run_vcpu and guest_code
- max_ipa_bits is global
- s/fuzz/subtest
- added test_kvm_device,
- moved ucall_init() just before the cpu run
- use vm_create_default_with_vcpus
- use vm_gic struct, vm_gic_create, vm_gic_destroy
- revwrite util.c helpers to comply with the usual style
---
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../testing/selftests/kvm/aarch64/vgic_init.c | 652 ++++++++++++++++++
.../testing/selftests/kvm/include/kvm_util.h | 9 +
tools/testing/selftests/kvm/lib/kvm_util.c | 77 +++
5 files changed, 740 insertions(+)
create mode 100644 tools/testing/selftests/kvm/aarch64/vgic_init.c
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 7bd7e776c266a..bb862f91f6409 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
/aarch64/get-reg-list
/aarch64/get-reg-list-sve
+/aarch64/vgic_init
/s390x/memop
/s390x/resets
/s390x/sync_regs_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 67eebb53235fd..2fd4801de9ca7 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -78,6 +78,7 @@ TEST_GEN_PROGS_x86_64 += steal_time
TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list-sve
+TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
TEST_GEN_PROGS_aarch64 += demand_paging_test
TEST_GEN_PROGS_aarch64 += dirty_log_test
TEST_GEN_PROGS_aarch64 += dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/aarch64/vgic_init.c b/tools/testing/selftests/kvm/aarch64/vgic_init.c
new file mode 100644
index 0000000000000..04e29c4d3e065
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/vgic_init.c
@@ -0,0 +1,652 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * vgic init sequence tests
+ *
+ * Copyright (C) 2020, Red Hat, Inc.
+ */
+#define _GNU_SOURCE
+#include <linux/kernel.h>
+#include <sys/syscall.h>
+#include <asm/kvm.h>
+#include <asm/kvm_para.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+
+#define NR_VCPUS 4
+
+#define REDIST_REGION_ATTR_ADDR(count, base, flags, index) (((uint64_t)(count) << 52) | \
+ ((uint64_t)((base) >> 16) << 16) | ((uint64_t)(flags) << 12) | index)
+#define REG_OFFSET(vcpu, offset) (((uint64_t)vcpu << 32) | offset)
+
+#define GICR_TYPER 0x8
+
+struct vm_gic {
+ struct kvm_vm *vm;
+ int gic_fd;
+};
+
+int max_ipa_bits;
+
+/* helper to access a redistributor register */
+static int access_redist_reg(int gicv3_fd, int vcpu, int offset,
+ uint32_t *val, bool write)
+{
+ uint64_t attr = REG_OFFSET(vcpu, offset);
+
+ return _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_REDIST_REGS,
+ attr, val, write);
+}
+
+/* dummy guest code */
+static void guest_code(void)
+{
+ GUEST_SYNC(0);
+ GUEST_SYNC(1);
+ GUEST_SYNC(2);
+ GUEST_DONE();
+}
+
+/* we don't want to assert on run execution, hence that helper */
+static int run_vcpu(struct kvm_vm *vm, uint32_t vcpuid)
+{
+ int ret;
+
+ vcpu_args_set(vm, vcpuid, 1);
+ ret = _vcpu_ioctl(vm, vcpuid, KVM_RUN, NULL);
+ get_ucall(vm, vcpuid, NULL);
+
+ if (ret)
+ return -errno;
+ return 0;
+}
+
+static struct vm_gic vm_gic_create(void)
+{
+ struct vm_gic v;
+
+ v.vm = vm_create_default_with_vcpus(NR_VCPUS, 0, 0, guest_code, NULL);
+ v.gic_fd = kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(v.gic_fd > 0, "GICv3 device created");
+
+ return v;
+}
+
+static void vm_gic_destroy(struct vm_gic *v)
+{
+ close(v->gic_fd);
+ kvm_vm_free(v->vm);
+}
+
+/**
+ * Helper routine that performs KVM device tests in general and
+ * especially ARM_VGIC_V3 ones. Eventually the ARM_VGIC_V3
+ * device gets created, a legacy RDIST region is set at @0x0
+ * and a DIST region is set @0x60000
+ */
+static void subtest_dist_rdist(struct vm_gic *v)
+{
+ int ret;
+ uint64_t addr;
+
+ /* Check existing group/attributes */
+ ret = _kvm_device_check_attr(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_DIST);
+ TEST_ASSERT(!ret, "KVM_DEV_ARM_VGIC_GRP_ADDR/KVM_VGIC_V3_ADDR_TYPE_DIST supported");
+
+ ret = _kvm_device_check_attr(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST);
+ TEST_ASSERT(!ret, "KVM_DEV_ARM_VGIC_GRP_ADDR/KVM_VGIC_V3_ADDR_TYPE_REDIST supported");
+
+ /* check non existing attribute */
+ ret = _kvm_device_check_attr(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR, 0);
+ TEST_ASSERT(ret == -ENXIO, "attribute not supported");
+
+ /* misaligned DIST and REDIST address settings */
+ addr = 0x1000;
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_DIST, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "GICv3 dist base not 64kB aligned");
+
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "GICv3 redist base not 64kB aligned");
+
+ /* out of range address */
+ if (max_ipa_bits) {
+ addr = 1ULL << max_ipa_bits;
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_DIST, &addr, true);
+ TEST_ASSERT(ret == -E2BIG, "dist address beyond IPA limit");
+
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST, &addr, true);
+ TEST_ASSERT(ret == -E2BIG, "redist address beyond IPA limit");
+ }
+
+ /* set REDIST base address @0x0*/
+ addr = 0x00000;
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST, &addr, true);
+ TEST_ASSERT(!ret, "GICv3 redist base set");
+
+ /* Attempt to create a second legacy redistributor region */
+ addr = 0xE0000;
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST, &addr, true);
+ TEST_ASSERT(ret == -EEXIST, "GICv3 redist base set again");
+
+ /* Attempt to mix legacy and new redistributor regions */
+ addr = REDIST_REGION_ATTR_ADDR(NR_VCPUS, 0x100000, 0, 0);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "attempt to mix GICv3 REDIST and REDIST_REGION");
+
+ /*
+ * Set overlapping DIST / REDIST, cannot be detected here. Will be detected
+ * on first vcpu run instead.
+ */
+ addr = 3 * 2 * 0x10000;
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR, KVM_VGIC_V3_ADDR_TYPE_DIST,
+ &addr, true);
+ TEST_ASSERT(!ret, "dist overlapping rdist");
+}
+
+/* Test the new REDIST region API */
+static void subtest_redist_regions(struct vm_gic *v)
+{
+ uint64_t addr, expected_addr;
+ int ret;
+
+ ret = kvm_device_check_attr(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST);
+ TEST_ASSERT(!ret, "Multiple redist regions advertised");
+
+ addr = REDIST_REGION_ATTR_ADDR(NR_VCPUS, 0x100000, 2, 0);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "redist region attr value with flags != 0");
+
+ addr = REDIST_REGION_ATTR_ADDR(0, 0x100000, 0, 0);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "redist region attr value with count== 0");
+
+ addr = REDIST_REGION_ATTR_ADDR(2, 0x200000, 0, 1);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "attempt to register the first rdist region with index != 0");
+
+ addr = REDIST_REGION_ATTR_ADDR(2, 0x201000, 0, 1);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "rdist region with misaligned address");
+
+ addr = REDIST_REGION_ATTR_ADDR(2, 0x200000, 0, 0);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "First valid redist region with 2 rdist @ 0x200000, index 0");
+
+ addr = REDIST_REGION_ATTR_ADDR(2, 0x200000, 0, 1);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "register an rdist region with already used index");
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 0x210000, 0, 2);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "register an rdist region overlapping with another one");
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 0x240000, 0, 2);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "register redist region with index not +1");
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 0x240000, 0, 1);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "register valid redist region with 1 rdist @ 0x220000, index 1");
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 1ULL << max_ipa_bits, 0, 2);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -E2BIG, "register redist region with base address beyond IPA range");
+
+ addr = 0x260000;
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "Mix KVM_VGIC_V3_ADDR_TYPE_REDIST and REDIST_REGION");
+
+ /*
+ * Now there are 2 redist regions:
+ * region 0 @ 0x200000 2 redists
+ * region 1 @ 0x240000 1 redist
+ * now attempt to read their characteristics
+ */
+
+ addr = REDIST_REGION_ATTR_ADDR(0, 0, 0, 0);
+ expected_addr = REDIST_REGION_ATTR_ADDR(2, 0x200000, 0, 0);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, false);
+ TEST_ASSERT(!ret && addr == expected_addr, "read characteristics of region #0");
+
+ addr = REDIST_REGION_ATTR_ADDR(0, 0, 0, 1);
+ expected_addr = REDIST_REGION_ATTR_ADDR(1, 0x240000, 0, 1);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, false);
+ TEST_ASSERT(!ret && addr == expected_addr, "read characteristics of region #1");
+
+ addr = REDIST_REGION_ATTR_ADDR(0, 0, 0, 2);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, false);
+ TEST_ASSERT(ret == -ENOENT, "read characteristics of non existing region");
+
+ addr = 0x260000;
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_DIST, &addr, true);
+ TEST_ASSERT(!ret, "set dist region");
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 0x260000, 0, 2);
+ ret = _kvm_device_access(v->gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "register redist region colliding with dist");
+}
+
+/*
+ * VGIC KVM device is created and initialized before the secondary CPUs
+ * get created
+ */
+static void test_vgic_then_vcpus(void)
+{
+ struct vm_gic v;
+ int ret, i;
+
+ v.vm = vm_create_default(0, 0, guest_code);
+ v.gic_fd = kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(v.gic_fd > 0, "GICv3 device created");
+
+ subtest_dist_rdist(&v);
+
+ /* Add the rest of the VCPUs */
+ for (i = 1; i < NR_VCPUS; ++i)
+ vm_vcpu_add_default(v.vm, i, guest_code);
+
+ ucall_init(v.vm, NULL);
+ ret = run_vcpu(v.vm, 3);
+ TEST_ASSERT(ret == -EINVAL, "dist/rdist overlap detected on 1st vcpu run");
+
+ vm_gic_destroy(&v);
+}
+
+
+/* All the VCPUs are created before the VGIC KVM device gets initialized */
+static void test_vcpus_then_vgic(void)
+{
+ struct vm_gic v;
+ int ret;
+
+ v = vm_gic_create();
+
+ subtest_dist_rdist(&v);
+
+ ucall_init(v.vm, NULL);
+ ret = run_vcpu(v.vm, 3);
+ TEST_ASSERT(ret == -EINVAL, "dist/rdist overlap detected on 1st vcpu run");
+
+ vm_gic_destroy(&v);
+}
+
+static void test_new_redist_regions(void)
+{
+ void *dummy = NULL;
+ struct vm_gic v;
+ uint64_t addr;
+ int ret;
+
+ v = vm_gic_create();
+ subtest_redist_regions(&v);
+ ret = _kvm_device_access(v.gic_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+ KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true);
+ TEST_ASSERT(!ret, "init the vgic");
+
+ ucall_init(v.vm, NULL);
+ ret = run_vcpu(v.vm, 3);
+ TEST_ASSERT(ret == -ENXIO, "running without sufficient number of rdists");
+ vm_gic_destroy(&v);
+
+ /* step2 */
+
+ v = vm_gic_create();
+ subtest_redist_regions(&v);
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 0x280000, 0, 2);
+ ret = _kvm_device_access(v.gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "register a third region allowing to cover the 4 vcpus");
+
+ ucall_init(v.vm, NULL);
+ ret = run_vcpu(v.vm, 3);
+ TEST_ASSERT(ret == -EBUSY, "running without vgic explicit init");
+
+ vm_gic_destroy(&v);
+
+ /* step 3 */
+
+ v = vm_gic_create();
+ subtest_redist_regions(&v);
+
+ ret = _kvm_device_access(v.gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, dummy, true);
+ TEST_ASSERT(ret == -EFAULT, "register a third region allowing to cover the 4 vcpus");
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 0x280000, 0, 2);
+ ret = _kvm_device_access(v.gic_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "register a third region allowing to cover the 4 vcpus");
+
+ ret = _kvm_device_access(v.gic_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+ KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true);
+ TEST_ASSERT(!ret, "init the vgic");
+
+ ucall_init(v.vm, NULL);
+ ret = run_vcpu(v.vm, 3);
+ TEST_ASSERT(!ret, "vcpu run");
+
+ vm_gic_destroy(&v);
+}
+
+static void test_typer_accesses(void)
+{
+ int ret, i, gicv3_fd = -1;
+ uint64_t addr;
+ struct kvm_vm *vm;
+ uint32_t val;
+
+ vm = vm_create_default(0, 0, guest_code);
+
+ gicv3_fd = kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(gicv3_fd >= 0, "VGIC_V3 device created");
+
+ vm_vcpu_add_default(vm, 3, guest_code);
+
+ ret = access_redist_reg(gicv3_fd, 1, GICR_TYPER, &val, false);
+ TEST_ASSERT(ret == -EINVAL, "attempting to read GICR_TYPER of non created vcpu");
+
+ vm_vcpu_add_default(vm, 1, guest_code);
+
+ ret = access_redist_reg(gicv3_fd, 1, GICR_TYPER, &val, false);
+ TEST_ASSERT(ret == -EBUSY, "read GICR_TYPER before GIC initialized");
+
+ vm_vcpu_add_default(vm, 2, guest_code);
+
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+ KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true);
+ TEST_ASSERT(!ret, "init the vgic after the vcpu creations");
+
+ for (i = 0; i < NR_VCPUS ; i++) {
+ ret = access_redist_reg(gicv3_fd, 0, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && !val, "read GICR_TYPER before rdist region setting");
+ }
+
+ addr = REDIST_REGION_ATTR_ADDR(2, 0x200000, 0, 0);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "first rdist region with a capacity of 2 rdists");
+
+ /* The 2 first rdists should be put there (vcpu 0 and 3) */
+ ret = access_redist_reg(gicv3_fd, 0, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && !val, "read typer of rdist #0");
+
+ ret = access_redist_reg(gicv3_fd, 3, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x310, "read typer of rdist #1");
+
+ addr = REDIST_REGION_ATTR_ADDR(10, 0x100000, 0, 1);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(ret == -EINVAL, "collision with previous rdist region");
+
+ ret = access_redist_reg(gicv3_fd, 1, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x100,
+ "no redist region attached to vcpu #1 yet, last cannot be returned");
+
+ ret = access_redist_reg(gicv3_fd, 2, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x200,
+ "no redist region attached to vcpu #2, last cannot be returned");
+
+ addr = REDIST_REGION_ATTR_ADDR(10, 0x20000, 0, 1);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "second rdist region");
+
+ ret = access_redist_reg(gicv3_fd, 1, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x100, "read typer of rdist #1");
+
+ ret = access_redist_reg(gicv3_fd, 2, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x210,
+ "read typer of rdist #1, last properly returned");
+
+ close(gicv3_fd);
+ kvm_vm_free(vm);
+}
+
+/**
+ * Test GICR_TYPER last bit with new redist regions
+ * 2 rdist regions that are contiguous
+ * rdist region #0 @0x200000 3 rdist capacity
+ * rdists: 0, 2 (Last), 1
+ * rdist region #1 @0x260000 10 rdist capacity
+ * rdists: 3, 5 (Last), 4 (Last)
+ */
+static void test_last_bit_1(void)
+{
+ uint32_t vcpuids[] = { 0, 2, 1, 3, 5, 4 };
+ int ret, gicv3_fd = -1;
+ uint64_t addr;
+ struct kvm_vm *vm;
+ uint32_t val;
+
+ vm = vm_create_default_with_vcpus(6, 0, 0, guest_code, vcpuids);
+
+ gicv3_fd = kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(gicv3_fd >= 0, "VGIC_V3 device created");
+
+ ret = access_redist_reg(gicv3_fd, 0, GICR_TYPER, &val, false);
+ TEST_ASSERT(ret, "read typer of rdist #0 before redist reg creation");
+
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+ KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true);
+ TEST_ASSERT(!ret, "init the vgic after the vcpu creations");
+
+ addr = REDIST_REGION_ATTR_ADDR(3, 0x200000, 0, 0);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "rdist region #0 with a capacity of 3 rdists");
+
+ addr = REDIST_REGION_ATTR_ADDR(10, 0x260000, 0, 1);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "rdist region #1 (1 rdist) contiguous with the 1st one");
+
+ /*
+ * rdist_region #0 should contain rdists 0, 2, 1
+ * rdist region #1 should contain rdists 3, 5, 4
+ */
+ ret = access_redist_reg(gicv3_fd, 0, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && !val, "read typer of rdist #0");
+
+ ret = access_redist_reg(gicv3_fd, 2, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x210, "read typer of rdist #2");
+
+ ret = access_redist_reg(gicv3_fd, 1, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x100, "read typer of rdist #1");
+
+ ret = access_redist_reg(gicv3_fd, 3, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x300, "read typer of rdist #3");
+
+ ret = access_redist_reg(gicv3_fd, 5, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x510, "read typer of rdist #3");
+
+ ret = access_redist_reg(gicv3_fd, 4, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x410, "read typer of rdist #3");
+
+ close(gicv3_fd);
+ kvm_vm_free(vm);
+}
+
+/**
+ * Test GICR_TYPER last bit with new redist regions
+ * rdist regions #1 and #2 are contiguous
+ * rdist region #0 @0x100000 1 rdist capacity
+ * rdists: 0 (Last)
+ * rdist region #1 @0x240000 3 rdist capacity
+ * rdists: 3, 5 (Last), 4 (Last)
+ * rdist region #2 @0x200000 2 rdist capacity
+ * rdists: 1, 2
+ */
+static void test_last_bit_2(void)
+{
+ uint32_t vcpuids[] = { 0, 3, 5, 4, 1, 2 };
+ int ret, gicv3_fd;
+ uint64_t addr;
+ struct kvm_vm *vm;
+ uint32_t val;
+
+ vm = vm_create_default_with_vcpus(6, 0, 0, guest_code, vcpuids);
+
+ gicv3_fd = kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(gicv3_fd >= 0, "VGIC_V3 device created");
+
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+ KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true);
+ TEST_ASSERT(!ret, "init the vgic after the vcpu creations");
+
+ addr = REDIST_REGION_ATTR_ADDR(1, 0x100000, 0, 0);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "rdist region #0 (1 rdist)");
+
+ addr = REDIST_REGION_ATTR_ADDR(3, 0x240000, 0, 1);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "rdist region #1 (1 rdist) contiguous with #2");
+
+ addr = REDIST_REGION_ATTR_ADDR(2, 0x200000, 0, 2);
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION, &addr, true);
+ TEST_ASSERT(!ret, "rdist region #2 with a capacity of 3 rdists");
+
+
+ ret = access_redist_reg(gicv3_fd, 0, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x010, "read typer of rdist #0");
+
+ ret = access_redist_reg(gicv3_fd, 1, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x100, "read typer of rdist #1");
+
+ ret = access_redist_reg(gicv3_fd, 2, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x200, "read typer of rdist #2");
+
+ ret = access_redist_reg(gicv3_fd, 3, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x300, "read typer of rdist #3");
+
+ ret = access_redist_reg(gicv3_fd, 5, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x510, "read typer of rdist #3");
+
+ ret = access_redist_reg(gicv3_fd, 4, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x410, "read typer of rdist #3");
+
+ close(gicv3_fd);
+ kvm_vm_free(vm);
+}
+
+/* Test last bit with legacy region */
+static void test_last_bit_3(void)
+{
+ uint32_t vcpuids[] = { 0, 3, 5, 4, 1, 2 };
+ int ret, gicv3_fd;
+ uint64_t addr;
+ struct kvm_vm *vm;
+ uint32_t val;
+
+ vm = vm_create_default_with_vcpus(6, 0, 0, guest_code, vcpuids);
+
+ gicv3_fd = kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(gicv3_fd >= 0, "VGIC_V3 device created");
+
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+ KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true);
+ TEST_ASSERT(!ret, "init the vgic after the vcpu creations");
+
+ addr = 0x10000;
+ ret = _kvm_device_access(gicv3_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+ KVM_VGIC_V3_ADDR_TYPE_REDIST, &addr, true);
+
+ ret = access_redist_reg(gicv3_fd, 0, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x000, "read typer of rdist #0");
+
+ ret = access_redist_reg(gicv3_fd, 3, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x300, "read typer of rdist #1");
+
+ ret = access_redist_reg(gicv3_fd, 5, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x510, "read typer of rdist #2");
+
+ ret = access_redist_reg(gicv3_fd, 1, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x100, "read typer of rdist #3");
+
+ ret = access_redist_reg(gicv3_fd, 2, GICR_TYPER, &val, false);
+ TEST_ASSERT(!ret && val == 0x210, "read typer of rdist #3");
+
+ close(gicv3_fd);
+ kvm_vm_free(vm);
+}
+
+void test_kvm_device(void)
+{
+ struct vm_gic v;
+ int ret;
+
+ v.vm = vm_create_default_with_vcpus(NR_VCPUS, 0, 0, guest_code, NULL);
+
+ /* try to create a non existing KVM device */
+ ret = _kvm_create_device(v.vm, 0, true);
+ TEST_ASSERT(ret == -ENODEV, "unsupported device");
+
+ /* trial mode with VGIC_V3 device */
+ ret = kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V3, true);
+ if (ret) {
+ print_skip("GICv3 not supported");
+ exit(KSFT_SKIP);
+ }
+ v.gic_fd = kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(v.gic_fd, "create the GICv3 device");
+
+ ret = _kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ TEST_ASSERT(ret == -EEXIST, "create GICv3 device twice");
+
+ ret = kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V3, true);
+ TEST_ASSERT(!ret, "create GICv3 in test mode while the same already is created");
+
+ if (!_kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V2, true)) {
+ ret = kvm_create_device(v.vm, KVM_DEV_TYPE_ARM_VGIC_V2, false);
+ TEST_ASSERT(ret == -EINVAL, "create GICv2 while v3 exists");
+ }
+
+ vm_gic_destroy(&v);
+}
+
+int main(int ac, char **av)
+{
+ max_ipa_bits = kvm_check_cap(KVM_CAP_ARM_VM_IPA_SIZE);
+
+ test_kvm_device();
+ test_vcpus_then_vgic();
+ test_vgic_then_vcpus();
+ test_new_redist_regions();
+ test_typer_accesses();
+ test_last_bit_1();
+ test_last_bit_2();
+ test_last_bit_3();
+
+ return 0;
+}
+
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index 0f4258eaa629e..2b4b325cde010 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -225,6 +225,15 @@ int vcpu_nested_state_set(struct kvm_vm *vm, uint32_t vcpuid,
#endif
void *vcpu_map_dirty_ring(struct kvm_vm *vm, uint32_t vcpuid);
+int _kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr);
+int kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr);
+int _kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test);
+int kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test);
+int _kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
+ void *val, bool write);
+int kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
+ void *val, bool write);
+
const char *exit_reason_str(unsigned int exit_reason);
void virt_pgd_alloc(struct kvm_vm *vm, uint32_t pgd_memslot);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index b8849a1aca792..db2a252be9179 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1733,6 +1733,83 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long cmd, void *arg)
return ioctl(vm->kvm_fd, cmd, arg);
}
+/*
+ * Device Ioctl
+ */
+
+int _kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr)
+{
+ struct kvm_device_attr attribute = {
+ .group = group,
+ .attr = attr,
+ .flags = 0,
+ };
+ int ret = ioctl(dev_fd, KVM_HAS_DEVICE_ATTR, &attribute);
+
+ if (ret == -1)
+ return -errno;
+ return 0;
+}
+
+int kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr)
+{
+ int ret = _kvm_device_check_attr(dev_fd, group, attr);
+
+ TEST_ASSERT(ret >= 0, "KVM_HAS_DEVICE_ATTR failed, errno: %i", errno);
+ return ret;
+}
+
+int _kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test)
+{
+ struct kvm_create_device create_dev;
+ int ret;
+
+ create_dev.type = type;
+ create_dev.fd = -1;
+ create_dev.flags = test ? KVM_CREATE_DEVICE_TEST : 0;
+ ret = ioctl(vm_get_fd(vm), KVM_CREATE_DEVICE, &create_dev);
+ if (ret == -1)
+ return -errno;
+ return test ? 0 : create_dev.fd;
+}
+
+int kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test)
+{
+ int ret = _kvm_create_device(vm, type, test);
+
+ TEST_ASSERT(ret >= 0, "KVM_CREATE_DEVICE IOCTL failed,\n"
+ " errno: %i", errno);
+ return ret;
+}
+
+int _kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
+ void *val, bool write)
+{
+ struct kvm_device_attr kvmattr = {
+ .group = group,
+ .attr = attr,
+ .flags = 0,
+ .addr = (uintptr_t)val,
+ };
+ int ret;
+
+ ret = ioctl(dev_fd, write ? KVM_SET_DEVICE_ATTR : KVM_GET_DEVICE_ATTR,
+ &kvmattr);
+ if (ret < 0)
+ return -errno;
+ return ret;
+}
+
+int kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
+ void *val, bool write)
+{
+ int ret = _kvm_device_access(dev_fd, group, attr, val, write);
+
+ TEST_ASSERT(ret >= 0, "KVM_SET|GET_DEVICE_ATTR IOCTL failed,\n"
+ " errno: %i", errno);
+ return ret;
+}
+
/*
* VM Dump
*
--
2.26.3
vgic_uaccess() takes a struct vgic_io_device argument, converts it
to a struct kvm_io_device and passes it to the read/write accessor
functions, which convert it back to a struct vgic_io_device.
Avoid the indirection by passing the struct vgic_io_device argument
directly to vgic_uaccess_{read,write}.
Signed-off-by: Eric Auger <[email protected]>
---
v1 -> v2:
- reworded the commit message as suggested by Alexandru
---
arch/arm64/kvm/vgic/vgic-mmio.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
index b2d73fc0d1ef4..48c6067fc5ecb 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio.c
@@ -938,10 +938,9 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
return region;
}
-static int vgic_uaccess_read(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
+static int vgic_uaccess_read(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
gpa_t addr, u32 *val)
{
- struct vgic_io_device *iodev = kvm_to_vgic_iodev(dev);
const struct vgic_register_region *region;
struct kvm_vcpu *r_vcpu;
@@ -960,10 +959,9 @@ static int vgic_uaccess_read(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
return 0;
}
-static int vgic_uaccess_write(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
+static int vgic_uaccess_write(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
gpa_t addr, const u32 *val)
{
- struct vgic_io_device *iodev = kvm_to_vgic_iodev(dev);
const struct vgic_register_region *region;
struct kvm_vcpu *r_vcpu;
@@ -986,9 +984,9 @@ int vgic_uaccess(struct kvm_vcpu *vcpu, struct vgic_io_device *dev,
bool is_write, int offset, u32 *val)
{
if (is_write)
- return vgic_uaccess_write(vcpu, &dev->dev, offset, val);
+ return vgic_uaccess_write(vcpu, dev, offset, val);
else
- return vgic_uaccess_read(vcpu, &dev->dev, offset, val);
+ return vgic_uaccess_read(vcpu, dev, offset, val);
}
static int dispatch_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
--
2.26.3
Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
reporting of GICR_TYPER.Last for userspace") temporarily fixed
a bug identified when attempting to access the GICR_TYPER
register before the redistributor region setting, but dropped
the support of the LAST bit.
Emulating the GICR_TYPER.Last bit still makes sense for
architecture compliance though. This patch restores its support
(if the redistributor region was set) while keeping the code safe.
We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
computes whether a redistributor is the highest one of a series
of redistributor contributor pages.
The spec says "Indicates whether this Redistributor is the
highest-numbered Redistributor in a series of contiguous
Redistributor pages."
The code is a bit convulated since there is no guarantee
redistributors are added in a given reditributor region in
ascending order. In that case the current implementation was
wrong. Also redistributor regions can be contiguous
and registered in non increasing base address order.
So the index of redistributors are stored in an array within
the redistributor region structure.
With this new implementation we do not need to have a uaccess
read accessor anymore.
Signed-off-by: Eric Auger <[email protected]>
---
arch/arm64/kvm/vgic/vgic-init.c | 7 +--
arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 ++++++++++++++++++++----------
arch/arm64/kvm/vgic/vgic.h | 1 +
include/kvm/arm_vgic.h | 3 +
4 files changed, 73 insertions(+), 35 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
index cf6faa0aeddb2..61150c34c268c 100644
--- a/arch/arm64/kvm/vgic/vgic-init.c
+++ b/arch/arm64/kvm/vgic/vgic-init.c
@@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
int i;
vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
+ vgic_cpu->index = vcpu->vcpu_id;
INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
raw_spin_lock_init(&vgic_cpu->ap_list_lock);
@@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
dist->vgic_dist_base = VGIC_ADDR_UNDEF;
if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
- list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) {
- list_del(&rdreg->list);
- kfree(rdreg);
- }
+ list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list)
+ vgic_v3_free_redist_region(rdreg);
INIT_LIST_HEAD(&dist->rd_regions);
} else {
dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index 987e366c80008..f6a7eed1d6adb 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
vgic_enable_lpis(vcpu);
}
+static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
+{
+ struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+ struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
+
+ if (!rdreg)
+ return false;
+
+ if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) {
+ /* check whether there is no other contiguous rdist region */
+ struct list_head *rd_regions = &vgic->rd_regions;
+ struct vgic_redist_region *iter;
+
+ list_for_each_entry(iter, rd_regions, list) {
+ if (iter->base == rdreg->base + rdreg->count * KVM_VGIC_V3_REDIST_SIZE &&
+ iter->free_index > 0) {
+ /* check the first rdist index of this region, if any */
+ if (vgic_cpu->index < iter->rdist_indices[0])
+ return false;
+ }
+ }
+ } else if (vgic_cpu->rdreg_index < rdreg->free_index - 1) {
+ /* look at the index of next rdist */
+ int next_rdist_index = rdreg->rdist_indices[vgic_cpu->rdreg_index + 1];
+
+ if (vgic_cpu->index < next_rdist_index)
+ return false;
+ }
+ return true;
+}
+
static unsigned long vgic_mmio_read_v3r_typer(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
{
unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
- struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
- struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
int target_vcpu_id = vcpu->vcpu_id;
- gpa_t last_rdist_typer = rdreg->base + GICR_TYPER +
- (rdreg->free_index - 1) * KVM_VGIC_V3_REDIST_SIZE;
u64 value;
value = (u64)(mpidr & GENMASK(23, 0)) << 32;
value |= ((target_vcpu_id & 0xffff) << 8);
- if (addr == last_rdist_typer)
+ if (vgic_has_its(vcpu->kvm))
+ value |= GICR_TYPER_PLPIS;
+
+ if (vgic_mmio_vcpu_rdist_is_last(vcpu))
value |= GICR_TYPER_LAST;
- if (vgic_has_its(vcpu->kvm))
- value |= GICR_TYPER_PLPIS;
return extract_bytes(value, addr & 7, len);
}
-static unsigned long vgic_uaccess_read_v3r_typer(struct kvm_vcpu *vcpu,
- gpa_t addr, unsigned int len)
-{
- unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
- int target_vcpu_id = vcpu->vcpu_id;
- u64 value;
-
- value = (u64)(mpidr & GENMASK(23, 0)) << 32;
- value |= ((target_vcpu_id & 0xffff) << 8);
-
- if (vgic_has_its(vcpu->kvm))
- value |= GICR_TYPER_PLPIS;
-
- /* reporting of the Last bit is not supported for userspace */
- return extract_bytes(value, addr & 7, len);
-}
-
static unsigned long vgic_mmio_read_v3r_iidr(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
{
@@ -612,7 +624,7 @@ static const struct vgic_register_region vgic_v3_rd_registers[] = {
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_LENGTH_UACCESS(GICR_TYPER,
vgic_mmio_read_v3r_typer, vgic_mmio_write_wi,
- vgic_uaccess_read_v3r_typer, vgic_mmio_uaccess_write_wi, 8,
+ NULL, vgic_mmio_uaccess_write_wi, 8,
VGIC_ACCESS_64bit | VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_LENGTH(GICR_WAKER,
vgic_mmio_read_raz, vgic_mmio_write_wi, 4,
@@ -714,6 +726,16 @@ int vgic_register_redist_iodev(struct kvm_vcpu *vcpu)
return -EINVAL;
vgic_cpu->rdreg = rdreg;
+ vgic_cpu->rdreg_index = rdreg->free_index;
+ if (!rdreg->count) {
+ void *p = krealloc(rdreg->rdist_indices,
+ (vgic_cpu->rdreg_index + 1) * sizeof(u32),
+ GFP_KERNEL);
+ if (!p)
+ return -ENOMEM;
+ rdreg->rdist_indices = p;
+ }
+ rdreg->rdist_indices[vgic_cpu->rdreg_index] = vgic_cpu->index;
rd_base = rdreg->base + rdreg->free_index * KVM_VGIC_V3_REDIST_SIZE;
@@ -768,7 +790,7 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm)
}
/**
- * vgic_v3_insert_redist_region - Insert a new redistributor region
+ * vgic_v3_alloc_redist_region - Allocate a new redistributor region
*
* Performs various checks before inserting the rdist region in the list.
* Those tests depend on whether the size of the rdist region is known
@@ -782,8 +804,8 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm)
*
* Return 0 on success, < 0 otherwise
*/
-static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
- gpa_t base, uint32_t count)
+static int vgic_v3_alloc_redist_region(struct kvm *kvm, uint32_t index,
+ gpa_t base, uint32_t count)
{
struct vgic_dist *d = &kvm->arch.vgic;
struct vgic_redist_region *rdreg;
@@ -839,6 +861,13 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
rdreg->count = count;
rdreg->free_index = 0;
rdreg->index = index;
+ if (count) {
+ rdreg->rdist_indices = kcalloc(count, sizeof(u32), GFP_KERNEL);
+ if (!rdreg->rdist_indices) {
+ ret = -ENOMEM;
+ goto free;
+ }
+ }
list_add_tail(&rdreg->list, rd_regions);
return 0;
@@ -847,11 +876,18 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
return ret;
}
+void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg)
+{
+ list_del(&rdreg->list);
+ kfree(rdreg->rdist_indices);
+ kfree(rdreg);
+}
+
int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count)
{
int ret;
- ret = vgic_v3_insert_redist_region(kvm, index, addr, count);
+ ret = vgic_v3_alloc_redist_region(kvm, index, addr, count);
if (ret)
return ret;
@@ -864,8 +900,7 @@ int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count)
struct vgic_redist_region *rdreg;
rdreg = vgic_v3_rdist_region_from_index(kvm, index);
- list_del(&rdreg->list);
- kfree(rdreg);
+ vgic_v3_free_redist_region(rdreg);
return ret;
}
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 64fcd75111108..bc418c2c12141 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -293,6 +293,7 @@ vgic_v3_rd_region_size(struct kvm *kvm, struct vgic_redist_region *rdreg)
struct vgic_redist_region *vgic_v3_rdist_region_from_index(struct kvm *kvm,
u32 index);
+void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg);
bool vgic_v3_rdist_overlap(struct kvm *kvm, gpa_t base, size_t size);
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 3d74f1060bd18..9a3f060ac3547 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -197,6 +197,7 @@ struct vgic_redist_region {
gpa_t base;
u32 count; /* number of redistributors or 0 if single region */
u32 free_index; /* index of the next free redistributor */
+ int *rdist_indices; /* indices of the redistributors */
struct list_head list;
};
@@ -322,6 +323,8 @@ struct vgic_cpu {
*/
struct vgic_io_device rd_iodev;
struct vgic_redist_region *rdreg;
+ u32 rdreg_index;
+ int index; /* vcpu index */
/* Contains the attributes and gpa of the LPI pending tables. */
u64 pendbaser;
--
2.26.3
On Thu, 01 Apr 2021 18:03:25 +0100,
Auger Eric <[email protected]> wrote:
>
> Hi Marc,
>
> On 4/1/21 3:42 PM, Marc Zyngier wrote:
> > Hi Eric,
> >
> > On Thu, 01 Apr 2021 09:52:37 +0100,
> > Eric Auger <[email protected]> wrote:
> >>
> >> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> >> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> >> a bug identified when attempting to access the GICR_TYPER
> >> register before the redistributor region setting, but dropped
> >> the support of the LAST bit.
> >>
> >> Emulating the GICR_TYPER.Last bit still makes sense for
> >> architecture compliance though. This patch restores its support
> >> (if the redistributor region was set) while keeping the code safe.
> >>
> >> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
> >> computes whether a redistributor is the highest one of a series
> >> of redistributor contributor pages.
> >>
> >> The spec says "Indicates whether this Redistributor is the
> >> highest-numbered Redistributor in a series of contiguous
> >> Redistributor pages."
> >>
> >> The code is a bit convulated since there is no guarantee
> >
> > nit: convoluted
> >
> >> redistributors are added in a given reditributor region in
> >> ascending order. In that case the current implementation was
> >> wrong. Also redistributor regions can be contiguous
> >> and registered in non increasing base address order.
> >>
> >> So the index of redistributors are stored in an array within
> >> the redistributor region structure.
> >>
> >> With this new implementation we do not need to have a uaccess
> >> read accessor anymore.
> >>
> >> Signed-off-by: Eric Auger <[email protected]>
> >
> > This patch also hurt my head, a lot more than the first one. See
> > below.
> >
> >> ---
> >> arch/arm64/kvm/vgic/vgic-init.c | 7 +--
> >> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 ++++++++++++++++++++----------
> >> arch/arm64/kvm/vgic/vgic.h | 1 +
> >> include/kvm/arm_vgic.h | 3 +
> >> 4 files changed, 73 insertions(+), 35 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
> >> index cf6faa0aeddb2..61150c34c268c 100644
> >> --- a/arch/arm64/kvm/vgic/vgic-init.c
> >> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> >> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
> >> int i;
> >>
> >> vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
> >> + vgic_cpu->index = vcpu->vcpu_id;
> >
> > Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
> > do we need another field? We can always get to the vcpu using a
> > container_of().
> >
> >>
> >> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
> >> raw_spin_lock_init(&vgic_cpu->ap_list_lock);
> >> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
> >> dist->vgic_dist_base = VGIC_ADDR_UNDEF;
> >>
> >> if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
> >> - list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) {
> >> - list_del(&rdreg->list);
> >> - kfree(rdreg);
> >> - }
> >> + list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list)
> >> + vgic_v3_free_redist_region(rdreg);
> >
> > Consider moving the introduction of vgic_v3_free_redist_region() into
> > a separate patch. On its own, that's a good readability improvement.
> >
> >> INIT_LIST_HEAD(&dist->rd_regions);
> >> } else {
> >> dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
> >> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >> index 987e366c80008..f6a7eed1d6adb 100644
> >> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
> >> vgic_enable_lpis(vcpu);
> >> }
> >>
> >> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
> >> +{
> >> + struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
> >> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
> >> +
> >> + if (!rdreg)
> >> + return false;
> >> +
> >> + if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) {
> >> + /* check whether there is no other contiguous rdist region */
> >> + struct list_head *rd_regions = &vgic->rd_regions;
> >> + struct vgic_redist_region *iter;
> >> +
> >> + list_for_each_entry(iter, rd_regions, list) {
> >> + if (iter->base == rdreg->base + rdreg->count * KVM_VGIC_V3_REDIST_SIZE &&
> >> + iter->free_index > 0) {
> >> + /* check the first rdist index of this region, if any */
> >> + if (vgic_cpu->index < iter->rdist_indices[0])
> >> + return false;
> >
> > rdist_indices[] contains the vcpu_id of the vcpu associated with a
> > given RD in the region. At this stage, you have established that there
> > is another region that is contiguous with the one associated with our
> > vcpu. You also know that this adjacent region has a vcpu mapped in
> > (free_index isn't 0). Isn't that enough to declare that our vcpu isn't
> > last? I definitely don't understand what the index comparison does
> > here.
> Assume the following case:
> 2 RDIST region
> region #0 contains rdist 1, 2, 4
> region #1, adjacent to #0 contains rdist 3
>
> Spec days:
> "Indicates whether this Redistributor is the
> highest-numbered Redistributor in a series of contiguous
> Redistributor pages."
>
> To me 4 is last and 3 is last too.
No, only 3 is last, assuming that region 0 is full. I think the
phrasing in the spec is just really bad. What this describes is that
at the end of a set of contiguous set of RDs, that last RD has Last
set. If two regions are contiguous, that's undistinguishable from a
single, larger region.
There is no such thing as a "redistributor number" anyway. The closest
thing there is would be "processor number", but that has nothing to do
with the RD itself.
>
>
> >
> > It also seem to me that some of the complexity could be eliminated if
> > the regions were kept ordered at list insertion time.
> yes
> >
> >> + }
> >> + }
> >> + } else if (vgic_cpu->rdreg_index < rdreg->free_index - 1) {
> >> + /* look at the index of next rdist */
> >> + int next_rdist_index = rdreg->rdist_indices[vgic_cpu->rdreg_index + 1];
> >> +
> >> + if (vgic_cpu->index < next_rdist_index)
> >> + return false;
> >
> > Same thing here. We are in the middle of the allocated part of a
> > region, which means we cannot be last. I still don't get the index
> > check.
> Because within a region, nothing hinders rdist from being allocated in
> non ascending order. I exercise those cases in the kvmselftests
>
> one single RDIST region with the following rdists allocated there:
> 1, 3, 2
>
> 3 and 2 are "last", right? Or did I miss something. Yes that's totally
> not natural to do that kind of allocation but the API allows to do that.
No, only 2 is last. I think you got tripped by the bizarre language in
the spec, and the behaviour of this Last bit is much simpler than what
you ended up with.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Hi Marc,
On 4/1/21 3:42 PM, Marc Zyngier wrote:
> Hi Eric,
>
> On Thu, 01 Apr 2021 09:52:37 +0100,
> Eric Auger <[email protected]> wrote:
>>
>> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
>> reporting of GICR_TYPER.Last for userspace") temporarily fixed
>> a bug identified when attempting to access the GICR_TYPER
>> register before the redistributor region setting, but dropped
>> the support of the LAST bit.
>>
>> Emulating the GICR_TYPER.Last bit still makes sense for
>> architecture compliance though. This patch restores its support
>> (if the redistributor region was set) while keeping the code safe.
>>
>> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
>> computes whether a redistributor is the highest one of a series
>> of redistributor contributor pages.
>>
>> The spec says "Indicates whether this Redistributor is the
>> highest-numbered Redistributor in a series of contiguous
>> Redistributor pages."
>>
>> The code is a bit convulated since there is no guarantee
>
> nit: convoluted
>
>> redistributors are added in a given reditributor region in
>> ascending order. In that case the current implementation was
>> wrong. Also redistributor regions can be contiguous
>> and registered in non increasing base address order.
>>
>> So the index of redistributors are stored in an array within
>> the redistributor region structure.
>>
>> With this new implementation we do not need to have a uaccess
>> read accessor anymore.
>>
>> Signed-off-by: Eric Auger <[email protected]>
>
> This patch also hurt my head, a lot more than the first one. See
> below.
>
>> ---
>> arch/arm64/kvm/vgic/vgic-init.c | 7 +--
>> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 ++++++++++++++++++++----------
>> arch/arm64/kvm/vgic/vgic.h | 1 +
>> include/kvm/arm_vgic.h | 3 +
>> 4 files changed, 73 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
>> index cf6faa0aeddb2..61150c34c268c 100644
>> --- a/arch/arm64/kvm/vgic/vgic-init.c
>> +++ b/arch/arm64/kvm/vgic/vgic-init.c
>> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
>> int i;
>>
>> vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
>> + vgic_cpu->index = vcpu->vcpu_id;
>
> Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
> do we need another field? We can always get to the vcpu using a
> container_of().
>
>>
>> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
>> raw_spin_lock_init(&vgic_cpu->ap_list_lock);
>> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>> dist->vgic_dist_base = VGIC_ADDR_UNDEF;
>>
>> if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
>> - list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) {
>> - list_del(&rdreg->list);
>> - kfree(rdreg);
>> - }
>> + list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list)
>> + vgic_v3_free_redist_region(rdreg);
>
> Consider moving the introduction of vgic_v3_free_redist_region() into
> a separate patch. On its own, that's a good readability improvement.
>
>> INIT_LIST_HEAD(&dist->rd_regions);
>> } else {
>> dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> index 987e366c80008..f6a7eed1d6adb 100644
>> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
>> vgic_enable_lpis(vcpu);
>> }
>>
>> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
>> +{
>> + struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
>> +
>> + if (!rdreg)
>> + return false;
>> +
>> + if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) {
>> + /* check whether there is no other contiguous rdist region */
>> + struct list_head *rd_regions = &vgic->rd_regions;
>> + struct vgic_redist_region *iter;
>> +
>> + list_for_each_entry(iter, rd_regions, list) {
>> + if (iter->base == rdreg->base + rdreg->count * KVM_VGIC_V3_REDIST_SIZE &&
>> + iter->free_index > 0) {
>> + /* check the first rdist index of this region, if any */
>> + if (vgic_cpu->index < iter->rdist_indices[0])
>> + return false;
>
> rdist_indices[] contains the vcpu_id of the vcpu associated with a
> given RD in the region. At this stage, you have established that there
> is another region that is contiguous with the one associated with our
> vcpu. You also know that this adjacent region has a vcpu mapped in
> (free_index isn't 0). Isn't that enough to declare that our vcpu isn't
> last? I definitely don't understand what the index comparison does
> here.
Assume the following case:
2 RDIST region
region #0 contains rdist 1, 2, 4
region #1, adjacent to #0 contains rdist 3
Spec days:
"Indicates whether this Redistributor is the
highest-numbered Redistributor in a series of contiguous
Redistributor pages."
To me 4 is last and 3 is last too.
>
> It also seem to me that some of the complexity could be eliminated if
> the regions were kept ordered at list insertion time.
yes
>
>> + }
>> + }
>> + } else if (vgic_cpu->rdreg_index < rdreg->free_index - 1) {
>> + /* look at the index of next rdist */
>> + int next_rdist_index = rdreg->rdist_indices[vgic_cpu->rdreg_index + 1];
>> +
>> + if (vgic_cpu->index < next_rdist_index)
>> + return false;
>
> Same thing here. We are in the middle of the allocated part of a
> region, which means we cannot be last. I still don't get the index
> check.
Because within a region, nothing hinders rdist from being allocated in
non ascending order. I exercise those cases in the kvmselftests
one single RDIST region with the following rdists allocated there:
1, 3, 2
3 and 2 are "last", right? Or did I miss something. Yes that's totally
not natural to do that kind of allocation but the API allows to do that.
>
>> + }
>> + return true;
>> +}
>> +
>> static unsigned long vgic_mmio_read_v3r_typer(struct kvm_vcpu *vcpu,
>> gpa_t addr, unsigned int len)
>> {
>> unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
>> - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> - struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
>> int target_vcpu_id = vcpu->vcpu_id;
>> - gpa_t last_rdist_typer = rdreg->base + GICR_TYPER +
>> - (rdreg->free_index - 1) * KVM_VGIC_V3_REDIST_SIZE;
>> u64 value;
>>
>> value = (u64)(mpidr & GENMASK(23, 0)) << 32;
>> value |= ((target_vcpu_id & 0xffff) << 8);
>>
>> - if (addr == last_rdist_typer)
>> + if (vgic_has_its(vcpu->kvm))
>> + value |= GICR_TYPER_PLPIS;
>> +
>> + if (vgic_mmio_vcpu_rdist_is_last(vcpu))
>> value |= GICR_TYPER_LAST;
>> - if (vgic_has_its(vcpu->kvm))
>> - value |= GICR_TYPER_PLPIS;
>>
>> return extract_bytes(value, addr & 7, len);
>> }
>>
>> -static unsigned long vgic_uaccess_read_v3r_typer(struct kvm_vcpu *vcpu,
>> - gpa_t addr, unsigned int len)
>> -{
>> - unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
>> - int target_vcpu_id = vcpu->vcpu_id;
>> - u64 value;
>> -
>> - value = (u64)(mpidr & GENMASK(23, 0)) << 32;
>> - value |= ((target_vcpu_id & 0xffff) << 8);
>> -
>> - if (vgic_has_its(vcpu->kvm))
>> - value |= GICR_TYPER_PLPIS;
>> -
>> - /* reporting of the Last bit is not supported for userspace */
>> - return extract_bytes(value, addr & 7, len);
>> -}
>> -
>> static unsigned long vgic_mmio_read_v3r_iidr(struct kvm_vcpu *vcpu,
>> gpa_t addr, unsigned int len)
>> {
>> @@ -612,7 +624,7 @@ static const struct vgic_register_region vgic_v3_rd_registers[] = {
>> VGIC_ACCESS_32bit),
>> REGISTER_DESC_WITH_LENGTH_UACCESS(GICR_TYPER,
>> vgic_mmio_read_v3r_typer, vgic_mmio_write_wi,
>> - vgic_uaccess_read_v3r_typer, vgic_mmio_uaccess_write_wi, 8,
>> + NULL, vgic_mmio_uaccess_write_wi, 8,
>> VGIC_ACCESS_64bit | VGIC_ACCESS_32bit),
>> REGISTER_DESC_WITH_LENGTH(GICR_WAKER,
>> vgic_mmio_read_raz, vgic_mmio_write_wi, 4,
>> @@ -714,6 +726,16 @@ int vgic_register_redist_iodev(struct kvm_vcpu *vcpu)
>> return -EINVAL;
>>
>> vgic_cpu->rdreg = rdreg;
>> + vgic_cpu->rdreg_index = rdreg->free_index;
>> + if (!rdreg->count) {
>> + void *p = krealloc(rdreg->rdist_indices,
>> + (vgic_cpu->rdreg_index + 1) * sizeof(u32),
>> + GFP_KERNEL);
>> + if (!p)
>> + return -ENOMEM;
>> + rdreg->rdist_indices = p;
>> + }
>> + rdreg->rdist_indices[vgic_cpu->rdreg_index] = vgic_cpu->index;
>
> I think I really have a problem with this array, which comes from me
> not understanding the two checks I previously commented on.
I hope the above clarified the array need.
>
> If we stick to the definition of 'Last', all that matters is the
> position of the RD in a region (rdreg_index) and potentially the
> presence of another contiguous region with allocated RDs in it.
>
> IIUC, the checks should read like this:
>
> if (vcpu->rdreg_index < (vcpu->rdreg->free_index - 1))
> last = false;
> else if (vcpu->rdreg_index == (vcpu->rdreg->free_index - 1) &&
> adjacent_region(vcpu->rdreg)->free_index > 0)
> last = false;
> else
> last = true;
>
> So why do we need to track the vcpu_id associated to a region?
because the redistributors within a region can be in random order.
That's why I need to store their number.
Does that make more sense?
Thanks
Eric
>
>>
>> rd_base = rdreg->base + rdreg->free_index * KVM_VGIC_V3_REDIST_SIZE;
>>
>> @@ -768,7 +790,7 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm)
>> }
>>
>> /**
>> - * vgic_v3_insert_redist_region - Insert a new redistributor region
>> + * vgic_v3_alloc_redist_region - Allocate a new redistributor region
>> *
>> * Performs various checks before inserting the rdist region in the list.
>> * Those tests depend on whether the size of the rdist region is known
>> @@ -782,8 +804,8 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm)
>> *
>> * Return 0 on success, < 0 otherwise
>> */
>> -static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
>> - gpa_t base, uint32_t count)
>> +static int vgic_v3_alloc_redist_region(struct kvm *kvm, uint32_t index,
>> + gpa_t base, uint32_t count)
>> {
>> struct vgic_dist *d = &kvm->arch.vgic;
>> struct vgic_redist_region *rdreg;
>> @@ -839,6 +861,13 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
>> rdreg->count = count;
>> rdreg->free_index = 0;
>> rdreg->index = index;
>> + if (count) {
>> + rdreg->rdist_indices = kcalloc(count, sizeof(u32), GFP_KERNEL);
>> + if (!rdreg->rdist_indices) {
>> + ret = -ENOMEM;
>> + goto free;
>> + }
>> + }
>>
>> list_add_tail(&rdreg->list, rd_regions);
>> return 0;
>> @@ -847,11 +876,18 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
>> return ret;
>> }
>>
>> +void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg)
>> +{
>> + list_del(&rdreg->list);
>> + kfree(rdreg->rdist_indices);
>> + kfree(rdreg);
>> +}
>> +
>> int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count)
>> {
>> int ret;
>>
>> - ret = vgic_v3_insert_redist_region(kvm, index, addr, count);
>> + ret = vgic_v3_alloc_redist_region(kvm, index, addr, count);
>> if (ret)
>> return ret;
>>
>> @@ -864,8 +900,7 @@ int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count)
>> struct vgic_redist_region *rdreg;
>>
>> rdreg = vgic_v3_rdist_region_from_index(kvm, index);
>> - list_del(&rdreg->list);
>> - kfree(rdreg);
>> + vgic_v3_free_redist_region(rdreg);
>> return ret;
>> }
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
>> index 64fcd75111108..bc418c2c12141 100644
>> --- a/arch/arm64/kvm/vgic/vgic.h
>> +++ b/arch/arm64/kvm/vgic/vgic.h
>> @@ -293,6 +293,7 @@ vgic_v3_rd_region_size(struct kvm *kvm, struct vgic_redist_region *rdreg)
>>
>> struct vgic_redist_region *vgic_v3_rdist_region_from_index(struct kvm *kvm,
>> u32 index);
>> +void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg);
>>
>> bool vgic_v3_rdist_overlap(struct kvm *kvm, gpa_t base, size_t size);
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 3d74f1060bd18..9a3f060ac3547 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -197,6 +197,7 @@ struct vgic_redist_region {
>> gpa_t base;
>> u32 count; /* number of redistributors or 0 if single region */
>> u32 free_index; /* index of the next free redistributor */
>> + int *rdist_indices; /* indices of the redistributors */
>
> You are treating it as an array of u32 when allocating it. Please
> choose one type or the other.
>
>> struct list_head list;
>> };
>>
>> @@ -322,6 +323,8 @@ struct vgic_cpu {
>> */
>> struct vgic_io_device rd_iodev;
>> struct vgic_redist_region *rdreg;
>> + u32 rdreg_index;
>> + int index; /* vcpu index */
>>
>> /* Contains the attributes and gpa of the LPI pending tables. */
>> u64 pendbaser;
>
> Thanks,
>
> M.
>
Hi Eric,
On Thu, 01 Apr 2021 09:52:37 +0100,
Eric Auger <[email protected]> wrote:
>
> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> a bug identified when attempting to access the GICR_TYPER
> register before the redistributor region setting, but dropped
> the support of the LAST bit.
>
> Emulating the GICR_TYPER.Last bit still makes sense for
> architecture compliance though. This patch restores its support
> (if the redistributor region was set) while keeping the code safe.
>
> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
> computes whether a redistributor is the highest one of a series
> of redistributor contributor pages.
>
> The spec says "Indicates whether this Redistributor is the
> highest-numbered Redistributor in a series of contiguous
> Redistributor pages."
>
> The code is a bit convulated since there is no guarantee
nit: convoluted
> redistributors are added in a given reditributor region in
> ascending order. In that case the current implementation was
> wrong. Also redistributor regions can be contiguous
> and registered in non increasing base address order.
>
> So the index of redistributors are stored in an array within
> the redistributor region structure.
>
> With this new implementation we do not need to have a uaccess
> read accessor anymore.
>
> Signed-off-by: Eric Auger <[email protected]>
This patch also hurt my head, a lot more than the first one. See
below.
> ---
> arch/arm64/kvm/vgic/vgic-init.c | 7 +--
> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 ++++++++++++++++++++----------
> arch/arm64/kvm/vgic/vgic.h | 1 +
> include/kvm/arm_vgic.h | 3 +
> 4 files changed, 73 insertions(+), 35 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
> index cf6faa0aeddb2..61150c34c268c 100644
> --- a/arch/arm64/kvm/vgic/vgic-init.c
> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
> int i;
>
> vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
> + vgic_cpu->index = vcpu->vcpu_id;
Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
do we need another field? We can always get to the vcpu using a
container_of().
>
> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
> raw_spin_lock_init(&vgic_cpu->ap_list_lock);
> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
> dist->vgic_dist_base = VGIC_ADDR_UNDEF;
>
> if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
> - list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) {
> - list_del(&rdreg->list);
> - kfree(rdreg);
> - }
> + list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list)
> + vgic_v3_free_redist_region(rdreg);
Consider moving the introduction of vgic_v3_free_redist_region() into
a separate patch. On its own, that's a good readability improvement.
> INIT_LIST_HEAD(&dist->rd_regions);
> } else {
> dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 987e366c80008..f6a7eed1d6adb 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
> vgic_enable_lpis(vcpu);
> }
>
> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
> +{
> + struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
> +
> + if (!rdreg)
> + return false;
> +
> + if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) {
> + /* check whether there is no other contiguous rdist region */
> + struct list_head *rd_regions = &vgic->rd_regions;
> + struct vgic_redist_region *iter;
> +
> + list_for_each_entry(iter, rd_regions, list) {
> + if (iter->base == rdreg->base + rdreg->count * KVM_VGIC_V3_REDIST_SIZE &&
> + iter->free_index > 0) {
> + /* check the first rdist index of this region, if any */
> + if (vgic_cpu->index < iter->rdist_indices[0])
> + return false;
rdist_indices[] contains the vcpu_id of the vcpu associated with a
given RD in the region. At this stage, you have established that there
is another region that is contiguous with the one associated with our
vcpu. You also know that this adjacent region has a vcpu mapped in
(free_index isn't 0). Isn't that enough to declare that our vcpu isn't
last? I definitely don't understand what the index comparison does
here.
It also seem to me that some of the complexity could be eliminated if
the regions were kept ordered at list insertion time.
> + }
> + }
> + } else if (vgic_cpu->rdreg_index < rdreg->free_index - 1) {
> + /* look at the index of next rdist */
> + int next_rdist_index = rdreg->rdist_indices[vgic_cpu->rdreg_index + 1];
> +
> + if (vgic_cpu->index < next_rdist_index)
> + return false;
Same thing here. We are in the middle of the allocated part of a
region, which means we cannot be last. I still don't get the index
check.
> + }
> + return true;
> +}
> +
> static unsigned long vgic_mmio_read_v3r_typer(struct kvm_vcpu *vcpu,
> gpa_t addr, unsigned int len)
> {
> unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
> - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> - struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
> int target_vcpu_id = vcpu->vcpu_id;
> - gpa_t last_rdist_typer = rdreg->base + GICR_TYPER +
> - (rdreg->free_index - 1) * KVM_VGIC_V3_REDIST_SIZE;
> u64 value;
>
> value = (u64)(mpidr & GENMASK(23, 0)) << 32;
> value |= ((target_vcpu_id & 0xffff) << 8);
>
> - if (addr == last_rdist_typer)
> + if (vgic_has_its(vcpu->kvm))
> + value |= GICR_TYPER_PLPIS;
> +
> + if (vgic_mmio_vcpu_rdist_is_last(vcpu))
> value |= GICR_TYPER_LAST;
> - if (vgic_has_its(vcpu->kvm))
> - value |= GICR_TYPER_PLPIS;
>
> return extract_bytes(value, addr & 7, len);
> }
>
> -static unsigned long vgic_uaccess_read_v3r_typer(struct kvm_vcpu *vcpu,
> - gpa_t addr, unsigned int len)
> -{
> - unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
> - int target_vcpu_id = vcpu->vcpu_id;
> - u64 value;
> -
> - value = (u64)(mpidr & GENMASK(23, 0)) << 32;
> - value |= ((target_vcpu_id & 0xffff) << 8);
> -
> - if (vgic_has_its(vcpu->kvm))
> - value |= GICR_TYPER_PLPIS;
> -
> - /* reporting of the Last bit is not supported for userspace */
> - return extract_bytes(value, addr & 7, len);
> -}
> -
> static unsigned long vgic_mmio_read_v3r_iidr(struct kvm_vcpu *vcpu,
> gpa_t addr, unsigned int len)
> {
> @@ -612,7 +624,7 @@ static const struct vgic_register_region vgic_v3_rd_registers[] = {
> VGIC_ACCESS_32bit),
> REGISTER_DESC_WITH_LENGTH_UACCESS(GICR_TYPER,
> vgic_mmio_read_v3r_typer, vgic_mmio_write_wi,
> - vgic_uaccess_read_v3r_typer, vgic_mmio_uaccess_write_wi, 8,
> + NULL, vgic_mmio_uaccess_write_wi, 8,
> VGIC_ACCESS_64bit | VGIC_ACCESS_32bit),
> REGISTER_DESC_WITH_LENGTH(GICR_WAKER,
> vgic_mmio_read_raz, vgic_mmio_write_wi, 4,
> @@ -714,6 +726,16 @@ int vgic_register_redist_iodev(struct kvm_vcpu *vcpu)
> return -EINVAL;
>
> vgic_cpu->rdreg = rdreg;
> + vgic_cpu->rdreg_index = rdreg->free_index;
> + if (!rdreg->count) {
> + void *p = krealloc(rdreg->rdist_indices,
> + (vgic_cpu->rdreg_index + 1) * sizeof(u32),
> + GFP_KERNEL);
> + if (!p)
> + return -ENOMEM;
> + rdreg->rdist_indices = p;
> + }
> + rdreg->rdist_indices[vgic_cpu->rdreg_index] = vgic_cpu->index;
I think I really have a problem with this array, which comes from me
not understanding the two checks I previously commented on.
If we stick to the definition of 'Last', all that matters is the
position of the RD in a region (rdreg_index) and potentially the
presence of another contiguous region with allocated RDs in it.
IIUC, the checks should read like this:
if (vcpu->rdreg_index < (vcpu->rdreg->free_index - 1))
last = false;
else if (vcpu->rdreg_index == (vcpu->rdreg->free_index - 1) &&
adjacent_region(vcpu->rdreg)->free_index > 0)
last = false;
else
last = true;
So why do we need to track the vcpu_id associated to a region?
>
> rd_base = rdreg->base + rdreg->free_index * KVM_VGIC_V3_REDIST_SIZE;
>
> @@ -768,7 +790,7 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm)
> }
>
> /**
> - * vgic_v3_insert_redist_region - Insert a new redistributor region
> + * vgic_v3_alloc_redist_region - Allocate a new redistributor region
> *
> * Performs various checks before inserting the rdist region in the list.
> * Those tests depend on whether the size of the rdist region is known
> @@ -782,8 +804,8 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm)
> *
> * Return 0 on success, < 0 otherwise
> */
> -static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
> - gpa_t base, uint32_t count)
> +static int vgic_v3_alloc_redist_region(struct kvm *kvm, uint32_t index,
> + gpa_t base, uint32_t count)
> {
> struct vgic_dist *d = &kvm->arch.vgic;
> struct vgic_redist_region *rdreg;
> @@ -839,6 +861,13 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
> rdreg->count = count;
> rdreg->free_index = 0;
> rdreg->index = index;
> + if (count) {
> + rdreg->rdist_indices = kcalloc(count, sizeof(u32), GFP_KERNEL);
> + if (!rdreg->rdist_indices) {
> + ret = -ENOMEM;
> + goto free;
> + }
> + }
>
> list_add_tail(&rdreg->list, rd_regions);
> return 0;
> @@ -847,11 +876,18 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
> return ret;
> }
>
> +void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg)
> +{
> + list_del(&rdreg->list);
> + kfree(rdreg->rdist_indices);
> + kfree(rdreg);
> +}
> +
> int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count)
> {
> int ret;
>
> - ret = vgic_v3_insert_redist_region(kvm, index, addr, count);
> + ret = vgic_v3_alloc_redist_region(kvm, index, addr, count);
> if (ret)
> return ret;
>
> @@ -864,8 +900,7 @@ int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count)
> struct vgic_redist_region *rdreg;
>
> rdreg = vgic_v3_rdist_region_from_index(kvm, index);
> - list_del(&rdreg->list);
> - kfree(rdreg);
> + vgic_v3_free_redist_region(rdreg);
> return ret;
> }
>
> diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
> index 64fcd75111108..bc418c2c12141 100644
> --- a/arch/arm64/kvm/vgic/vgic.h
> +++ b/arch/arm64/kvm/vgic/vgic.h
> @@ -293,6 +293,7 @@ vgic_v3_rd_region_size(struct kvm *kvm, struct vgic_redist_region *rdreg)
>
> struct vgic_redist_region *vgic_v3_rdist_region_from_index(struct kvm *kvm,
> u32 index);
> +void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg);
>
> bool vgic_v3_rdist_overlap(struct kvm *kvm, gpa_t base, size_t size);
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 3d74f1060bd18..9a3f060ac3547 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -197,6 +197,7 @@ struct vgic_redist_region {
> gpa_t base;
> u32 count; /* number of redistributors or 0 if single region */
> u32 free_index; /* index of the next free redistributor */
> + int *rdist_indices; /* indices of the redistributors */
You are treating it as an array of u32 when allocating it. Please
choose one type or the other.
> struct list_head list;
> };
>
> @@ -322,6 +323,8 @@ struct vgic_cpu {
> */
> struct vgic_io_device rd_iodev;
> struct vgic_redist_region *rdreg;
> + u32 rdreg_index;
> + int index; /* vcpu index */
>
> /* Contains the attributes and gpa of the LPI pending tables. */
> u64 pendbaser;
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Hi Marc,
On 4/1/21 7:30 PM, Marc Zyngier wrote:
> On Thu, 01 Apr 2021 18:03:25 +0100,
> Auger Eric <[email protected]> wrote:
>>
>> Hi Marc,
>>
>> On 4/1/21 3:42 PM, Marc Zyngier wrote:
>>> Hi Eric,
>>>
>>> On Thu, 01 Apr 2021 09:52:37 +0100,
>>> Eric Auger <[email protected]> wrote:
>>>>
>>>> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
>>>> reporting of GICR_TYPER.Last for userspace") temporarily fixed
>>>> a bug identified when attempting to access the GICR_TYPER
>>>> register before the redistributor region setting, but dropped
>>>> the support of the LAST bit.
>>>>
>>>> Emulating the GICR_TYPER.Last bit still makes sense for
>>>> architecture compliance though. This patch restores its support
>>>> (if the redistributor region was set) while keeping the code safe.
>>>>
>>>> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
>>>> computes whether a redistributor is the highest one of a series
>>>> of redistributor contributor pages.
>>>>
>>>> The spec says "Indicates whether this Redistributor is the
>>>> highest-numbered Redistributor in a series of contiguous
>>>> Redistributor pages."
>>>>
>>>> The code is a bit convulated since there is no guarantee
>>>
>>> nit: convoluted
>>>
>>>> redistributors are added in a given reditributor region in
>>>> ascending order. In that case the current implementation was
>>>> wrong. Also redistributor regions can be contiguous
>>>> and registered in non increasing base address order.
>>>>
>>>> So the index of redistributors are stored in an array within
>>>> the redistributor region structure.
>>>>
>>>> With this new implementation we do not need to have a uaccess
>>>> read accessor anymore.
>>>>
>>>> Signed-off-by: Eric Auger <[email protected]>
>>>
>>> This patch also hurt my head, a lot more than the first one. See
>>> below.
>>>
>>>> ---
>>>> arch/arm64/kvm/vgic/vgic-init.c | 7 +--
>>>> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 ++++++++++++++++++++----------
>>>> arch/arm64/kvm/vgic/vgic.h | 1 +
>>>> include/kvm/arm_vgic.h | 3 +
>>>> 4 files changed, 73 insertions(+), 35 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
>>>> index cf6faa0aeddb2..61150c34c268c 100644
>>>> --- a/arch/arm64/kvm/vgic/vgic-init.c
>>>> +++ b/arch/arm64/kvm/vgic/vgic-init.c
>>>> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
>>>> int i;
>>>>
>>>> vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
>>>> + vgic_cpu->index = vcpu->vcpu_id;
>>>
>>> Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
>>> do we need another field? We can always get to the vcpu using a
>>> container_of().
>>>
>>>>
>>>> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
>>>> raw_spin_lock_init(&vgic_cpu->ap_list_lock);
>>>> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>>>> dist->vgic_dist_base = VGIC_ADDR_UNDEF;
>>>>
>>>> if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
>>>> - list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) {
>>>> - list_del(&rdreg->list);
>>>> - kfree(rdreg);
>>>> - }
>>>> + list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list)
>>>> + vgic_v3_free_redist_region(rdreg);
>>>
>>> Consider moving the introduction of vgic_v3_free_redist_region() into
>>> a separate patch. On its own, that's a good readability improvement.
>>>
>>>> INIT_LIST_HEAD(&dist->rd_regions);
>>>> } else {
>>>> dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
>>>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>>>> index 987e366c80008..f6a7eed1d6adb 100644
>>>> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>>>> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>>>> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
>>>> vgic_enable_lpis(vcpu);
>>>> }
>>>>
>>>> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
>>>> +{
>>>> + struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
>>>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
>>>> +
>>>> + if (!rdreg)
>>>> + return false;
>>>> +
>>>> + if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) {
>>>> + /* check whether there is no other contiguous rdist region */
>>>> + struct list_head *rd_regions = &vgic->rd_regions;
>>>> + struct vgic_redist_region *iter;
>>>> +
>>>> + list_for_each_entry(iter, rd_regions, list) {
>>>> + if (iter->base == rdreg->base + rdreg->count * KVM_VGIC_V3_REDIST_SIZE &&
>>>> + iter->free_index > 0) {
>>>> + /* check the first rdist index of this region, if any */
>>>> + if (vgic_cpu->index < iter->rdist_indices[0])
>>>> + return false;
>>>
>>> rdist_indices[] contains the vcpu_id of the vcpu associated with a
>>> given RD in the region. At this stage, you have established that there
>>> is another region that is contiguous with the one associated with our
>>> vcpu. You also know that this adjacent region has a vcpu mapped in
>>> (free_index isn't 0). Isn't that enough to declare that our vcpu isn't
>>> last? I definitely don't understand what the index comparison does
>>> here.
>> Assume the following case:
>> 2 RDIST region
>> region #0 contains rdist 1, 2, 4
>> region #1, adjacent to #0 contains rdist 3
>>
>> Spec days:
>> "Indicates whether this Redistributor is the
>> highest-numbered Redistributor in a series of contiguous
>> Redistributor pages."
>>
>> To me 4 is last and 3 is last too.
>
> No, only 3 is last, assuming that region 0 is full. I think the
> phrasing in the spec is just really bad. What this describes is that
> at the end of a set of contiguous set of RDs, that last RD has Last
> set. If two regions are contiguous, that's undistinguishable from a
> single, larger region.
>
> There is no such thing as a "redistributor number" anyway. The closest
> thing there is would be "processor number", but that has nothing to do
> with the RD itself.
Hum OK. That's a different understanding of the spec wording indeed. For
me redistributor number was the index of the vcpu.
But well, you're understanding is definitively simpler to implement and
also matches what was implemented for single RDIST region.
>
>>
>>
>>>
>>> It also seem to me that some of the complexity could be eliminated if
>>> the regions were kept ordered at list insertion time.
>> yes
>>>
>>>> + }
>>>> + }
>>>> + } else if (vgic_cpu->rdreg_index < rdreg->free_index - 1) {
>>>> + /* look at the index of next rdist */
>>>> + int next_rdist_index = rdreg->rdist_indices[vgic_cpu->rdreg_index + 1];
>>>> +
>>>> + if (vgic_cpu->index < next_rdist_index)
>>>> + return false;
>>>
>>> Same thing here. We are in the middle of the allocated part of a
>>> region, which means we cannot be last. I still don't get the index
>>> check.
>> Because within a region, nothing hinders rdist from being allocated in
>> non ascending order. I exercise those cases in the kvmselftests
>>
>> one single RDIST region with the following rdists allocated there:
>> 1, 3, 2
>>
>> 3 and 2 are "last", right? Or did I miss something. Yes that's totally
>> not natural to do that kind of allocation but the API allows to do that.
>
> No, only 2 is last. I think you got tripped by the bizarre language in
> the spec, and the behaviour of this Last bit is much simpler than what
> you ended up with.
OK, I will respin according to your suggestion then.
Thanks
Eric
>
> Thanks,
>
> M.
>
Hi Eric,
On Thu, 01 Apr 2021 20:16:53 +0100,
Auger Eric <[email protected]> wrote:
>
> Hi Marc,
>
> On 4/1/21 7:30 PM, Marc Zyngier wrote:
> > On Thu, 01 Apr 2021 18:03:25 +0100,
> > Auger Eric <[email protected]> wrote:
> >>
> >> Hi Marc,
> >>
> >> On 4/1/21 3:42 PM, Marc Zyngier wrote:
> >>> Hi Eric,
> >>>
> >>> On Thu, 01 Apr 2021 09:52:37 +0100,
> >>> Eric Auger <[email protected]> wrote:
> >>>>
> >>>> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> >>>> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> >>>> a bug identified when attempting to access the GICR_TYPER
> >>>> register before the redistributor region setting, but dropped
> >>>> the support of the LAST bit.
> >>>>
> >>>> Emulating the GICR_TYPER.Last bit still makes sense for
> >>>> architecture compliance though. This patch restores its support
> >>>> (if the redistributor region was set) while keeping the code safe.
> >>>>
> >>>> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
> >>>> computes whether a redistributor is the highest one of a series
> >>>> of redistributor contributor pages.
> >>>>
> >>>> The spec says "Indicates whether this Redistributor is the
> >>>> highest-numbered Redistributor in a series of contiguous
> >>>> Redistributor pages."
> >>>>
> >>>> The code is a bit convulated since there is no guarantee
> >>>
> >>> nit: convoluted
> >>>
> >>>> redistributors are added in a given reditributor region in
> >>>> ascending order. In that case the current implementation was
> >>>> wrong. Also redistributor regions can be contiguous
> >>>> and registered in non increasing base address order.
> >>>>
> >>>> So the index of redistributors are stored in an array within
> >>>> the redistributor region structure.
> >>>>
> >>>> With this new implementation we do not need to have a uaccess
> >>>> read accessor anymore.
> >>>>
> >>>> Signed-off-by: Eric Auger <[email protected]>
> >>>
> >>> This patch also hurt my head, a lot more than the first one. See
> >>> below.
> >>>
> >>>> ---
> >>>> arch/arm64/kvm/vgic/vgic-init.c | 7 +--
> >>>> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 ++++++++++++++++++++----------
> >>>> arch/arm64/kvm/vgic/vgic.h | 1 +
> >>>> include/kvm/arm_vgic.h | 3 +
> >>>> 4 files changed, 73 insertions(+), 35 deletions(-)
> >>>>
> >>>> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
> >>>> index cf6faa0aeddb2..61150c34c268c 100644
> >>>> --- a/arch/arm64/kvm/vgic/vgic-init.c
> >>>> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> >>>> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
> >>>> int i;
> >>>>
> >>>> vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
> >>>> + vgic_cpu->index = vcpu->vcpu_id;
> >>>
> >>> Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
> >>> do we need another field? We can always get to the vcpu using a
> >>> container_of().
> >>>
> >>>>
> >>>> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
> >>>> raw_spin_lock_init(&vgic_cpu->ap_list_lock);
> >>>> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
> >>>> dist->vgic_dist_base = VGIC_ADDR_UNDEF;
> >>>>
> >>>> if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
> >>>> - list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) {
> >>>> - list_del(&rdreg->list);
> >>>> - kfree(rdreg);
> >>>> - }
> >>>> + list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list)
> >>>> + vgic_v3_free_redist_region(rdreg);
> >>>
> >>> Consider moving the introduction of vgic_v3_free_redist_region() into
> >>> a separate patch. On its own, that's a good readability improvement.
> >>>
> >>>> INIT_LIST_HEAD(&dist->rd_regions);
> >>>> } else {
> >>>> dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
> >>>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >>>> index 987e366c80008..f6a7eed1d6adb 100644
> >>>> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >>>> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >>>> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
> >>>> vgic_enable_lpis(vcpu);
> >>>> }
> >>>>
> >>>> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
> >>>> +{
> >>>> + struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
> >>>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >>>> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
> >>>> +
> >>>> + if (!rdreg)
> >>>> + return false;
> >>>> +
> >>>> + if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) {
> >>>> + /* check whether there is no other contiguous rdist region */
> >>>> + struct list_head *rd_regions = &vgic->rd_regions;
> >>>> + struct vgic_redist_region *iter;
> >>>> +
> >>>> + list_for_each_entry(iter, rd_regions, list) {
> >>>> + if (iter->base == rdreg->base + rdreg->count * KVM_VGIC_V3_REDIST_SIZE &&
> >>>> + iter->free_index > 0) {
> >>>> + /* check the first rdist index of this region, if any */
> >>>> + if (vgic_cpu->index < iter->rdist_indices[0])
> >>>> + return false;
> >>>
> >>> rdist_indices[] contains the vcpu_id of the vcpu associated with a
> >>> given RD in the region. At this stage, you have established that there
> >>> is another region that is contiguous with the one associated with our
> >>> vcpu. You also know that this adjacent region has a vcpu mapped in
> >>> (free_index isn't 0). Isn't that enough to declare that our vcpu isn't
> >>> last? I definitely don't understand what the index comparison does
> >>> here.
> >> Assume the following case:
> >> 2 RDIST region
> >> region #0 contains rdist 1, 2, 4
> >> region #1, adjacent to #0 contains rdist 3
> >>
> >> Spec days:
> >> "Indicates whether this Redistributor is the
> >> highest-numbered Redistributor in a series of contiguous
> >> Redistributor pages."
> >>
> >> To me 4 is last and 3 is last too.
> >
> > No, only 3 is last, assuming that region 0 is full. I think the
> > phrasing in the spec is just really bad. What this describes is that
> > at the end of a set of contiguous set of RDs, that last RD has Last
> > set. If two regions are contiguous, that's undistinguishable from a
> > single, larger region.
> >
> > There is no such thing as a "redistributor number" anyway. The closest
> > thing there is would be "processor number", but that has nothing to do
> > with the RD itself.
>
> Hum OK. That's a different understanding of the spec wording indeed. For
> me redistributor number was the index of the vcpu.
I think that's the source of the confusion. There really is nothing
like a redistributor number. There is a processor number when
GICR_TYPER.PTA=0 (that the guest uses as the target CPU when moving a
LPI), but that's it. The layout is totally dumb, and the last frame in
a contiguous sequence of frames is, well, last. The content of the
frames doesn't matter in the least.
> But well, you're understanding is definitively simpler to implement and
> also matches what was implemented for single RDIST region.
That's a key insight. There is no reason why the RD layout would defer
between a single region and multiple regions.
Think of it from a HW perspective. You design a SoC that has
"clusters" of CPUs, and you lay down a bunch of RDs, one set per
cluster. Each set has a "Last" RD frame, and that's all there is to
it.
I'll try and see if ARM people are willing to clarify the spec (for
which an update is long overdue).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.