LinuxLists.cc - [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

2021-04-20 10:43:06

Subject: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

This is a preliminary clean up for TDX which complicates KVM page fault
execution path. simplify those execution path as preparation.

The current kvm page fault handlers passes around many arguments to the
functions.
To simplify those arguments and local variables, introduce data structure,
struct kvm_page_fault, to hold those arguments, variables, and passes around the pointer
to struct kvm_page_fault.

struct kvm_page_fault is allocated on stack on the caller of kvm fault handler,
kvm_mmu_do_page_fault().
And then push down the pointer to inner functions step by step.
The conversion order is as follows
. kvm_mmu_do_page_fault() with introducing struct kvm_page_fault
. kvm_mmu.page_fault(): kvm_tdp_page_fault(), nonpaging_page_fault(), FNAME(page_fault)
. direct_page_fault()
. try_async_pf()
. page_fault_handle_page_track()
. handle_abnormal_pfn()
. fast_page_fault()
. __direct_map()
. kvm_tdp_mmu_map()
. FNAME(fetch)

Probably more functions should be converted. or some should not converted.
Only code refactoring and no functional change is intended.

Isaku Yamahata (10):
KVM: x86/mmu: make kvm_mmu_do_page_fault() receive single argument
KVM: x86/mmu: make kvm_mmu:page_fault receive single argument
KVM: x86/mmu: make direct_page_fault() receive single argument
KVM: x86/mmu: make try_async_pf() receive single argument
KVM: x86/mmu: make page_fault_handle_page_track() receive single
argument
KVM: x86/mmu: make handle_abnormal_pfn() receive single argument
KVM: x86/mmu: make fast_page_fault() receive single argument
KVM: x86/mmu: make __direct_map() receive single argument
KVM: x86/mmu: make kvm_tdp_mmu_map() receive single argument
KVM: x86/mmu: make FNAME(fetch) receive single argument

arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/kvm/mmu.h | 49 ++++++++++--
arch/x86/kvm/mmu/mmu.c | 130 +++++++++++++++++---------------
arch/x86/kvm/mmu/paging_tmpl.h | 60 +++++++--------
arch/x86/kvm/mmu/tdp_mmu.c | 21 +++---
arch/x86/kvm/mmu/tdp_mmu.h | 4 +-
arch/x86/kvm/x86.c | 4 +-
7 files changed, 156 insertions(+), 116 deletions(-)

--
2.25.1

2021-04-20 10:43:15

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 01/10] KVM: x86/mmu: make kvm_mmu_do_page_fault() receive single argument

Introduce struct kvm_page_fault handler and its initialization function.
Make the caller of kvm page fault handler allocate/initialize
struct kvm_page_fault, and pass it to kvm_mmu_do_page_fault() instead
of many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu.h | 29 ++++++++++++++++++++++++-----
arch/x86/kvm/mmu/mmu.c | 6 ++++--
arch/x86/kvm/x86.c | 4 +++-
3 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index c68bfc3e2402..245c5d7fd3dd 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -106,17 +106,36 @@ static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
vcpu->arch.mmu->shadow_root_level);
}

+struct kvm_page_fault {
+ /* arguments to kvm page fault handler */
+ struct kvm_vcpu *vcpu;
+ gpa_t cr2_or_gpa;
+ u32 error_code;
+ bool prefault;
+};
+
+static inline void kvm_page_fault_init(
+ struct kvm_page_fault *kpf, struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
+ u32 error_code, bool prefault)
+{
+ kpf->vcpu = vcpu;
+ kpf->cr2_or_gpa = cr2_or_gpa;
+ kpf->error_code = error_code;
+ kpf->prefault = prefault;
+}
+
int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
bool prefault);

-static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
- u32 err, bool prefault)
+static inline int kvm_mmu_do_page_fault(struct kvm_page_fault *kpf)
{
#ifdef CONFIG_RETPOLINE
- if (likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault))
- return kvm_tdp_page_fault(vcpu, cr2_or_gpa, err, prefault);
+ if (likely(kpf->vcpu->arch.mmu->page_fault == kvm_tdp_page_fault))
+ return kvm_tdp_page_fault(kpf->vcpu, kpf->cr2_or_gpa,
+ kpf->error_code, kpf->prefault);
#endif
- return vcpu->arch.mmu->page_fault(vcpu, cr2_or_gpa, err, prefault);
+ return kpf->vcpu->arch.mmu->page_fault(kpf->vcpu, kpf->cr2_or_gpa,
+ kpf->error_code, kpf->prefault);
}

/*
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 951dae4e7175..8ea2afcb528c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5006,6 +5006,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
{
int r, emulation_type = EMULTYPE_PF;
bool direct = vcpu->arch.mmu->direct_map;
+ struct kvm_page_fault kpf;

if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)))
return RET_PF_RETRY;
@@ -5018,8 +5019,9 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
}

if (r == RET_PF_INVALID) {
- r = kvm_mmu_do_page_fault(vcpu, cr2_or_gpa,
- lower_32_bits(error_code), false);
+ kvm_page_fault_init(&kpf, vcpu, cr2_or_gpa,
+ lower_32_bits(error_code), false);
+ r = kvm_mmu_do_page_fault(&kpf);
if (WARN_ON_ONCE(r == RET_PF_INVALID))
return -EIO;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eca63625aee4..999ed561de64 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11083,6 +11083,7 @@ EXPORT_SYMBOL_GPL(kvm_set_rflags);
void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
{
int r;
+ struct kvm_page_fault kpf;

if ((vcpu->arch.mmu->direct_map != work->arch.direct_map) ||
work->wakeup_all)
@@ -11096,7 +11097,8 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
work->arch.cr3 != vcpu->arch.mmu->get_guest_pgd(vcpu))
return;

- kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, 0, true);
+ kvm_page_fault_init(&kpf, vcpu, work->cr2_or_gpa, 0, true);
+ kvm_mmu_do_page_fault(&kpf);
}

static inline u32 kvm_async_pf_hash_fn(gfn_t gfn)
--
2.25.1

2021-04-20 10:43:32

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 03/10] KVM: x86/mmu: make direct_page_fault() receive single argument

Convert direct_page_fault() to receive struct kvm_page_fault instead of
many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu.h | 10 ++++++++++
arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++++++------------
2 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 7fcd9c147e63..fa3b1df502e7 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -112,6 +112,11 @@ struct kvm_page_fault {
gpa_t cr2_or_gpa;
u32 error_code;
bool prefault;
+
+ /* internal state */
+ gfn_t gfn;
+ bool is_tdp;
+ int max_level;
};

static inline void kvm_page_fault_init(
@@ -122,6 +127,11 @@ static inline void kvm_page_fault_init(
kpf->cr2_or_gpa = cr2_or_gpa;
kpf->error_code = error_code;
kpf->prefault = prefault;
+
+ /* default value */
+ kpf->is_tdp = false;
+ kpf->gfn = cr2_or_gpa >> PAGE_SHIFT;
+ kpf->max_level = PG_LEVEL_4K;
}

int kvm_tdp_page_fault(struct kvm_page_fault *kpf);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 46998cfabfd3..cb90148f90af 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3681,13 +3681,18 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn,
return false;
}

-static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
- bool prefault, int max_level, bool is_tdp)
-{
+static int direct_page_fault(struct kvm_page_fault *kpf)
+{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
+ gpa_t gpa = kpf->cr2_or_gpa;
+ u32 error_code = kpf->error_code;
+ bool prefault = kpf->prefault;
+ int max_level = kpf->max_level;
+ bool is_tdp = kpf->is_tdp;
bool write = error_code & PFERR_WRITE_MASK;
bool map_writable;

- gfn_t gfn = gpa >> PAGE_SHIFT;
+ gfn_t gfn = kpf->gfn;
unsigned long mmu_seq;
kvm_pfn_t pfn;
hva_t hva;
@@ -3750,10 +3755,12 @@ static int nonpaging_page_fault(struct kvm_page_fault *kpf)
pgprintk("%s: gva %lx error %x\n", __func__,
kpf->cr2_or_gpa, kpf->error_code);

+ kpf->cr2_or_gpa &= PAGE_MASK;
+ kpf->is_tdp = false;
+ kpf->max_level = PG_LEVEL_2M;
+
/* This path builds a PAE pagetable, we can map 2mb pages at maximum. */
- return direct_page_fault(kpf->vcpu, kpf->cr2_or_gpa & PAGE_MASK,
- kpf->error_code,
- kpf->prefault, PG_LEVEL_2M, false);
+ return direct_page_fault(kpf);
}

int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
@@ -3791,21 +3798,22 @@ EXPORT_SYMBOL_GPL(kvm_handle_page_fault);

int kvm_tdp_page_fault(struct kvm_page_fault *kpf)
{
- u32 gpa = kpf->cr2_or_gpa;
+ struct kvm_vcpu *vcpu = kpf->vcpu;
int max_level;
+ kpf->is_tdp = true;

for (max_level = KVM_MAX_HUGEPAGE_LEVEL;
max_level > PG_LEVEL_4K;
max_level--) {
int page_num = KVM_PAGES_PER_HPAGE(max_level);
- gfn_t base = (gpa >> PAGE_SHIFT) & ~(page_num - 1);
+ gfn_t base = kpf->gfn & ~(page_num - 1);

- if (kvm_mtrr_check_gfn_range_consistency(kpf->vcpu, base, page_num))
+ if (kvm_mtrr_check_gfn_range_consistency(vcpu, base, page_num))
break;
}
+ kpf->max_level = max_level;

- return direct_page_fault(kpf->vcpu, gpa, kpf->error_code, kpf->prefault,
- max_level, true);
+ return direct_page_fault(kpf);
}

static void nonpaging_init_context(struct kvm_vcpu *vcpu,
--
2.25.1

2021-04-20 10:43:48

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 02/10] KVM: x86/mmu: make kvm_mmu:page_fault receive single argument

Convert kvm_mmu:page_fault callback to receive struct kvm_page_fault
instead of many arguments.
The following functions are converted by this patch.
kvm_tdp_page_fault(), nonpaging_page_fault() and, FNAME(page_fault).

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 4 ++--
arch/x86/kvm/mmu.h | 9 +++------
arch/x86/kvm/mmu/mmu.c | 19 ++++++++++---------
arch/x86/kvm/mmu/paging_tmpl.h | 7 +++++--
4 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3768819693e5..97e72076f358 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -351,6 +351,7 @@ struct kvm_mmu_root_info {
#define KVM_HAVE_MMU_RWLOCK

struct kvm_mmu_page;
+struct kvm_page_fault;

/*
* x86 supports 4 paging modes (5-level 64-bit, 4-level 64-bit, 3-level 32-bit,
@@ -360,8 +361,7 @@ struct kvm_mmu_page;
struct kvm_mmu {
unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu);
u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index);
- int (*page_fault)(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 err,
- bool prefault);
+ int (*page_fault)(struct kvm_page_fault *kpf);
void (*inject_page_fault)(struct kvm_vcpu *vcpu,
struct x86_exception *fault);
gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gpa_t gva_or_gpa,
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 245c5d7fd3dd..7fcd9c147e63 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -124,18 +124,15 @@ static inline void kvm_page_fault_init(
kpf->prefault = prefault;
}

-int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
- bool prefault);
+int kvm_tdp_page_fault(struct kvm_page_fault *kpf);

static inline int kvm_mmu_do_page_fault(struct kvm_page_fault *kpf)
{
#ifdef CONFIG_RETPOLINE
if (likely(kpf->vcpu->arch.mmu->page_fault == kvm_tdp_page_fault))
- return kvm_tdp_page_fault(kpf->vcpu, kpf->cr2_or_gpa,
- kpf->error_code, kpf->prefault);
+ return kvm_tdp_page_fault(kpf);
#endif
- return kpf->vcpu->arch.mmu->page_fault(kpf->vcpu, kpf->cr2_or_gpa,
- kpf->error_code, kpf->prefault);
+ return kpf->vcpu->arch.mmu->page_fault(kpf);
}

/*
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8ea2afcb528c..46998cfabfd3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3745,14 +3745,15 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
return r;
}

-static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa,
- u32 error_code, bool prefault)
+static int nonpaging_page_fault(struct kvm_page_fault *kpf)
{
- pgprintk("%s: gva %lx error %x\n", __func__, gpa, error_code);
+ pgprintk("%s: gva %lx error %x\n", __func__,
+ kpf->cr2_or_gpa, kpf->error_code);

/* This path builds a PAE pagetable, we can map 2mb pages at maximum. */
- return direct_page_fault(vcpu, gpa & PAGE_MASK, error_code, prefault,
- PG_LEVEL_2M, false);
+ return direct_page_fault(kpf->vcpu, kpf->cr2_or_gpa & PAGE_MASK,
+ kpf->error_code,
+ kpf->prefault, PG_LEVEL_2M, false);
}

int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
@@ -3788,9 +3789,9 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
}
EXPORT_SYMBOL_GPL(kvm_handle_page_fault);

-int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
- bool prefault)
+int kvm_tdp_page_fault(struct kvm_page_fault *kpf)
{
+ u32 gpa = kpf->cr2_or_gpa;
int max_level;

for (max_level = KVM_MAX_HUGEPAGE_LEVEL;
@@ -3799,11 +3800,11 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
int page_num = KVM_PAGES_PER_HPAGE(max_level);
gfn_t base = (gpa >> PAGE_SHIFT) & ~(page_num - 1);

- if (kvm_mtrr_check_gfn_range_consistency(vcpu, base, page_num))
+ if (kvm_mtrr_check_gfn_range_consistency(kpf->vcpu, base, page_num))
break;
}

- return direct_page_fault(vcpu, gpa, error_code, prefault,
+ return direct_page_fault(kpf->vcpu, gpa, kpf->error_code, kpf->prefault,
max_level, true);
}

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 55d7b473ac44..dc814463a8df 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -789,9 +789,12 @@ FNAME(is_self_change_mapping)(struct kvm_vcpu *vcpu,
* Returns: 1 if we need to emulate the instruction, 0 otherwise, or
* a negative value on error.
*/
-static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code,
- bool prefault)
+static int FNAME(page_fault)(struct kvm_page_fault *kpf)
{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
+ gpa_t addr = kpf->cr2_or_gpa;
+ u32 error_code = kpf->error_code;
+ bool prefault = kpf->prefault;
bool write_fault = error_code & PFERR_WRITE_MASK;
bool user_fault = error_code & PFERR_USER_MASK;
struct guest_walker walker;
--
2.25.1

2021-04-20 10:44:26

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 05/10] KVM: x86/mmu: make page_fault_handle_page_track() receive single argument

Convert page_fault_handle_page_trace() to receive single argument,
struct kvm_page_fault, instead of many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 9 +++++----
arch/x86/kvm/mmu/paging_tmpl.h | 4 ++--
2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a2422bd9f59b..dac022a79c57 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3598,9 +3598,10 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct)
return RET_PF_RETRY;
}

-static bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
- u32 error_code, gfn_t gfn)
+static bool page_fault_handle_page_track(struct kvm_page_fault *kpf)
{
+ u32 error_code = kpf->error_code;
+
if (unlikely(error_code & PFERR_RSVD_MASK))
return false;

@@ -3612,7 +3613,7 @@ static bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
* guest is writing the page which is write tracked which can
* not be fixed by page fault handler.
*/
- if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE))
+ if (kvm_page_track_is_active(kpf->vcpu, kpf->gfn, KVM_PAGE_TRACK_WRITE))
return true;

return false;
@@ -3697,7 +3698,7 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
unsigned long mmu_seq;
int r;

- if (page_fault_handle_page_track(vcpu, error_code, gfn))
+ if (page_fault_handle_page_track(kpf))
return RET_PF_EMULATE;

if (!is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) {
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index f2beb7f7c378..7965786418af 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -827,7 +827,8 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
return RET_PF_RETRY;
}

- if (page_fault_handle_page_track(vcpu, error_code, walker.gfn)) {
+ kpf->gfn = walker.gfn;
+ if (page_fault_handle_page_track(kpf)) {
shadow_page_table_clear_flood(vcpu, addr);
return RET_PF_EMULATE;
}
@@ -849,7 +850,6 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();

- kpf->gfn = walker.gfn;
if (try_async_pf(kpf))
return RET_PF_RETRY;

--
2.25.1

2021-04-20 10:44:38

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 08/10] KVM: x86/mmu: make __direct_map() receive single argument

Convert __direct_map() to receive single argument, struct kvm_page_fault
instead of many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 27 ++++++++++++---------------
1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index ce48416380c3..b58afb58430e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2856,27 +2856,26 @@ void disallowed_hugepage_adjust(u64 spte, gfn_t gfn, int cur_level,
}
}

-static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
- int map_writable, int max_level, kvm_pfn_t pfn,
- bool prefault, bool is_tdp)
+static int __direct_map(struct kvm_page_fault *kpf)
{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
bool nx_huge_page_workaround_enabled = is_nx_huge_page_enabled();
- bool write = error_code & PFERR_WRITE_MASK;
- bool exec = error_code & PFERR_FETCH_MASK;
+ bool exec = kpf->error_code & PFERR_FETCH_MASK;
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_shadow_walk_iterator it;
struct kvm_mmu_page *sp;
int level, req_level, ret;
- gfn_t gfn = gpa >> PAGE_SHIFT;
+ gpa_t gpa = kpf->cr2_or_gpa;
+ gfn_t gfn = kpf->gfn;
gfn_t base_gfn = gfn;

if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)))
return RET_PF_RETRY;

- level = kvm_mmu_hugepage_adjust(vcpu, gfn, max_level, &pfn,
+ level = kvm_mmu_hugepage_adjust(vcpu, gfn, kpf->max_level, &kpf->pfn,
huge_page_disallowed, &req_level);

- trace_kvm_mmu_spte_requested(gpa, level, pfn);
+ trace_kvm_mmu_spte_requested(gpa, level, kpf->pfn);
for_each_shadow_entry(vcpu, gpa, it) {
/*
* We cannot overwrite existing page tables with an NX
@@ -2884,7 +2883,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
*/
if (nx_huge_page_workaround_enabled)
disallowed_hugepage_adjust(*it.sptep, gfn, it.level,
- &pfn, &level);
+ &kpf->pfn, &level);

base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
if (it.level == level)
@@ -2896,15 +2895,15 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
it.level - 1, true, ACC_ALL);

link_shadow_page(vcpu, it.sptep, sp);
- if (is_tdp && huge_page_disallowed &&
+ if (kpf->is_tdp && huge_page_disallowed &&
req_level >= it.level)
account_huge_nx_page(vcpu->kvm, sp);
}
}

ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL,
- write, level, base_gfn, pfn, prefault,
- map_writable);
+ kpf->write_fault, level, base_gfn, kpf->pfn, kpf->prefault,
+ kpf->map_writable);
if (ret == RET_PF_SPURIOUS)
return ret;

@@ -3697,7 +3696,6 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
u32 error_code = kpf->error_code;
bool prefault = kpf->prefault;
int max_level = kpf->max_level;
- bool is_tdp = kpf->is_tdp;

unsigned long mmu_seq;
int r;
@@ -3742,8 +3740,7 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
r = kvm_tdp_mmu_map(vcpu, gpa, error_code, kpf->map_writable,
max_level, kpf->pfn, prefault);
else
- r = __direct_map(vcpu, gpa, error_code, kpf->map_writable,
- max_level, kpf->pfn, prefault, is_tdp);
+ r = __direct_map(kpf);

out_unlock:
if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
--
2.25.1

2021-04-20 10:44:46

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 09/10] KVM: x86/mmu: make kvm_tdp_mmu_map() receive single argument

Convert kvm_tdp_mmu_map() to receive single argument, struct kvm_page_fault
instead of many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 8 +-------
arch/x86/kvm/mmu/tdp_mmu.c | 21 +++++++++++----------
arch/x86/kvm/mmu/tdp_mmu.h | 4 +---
3 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b58afb58430e..ebac766839a9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3692,11 +3692,6 @@ static bool try_async_pf(struct kvm_page_fault *kpf)
static int direct_page_fault(struct kvm_page_fault *kpf)
{
struct kvm_vcpu *vcpu = kpf->vcpu;
- gpa_t gpa = kpf->cr2_or_gpa;
- u32 error_code = kpf->error_code;
- bool prefault = kpf->prefault;
- int max_level = kpf->max_level;
-
unsigned long mmu_seq;
int r;

@@ -3737,8 +3732,7 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
goto out_unlock;

if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
- r = kvm_tdp_mmu_map(vcpu, gpa, error_code, kpf->map_writable,
- max_level, kpf->pfn, prefault);
+ r = kvm_tdp_mmu_map(kpf);
else
r = __direct_map(kpf);

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 018d82e73e31..13ae4735fc25 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -793,12 +793,11 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
* Handle a TDP page fault (NPT/EPT violation/misconfiguration) by installing
* page tables and SPTEs to translate the faulting guest physical address.
*/
-int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
- int map_writable, int max_level, kvm_pfn_t pfn,
- bool prefault)
+int kvm_tdp_mmu_map(struct kvm_page_fault *kpf)
{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
+ u32 error_code = kpf->error_code;
bool nx_huge_page_workaround_enabled = is_nx_huge_page_enabled();
- bool write = error_code & PFERR_WRITE_MASK;
bool exec = error_code & PFERR_FETCH_MASK;
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu *mmu = vcpu->arch.mmu;
@@ -807,7 +806,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
u64 *child_pt;
u64 new_spte;
int ret;
- gfn_t gfn = gpa >> PAGE_SHIFT;
+ gpa_t gpa = kpf->cr2_or_gpa;
+ gfn_t gfn = kpf->gfn;
int level;
int req_level;

@@ -816,17 +816,17 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
if (WARN_ON(!is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)))
return RET_PF_RETRY;

- level = kvm_mmu_hugepage_adjust(vcpu, gfn, max_level, &pfn,
+ level = kvm_mmu_hugepage_adjust(vcpu, gfn, kpf->max_level, &kpf->pfn,
huge_page_disallowed, &req_level);

- trace_kvm_mmu_spte_requested(gpa, level, pfn);
+ trace_kvm_mmu_spte_requested(gpa, level, kpf->pfn);

rcu_read_lock();

tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
if (nx_huge_page_workaround_enabled)
disallowed_hugepage_adjust(iter.old_spte, gfn,
- iter.level, &pfn, &level);
+ iter.level, &kpf->pfn, &level);

if (iter.level == level)
break;
@@ -875,8 +875,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
return RET_PF_RETRY;
}

- ret = tdp_mmu_map_handle_target_level(vcpu, write, map_writable, &iter,
- pfn, prefault);
+ ret = tdp_mmu_map_handle_target_level(
+ vcpu, kpf->write_fault, kpf->map_writable, &iter, kpf->pfn,
+ kpf->prefault);
rcu_read_unlock();

return ret;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 31096ece9b14..cbf63791603d 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -33,9 +33,7 @@ static inline bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
}
void kvm_tdp_mmu_zap_all(struct kvm *kvm);

-int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
- int map_writable, int max_level, kvm_pfn_t pfn,
- bool prefault);
+int kvm_tdp_mmu_map(struct kvm_page_fault *kpf);

int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start,
unsigned long end);
--
2.25.1

2021-04-20 10:46:05

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 04/10] KVM: x86/mmu: make try_async_pf() receive single argument

Convert try_async_pf() to receive single struct kvm_page_fault instead of
many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu.h | 9 +++++++
arch/x86/kvm/mmu/mmu.c | 45 +++++++++++++++++-----------------
arch/x86/kvm/mmu/paging_tmpl.h | 23 +++++++++--------
3 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index fa3b1df502e7..b60fcad7279c 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -111,12 +111,17 @@ struct kvm_page_fault {
struct kvm_vcpu *vcpu;
gpa_t cr2_or_gpa;
u32 error_code;
+ bool write_fault;
bool prefault;

/* internal state */
gfn_t gfn;
bool is_tdp;
int max_level;
+
+ kvm_pfn_t pfn;
+ hva_t hva;
+ bool map_writable;
};

static inline void kvm_page_fault_init(
@@ -126,12 +131,16 @@ static inline void kvm_page_fault_init(
kpf->vcpu = vcpu;
kpf->cr2_or_gpa = cr2_or_gpa;
kpf->error_code = error_code;
+ kpf->write_fault = error_code & PFERR_WRITE_MASK;
kpf->prefault = prefault;

/* default value */
kpf->is_tdp = false;
kpf->gfn = cr2_or_gpa >> PAGE_SHIFT;
kpf->max_level = PG_LEVEL_4K;
+ kpf->pfn = KVM_PFN_NOSLOT;
+ kpf->hva = KVM_HVA_ERR_BAD;
+ kpf->map_writable = false;
}

int kvm_tdp_page_fault(struct kvm_page_fault *kpf);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index cb90148f90af..a2422bd9f59b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3646,27 +3646,29 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
kvm_vcpu_gfn_to_hva(vcpu, gfn), &arch);
}

-static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn,
- gpa_t cr2_or_gpa, kvm_pfn_t *pfn, hva_t *hva,
- bool write, bool *writable)
+static bool try_async_pf(struct kvm_page_fault *kpf)
{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
+ gfn_t gfn = kpf->gfn;
+ gpa_t cr2_or_gpa = kpf->cr2_or_gpa;
struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
bool async;

/* Don't expose private memslots to L2. */
if (is_guest_mode(vcpu) && !kvm_is_visible_memslot(slot)) {
- *pfn = KVM_PFN_NOSLOT;
- *writable = false;
+ kpf->pfn = KVM_PFN_NOSLOT;
+ kpf->map_writable = false;
return false;
}

async = false;
- *pfn = __gfn_to_pfn_memslot(slot, gfn, false, &async,
- write, writable, hva);
+ kpf->pfn = __gfn_to_pfn_memslot(slot, gfn, false, &async,
+ kpf->write_fault, &kpf->map_writable,
+ &kpf->hva);
if (!async)
return false; /* *pfn has correct page already */

- if (!prefault && kvm_can_do_async_pf(vcpu)) {
+ if (!kpf->prefault && kvm_can_do_async_pf(vcpu)) {
trace_kvm_try_async_get_page(cr2_or_gpa, gfn);
if (kvm_find_async_pf_gfn(vcpu, gfn)) {
trace_kvm_async_pf_doublefault(cr2_or_gpa, gfn);
@@ -3676,8 +3678,9 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn,
return true;
}

- *pfn = __gfn_to_pfn_memslot(slot, gfn, false, NULL,
- write, writable, hva);
+ kpf->pfn = __gfn_to_pfn_memslot(slot, gfn, false, NULL,
+ kpf->write_fault, &kpf->map_writable,
+ &kpf->hva);
return false;
}

@@ -3689,13 +3692,9 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
bool prefault = kpf->prefault;
int max_level = kpf->max_level;
bool is_tdp = kpf->is_tdp;
- bool write = error_code & PFERR_WRITE_MASK;
- bool map_writable;

gfn_t gfn = kpf->gfn;
unsigned long mmu_seq;
- kvm_pfn_t pfn;
- hva_t hva;
int r;

if (page_fault_handle_page_track(vcpu, error_code, gfn))
@@ -3714,11 +3713,10 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();

- if (try_async_pf(vcpu, prefault, gfn, gpa, &pfn, &hva,
- write, &map_writable))
+ if (try_async_pf(kpf))
return RET_PF_RETRY;

- if (handle_abnormal_pfn(vcpu, is_tdp ? 0 : gpa, gfn, pfn, ACC_ALL, &r))
+ if (handle_abnormal_pfn(vcpu, is_tdp ? 0 : gpa, gfn, kpf->pfn, ACC_ALL, &r))
return r;

r = RET_PF_RETRY;
@@ -3728,25 +3726,26 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
else
write_lock(&vcpu->kvm->mmu_lock);

- if (!is_noslot_pfn(pfn) && mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, hva))
+ if (!is_noslot_pfn(kpf->pfn) &&
+ mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, kpf->hva))
goto out_unlock;
r = make_mmu_pages_available(vcpu);
if (r)
goto out_unlock;

if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
- r = kvm_tdp_mmu_map(vcpu, gpa, error_code, map_writable, max_level,
- pfn, prefault);
+ r = kvm_tdp_mmu_map(vcpu, gpa, error_code, kpf->map_writable,
+ max_level, kpf->pfn, prefault);
else
- r = __direct_map(vcpu, gpa, error_code, map_writable, max_level, pfn,
- prefault, is_tdp);
+ r = __direct_map(vcpu, gpa, error_code, kpf->map_writable,
+ max_level, kpf->pfn, prefault, is_tdp);

out_unlock:
if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
read_unlock(&vcpu->kvm->mmu_lock);
else
write_unlock(&vcpu->kvm->mmu_lock);
- kvm_release_pfn_clean(pfn);
+ kvm_release_pfn_clean(kpf->pfn);
return r;
}

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index dc814463a8df..f2beb7f7c378 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -795,14 +795,12 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
gpa_t addr = kpf->cr2_or_gpa;
u32 error_code = kpf->error_code;
bool prefault = kpf->prefault;
- bool write_fault = error_code & PFERR_WRITE_MASK;
+ bool write_fault = kpf->write_fault;
bool user_fault = error_code & PFERR_USER_MASK;
struct guest_walker walker;
int r;
- kvm_pfn_t pfn;
- hva_t hva;
unsigned long mmu_seq;
- bool map_writable, is_self_change_mapping;
+ bool is_self_change_mapping;
int max_level;

pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code);
@@ -851,11 +849,11 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();

- if (try_async_pf(vcpu, prefault, walker.gfn, addr, &pfn, &hva,
- write_fault, &map_writable))
+ kpf->gfn = walker.gfn;
+ if (try_async_pf(kpf))
return RET_PF_RETRY;

- if (handle_abnormal_pfn(vcpu, addr, walker.gfn, pfn, walker.pte_access, &r))
+ if (handle_abnormal_pfn(vcpu, addr, walker.gfn, kpf->pfn, walker.pte_access, &r))
return r;

/*
@@ -864,7 +862,7 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
*/
if (write_fault && !(walker.pte_access & ACC_WRITE_MASK) &&
!is_write_protection(vcpu) && !user_fault &&
- !is_noslot_pfn(pfn)) {
+ !is_noslot_pfn(kpf->pfn)) {
walker.pte_access |= ACC_WRITE_MASK;
walker.pte_access &= ~ACC_USER_MASK;

@@ -880,20 +878,21 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)

r = RET_PF_RETRY;
write_lock(&vcpu->kvm->mmu_lock);
- if (!is_noslot_pfn(pfn) && mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, hva))
+ if (!is_noslot_pfn(kpf->pfn) &&
+ mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, kpf->hva))
goto out_unlock;

kvm_mmu_audit(vcpu, AUDIT_PRE_PAGE_FAULT);
r = make_mmu_pages_available(vcpu);
if (r)
goto out_unlock;
- r = FNAME(fetch)(vcpu, addr, &walker, error_code, max_level, pfn,
- map_writable, prefault);
+ r = FNAME(fetch)(vcpu, addr, &walker, error_code, max_level, kpf->pfn,
+ kpf->map_writable, prefault);
kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT);

out_unlock:
write_unlock(&vcpu->kvm->mmu_lock);
- kvm_release_pfn_clean(pfn);
+ kvm_release_pfn_clean(kpf->pfn);
return r;
}

--
2.25.1

2021-04-20 10:46:26

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 06/10] KVM: x86/mmu: make handle_abnormal_pfn() receive single argument

Convert handle_abnormal_pfn() to receive single argument,
struct kvm_page_fault, instead of many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 14 ++++++++------
arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dac022a79c57..a16e1b228ac2 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2936,18 +2936,21 @@ static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t pfn)
return -EFAULT;
}

-static bool handle_abnormal_pfn(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
- kvm_pfn_t pfn, unsigned int access,
+static bool handle_abnormal_pfn(struct kvm_page_fault *kpf, unsigned int access,
int *ret_val)
{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
+ gva_t gva = kpf->is_tdp ? 0 : kpf->cr2_or_gpa;
+ kvm_pfn_t pfn = kpf->pfn;
+
/* The pfn is invalid, report the error! */
if (unlikely(is_error_pfn(pfn))) {
- *ret_val = kvm_handle_bad_page(vcpu, gfn, pfn);
+ *ret_val = kvm_handle_bad_page(vcpu, kpf->gfn, pfn);
return true;
}

if (unlikely(is_noslot_pfn(pfn)))
- vcpu_cache_mmio_info(vcpu, gva, gfn,
+ vcpu_cache_mmio_info(vcpu, gva, kpf->gfn,
access & shadow_mmio_access_mask);

return false;
@@ -3694,7 +3697,6 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
int max_level = kpf->max_level;
bool is_tdp = kpf->is_tdp;

- gfn_t gfn = kpf->gfn;
unsigned long mmu_seq;
int r;

@@ -3717,7 +3719,7 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
if (try_async_pf(kpf))
return RET_PF_RETRY;

- if (handle_abnormal_pfn(vcpu, is_tdp ? 0 : gpa, gfn, kpf->pfn, ACC_ALL, &r))
+ if (handle_abnormal_pfn(kpf, ACC_ALL, &r))
return r;

r = RET_PF_RETRY;
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 7965786418af..7df68b5fdd10 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -853,7 +853,7 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
if (try_async_pf(kpf))
return RET_PF_RETRY;

- if (handle_abnormal_pfn(vcpu, addr, walker.gfn, kpf->pfn, walker.pte_access, &r))
+ if (handle_abnormal_pfn(kpf, walker.pte_access, &r))
return r;

/*
--
2.25.1

2021-04-20 10:46:35

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 07/10] KVM: x86/mmu: make fast_page_fault() receive single argument

Convert fast_page_fault() to receive single argument, struct kvm_page_fault
instead of many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a16e1b228ac2..ce48416380c3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3043,9 +3043,11 @@ static bool is_access_allowed(u32 fault_err_code, u64 spte)
/*
* Returns one of RET_PF_INVALID, RET_PF_FIXED or RET_PF_SPURIOUS.
*/
-static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
- u32 error_code)
+static int fast_page_fault(struct kvm_page_fault *kpf)
{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
+ gpa_t cr2_or_gpa = kpf->cr2_or_gpa;
+ u32 error_code = kpf->error_code;
struct kvm_shadow_walk_iterator iterator;
struct kvm_mmu_page *sp;
int ret = RET_PF_INVALID;
@@ -3704,7 +3706,7 @@ static int direct_page_fault(struct kvm_page_fault *kpf)
return RET_PF_EMULATE;

if (!is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) {
- r = fast_page_fault(vcpu, gpa, error_code);
+ r = fast_page_fault(kpf);
if (r != RET_PF_INVALID)
return r;
}
--
2.25.1

2021-04-20 10:46:45

by Isaku Yamahata

[permalink] [raw]

Subject: [RFC PATCH 10/10] KVM: x86/mmu: make FNAME(fetch) receive single argument

Convert FNAME(fetch) to receive single argument, struct kvm_page_fault
instead of many arguments.

No functional change is intended.

Signed-off-by: Isaku Yamahata <[email protected]>
---
arch/x86/kvm/mmu/paging_tmpl.h | 36 +++++++++++++++-------------------
1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 7df68b5fdd10..ad01d933f2b7 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -631,20 +631,19 @@ static void FNAME(pte_prefetch)(struct kvm_vcpu *vcpu, struct guest_walker *gw,
* If the guest tries to write a write-protected page, we need to
* emulate this operation, return 1 to indicate this case.
*/
-static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
- struct guest_walker *gw, u32 error_code,
- int max_level, kvm_pfn_t pfn, bool map_writable,
- bool prefault)
+static int FNAME(fetch)(struct kvm_page_fault *kpf, struct guest_walker *gw)
{
+ struct kvm_vcpu *vcpu = kpf->vcpu;
+ gpa_t addr = kpf->cr2_or_gpa;
bool nx_huge_page_workaround_enabled = is_nx_huge_page_enabled();
- bool write_fault = error_code & PFERR_WRITE_MASK;
- bool exec = error_code & PFERR_FETCH_MASK;
+ bool exec = kpf->error_code & PFERR_FETCH_MASK;
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
unsigned direct_access, access = gw->pt_access;
int top_level, level, req_level, ret;
gfn_t base_gfn = gw->gfn;
+ WARN_ON(kpf->gfn != gw->gfn);

direct_access = gw->pte_access;

@@ -689,10 +688,10 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
link_shadow_page(vcpu, it.sptep, sp);
}

- level = kvm_mmu_hugepage_adjust(vcpu, gw->gfn, max_level, &pfn,
+ level = kvm_mmu_hugepage_adjust(vcpu, gw->gfn, kpf->max_level, &kpf->pfn,
huge_page_disallowed, &req_level);

- trace_kvm_mmu_spte_requested(addr, gw->level, pfn);
+ trace_kvm_mmu_spte_requested(addr, gw->level, kpf->pfn);

for (; shadow_walk_okay(&it); shadow_walk_next(&it)) {
clear_sp_write_flooding_count(it.sptep);
@@ -703,7 +702,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
*/
if (nx_huge_page_workaround_enabled)
disallowed_hugepage_adjust(*it.sptep, gw->gfn, it.level,
- &pfn, &level);
+ &kpf->pfn, &level);

base_gfn = gw->gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
if (it.level == level)
@@ -722,8 +721,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
}
}

- ret = mmu_set_spte(vcpu, it.sptep, gw->pte_access, write_fault,
- it.level, base_gfn, pfn, prefault, map_writable);
+ ret = mmu_set_spte(vcpu, it.sptep, gw->pte_access, kpf->write_fault,
+ it.level, base_gfn, kpf->pfn, kpf->prefault,
+ kpf->map_writable);
if (ret == RET_PF_SPURIOUS)
return ret;

@@ -794,14 +794,11 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
struct kvm_vcpu *vcpu = kpf->vcpu;
gpa_t addr = kpf->cr2_or_gpa;
u32 error_code = kpf->error_code;
- bool prefault = kpf->prefault;
- bool write_fault = kpf->write_fault;
bool user_fault = error_code & PFERR_USER_MASK;
struct guest_walker walker;
int r;
unsigned long mmu_seq;
bool is_self_change_mapping;
- int max_level;

pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code);

@@ -821,7 +818,7 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
*/
if (!r) {
pgprintk("%s: guest page fault\n", __func__);
- if (!prefault)
+ if (!kpf->prefault)
kvm_inject_emulated_page_fault(vcpu, &walker.fault);

return RET_PF_RETRY;
@@ -843,9 +840,9 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
&walker, user_fault, &vcpu->arch.write_fault_to_shadow_pgtable);

if (is_self_change_mapping)
- max_level = PG_LEVEL_4K;
+ kpf->max_level = PG_LEVEL_4K;
else
- max_level = walker.level;
+ kpf->max_level = walker.level;

mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
@@ -860,7 +857,7 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
* Do not change pte_access if the pfn is a mmio page, otherwise
* we will cache the incorrect access into mmio spte.
*/
- if (write_fault && !(walker.pte_access & ACC_WRITE_MASK) &&
+ if (kpf->write_fault && !(walker.pte_access & ACC_WRITE_MASK) &&
!is_write_protection(vcpu) && !user_fault &&
!is_noslot_pfn(kpf->pfn)) {
walker.pte_access |= ACC_WRITE_MASK;
@@ -886,8 +883,7 @@ static int FNAME(page_fault)(struct kvm_page_fault *kpf)
r = make_mmu_pages_available(vcpu);
if (r)
goto out_unlock;
- r = FNAME(fetch)(vcpu, addr, &walker, error_code, max_level, kpf->pfn,
- kpf->map_writable, prefault);
+ r = FNAME(fetch)(kpf, &walker);
kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT);

out_unlock:
--
2.25.1

2021-05-27 00:50:12

by Sean Christopherson

[permalink] [raw]

Subject: Re: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

On Tue, Apr 20, 2021, Isaku Yamahata wrote:
> This is a preliminary clean up for TDX which complicates KVM page fault
> execution path.

Ooh, a series to complicate the page fault path! ;-)

Grammatical snarkiness aside, I'm all in favor of adding a struct to collect the
page fault collateral. Overarching feedback:

- Have kvm_mmu_do_page_fault() handle initialization of the struct. That
will allow making most of the fields const, and will avoid the rather painful
kvm_page_fault_init().

- Pass @vcpu separately. Yes, it's associated with the fault, but literally
the first line in every consumer is "struct kvm_vcpu *vcpu = kpf->vcpu;".

- Use "fault" instead of "kpf", mostly because it reads better for people that
aren't intimately familiar with the code, but also to avoid having to refactor
a huge amount of code if we decide to rename kvm_page_fault, e.g. if we decide
to use that name to return fault information to userspace.

- Snapshot anything that is computed in multiple places, even if it is
derivative of existing info. E.g. it probably makes sense to grab
write/fetch (or exec).

E.g. I'm thinking something like

struct kvm_page_fault {
const gpa_t cr2_or_gpa;
const u32 error_code;
const bool write;
const bool read;
const bool fetch;
const bool prefault;
const bool is_tdp;

gfn_t gfn;
hva_t hva;
int max_level;

kvm_pfn_t pfn;
bool map_writable;
};

int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);

static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
u32 err, bool prefault)
{
struct kvm_page_fault fault = {
.cr2_or_gpa = cr2_or_gpa,
.error_code = err,
.write = err & PFERR_WRITE_MASK,
.fetch = err & PFERR_FETCH_MASK,
.perm = ...
.rsvd = err & PFERR_RSVD_MASK,

.is_tdp = vcpu->arch.mmu->page_fault == kvm_tdp_page_fault,

...
};

#ifdef CONFIG_RETPOLINE
if (likely(fault.is_tdp))
return kvm_tdp_page_fault(vcpu, &fault);
#endif
return vcpu->arch.mmu->page_fault(vcpu, &fault);
}

2021-06-10 12:49:11

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

On 26/05/21 23:10, Sean Christopherson wrote:
> - Have kvm_mmu_do_page_fault() handle initialization of the struct. That
> will allow making most of the fields const, and will avoid the rather painful
> kvm_page_fault_init().
>
> - Pass @vcpu separately. Yes, it's associated with the fault, but literally
> the first line in every consumer is "struct kvm_vcpu *vcpu = kpf->vcpu;".
>
> - Use "fault" instead of "kpf", mostly because it reads better for people that
> aren't intimately familiar with the code, but also to avoid having to refactor
> a huge amount of code if we decide to rename kvm_page_fault, e.g. if we decide
> to use that name to return fault information to userspace.
>
> - Snapshot anything that is computed in multiple places, even if it is
> derivative of existing info. E.g. it probably makes sense to grab

I agree with all of these (especially it was a bit weird not to see vcpu
in the prototypes). Thanks Sean for the review!

Paolo

2021-06-10 22:04:28

by Isaku Yamahata

[permalink] [raw]

Subject: Re: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

Thanks for feedback. Let me respin it.

On Thu, Jun 10, 2021 at 02:45:55PM +0200,
Paolo Bonzini <[email protected]> wrote:

> On 26/05/21 23:10, Sean Christopherson wrote:
> > - Have kvm_mmu_do_page_fault() handle initialization of the struct. That
> > will allow making most of the fields const, and will avoid the rather painful
> > kvm_page_fault_init().
> >
> > - Pass @vcpu separately. Yes, it's associated with the fault, but literally
> > the first line in every consumer is "struct kvm_vcpu *vcpu = kpf->vcpu;".
> >
> > - Use "fault" instead of "kpf", mostly because it reads better for people that
> > aren't intimately familiar with the code, but also to avoid having to refactor
> > a huge amount of code if we decide to rename kvm_page_fault, e.g. if we decide
> > to use that name to return fault information to userspace.
> >
> > - Snapshot anything that is computed in multiple places, even if it is
> > derivative of existing info. E.g. it probably makes sense to grab
>
> I agree with all of these (especially it was a bit weird not to see vcpu in
> the prototypes). Thanks Sean for the review!
>
> Paolo
>

--
Isaku Yamahata <[email protected]>

2021-07-29 16:53:26

by David Matlack

[permalink] [raw]

Subject: Re: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

On Thu, Jun 10, 2021 at 3:05 PM Isaku Yamahata <[email protected]> wrote:
>
> Thanks for feedback. Let me respin it.

Hi Isaku,

I'm working on a series to plumb the memslot backing the faulting gfn
through the page fault handling stack to avoid redundant lookups. This
would be much cleaner to implement on top of your struct
kvm_page_fault series than the existing code.

Are you still planning to send another version of this series? Or if
you have decided to drop it or go in a different direction?

>
> On Thu, Jun 10, 2021 at 02:45:55PM +0200,
> Paolo Bonzini <[email protected]> wrote:
>
> > On 26/05/21 23:10, Sean Christopherson wrote:
> > > - Have kvm_mmu_do_page_fault() handle initialization of the struct. That
> > > will allow making most of the fields const, and will avoid the rather painful
> > > kvm_page_fault_init().
> > >
> > > - Pass @vcpu separately. Yes, it's associated with the fault, but literally
> > > the first line in every consumer is "struct kvm_vcpu *vcpu = kpf->vcpu;".
> > >
> > > - Use "fault" instead of "kpf", mostly because it reads better for people that
> > > aren't intimately familiar with the code, but also to avoid having to refactor
> > > a huge amount of code if we decide to rename kvm_page_fault, e.g. if we decide
> > > to use that name to return fault information to userspace.
> > >
> > > - Snapshot anything that is computed in multiple places, even if it is
> > > derivative of existing info. E.g. it probably makes sense to grab
> >
> > I agree with all of these (especially it was a bit weird not to see vcpu in
> > the prototypes). Thanks Sean for the review!
> >
> > Paolo
> >
>
> --
> Isaku Yamahata <[email protected]>

2021-07-29 17:18:49

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

On 29/07/21 18:48, David Matlack wrote:
> On Thu, Jun 10, 2021 at 3:05 PM Isaku Yamahata <[email protected]> wrote:
>>
>> Thanks for feedback. Let me respin it.
>
> Hi Isaku,
>
> I'm working on a series to plumb the memslot backing the faulting gfn
> through the page fault handling stack to avoid redundant lookups. This
> would be much cleaner to implement on top of your struct
> kvm_page_fault series than the existing code.
>
> Are you still planning to send another version of this series? Or if
> you have decided to drop it or go in a different direction?

I can work on this and post updated patches next week.

Paolo

2021-07-29 17:25:51

by David Matlack

[permalink] [raw]

Subject: Re: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

On Thu, Jul 29, 2021 at 10:17 AM Paolo Bonzini <[email protected]> wrote:
>
> On 29/07/21 18:48, David Matlack wrote:
> > On Thu, Jun 10, 2021 at 3:05 PM Isaku Yamahata <[email protected]> wrote:
> >>
> >> Thanks for feedback. Let me respin it.
> >
> > Hi Isaku,
> >
> > I'm working on a series to plumb the memslot backing the faulting gfn
> > through the page fault handling stack to avoid redundant lookups. This
> > would be much cleaner to implement on top of your struct
> > kvm_page_fault series than the existing code.
> >
> > Are you still planning to send another version of this series? Or if
> > you have decided to drop it or go in a different direction?
>
> I can work on this and post updated patches next week.

Sounds good. For the record I'm also looking at adding an per-vCPU LRU
slot, which *may* obviate the need to pass around the slot. (Isaku's
series is still a nice cleanup regardless.)

>
> Paolo
>

2021-08-06 23:51:20

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [RFC PATCH 00/10] KVM: x86/mmu: simplify argument to kvm page fault handler

On 29/07/21 19:24, David Matlack wrote:
> On Thu, Jul 29, 2021 at 10:17 AM Paolo Bonzini <[email protected]> wrote:
>>
>> On 29/07/21 18:48, David Matlack wrote:
>>> On Thu, Jun 10, 2021 at 3:05 PM Isaku Yamahata <[email protected]> wrote:
>>>>
>>>> Thanks for feedback. Let me respin it.
>>>
>>> Hi Isaku,
>>>
>>> I'm working on a series to plumb the memslot backing the faulting gfn
>>> through the page fault handling stack to avoid redundant lookups. This
>>> would be much cleaner to implement on top of your struct
>>> kvm_page_fault series than the existing code.
>>>
>>> Are you still planning to send another version of this series? Or if
>>> you have decided to drop it or go in a different direction?
>>
>> I can work on this and post updated patches next week.
>
> Sounds good. For the record I'm also looking at adding an per-vCPU LRU
> slot, which *may* obviate the need to pass around the slot. (Isaku's
> series is still a nice cleanup regardless.)

Backport done, but not tested very well yet (and it's scary :)).

Paolo