Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp4108179ybn; Fri, 27 Sep 2019 16:44:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqyQEGLcOLL4tb7z8ePu56Uq8aPV/5IRJjTkZFRMrACRn0nNaUOo9kTUK682x5BmM0q51hCK X-Received: by 2002:a50:cc4d:: with SMTP id n13mr7491430edi.5.1569627860204; Fri, 27 Sep 2019 16:44:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569627860; cv=none; d=google.com; s=arc-20160816; b=sz0WdIUo/QpOXSWb6dRQip3S3zFrk6HmZ29BiGk3TjTik2VsCgMs+uka0OZO/co3h2 ZkrsjJqiC61iL6gM2TcPF6EIOUDk7Fmu8oG1e9qIez8aADkxy6p/4A6r0s8vkz9rqmp9 AGxqgwygc9Iqn+oK6sRJpPxpXddIq3cC5MAEae63PWcGV4o6rOYTGHa8BsMZ1SlQBoRs Oz8juP92kBPD+ucKDs62pve7jFk+IYi4E7wlbCmaS8A/JE/oAcTDcKajRip6mhbfXyF0 cN9umjhZA74acozLcMvtFs1DjUtucaw0Jbf9CRLht42EfNDVQ8LWzyDTmo5vcRrozd7Z ntPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=NeX0MW05FzZRSMVy1sRXWGaqoWvVxS6JK0jCQGNkEeI=; b=eJeyPXnHc4at/0qw2UfpZHy/qDwfh8AnboCPcnWA/oZv+y6Qfceoz4Dus8NyVkS/3E 75GnRni7m+jyKaYdZMJk3w8LGDDwUv6eUI0C2DzOUmF0D7z9OkY4giXbDuNAzeo0BqtE wuojOvpLQ4PzmedKl19Wy+T0k4n0kYZgd5opwYjrnaykKkw2ut3ZJH5Vn0m9wAS2tG9U 9VEfwcuXa/u7gHrSVdxbitWAxfuP0NbfHqlFmx2JI1ehuumy5TqCnqLkcp7Qq/SGj7SK 0qbwg6dEwKXsactQ3j5ScCZ3b2SrrJwXjrfl1Q6Z/22PBeh5/A661y5QPZpampy+0bsd c52Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q4si3428359eji.152.2019.09.27.16.43.55; Fri, 27 Sep 2019 16:44:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728693AbfI0Xlf (ORCPT + 99 others); Fri, 27 Sep 2019 19:41:35 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:53936 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728558AbfI0Xle (ORCPT ); Fri, 27 Sep 2019 19:41:34 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x8RNbTOm135082; Fri, 27 Sep 2019 19:40:58 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2v9rkswerh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Sep 2019 19:40:58 -0400 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.27/8.16.0.27) with SMTP id x8RNbkb0135919; Fri, 27 Sep 2019 19:40:57 -0400 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com with ESMTP id 2v9rkswer7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Sep 2019 19:40:57 -0400 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id x8RNeO2h023852; Fri, 27 Sep 2019 23:40:56 GMT Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27]) by ppma03dal.us.ibm.com with ESMTP id 2v5bg8b71f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Sep 2019 23:40:56 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x8RNetGr43647294 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Sep 2019 23:40:55 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B7795112063; Fri, 27 Sep 2019 23:40:55 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 14094112062; Fri, 27 Sep 2019 23:40:52 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.18.235.58]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Fri, 27 Sep 2019 23:40:51 +0000 (GMT) From: Leonardo Bras To: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: Leonardo Bras , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , "Aneesh Kumar K.V" , Christophe Leroy , Andrew Morton , Dan Williams , Nicholas Piggin , Mahesh Salgaonkar , Allison Randal , Thomas Gleixner , Ganesh Goudar , Mike Rapoport , YueHaibing , Greg Kroah-Hartman , Ira Weiny , Jason Gunthorpe , John Hubbard , Keith Busch Subject: [PATCH v4 08/11] powerpc/kvm/book3s_hv: Applies counting method to monitor lockless pgtbl walks Date: Fri, 27 Sep 2019 20:40:05 -0300 Message-Id: <20190927234008.11513-9-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190927234008.11513-1-leonardo@linux.ibm.com> References: <20190927234008.11513-1-leonardo@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-09-27_09:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=897 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1909270205 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Applies the counting-based method for monitoring all book3s_hv related functions that do lockless pagetable walks. Adds comments explaining that some lockless pagetable walks don't need protection due to guest pgd not being a target of THP collapse/split, or due to being called from Realmode + MSR_EE = 0 kvmppc_do_h_enter: Fixes where local_irq_restore() must be placed (after the last usage of ptep). Signed-off-by: Leonardo Bras --- arch/powerpc/kvm/book3s_hv_nested.c | 22 ++++++++++++++++++++-- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 18 ++++++++++++++---- 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index 735e0ac6f5b2..5a641b559de7 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -803,7 +803,11 @@ static void kvmhv_update_nest_rmap_rc(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find the pte */ + /* Find the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* * If the pte is present and the pfn is still the same, update the pte. @@ -853,7 +857,11 @@ static void kvmhv_remove_nest_rmap(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find and invalidate the pte */ + /* Find and invalidate the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* Don't spuriously invalidate ptes if the pfn has changed */ if (ptep && pte_present(*ptep) && ((pte_val(*ptep) & mask) == hpa)) @@ -921,6 +929,11 @@ static bool kvmhv_invalidate_shadow_pte(struct kvm_vcpu *vcpu, int shift; spin_lock(&kvm->mmu_lock); + /* + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; @@ -1362,6 +1375,11 @@ static long int __kvmhv_nested_page_fault(struct kvm_run *run, /* See if can find translation in our partition scoped tables for L1 */ pte = __pte(0); spin_lock(&kvm->mmu_lock); + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ pte_p = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 63e0ce91e29d..2076a7ac230a 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -252,6 +252,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * If we had a page table table change after lookup, we would * retry via mmu_notifier_retry. */ + start_lockless_pgtbl_walk(kvm->mm); if (!realmode) local_irq_save(irq_flags); /* @@ -287,8 +288,6 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pa |= gpa & ~PAGE_MASK; } } - if (!realmode) - local_irq_restore(irq_flags); ptel &= HPTE_R_KEY | HPTE_R_PP0 | (psize-1); ptel |= pa; @@ -311,6 +310,9 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, ptel &= ~(HPTE_R_W|HPTE_R_I|HPTE_R_G); ptel |= HPTE_R_M; } + if (!realmode) + local_irq_restore(irq_flags); + end_lockless_pgtbl_walk(kvm->mm); /* Find and lock the HPTEG slot to use */ do_insert: @@ -885,11 +887,19 @@ static int kvmppc_get_hpa(struct kvm_vcpu *vcpu, unsigned long gpa, /* Translate to host virtual address */ hva = __gfn_to_hva_memslot(memslot, gfn); - /* Try to find the host pte for that virtual address */ + /* Try to find the host pte for that virtual address : + * Called by hcall_real_table (real mode + MSR_EE=0) + * Interrupts are disabled here. + */ + start_lockless_pgtbl_walk(kvm->mm); ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); - if (!ptep) + if (!ptep) { + end_lockless_pgtbl_walk(kvm->mm); return H_TOO_HARD; + } pte = kvmppc_read_update_linux_pte(ptep, writing); + end_lockless_pgtbl_walk(kvm->mm); + if (!pte_present(pte)) return H_TOO_HARD; -- 2.20.1