Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp5443006imm; Tue, 16 Oct 2018 10:17:33 -0700 (PDT) X-Google-Smtp-Source: ACcGV63zGmQoj2IJX6QMraP0tlDPVdpOZNIYJfMiHr0J7bMi6ZNqRNw/6GQn4nWhJ5V5Ze1A2dvh X-Received: by 2002:a63:1b61:: with SMTP id b33-v6mr21094607pgm.245.1539710253099; Tue, 16 Oct 2018 10:17:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539710253; cv=none; d=google.com; s=arc-20160816; b=czul2SclQ3UzgyErFPngNg7sxK1aLwxu6PRwr8MAqLmhmFPV3qZI2IcHXUQoAOVUBO K76Lo7mMRMp2UI22XuPFV648ojW5sFDHPT/rUUHggkt8kXujGIR15e1KcgAf2q1hCH+p oQccEzT2267hj+6qnBJFhH4O3wXp6uvKqJpzoXQquuMNGYI9xEF2FA5c6IdRqoSYgax8 koCsXCg42PuYdxCW8IHdxgDXUsXpbW3UyXu0tWfM4v5Tvu0Nouj0EMc58ui9kQfhNyxE Hp2v+oqBw2t2S6p6AhhtvseDSHcqealaOGHdalyfYS2fVRH/PZkO5rK4y2mxgAGHtHOI iV0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=4U2QiRCqiVwRyUxUfvRljhoLn0ORJkDqh+GoF5X7cK8=; b=jNhEc6vWLoNAlp+UwyUa8tQId4vcqKTUCZJr4q8iMJiZLu06KZdVCMiEb42PwKkTa3 LrYY/XGU9prn3gl9RUVeJdzseY0CoeiFKYwd1yCkovc5/t4vFg1/TA4Kx70+GIQ2FE1+ abLzyM+I1p4NvHT85f8ZCfdwgQgbLtUpZIK8Lb0xLnphed1SUwix/KivqAe3W7iDOCoQ IDdwqH1Oo4VJftQNaUYhv+PX1NLOwA4o81CON4saIDk6Bc4U1V8n8sLeSwWcUAxPq5sa 4H4l203Phwv7Mc4LxZgbrUqlOkG0XvNXfPyqmK9ya5HMcLOwfoEjox4+azpSJ+0Oqw/4 CMSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=GaTbhOSm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p9-v6si13661989pls.378.2018.10.16.10.17.16; Tue, 16 Oct 2018 10:17:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=GaTbhOSm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729554AbeJQBH0 (ORCPT + 99 others); Tue, 16 Oct 2018 21:07:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:51728 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728422AbeJQBHZ (ORCPT ); Tue, 16 Oct 2018 21:07:25 -0400 Received: from localhost (ip-213-127-77-176.ip.prioritytelecom.net [213.127.77.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B30AE20866; Tue, 16 Oct 2018 17:16:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1539710161; bh=FFqP+vq2j5Wue3Z1M4J5ODbAPHEs3FDYIrfUxxCx7og=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GaTbhOSm1Tk0YehK8kCpaGyDN4mWAF52Aw7mKkU83BKEtYmvvGzuhUgJQStEJtamz IxGiv+6jyP9PtaMQVtSZr658Cojtth6P245/ODDU808MJY2Dcw0bHPXfrmTDNW6UgT 7ek2451Z5q3w2/1TfhcTWPkXBRXdnf3s1I2gNMM4= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, David Gibson , "Aneesh Kumar K.V" , kvm-ppc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , Paul Mackerras , Sasha Levin Subject: [PATCH 4.18 086/135] KVM: PPC: Book3S HV: Dont use compound_order to determine host mapping size Date: Tue, 16 Oct 2018 19:05:16 +0200 Message-Id: <20181016170521.241312038@linuxfoundation.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181016170515.447235311@linuxfoundation.org> References: <20181016170515.447235311@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.18-stable review patch. If anyone has any objections, please let me know. ------------------ From: Nicholas Piggin [ Upstream commit 71d29f43b6332badc5598c656616a62575e83342 ] THP paths can defer splitting compound pages until after the actual remap and TLB flushes to split a huge PMD/PUD. This causes radix partition scope page table mappings to get out of synch with the host qemu page table mappings. This results in random memory corruption in the guest when running with THP. The easiest way to reproduce is use KVM balloon to free up a lot of memory in the guest and then shrink the balloon to give the memory back, while some work is being done in the guest. Cc: David Gibson Cc: "Aneesh Kumar K.V" Cc: kvm-ppc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Nicholas Piggin Signed-off-by: Paul Mackerras Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/kvm/book3s_64_mmu_radix.c | 91 +++++++++++++-------------------- 1 file changed, 37 insertions(+), 54 deletions(-) --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -538,8 +538,8 @@ int kvmppc_book3s_radix_page_fault(struc unsigned long ea, unsigned long dsisr) { struct kvm *kvm = vcpu->kvm; - unsigned long mmu_seq, pte_size; - unsigned long gpa, gfn, hva, pfn; + unsigned long mmu_seq; + unsigned long gpa, gfn, hva; struct kvm_memory_slot *memslot; struct page *page = NULL; long ret; @@ -636,9 +636,10 @@ int kvmppc_book3s_radix_page_fault(struc */ hva = gfn_to_hva_memslot(memslot, gfn); if (upgrade_p && __get_user_pages_fast(hva, 1, 1, &page) == 1) { - pfn = page_to_pfn(page); upgrade_write = true; } else { + unsigned long pfn; + /* Call KVM generic code to do the slow-path check */ pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL, writing, upgrade_p); @@ -652,63 +653,45 @@ int kvmppc_book3s_radix_page_fault(struc } } - /* See if we can insert a 1GB or 2MB large PTE here */ - level = 0; - if (page && PageCompound(page)) { - pte_size = PAGE_SIZE << compound_order(compound_head(page)); - if (pte_size >= PUD_SIZE && - (gpa & (PUD_SIZE - PAGE_SIZE)) == - (hva & (PUD_SIZE - PAGE_SIZE))) { - level = 2; - pfn &= ~((PUD_SIZE >> PAGE_SHIFT) - 1); - } else if (pte_size >= PMD_SIZE && - (gpa & (PMD_SIZE - PAGE_SIZE)) == - (hva & (PMD_SIZE - PAGE_SIZE))) { - level = 1; - pfn &= ~((PMD_SIZE >> PAGE_SHIFT) - 1); - } - } - /* - * Compute the PTE value that we need to insert. + * Read the PTE from the process' radix tree and use that + * so we get the shift and attribute bits. */ - if (page) { - pgflags = _PAGE_READ | _PAGE_EXEC | _PAGE_PRESENT | _PAGE_PTE | - _PAGE_ACCESSED; - if (writing || upgrade_write) - pgflags |= _PAGE_WRITE | _PAGE_DIRTY; - pte = pfn_pte(pfn, __pgprot(pgflags)); + local_irq_disable(); + ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); + pte = *ptep; + local_irq_enable(); + + /* Get pte level from shift/size */ + if (shift == PUD_SHIFT && + (gpa & (PUD_SIZE - PAGE_SIZE)) == + (hva & (PUD_SIZE - PAGE_SIZE))) { + level = 2; + } else if (shift == PMD_SHIFT && + (gpa & (PMD_SIZE - PAGE_SIZE)) == + (hva & (PMD_SIZE - PAGE_SIZE))) { + level = 1; } else { - /* - * Read the PTE from the process' radix tree and use that - * so we get the attribute bits. - */ - local_irq_disable(); - ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); - pte = *ptep; - local_irq_enable(); - if (shift == PUD_SHIFT && - (gpa & (PUD_SIZE - PAGE_SIZE)) == - (hva & (PUD_SIZE - PAGE_SIZE))) { - level = 2; - } else if (shift == PMD_SHIFT && - (gpa & (PMD_SIZE - PAGE_SIZE)) == - (hva & (PMD_SIZE - PAGE_SIZE))) { - level = 1; - } else if (shift && shift != PAGE_SHIFT) { - /* Adjust PFN */ - unsigned long mask = (1ul << shift) - PAGE_SIZE; - pte = __pte(pte_val(pte) | (hva & mask)); - } - pte = __pte(pte_val(pte) | _PAGE_EXEC | _PAGE_ACCESSED); - if (writing || upgrade_write) { - if (pte_val(pte) & _PAGE_WRITE) - pte = __pte(pte_val(pte) | _PAGE_DIRTY); - } else { - pte = __pte(pte_val(pte) & ~(_PAGE_WRITE | _PAGE_DIRTY)); + level = 0; + if (shift > PAGE_SHIFT) { + /* + * If the pte maps more than one page, bring over + * bits from the virtual address to get the real + * address of the specific single page we want. + */ + unsigned long rpnmask = (1ul << shift) - PAGE_SIZE; + pte = __pte(pte_val(pte) | (hva & rpnmask)); } } + pte = __pte(pte_val(pte) | _PAGE_EXEC | _PAGE_ACCESSED); + if (writing || upgrade_write) { + if (pte_val(pte) & _PAGE_WRITE) + pte = __pte(pte_val(pte) | _PAGE_DIRTY); + } else { + pte = __pte(pte_val(pte) & ~(_PAGE_WRITE | _PAGE_DIRTY)); + } + /* Allocate space in the tree and write the PTE */ ret = kvmppc_create_pte(kvm, pte, gpa, level, mmu_seq);