Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2831658pxb; Sun, 3 Apr 2022 23:25:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3jljW88LLBq6dyBe5UIIvycW5etVdh98O/RAIcql3YG87+rGM1auo4JfcSMrBKoO+WA+w X-Received: by 2002:a17:907:e87:b0:6df:6a26:e17c with SMTP id ho7-20020a1709070e8700b006df6a26e17cmr9461249ejc.666.1649053554506; Sun, 03 Apr 2022 23:25:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649053554; cv=none; d=google.com; s=arc-20160816; b=NdceA5wZZDbRKPVzOSO7UdXp1J0Eg9b9hGXLODWQd2v1x7CHAaLgf/YPNOfFk/ehEZ lZc6FAkJzD0aW7Y47vJaiKQCES5QJMXOc1BEyLYDI/y/+bYrw62ITk2EajFKSOK5oOPi 9WUidgpVnadqppxDld/NvJ5tf+yPJkbtBJOcBHVSbMiLKm4sgRPegd5yVqwrl0tQ0s+e nhtI6Km3quni+meiBgtdqmJgicHxr2AsFkXP7dLciS7VhbeKNfAhhPauV3tIAqoZoaob N4RF0vGuixxjjAVkY6VRPJs415xbqVbWCRak5Lu/4hb5QWkPC2UCFNFnVoVAa1gp97c6 ZOiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=Bf5lVowz2E1FrtTOo3bYEtooZyQIechMv/JUA7FPBhk=; b=qP9jqTRAADoM1FR8v01iD5oNEvQg48WlCZpCHEab1MzYnBofQPkZNYGJR6MKRSIedK mfcK7v1wKWK7/QL+JUmRSZ08u07P7VKy9w8PjEjefmpffc/ZXYXrL80aeLTPENrT+PJL 4TpRkS90EazM4TBx2++4nYC8wH0UKrnmR81LM5a1gtT7KGIp0eWrzMnvC4wdPCUQd7nE ex4hFVxW+GwsQen3yDCDtPgcPOkz2PDo/gLS/u5VkCUL6lGgOUxpjWOcVI2AsUmGLreD QOoOXY+8fGh5uezPsshE4XpIhII1lhuMCiQW+GJvNEg/nRFHXwzt9xH+KX7SSu4oqyRI n4zQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=BIg+XwPK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f1-20020a05640214c100b00418edba5779si5323127edx.544.2022.04.03.23.25.29; Sun, 03 Apr 2022 23:25:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=BIg+XwPK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350121AbiDAQyx (ORCPT + 99 others); Fri, 1 Apr 2022 12:54:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349829AbiDAQy3 (ORCPT ); Fri, 1 Apr 2022 12:54:29 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0AFEBC0A for ; Fri, 1 Apr 2022 09:49:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5B08EB82538 for ; Fri, 1 Apr 2022 16:49:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 02540C340EC; Fri, 1 Apr 2022 16:49:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1648831742; bh=Q7xkMN9lfi4IJffUuB8VvpeF+YJTc1OHOUWtd3UE+EA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=BIg+XwPKwBJbVY4YLTI3r0FnmtKbcZjwuGN5zjQ+w7aRWZU+y9nJ1FBzOwut1QclT 42G0nHKl8/cADyAQ6NYg3KIm5ZGu88jc2hGcI8E1bRrIFsFq3Y+OOZt+STQ+948mRZ 2BogPHc3j/YJQL9Gy4HNYWOC6bkVEZ7FP07HXdKo4TqNKwmbRxgYxWUjbMWLK3zzSf G7G1TjiFtVqx1ZGQRRET5V3f1Sq8Uu8BKh/ev9N2FkI0R+gW3YKB1YnLBh1lSM4d3W mfT5FoD1QJ1oM80YXhTuEjpefucg1Y/1pajNBSdFudokT6F2mHtVDLfMrrxUyrRkOL phJCu+ul8Zhaw== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1naKSJ-000rxZ-J8; Fri, 01 Apr 2022 17:48:59 +0100 Date: Fri, 01 Apr 2022 17:48:59 +0100 Message-ID: <87tubcbvgk.wl-maz@kernel.org> From: Marc Zyngier To: xieming Cc: linux@armlinux.org.uk, catalin.marinas@arm.com, will@kernel.org, alex.williamson@redhat.com, sashal@kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] kvm/arm64: fixed passthrough gpu into vm on arm64 In-Reply-To: <20220401090828.614167-1-xieming@kylinos.cn> References: <20220401090828.614167-1-xieming@kylinos.cn> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: xieming@kylinos.cn, linux@armlinux.org.uk, catalin.marinas@arm.com, will@kernel.org, alex.williamson@redhat.com, sashal@kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Xieming, This is the second time I fix email addresses for you. Next time, I simply won't bother replying. On Fri, 01 Apr 2022 10:08:28 +0100, xieming wrote: > > when passthrough some pcie device, such as gpus(including > Nvidia and AMD),kvm will report:"Unsupported FSC: EC=0x24 > xFSC=0x21 ESR_EL2=0x92000061" err.the main reason is vfio I have asked you to describe how you get there, and you still haven't bothered replying. > ioremap vga memory type by DEVICE_nGnRnE, and kvm setting > memory type to PAGE_S2_DEVICE(DEVICE_nGnRE), but in guestos, > all of device io memory type when ioremapping (including gpu > driver TTM memory type) is setting to MT_NORMAL_NC. > > according to ARM64 stage1&stage2 conbining rules. > memory type attributes combining rules: > Normal-WB DevicenGnRE Normal-WB is weakest,Device-nGnRnE is strongest. > > refferring to 'Arm Architecture Reference Manual Armv8, > for Armv8-A architecture profile' pdf, chapter B2.8 > refferring to 'ARM System Memory Management Unit Architecture > Specification SMMU architecture version 3.0 and version 3.1' pdf, > chapter 13.1.5 > > therefore, the I/O memory attribute of the VM is setting to > DevicenGnRE maybe is a mistake. it causes all device memory > accessing in the virtual machine must be aligned. > > To summarize: stage2 memory type cannot be stronger than stage1 > in arm64 archtechture. You are plain wrong. It can, and most of the time, it *must*. > > Signed-off-by: xieming > --- > arch/arm/include/asm/kvm_mmu.h | 3 ++- > arch/arm64/include/asm/kvm_mmu.h | 3 ++- > arch/arm64/include/asm/memory.h | 4 +++- > arch/arm64/include/asm/pgtable-prot.h | 2 +- > drivers/vfio/pci/vfio_pci.c | 7 +++++++ > virt/kvm/arm/mmu.c | 19 ++++++++++++++++--- > virt/kvm/arm/vgic/vgic-v2.c | 2 +- > 7 files changed, 32 insertions(+), 8 deletions(-) > > diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h > index 523c499e42db..5c7869d25b62 100644 > --- a/arch/arm/include/asm/kvm_mmu.h > +++ b/arch/arm/include/asm/kvm_mmu.h This file has been removed from the tree *over two years ago*. > @@ -64,7 +64,8 @@ void stage2_unmap_vm(struct kvm *kvm); > int kvm_alloc_stage2_pgd(struct kvm *kvm); > void kvm_free_stage2_pgd(struct kvm *kvm); > int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, > - phys_addr_t pa, unsigned long size, bool writable); > + phys_addr_t pa, unsigned long size, > + bool writable, bool writecombine); > > int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run); > > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h > index b2558447c67d..3f98286c7498 100644 > --- a/arch/arm64/include/asm/kvm_mmu.h > +++ b/arch/arm64/include/asm/kvm_mmu.h > @@ -158,7 +158,8 @@ void stage2_unmap_vm(struct kvm *kvm); > int kvm_alloc_stage2_pgd(struct kvm *kvm); > void kvm_free_stage2_pgd(struct kvm *kvm); > int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, > - phys_addr_t pa, unsigned long size, bool writable); > + phys_addr_t pa, unsigned long size, > + bool writable, bool writecombine); NAK. For a start, there is no such thing as 'write-combine' in the ARM architecture, and I'm not convinced you can equate WC to Normal-NC. See the previous discussion at [1]. [1] https://lore.kernel.org/r/20210429162906.32742-1-sdonthineni@nvidia.com [...] > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 51b791c750f1..6f66efb71743 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -1452,7 +1452,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) > } > > vma->vm_private_data = vdev; > +#ifdef CONFIG_ARM64 > + if (vfio_pci_is_vga(pdev)) > + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > + else > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); No. That's completely unacceptable. Who says that some VGA (who the hell implements VGA these days?) implies any sort of attribute other than device memory? This may work for your particular device under your own circumstances. Can it be generalised? No. And as Jason pointed out, this is likely to break userspace. > +#else > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > +#endif > vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; > > /* > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 11103b75c596..a46a58696834 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -206,6 +206,17 @@ static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp) > dsb(ishst); > } > > +/** > + * is_vma_write_combine - check if VMA is mapped with writecombine or not > + * Return true if VMA mapped with MT_NORMAL_NC otherwise fasle > + */ > +static inline bool is_vma_write_combine(struct vm_area_struct *vma) > +{ > + pteval_t pteval = pgprot_val(vma->vm_page_prot); > + > + return ((pteval & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_NC)); > +} Again, you are making tons of assumptions here, none of which are acceptable as is. M. -- Without deviation from the norm, progress is not possible.