Received: by 2002:a05:6a10:c7c6:0:0:0:0 with SMTP id h6csp2713298pxy; Tue, 3 Aug 2021 13:12:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz1gHC2bxAWdbai83z266OBvB891DIsNT5DL+l/W7e7cglijiRW8HDA0ZyI1cCO1F/I56SA X-Received: by 2002:aa7:c3d1:: with SMTP id l17mr24379699edr.299.1628021558407; Tue, 03 Aug 2021 13:12:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628021558; cv=none; d=google.com; s=arc-20160816; b=tZF+9GOSYFkq9tIkVAjFNZu774/mA6Hzpa7L/TjJg9lzEm1PLfNxc3WnnWdyFiJfa4 oJ0C1AAKc8ZRENg+64dJ8TwBAYyuoyWHV3lVdx7ibJNwIo1K3Fv9fGV4EyyF/3poUYd/ 7q9GLKQSQ8I3CqtDXu9SUEwK878YYauUYi7Wl2PC13hf0BLO/wL3SbisbTVfbmfsfdeY MR4MUsb0ksZPCjN2YWYNrjdCQsGTmkDJOuBQSFESeS6mkulN/BKRw813gsUof+eQSLwv dp9/tRTKndkdbGr6Hq/XrvM4j8I7Ld6njQ9CemwyBeHkWNzSMpfjbEcr6x3c7PSwZJo+ 9PQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature:dkim-filter; bh=HCe3GshWTF7vPJaEOofUaKnKzeINHIBZgPnnV5jC2+0=; b=S2YywDPoTtV2TqcnPkItio3jpylCVVqaOHLNsg04jy+RNNetnPysB9D6HLCdC7TvAQ Wh7VDIgxRNbIf1wnLAhqvVXrPbSmxKx503sCn/QfkxVpU3zwcp3WVpAWIvqHftky8acR Oil2XZpNdYrxmrtq6RFu1Dw/KNt3CI7KC2hm7oWV8dBrUNUOVRwk9Qdjo4RZJ28BOebq zvh2lb/ZzDy7h64dxELttEyGoGmhd5D83IZot6W9gkqBx2UGseJ3cb/3YqKtSKYeNSoN w+99Sog8NVJOI3G5xj+c9aja/IgQUHSUAwxYQJU7WUEIXwse0onNsfjAgkpWRvhq8Nwe o0+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=mHZgfl1Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w11si17142072ede.300.2021.08.03.13.12.14; Tue, 03 Aug 2021 13:12:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=mHZgfl1Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239196AbhHCSlG (ORCPT + 99 others); Tue, 3 Aug 2021 14:41:06 -0400 Received: from linux.microsoft.com ([13.77.154.182]:34654 "EHLO linux.microsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233816AbhHCSlF (ORCPT ); Tue, 3 Aug 2021 14:41:05 -0400 Received: from [192.168.1.115] (unknown [223.178.56.171]) by linux.microsoft.com (Postfix) with ESMTPSA id ED6F4208AB1A; Tue, 3 Aug 2021 11:40:49 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com ED6F4208AB1A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1628016054; bh=HCe3GshWTF7vPJaEOofUaKnKzeINHIBZgPnnV5jC2+0=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=mHZgfl1Q5lJPQVPX9SoXjFNffyQd154C3GYng+WnVdmQuXOaOWJ+wjQ/17TCrJkBn az9me95RrkjyJ9knQ6lP/rPHR20eA2rCcm05E2fBci6t9URSfPF27JouSoZjYuDWaP IWxiJRtYcxk+787gLGL0VPqkWFJNWb1jw2rIPYh0= Subject: Re: [RFC v1 5/8] mshv: add paravirtualized IOMMU support To: Wei Liu , Linux on Hyper-V List Cc: virtualization@lists.linux-foundation.org, Linux Kernel List , Michael Kelley , Vineeth Pillai , Sunil Muthuswamy , Nuno Das Neves , pasha.tatashin@soleen.com, Joerg Roedel , Will Deacon , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Dexuan Cui , "open list:IOMMU DRIVERS" References: <20210709114339.3467637-1-wei.liu@kernel.org> <20210709114339.3467637-6-wei.liu@kernel.org> From: Praveen Kumar Message-ID: <77670985-2a1b-7bbd-2ede-4b7810c3e220@linux.microsoft.com> Date: Wed, 4 Aug 2021 00:10:45 +0530 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: <20210709114339.3467637-6-wei.liu@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09-07-2021 17:13, Wei Liu wrote: > +static void hv_iommu_domain_free(struct iommu_domain *d) > +{ > + struct hv_iommu_domain *domain = to_hv_iommu_domain(d); > + unsigned long flags; > + u64 status; > + struct hv_input_delete_device_domain *input; > + > + if (is_identity_domain(domain) || is_null_domain(domain)) > + return; > + > + local_irq_save(flags); > + input = *this_cpu_ptr(hyperv_pcpu_input_arg); > + memset(input, 0, sizeof(*input)); > + > + input->device_domain= domain->device_domain; > + > + status = hv_do_hypercall(HVCALL_DELETE_DEVICE_DOMAIN, input, NULL); > + > + local_irq_restore(flags); > + > + if (!hv_result_success(status)) > + pr_err("%s: hypercall failed, status %lld\n", __func__, status); Is it OK to deallocate the resources, if hypercall has failed ? Do we have any specific error code EBUSY (kind of) which we need to wait upon ? > + > + ida_free(&domain->hv_iommu->domain_ids, domain->device_domain.domain_id.id); > + > + iommu_put_dma_cookie(d); > + > + kfree(domain); > +} > + > +static int hv_iommu_attach_dev(struct iommu_domain *d, struct device *dev) > +{ > + struct hv_iommu_domain *domain = to_hv_iommu_domain(d); > + u64 status; > + unsigned long flags; > + struct hv_input_attach_device_domain *input; > + struct pci_dev *pdev; > + struct hv_iommu_endpoint *vdev = dev_iommu_priv_get(dev); > + > + /* Only allow PCI devices for now */ > + if (!dev_is_pci(dev)) > + return -EINVAL; > + > + pdev = to_pci_dev(dev); > + > + dev_dbg(dev, "Attaching (%strusted) to %d\n", pdev->untrusted ? "un" : "", > + domain->device_domain.domain_id.id); > + > + local_irq_save(flags); > + input = *this_cpu_ptr(hyperv_pcpu_input_arg); > + memset(input, 0, sizeof(*input)); > + > + input->device_domain = domain->device_domain; > + input->device_id = hv_build_pci_dev_id(pdev); > + > + status = hv_do_hypercall(HVCALL_ATTACH_DEVICE_DOMAIN, input, NULL); > + local_irq_restore(flags); > + > + if (!hv_result_success(status)) > + pr_err("%s: hypercall failed, status %lld\n", __func__, status); Does it make sense to vdev->domain = NULL ? > + else > + vdev->domain = domain; > + > + return hv_status_to_errno(status); > +} > + > +static void hv_iommu_detach_dev(struct iommu_domain *d, struct device *dev) > +{ > + u64 status; > + unsigned long flags; > + struct hv_input_detach_device_domain *input; > + struct pci_dev *pdev; > + struct hv_iommu_domain *domain = to_hv_iommu_domain(d); > + struct hv_iommu_endpoint *vdev = dev_iommu_priv_get(dev); > + > + /* See the attach function, only PCI devices for now */ > + if (!dev_is_pci(dev)) > + return; > + > + pdev = to_pci_dev(dev); > + > + dev_dbg(dev, "Detaching from %d\n", domain->device_domain.domain_id.id); > + > + local_irq_save(flags); > + input = *this_cpu_ptr(hyperv_pcpu_input_arg); > + memset(input, 0, sizeof(*input)); > + > + input->partition_id = HV_PARTITION_ID_SELF; > + input->device_id = hv_build_pci_dev_id(pdev); > + > + status = hv_do_hypercall(HVCALL_DETACH_DEVICE_DOMAIN, input, NULL); > + local_irq_restore(flags); > + > + if (!hv_result_success(status)) > + pr_err("%s: hypercall failed, status %lld\n", __func__, status); > + > + vdev->domain = NULL; > +} > + > +static int hv_iommu_add_mapping(struct hv_iommu_domain *domain, unsigned long iova, > + phys_addr_t paddr, size_t size, u32 flags) > +{ > + unsigned long irqflags; > + struct hv_iommu_mapping *mapping; > + > + mapping = kzalloc(sizeof(*mapping), GFP_ATOMIC); > + if (!mapping) > + return -ENOMEM; > + > + mapping->paddr = paddr; > + mapping->iova.start = iova; > + mapping->iova.last = iova + size - 1; > + mapping->flags = flags; > + > + spin_lock_irqsave(&domain->mappings_lock, irqflags); > + interval_tree_insert(&mapping->iova, &domain->mappings); > + spin_unlock_irqrestore(&domain->mappings_lock, irqflags); > + > + return 0; > +} > + > +static size_t hv_iommu_del_mappings(struct hv_iommu_domain *domain, > + unsigned long iova, size_t size) > +{ > + unsigned long flags; > + size_t unmapped = 0; > + unsigned long last = iova + size - 1; > + struct hv_iommu_mapping *mapping = NULL; > + struct interval_tree_node *node, *next; > + > + spin_lock_irqsave(&domain->mappings_lock, flags); > + next = interval_tree_iter_first(&domain->mappings, iova, last); > + while (next) { > + node = next; > + mapping = container_of(node, struct hv_iommu_mapping, iova); > + next = interval_tree_iter_next(node, iova, last); > + > + /* Trying to split a mapping? Not supported for now. */ > + if (mapping->iova.start < iova) > + break; > + > + unmapped += mapping->iova.last - mapping->iova.start + 1; > + > + interval_tree_remove(node, &domain->mappings); > + kfree(mapping); > + } > + spin_unlock_irqrestore(&domain->mappings_lock, flags); > + > + return unmapped; > +} > + > +static int hv_iommu_map(struct iommu_domain *d, unsigned long iova, > + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) > +{ > + u32 map_flags; > + unsigned long flags, pfn, npages; > + int ret, i; > + struct hv_iommu_domain *domain = to_hv_iommu_domain(d); > + struct hv_input_map_device_gpa_pages *input; > + u64 status; > + > + /* Reject size that's not a whole page */ > + if (size & ~HV_HYP_PAGE_MASK) > + return -EINVAL; > + > + map_flags = HV_MAP_GPA_READABLE; /* Always required */ > + map_flags |= prot & IOMMU_WRITE ? HV_MAP_GPA_WRITABLE : 0; > + > + ret = hv_iommu_add_mapping(domain, iova, paddr, size, flags); > + if (ret) > + return ret; > + > + npages = size >> HV_HYP_PAGE_SHIFT; > + > + local_irq_save(flags); > + input = *this_cpu_ptr(hyperv_pcpu_input_arg); > + memset(input, 0, sizeof(*input)); > + > + input->device_domain = domain->device_domain; > + input->map_flags = map_flags; > + input->target_device_va_base = iova; > + > + pfn = paddr >> HV_HYP_PAGE_SHIFT; > + for (i = 0; i < npages; i++) { > + input->gpa_page_list[i] = pfn; > + pfn += 1; > + } > + > + status = hv_do_rep_hypercall(HVCALL_MAP_DEVICE_GPA_PAGES, npages, 0, > + input, NULL); > + > + local_irq_restore(flags); > + > + if (!hv_result_success(status)) { > + pr_err("%s: hypercall failed, status %lld\n", __func__, status); > + hv_iommu_del_mappings(domain, iova, size); > + } > + > + return hv_status_to_errno(status); > +} > + > +static size_t hv_iommu_unmap(struct iommu_domain *d, unsigned long iova, > + size_t size, struct iommu_iotlb_gather *gather) > +{ > + size_t unmapped; > + struct hv_iommu_domain *domain = to_hv_iommu_domain(d); > + unsigned long flags, npages; > + struct hv_input_unmap_device_gpa_pages *input; > + u64 status; > + > + unmapped = hv_iommu_del_mappings(domain, iova, size); > + if (unmapped < size) > + return 0; Is there a case where unmapped > 0 && unmapped < size ? > + > + npages = size >> HV_HYP_PAGE_SHIFT; > + > + local_irq_save(flags); > + input = *this_cpu_ptr(hyperv_pcpu_input_arg); > + memset(input, 0, sizeof(*input)); > + > + input->device_domain = domain->device_domain; > + input->target_device_va_base = iova; > + > + /* Unmap `npages` pages starting from VA base */ > + status = hv_do_rep_hypercall(HVCALL_UNMAP_DEVICE_GPA_PAGES, npages, > + 0, input, NULL); > + > + local_irq_restore(flags); > + > + if (!hv_result_success(status)) > + pr_err("%s: hypercall failed, status %lld\n", __func__, status); > + > + return hv_result_success(status) ? unmapped : 0; > +} > + Regards, ~Praveen.