Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754972AbaAHGWD (ORCPT ); Wed, 8 Jan 2014 01:22:03 -0500 Received: from mga01.intel.com ([192.55.52.88]:10045 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752932AbaAHGV4 (ORCPT ); Wed, 8 Jan 2014 01:21:56 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.95,622,1384329600"; d="scan'208";a="455257664" Message-ID: <52CCEE80.20001@linux.intel.com> Date: Wed, 08 Jan 2014 14:21:52 +0800 From: Jiang Liu Organization: Intel User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Kai Huang CC: Joerg Roedel , David Woodhouse , Yinghai Lu , Bjorn Helgaas , Dan Williams , Vinod Koul , Tony Luck , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, dmaengine@vger.kernel.org, iommu@lists.linux-foundation.org Subject: Re: [RFC Patch Part2 V1 14/14] iommu/vt-d: update IOMMU state when memory hotplug happens References: <1389085234-22296-1-git-send-email-jiang.liu@linux.intel.com> <1389085234-22296-15-git-send-email-jiang.liu@linux.intel.com> <52CCE9CD.5060404@linux.intel.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2014/1/8 14:14, Kai Huang wrote: > On Wed, Jan 8, 2014 at 2:01 PM, Jiang Liu wrote: >> >> >> On 2014/1/8 13:07, Kai Huang wrote: >>> On Tue, Jan 7, 2014 at 5:00 PM, Jiang Liu wrote: >>>> If static identity domain is created, IOMMU driver needs to update >>>> si_domain page table when memory hotplug event happens. Otherwise >>>> PCI device DMA operations can't access the hot-added memory regions. >>>> >>>> Signed-off-by: Jiang Liu >>>> --- >>>> drivers/iommu/intel-iommu.c | 52 ++++++++++++++++++++++++++++++++++++++++++- >>>> 1 file changed, 51 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>>> index 83e3ed4..35a987d 100644 >>>> --- a/drivers/iommu/intel-iommu.c >>>> +++ b/drivers/iommu/intel-iommu.c >>>> @@ -33,6 +33,7 @@ >>>> #include >>>> #include >>>> #include >>>> +#include >>>> #include >>>> #include >>>> #include >>>> @@ -3689,6 +3690,54 @@ static struct notifier_block device_nb = { >>>> .notifier_call = device_notifier, >>>> }; >>>> >>>> +static int intel_iommu_memory_notifier(struct notifier_block *nb, >>>> + unsigned long val, void *v) >>>> +{ >>>> + struct memory_notify *mhp = v; >>>> + unsigned long long start, end; >>>> + struct iova *iova; >>>> + >>>> + switch (val) { >>>> + case MEM_GOING_ONLINE: >>>> + start = mhp->start_pfn << PAGE_SHIFT; >>>> + end = ((mhp->start_pfn + mhp->nr_pages) << PAGE_SHIFT) - 1; >>>> + if (iommu_domain_identity_map(si_domain, start, end)) { >>>> + pr_warn("dmar: failed to build identity map for [%llx-%llx]\n", >>>> + start, end); >>>> + return NOTIFY_BAD; >>>> + } >>> >>> Better to use iommu_prepare_identity_map? For si_domain, if >>> hw_pass_through is used, there's no page table. >> Hi Kai, >> Good catch! >> Seems function iommu_prepare_identity_map() is designed to handle >> RMRRs. So how about avoiding of registering memory hotplug notifier >> if hw_pass_through is true? > > I think that's also fine :) > > Btw, I have a related question to memory hotplug but not related to > intel IOMMU specifically. For the devices use DMA remapping, suppose > the device is already using the memory that we are trying to remove, > is this case, looks we need to change the existing iova <-> pa > mappings for the pa that is in the memory range about to be removed, > and reset the mapping to different pa (iova remains the same). Does > existing code have this covered? Is there a generic IOMMU layer memory > hotplug notifier to handle memory removal? That's a big issue about how to reclaim memory in use. Current rule is that memory used by DMA won't be removed until released. > > -Kai >> >> Thanks! >> Gerry >> >>> >>>> + break; >>>> + case MEM_OFFLINE: >>>> + case MEM_CANCEL_ONLINE: >>>> + /* TODO: enhance RB-tree and IOVA code to support of splitting iova */ >>>> + iova = find_iova(&si_domain->iovad, mhp->start_pfn); >>>> + if (iova) { >>>> + unsigned long start_pfn, last_pfn; >>>> + struct dmar_drhd_unit *drhd; >>>> + struct intel_iommu *iommu; >>>> + >>>> + start_pfn = mm_to_dma_pfn(iova->pfn_lo); >>>> + last_pfn = mm_to_dma_pfn(iova->pfn_hi + 1) - 1; >>>> + dma_pte_clear_range(si_domain, start_pfn, last_pfn); >>>> + dma_pte_free_pagetable(si_domain, start_pfn, last_pfn); >>>> + rcu_read_lock(); >>>> + for_each_active_iommu(iommu, drhd) >>>> + iommu_flush_iotlb_psi(iommu, si_domain->id, >>>> + start_pfn, last_pfn - start_pfn + 1, 0); >>>> + rcu_read_unlock(); >>>> + __free_iova(&si_domain->iovad, iova); >>>> + } >>> >>> The same as above. Looks we need to consider hw_pass_through for the si_domain. >>> >>> -Kai >>> >>>> + break; >>>> + } >>>> + >>>> + return NOTIFY_OK; >>>> +} >>>> + >>>> +static struct notifier_block intel_iommu_memory_nb = { >>>> + .notifier_call = intel_iommu_memory_notifier, >>>> + .priority = 0 >>>> +}; >>>> + >>>> int __init intel_iommu_init(void) >>>> { >>>> int ret = -ENODEV; >>>> @@ -3761,8 +3810,9 @@ int __init intel_iommu_init(void) >>>> init_iommu_pm_ops(); >>>> >>>> bus_set_iommu(&pci_bus_type, &intel_iommu_ops); >>>> - >>>> bus_register_notifier(&pci_bus_type, &device_nb); >>>> + if (si_domain) >>>> + register_memory_notifier(&intel_iommu_memory_nb); >>>> >>>> intel_iommu_enabled = 1; >>>> >>>> -- >>>> 1.7.10.4 >>>> >>>> _______________________________________________ >>>> iommu mailing list >>>> iommu@lists.linux-foundation.org >>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/