Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753150Ab3G3P7j (ORCPT ); Tue, 30 Jul 2013 11:59:39 -0400 Received: from mail-ob0-f171.google.com ([209.85.214.171]:53792 "EHLO mail-ob0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751040Ab3G3P7g (ORCPT ); Tue, 30 Jul 2013 11:59:36 -0400 MIME-Version: 1.0 In-Reply-To: <51F758B6.9090204@jp.fujitsu.com> References: <1368509365-2260-1-git-send-email-indou.takao@jp.fujitsu.com> <51B19DF3.2070009@jp.fujitsu.com> <51B6BEDB.3000509@jp.fujitsu.com> <51B93221.2040505@jp.fujitsu.com> <51BA7BB6.1080104@jp.fujitsu.com> <51EF7466.20703@jp.fujitsu.com> <51F5B966.9080405@jp.fujitsu.com> <51F758B6.9090204@jp.fujitsu.com> From: Bjorn Helgaas Date: Tue, 30 Jul 2013 09:59:16 -0600 Message-ID: Subject: Re: [PATCH v2] PCI: Reset PCIe devices to stop ongoing DMA To: Takao Indoh Cc: Vivek Goyal , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" , "open list:INTEL IOMMU (VT-d)" , "kexec@lists.infradead.org" , "ishii.hironobu@jp.fujitsu.com" , Don Dutile , "Sumner, William" , "alex.williamson@redhat.com" , Haren Myneni Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2863 Lines: 69 On Tue, Jul 30, 2013 at 12:09 AM, Takao Indoh wrote: > (2013/07/29 23:17), Bjorn Helgaas wrote: >> On Sun, Jul 28, 2013 at 6:37 PM, Takao Indoh wrote: >>> (2013/07/26 2:00), Bjorn Helgaas wrote: >>>> My point about IOMMU and PCI initialization order doesn't go away just >>>> because it doesn't fit "kdump policy." Having system initialization >>>> occur in a logical order is far more important than making kdump work. >>> >>> My next plan is as follows. I think this is matched to logical order >>> on boot. >>> >>> drivers/pci/pci.c >>> - Add function to reset bus, for example, pci_reset_bus(struct pci_bus *bus) >>> >>> drivers/iommu/intel-iommu.c >>> - On initialization, if IOMMU is already enabled, call this bus reset >>> function before disabling and re-enabling IOMMU. >> >> I raised this issue because of arches like sparc that enumerate the >> IOMMU before the PCI devices that use it. In that situation, I think >> you're proposing this: >> >> panic kernel >> enable IOMMU >> panic >> kdump kernel >> initialize IOMMU (already enabled) >> pci_reset_bus >> disable IOMMU >> enable IOMMU >> enumerate PCI devices >> >> But the problem is that when you call pci_reset_bus(), you haven't >> enumerated the PCI devices, so you don't know what to reset. > > Right, so my idea is adding reset code into "intel-iommu.c". intel-iommu > initialization is based on the assumption that enumeration of PCI devices > is already done. We can find target device from IOMMU page table instead > of scanning all devices in pci tree. > > Therefore, this idea is only for intel-iommu. Other architectures need > to implement their own reset code. That's my point. I'm opposed to adding code to PCI when it only benefits x86 and we know other arches will need a fundamentally different design. I would rather have a design that can work for all arches. If your implementation is totally implemented under arch/x86 (or in intel-iommu.c, I guess), I can't object as much. However, I think that eventually even x86 should enumerate the IOMMUs via ACPI before we enumerate PCI devices. It's pretty clear that's how BIOS designers expect the OS to work. For example, sec 8.7.3 of the Intel Virtualization Technology for Directed I/O spec, rev 1.3, shows the expectation that remapping hardware (IOMMU) is initialized before discovering the I/O hierarchy below a hot-added host bridge. Obviously you're not talking about a hot-add scenario, but I think the same sequence should apply at boot-time as well. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/