Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753356AbdFWLnP (ORCPT ); Fri, 23 Jun 2017 07:43:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44166 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751182AbdFWLnO (ORCPT ); Fri, 23 Jun 2017 07:43:14 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 99C3574853 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=bhe@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 99C3574853 Date: Fri, 23 Jun 2017 19:43:10 +0800 From: Baoquan He To: Joerg Roedel Cc: iommu@lists.linux-foundation.org, Joerg Roedel , linux-kernel@vger.kernel.org Subject: Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel Message-ID: <20170623114310.GE20618@x1> References: <1497600901-8993-1-git-send-email-joro@8bytes.org> <20170623085719.GD20618@x1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170623085719.GD20618@x1> User-Agent: Mutt/1.7.0 (2016-08-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 23 Jun 2017 11:43:13 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3751 Lines: 101 Hi Joerg, On 06/23/17 at 04:57pm, Baoquan He wrote: > Hi dear Joerg, > > On 06/16/17 at 10:15am, Joerg Roedel wrote: > > From: Joerg Roedel > > > > When booting into a kdump kernel, suppress IO_PAGE_FAULTs by > > default for all devices. But allow the faults again when a > > domain is assigned to a device. > > I have two bugs at hand reported by customer, saying their system hang > with amd iommu on. I remember I borrowed the system and found it hang very > early so that no one knew what's happened. One time it printed several lines > of boot message and I found it's amd iommu system, adding amd_iommu=off > to make the system boot normally. > > And with the kdump fix of amd iommu patchset applied, kdump kernel boots > well. So maybe suppressing the fault message is not enough. Do you think whether it's necessary to continue my kdump fix of amd iommu patchset? Seems my last post was in Jan this year. I know you are very busy on fixing bugs and reviewing tons of patches. Without your guidance and reviewing, I absolutely can't make it. So I would like to hear your suggestions and idea. I focused on kaslr issues recently, now most of them have been fixed. My boss discussed with me about the next plan. If you have other plan, I can sync it to our team about the status of upstream. Thanks Baoquan > > --- > > drivers/iommu/amd_iommu.c | 3 ++- > > drivers/iommu/amd_iommu_init.c | 9 +++++++++ > > drivers/iommu/amd_iommu_types.h | 1 + > > 3 files changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > > index 80efa72..623ab53 100644 > > --- a/drivers/iommu/amd_iommu.c > > +++ b/drivers/iommu/amd_iommu.c > > @@ -2050,7 +2050,8 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats) > > flags |= tmp; > > } > > > > - flags &= ~(0xffffUL); > > + > > + flags &= ~(DTE_FLAG_SA | 0xffffULL); > > flags |= domain->id; > > > > amd_iommu_dev_table[devid].data[1] = flags; > > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c > > index 5a11328..d9f5ddd 100644 > > --- a/drivers/iommu/amd_iommu_init.c > > +++ b/drivers/iommu/amd_iommu_init.c > > @@ -29,6 +29,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -1898,6 +1899,14 @@ static void init_device_table_dma(void) > > for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) { > > set_dev_entry_bit(devid, DEV_ENTRY_VALID); > > set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION); > > + /* > > + * In kdump kernels in-flight DMA from the old kernel might > > + * cause IO_PAGE_FAULTs. There are no reports that a kdump > > + * actually failed because of that, so just disable fault > > + * reporting in the hardware to get rid of the messages > > + */ > > + if (is_kdump_kernel()) > > + set_dev_entry_bit(devid, DEV_ENTRY_NO_PAGE_FAULT); > > } > > } > > > > diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h > > index 4de8f41..4cad9b3 100644 > > --- a/drivers/iommu/amd_iommu_types.h > > +++ b/drivers/iommu/amd_iommu_types.h > > @@ -322,6 +322,7 @@ > > #define IOMMU_PTE_IW (1ULL << 62) > > > > #define DTE_FLAG_IOTLB (1ULL << 32) > > +#define DTE_FLAG_SA (1ULL << 34) > > #define DTE_FLAG_GV (1ULL << 55) > > #define DTE_FLAG_MASK (0x3ffULL << 32) > > #define DTE_GLX_SHIFT (56) > > -- > > 2.7.4 > > > > _______________________________________________ > > iommu mailing list > > iommu@lists.linux-foundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/iommu