Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753094AbZAIQQe (ORCPT ); Fri, 9 Jan 2009 11:16:34 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752189AbZAIQQZ (ORCPT ); Fri, 9 Jan 2009 11:16:25 -0500 Received: from mga09.intel.com ([134.134.136.24]:49466 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751365AbZAIQQY (ORCPT ); Fri, 9 Jan 2009 11:16:24 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.37,235,1231142400"; d="scan'208";a="421018193" Message-ID: <49677856.90807@intel.com> Date: Sat, 10 Jan 2009 00:16:22 +0800 From: "Zhao, Yu" User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) MIME-Version: 1.0 To: Dirk Hohndel CC: "Han, Weidong" , "'Grant Grundler'" , "'linux-pci@vger.kernel.org'" , "'linux-kernel@vger.kernel.org'" , "'Jesse Barnes'" , "'iommu@lists.linux-foundation.org'" , "'Ingo Molnar'" , "'Arjan van de Ven'" Subject: Re: git-latest: kernel oops in IOMMU setup References: <20090108120538.0176d348@infradead.org> <20090108214116.GB20506@colo.lackof.org> <715D42877B251141A38726ABF5CABF2C018E8FEA77@pdsmsx503.ccr.corp.intel.com> <20090108180515.2f279671@infradead.org> <20090108205222.2c89dcde@infradead.org> <715D42877B251141A38726ABF5CABF2C018E8FECAA@pdsmsx503.ccr.corp.intel.com> <20090109070805.525c0de9@infradead.org> In-Reply-To: <20090109070805.525c0de9@infradead.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2999 Lines: 81 Dirk Hohndel wrote: > On Fri, 9 Jan 2009 14:53:14 +0800 > "Han, Weidong" wrote: > >> Dirk Hohndel wrote: >>> On Thu, 8 Jan 2009 18:05:15 -0800 >>> Dirk Hohndel wrote: >>> >>>> On Fri, 9 Jan 2009 08:58:46 +0800 "Han, Weidong" >>>> >>>> I updated to Linus' latest git (as your description made me wonder >>>> if the async stuff might play a role here). I still get an oops - >>>> but at a different spot and the system no longer hangs - it partly >>>> recovers (but things aren't too well - for example my USB >>>> keyboard / mouse don't work anymore). >>> Spoke too soon. Rebooted and had the same hard lockup again. This >>> time I had my camera within reach, so here's the trace: >>> >>> device_to_iommu+0x33/0x73 >>> domain_context_mapping_one+0x37/0x335 >>> domain_context_mapping+0x25/0xa7 >>> iommu_prepare_identity+0xd7/0xf3 >>> intel_iommu_init+0x4e4/0x8f3 >>> ? mutex_lock >>> ? sysctl_net_init >>> ? pci_iommu_init >>> pci_iommu_init >>> >>> I also have stack, code and register values. Let me know if you need >>> them. Or I can just post the picture :-) >>> >>> Again, very latest git tree, VT-d enabled. >>> >>> /D >> I tried latest git tree, it works for me. Above call trace looks >> right. > > Spent some more time reading the code. Can't quite claim to understand > all of it, yet, but I notice that most everywhere else drhd->devices[i] > is checked to be != NULL before it is accessed. Why is it safe not to > do that in device_to_iommu()? > > Would the patch below be a valid fix? It stops my system from hanging at > boot. But I wonder if there is an assertion that if drhd->ignored is 0 > then drhd->devices[0..drhd->device_cnt] is known to be != NULL and > therefore this test is just hiding a bug somewhere else... > > /D > > Signed-off-by: Dirk Hohndel > --- > drivers/pci/intel-iommu.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c > index 235fb7a..3dfecb2 100644 > --- a/drivers/pci/intel-iommu.c > +++ b/drivers/pci/intel-iommu.c > @@ -438,7 +438,8 @@ static struct intel_iommu *device_to_iommu(u8 bus, > u8 devfn) continue; > > for (i = 0; i < drhd->devices_cnt; i++) > - if (drhd->devices[i]->bus->number == bus && > + if (drhd->devices[i] && > + drhd->devices[i]->bus->number == bus && > drhd->devices[i]->devfn == devfn) > return drhd->iommu; > Did you see following in the kernel message? printk(KERN_WARNING PREFIX "Device scope device [%04x:%02x:%02x.%02x] not found\n", segment, scope->bus, path->dev, path->fn); If yes, then Acked-by: Yu Zhao -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/