Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754839AbdLTK2A (ORCPT ); Wed, 20 Dec 2017 05:28:00 -0500 Received: from mail-lf0-f67.google.com ([209.85.215.67]:36699 "EHLO mail-lf0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754404AbdLTK14 (ORCPT ); Wed, 20 Dec 2017 05:27:56 -0500 X-Google-Smtp-Source: ACJfBou6cWXNenxyx6sh2ldf8PUgE2GhobMGWovFLiXfJZc9P7QPT82ZDsWQPL2MLktEL+e7rmLajQ== Subject: Re: [PATCH V1 0/1] Fix kernel panic caused by device ID duplication presented to the IOMMU To: Robin Murphy , Tomasz Nowicki , joro@8bytes.org, will.deacon@arm.com, lorenzo.pieralisi@arm.com, bhelgaas@google.com Cc: Jayachandran.Nair@cavium.com, Ganapatrao.Kulkarni@cavium.com, linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-arm-kernel@lists.infradead.org, linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org, mw@semihalf.com, stable@vger.kernel.org References: <1513696436-31834-1-git-send-email-tomasz.nowicki@caviumnetworks.com> <7f7fca76-88f8-6ee7-c402-fe4300c62253@arm.com> From: Tomasz Nowicki Message-ID: <68279889-a16d-3c8d-3a01-6b5f992c5744@semihalf.com> Date: Wed, 20 Dec 2017 11:27:52 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <7f7fca76-88f8-6ee7-c402-fe4300c62253@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3662 Lines: 95 Hi Robin, On 19.12.2017 17:34, Robin Murphy wrote: > Hi Tomasz, > > On 19/12/17 15:13, Tomasz Nowicki wrote: >> Here is my lspci output of ThunderX2 for which I am observing kernel >> panic coming from >> SMMUv3 driver -> arm_smmu_write_strtab_ent() -> BUG_ON(ste_live): >> >> # lspci -vt >> -[0000:00]-+-00.0-[01-1f]--+ [...] >>                             + [...] >>                             \-00.0-[1e-1f]----00.0-[1f]----00.0 >> ASPEED Technology, Inc. ASPEED Graphics Family >> >> ASP device -> 1f:00.0 VGA compatible controller: ASPEED Technology, >> Inc. ASPEED Graphics Family >> PCI-Express to PCI/PCI-X Bridge -> 1e:00.0 PCI bridge: ASPEED >> Technology, Inc. AST1150 PCI-to-PCI Bridge >> While setting up ASP device SID in IORT dirver: >> iort_iommu_configure() -> pci_for_each_dma_alias() >> we need to walk up and iterate over each device which alias >> transaction from >> downstream devices. >> >> AST device (1f:00.0) gets BDF=0x1f00 and corresponding SID=0x1f00 from >> IORT. >> Bridge (1e:00.0) is the first alias. Following PCI Express to >> PCI/PCI-X Bridge >> spec: PCIe-to-PCI/X bridges alias transactions from downstream devices >> using >> the subordinate bus number. For bridge (1e:00.0), the subordinate is >> equal >> to 0x1f. This gives BDF=0x1f00 and SID=1f00 which is the same as >> downstream >> device. So it is possible to have two identical SIDs. The question is >> what we >> should do about such case. Presented patch prevents from registering >> the same >> ID so that SMMUv3 is not complaining later on. > > Ooh, subtle :( There is logic in arm_smmu_attach_device() to tolerate > grouped devices aliasing to the same ID, but I guess I overlooked the > distinction of a device sharing an alias ID with itself. I'm not sure > I really like trying to work around this in generic code, since > fwspec->ids is essentially opaque data in a driver-specific format - in > theory a driver is free to encode a single logical ID into multiple > fwspec elements (I think I did that in an early draft of SMMUv2 SMR > support), at which point this approach might corrupt things massively. I don't have strong favourite here, the fix in SMMUv3 driver would work too. Initially we can fix things for SMMUv3 only and if ever face the same issue for other IOMMU driver, then we can move it to generic layer. > > Does the (untested) diff below suffice? > > Robin. > > ----->8-----diff --git a/drivers/iommu/arm-smmu-v3.c > b/drivers/iommu/arm-smmu-v3.c > index f122071688fd..d8a730d83401 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -1731,7 +1731,7 @@ static __le64 *arm_smmu_get_step_for_sid(struct > arm_smmu_device *smmu, u32 sid) > >  static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec) >  { > -    int i; > +    int i, j; >      struct arm_smmu_master_data *master = fwspec->iommu_priv; >      struct arm_smmu_device *smmu = master->smmu; > > @@ -1739,6 +1739,13 @@ static void arm_smmu_install_ste_for_dev(struct > iommu_fwspec *fwspec) >          u32 sid = fwspec->ids[i]; >          __le64 *step = arm_smmu_get_step_for_sid(smmu, sid); > > +        /* Bridged PCI devices may end up with duplicated IDs */ > +        for (j = 0; j < i; j++) > +            if (fwspec->ids[j] == sid) > +                break; > +        if (j < i) > +            continue; > + >          arm_smmu_write_strtab_ent(smmu, sid, step, &master->ste); >      } >  } Yes, worked for me. Thanks, Tomasz