Received: by 10.192.165.148 with SMTP id m20csp4854795imm; Tue, 8 May 2018 16:01:05 -0700 (PDT) X-Google-Smtp-Source: AB8JxZo1BAyR7arkWeWT/7gHa/BfSeZmTnGnj3sCFEMO7BoUB8w+l39LtVPly0qXwDRW0Kbm90Ne X-Received: by 2002:a17:902:bc88:: with SMTP id bb8-v6mr32547333plb.175.1525820465786; Tue, 08 May 2018 16:01:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525820465; cv=none; d=google.com; s=arc-20160816; b=Ns4AuyOxjTEsSQHZmWy+B2KDZex9fVOABQ51pcZKQ0eogzL5bQDn3MhS9cmx4mDxk2 0t6SIe1n1Pye6OfUvdBVWvBM4finLSMuu+G9IgdNCDHc1ETpT1xYUzH4JZgZORKJ1J1J 9erV9qBYYQf2esnijESeY9gUbsdY6Im6pxhvdap5XRUTb/WR9xmN0M/1919KQ5iHFBJ9 ahXm/eIbgRwJjXS5egj3u7LVZWtB4U+klZmS1FRrVg9yYDDU3MHX4qepIfEuInFOHUsz lcPvZYCFRxo5/KebWWzlLBn4mB1qZc6pGVrv6q69XphOOBfTlCstz/OJIxrfhXyREpzj nSbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=FOlS1DM6MlbdC5qFM8aoC+UjlNYxuU/EudTg7n27ezo=; b=Sf+NTfXmtas8g40/f1bze4RC20dQedtByovMVLGv6k+nnRxpjWQRQgd4Fm2LE0hIu4 y1ElVM6ZTheaPLJb+y8/Cs9CusvWtgMDeIp8YSp4QX11rv1fZNHEaeazp3M51ZVGRrvb 3oFPdD8QIS1SOpSC4xbYzIPAnMr7ztqN2bU87aYGa6njakidnl5IAAagt68WQTiQ1vZe DwnlPybYCevHkkZiq+s95IXcKJfZtLYQLZVl8p9HnY5Qk4bUDnmxU/Y6ZKve9v7XxUyl v1mp33eOs1sCRPTqXIjnSfeJrMIjNY5sZNAiA6FWmyW3BEC331NAt+UuM1uWfcZro5qg uoiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=AgK+Npne; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 8-v6si25298557plc.342.2018.05.08.16.00.49; Tue, 08 May 2018 16:01:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=AgK+Npne; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932623AbeEHXAg (ORCPT + 99 others); Tue, 8 May 2018 19:00:36 -0400 Received: from mail-ot0-f180.google.com ([74.125.82.180]:39329 "EHLO mail-ot0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932314AbeEHXAe (ORCPT ); Tue, 8 May 2018 19:00:34 -0400 Received: by mail-ot0-f180.google.com with SMTP id l12-v6so38056898oth.6 for ; Tue, 08 May 2018 16:00:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=FOlS1DM6MlbdC5qFM8aoC+UjlNYxuU/EudTg7n27ezo=; b=AgK+NpneStVx5oSSh9uMZCy0h0z4K3Wy5d3eJagWB/PWQYz4QKxYLwGbx37y9EiaBL LelPdvb2WQFTsIPN/kPGBUIEQDKyyjeL52kqUEXvJMBXcALk4w2t8vDigoPqHQHNoOB6 UYHOKP7Tp/7iu+W+tgFQqef/3sjUNt4/1EDOxfRGoWAh2aJ4b7+FsrJRPMn329JS0U64 b/MICISqzP/v69oBp4gs/PAhDQurXo860j3cC9LhUBbchW8Oh4O9Ltha0QhF0tK/iA/Y t7cpLlNN+DhJjle0Bl8hD4dbSTKwJcV31+5BmIQ92eFtpDVgAwbyq7j0F+Oma+K8m3eC OBvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=FOlS1DM6MlbdC5qFM8aoC+UjlNYxuU/EudTg7n27ezo=; b=insOua6zz/wX2/OtWSBAcQsDlLV9VXk/e81XbRRzZS0LpRNoscZ+HSEjfmEQEGvUN5 6JTyrD+1Nociqk/IX6dBfaDxpuaayoXwwPJ1QO2K7PsuZMZVuhk4SvUMFaDiBmsf9iBj ugldm7+MTIp9KClT6VopSUr+6TTOEOAsCyYhNq+Xmx6C8Y/XWvZMm1oFpvjg+KSjfjiF s2LUNFIYCHvGhwOQNoI9qmNHQ/dC+zs/xOl/QWARd5l/B7IaTOzRrCSs5yEpYdlABNRz tjxBjtZPkTkeMk3pCiggnItUPBlLF1LALnjA4a/U6aX2rVB3vUXhWQmRgoisx6U64yDf wqBg== X-Gm-Message-State: ALQs6tCLh86BIbLJx45BhP+NRzu7wV8pxQOAjvnGjlokragysZkZv4vo pSVZjg+bllUWzuH/mgZRGH2BkNxqKh5/qjycksQkEg== X-Received: by 2002:a9d:de3:: with SMTP id 90-v6mr28973133ots.117.1525820433674; Tue, 08 May 2018 16:00:33 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:2d36:0:0:0:0:0 with HTTP; Tue, 8 May 2018 16:00:33 -0700 (PDT) In-Reply-To: <20180508163206.7d3bf383@w520.home> References: <20180423233046.21476-1-logang@deltatee.com> <20180423233046.21476-5-logang@deltatee.com> <20180507231306.GG161390@bhelgaas-glaptop.roam.corp.google.com> <0b4183ef-e720-204b-9e85-b9eaf7a4136a@deltatee.com> <3584a6ac-95c7-5d23-1859-aee30605776e@deltatee.com> <20180508133407.57a46902@w520.home> <5fc9b1c1-9208-06cc-0ec5-1f54c2520494@deltatee.com> <20180508141331.7cd737cb@w520.home> <20180508144341.0441b676@w520.home> <20180508152631.50fd583c@w520.home> <354F7407-0DC7-470C-B9AA-74FDF9C46B08@raithlin.com> <20180508160336.0935ddde@w520.home> <20905682-9440-7d4b-0260-99d3dc794c3d@deltatee.com> <20180508163206.7d3bf383@w520.home> From: Dan Williams Date: Tue, 8 May 2018 16:00:33 -0700 Message-ID: Subject: Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches To: Alex Williamson Cc: Logan Gunthorpe , Stephen Bates , =?UTF-8?Q?Christian_K=C3=B6nig?= , Bjorn Helgaas , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" , "linux-nvme@lists.infradead.org" , "linux-rdma@vger.kernel.org" , "linux-nvdimm@lists.01.org" , "linux-block@vger.kernel.org" , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg , Bjorn Helgaas , Jason Gunthorpe , Max Gurtovoy , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Benjamin Herrenschmidt Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 8, 2018 at 3:32 PM, Alex Williamson wrote: > On Tue, 8 May 2018 16:10:19 -0600 > Logan Gunthorpe wrote: > >> On 08/05/18 04:03 PM, Alex Williamson wrote: >> > If IOMMU grouping implies device assignment (because nobody else uses >> > it to the same extent as device assignment) then the build-time option >> > falls to pieces, we need a single kernel that can do both. I think we >> > need to get more clever about allowing the user to specify exactly at >> > which points in the topology they want to disable isolation. Thanks, >> >> >> Yeah, so based on the discussion I'm leaning toward just having a >> command line option that takes a list of BDFs and disables ACS for them. >> (Essentially as Dan has suggested.) This avoids the shotgun. >> >> Then, the pci_p2pdma_distance command needs to check that ACS is >> disabled for all bridges between the two devices. If this is not the >> case, it returns -1. Future work can check if the EP has ATS support, in >> which case it has to check for the ACS direct translated bit. >> >> A user then needs to either disable the IOMMU and/or add the command >> line option to disable ACS for the specific downstream ports in the PCI >> hierarchy. This means the IOMMU groups will be less granular but >> presumably the person adding the command line argument understands this. >> >> We may also want to do some work so that there's informative dmesgs on >> which BDFs need to be specified on the command line so it's not so >> difficult for the user to figure out. > > I'd advise caution with a user supplied BDF approach, we have no > guaranteed persistence for a device's PCI address. Adding a device > might renumber the buses, replacing a device with one that consumes > more/less bus numbers can renumber the buses, motherboard firmware > updates could renumber the buses, pci=assign-buses can renumber the > buses, etc. This is why the VT-d spec makes use of device paths when > describing PCI hierarchies, firmware can't know what bus number will be > assigned to a device, but it does know the base bus number and the path > of devfns needed to get to it. I don't know how we come up with an > option that's easy enough for a user to understand, but reasonably > robust against hardware changes. Thanks, True, but at the same time this feature is for "users with custom hardware designed for purpose", I assume they would be willing to take on the bus renumbering risk. It's already the case that /sys/bus/pci/drivers//bind takes BDF, which is why it seemed to make a similar interface for the command line. Ideally we could later get something into ACPI or other platform firmware to arrange for bridges to disable ACS by default if we see p2p becoming a common-off-the-shelf feature. I.e. a BIOS switch to enable p2p in a given PCI-E sub-domain.