Received: by 10.223.185.116 with SMTP id b49csp7910144wrg; Thu, 1 Mar 2018 13:23:06 -0800 (PST) X-Google-Smtp-Source: AG47ELvT7/sCq6JpmP7rcqliLrXs11JgYzhqNqh/EWBOUYc2GqN9ZM1NV4WmVf4V/N9vprZDrZd8 X-Received: by 10.98.60.15 with SMTP id j15mr3326310pfa.7.1519939386190; Thu, 01 Mar 2018 13:23:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519939386; cv=none; d=google.com; s=arc-20160816; b=z8elMYZViQ8mizhEJaB0bdYv4gC7vgAp5HGuXJPuASdE3MkBN1NTVT07JUjMiMxDkK l6v/dus2K+iA5Q02c/cH8/hdfzAk5gso92XckqyZSqSq1SU5vTUwPr16n5+5QDMZlLsE BCQ1ZmFjJ4esCP+A5jrDrk4O0LDUPthfGrv+ieEJcJpSJXR5qjO0z/eijFvSdKfQFJYM cqHEDT1+0Y0GaZzI/fNkwKYB/h6lM4HV5ZiKPk8jRMwhcKIM/18NmhnU0wkFNLMjxhgI EQNiR2lFW7jx8LSqqBD5rFmUSYIPj+RDJwGim0bFESFKH844zzX9NmoaGiYUowfnEkQb /dNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=LnI7Q7fVXO59V90uitRn3RO36uTDfBnNtkdW1HbCSbg=; b=fmELkISnBuJ5fWS7S9xhCJ2HICgao/0H76tqr+smxb39SEgC4YGw/UrLdNMpqLA9Sf OAfIJ0xnDRWvAo0ZlysNp+QdXocYj7cT+yRoOuc1U/I++6tyllxzTd9czCEyGr41p3Jp wuI+VVEvmeLq1eOD8Zwn4L0aK4cuX8ZtlJkXKqzJSbbpxE7Os7wkncqiU3ruvSNpLgXK yHSx3mCb3/efUjoKMp1c6IG9akUkCpHbt00MDh/fnNnA4PQUiKe7zcnWjFcFoiDHexoO jLicLvazwssdbpbbzdrnZttjjnLiJ+SwaCwPKPdUDxq5T3oxzM8YAc5LSr13MRICkc3h Uqsg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e92-v6si3535585pld.736.2018.03.01.13.22.51; Thu, 01 Mar 2018 13:23:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161957AbeCAVWC (ORCPT + 99 others); Thu, 1 Mar 2018 16:22:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:24677 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161475AbeCAVV7 (ORCPT ); Thu, 1 Mar 2018 16:21:59 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9C7F985541; Thu, 1 Mar 2018 21:21:58 +0000 (UTC) Received: from w520.home (ovpn-117-203.phx2.redhat.com [10.3.117.203]) by smtp.corp.redhat.com (Postfix) with ESMTP id 22D9E6046B; Thu, 1 Mar 2018 21:21:57 +0000 (UTC) Date: Thu, 1 Mar 2018 14:21:55 -0700 From: Alex Williamson To: "Stephen Bates" Cc: Bjorn Helgaas , Logan Gunthorpe , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" , "linux-nvme@lists.infradead.org" , "linux-rdma@vger.kernel.org" , "linux-nvdimm@lists.01.org" , "linux-block@vger.kernel.org" , "Christoph Hellwig" , Jens Axboe , Keith Busch , Sagi Grimberg , Bjorn Helgaas , Jason Gunthorpe , Max Gurtovoy , Dan Williams , =?UTF-8?B?SsOpcsO0bWU=?= Glisse , Benjamin Herrenschmidt Subject: Re: [PATCH v2 04/10] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches Message-ID: <20180301142155.5966c4c0@w520.home> In-Reply-To: <0D05579B-789C-4A19-B3A2-C1A630BE31C0@raithlin.com> References: <20180228234006.21093-1-logang@deltatee.com> <20180228234006.21093-5-logang@deltatee.com> <20180301180257.GH13722@bhelgaas-glaptop.roam.corp.google.com> <0D05579B-789C-4A19-B3A2-C1A630BE31C0@raithlin.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 01 Mar 2018 21:21:58 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 1 Mar 2018 18:54:01 +0000 "Stephen Bates" wrote: > Thanks for the detailed review Bjorn! > > >> > >> + Enabling this option will also disable ACS on all ports behind > >> + any PCIe switch. This effictively puts all devices behind any > >> + switch into the same IOMMU group. > > > > > Does this really mean "all devices behind the same Root Port"? > > Not necessarily. You might have a cascade of switches (i.e switches below a switch) to achieve a very large fan-out (in an NVMe SSD array for example) and we will only disable ACS on the ports below the relevant switch. > > > What does this mean in terms of device security? I assume it means, > > at least, that individual devices can't be assigned to separate VMs. > > This was discussed during v1 [1]. Disabling ACS on all downstream ports of the switch means that all the EPs below it have to part of the same IOMMU grouping. However it was also agreed that as long as the ACS disable occurred at boot time (which is does in v2) then the virtualization layer will be aware of it and will perform the IOMMU group formation correctly. This is still a pretty terrible solution though, your kernel provider needs to decide whether they favor device assignment or p2p, because we can't do both, unless there's a patch I haven't seen yet that allows boot time rather than compile time configuration. There are absolutely supported device assignment cases of switches proving isolation between devices allowing the downstream EPs to be used independently. I think this is a non-starter for distribution support without boot time or dynamic configuration. I could imagine dynamic configuration through sysfs that might trigger a soft remove and rescan of the affected devices in order to rebuild the IOMMU group. The hard part might be determining which points to allow that to guarantee correctness. For instance, upstream switch ports don't actually support ACS, but they'd otherwise be an obvious concentration point to trigger a reconfiguration. Thanks, Alex